11. The EGG waveform and voice qualities
-
Figure 21. Examples of EGG recordings of different
voice qualities
|
|
The dependence of the EGG waveform on voice quality
is evident in Fig.21. Several periods of Lx recordings
of /a:/ spoken by the same speaker are compared using the same amplitude
and time resolution settings. The visual inspection
of the waveforms leads to the following conclusions:
-
-
Modal voice is characterized by a
steep increase of contact in the closing phase. The closing phase is short
and peak-to-peak amplitude is high. The maximum contact phase (closed phase)
has a parabolic shape and the transition to the
rise7 of the impedance is smooth. The fall
of the signal lasts longer than the rise (the skewness is substantially smaller
than 100%) and in the increase of the impedance a knee can be observed. The
increase in impedance ends at the minimum located at the beginning of the
no-contact phase, from then on the impedance slowly decreases until the sharp
decrease at the beginning of the next signal period. The signal duty ratio
is about 50% (the Open Quotient), although the location of the opening instant
is not obvious. The waveform directly corresponds to the vocal folds vibration
model and its EGG projection described in section 26 (Fig. 45). Moreover,
the phases of the signal are almost linear (exept for the maximum contact
phase).
-
-
Whispery voice periods are slightly longer
than those of modal voice. The decrease in impedance is very fast. The signal
peak is located at the beginning of the maximum contact phase which is much
shorter than for modal voice. The increase in impedance is also faster and
the knee in the fall is present, although in this particular example in would
be preferable if the opening instant were located at the signal level of
the closure instant (i.e. where the falling edge of the signal changes from
concave to convex, not where the time derivative reaches the minimum). The
skewness of the contact pulse is still much smaller than 100% but still larger
than for modal voice. The minimum of the signal amplitude is reached at about
the beginning of the no-contact phase. From this point on the waveform rises
linearly up to the discontinuity point at the begining of the sharp rise
which marks the next period of the signal. The duty ratio is much lower than
for modal voice, indicating a much higher Open Quotient. The high amplitude
of the signal suggests that there is complete adduction of the mucous part
of the vocal folds. Once again, the phases of the EGG waveform are easy to
identify in the EGG model. It is to be expected that the durations of the
single phases differ from those of modal voice.
-
-
The creaky voice recordings of the EGG contrast
in many aspects with the other voice qualities. The shape of the waveform
could be described as a "triangle with rounded corners" and the period duration
is ca. twice as long as for modal voice. The instant of glottal
closure can be identified clearly at the sharp rise of the waveform. The
maximum contact is short compared to the period duration. In the opening
phase the signal gradually decrease and no knee can be identified. Due to
this fact, the signal duty cycle is not easy to define and it is to be expected
that the measure will be biased and imprecise. With respect to the skewness
of the pulse, one can observe a very strong tendency to the left, presumably
higher than for the other voices. The no-contact phase has a rather parabolic
form with the signal minimum located at the centre of the phase. Using the
EGG model (see section 9) the description of a waveform is more complicated
due to the undetectable opening instant. The start and end of the opening
phases (see section 12) are approximated with two straight lines with almost
the same slope. Also, the initial part of the closing phase with a rather
gradual increase in the signal is difficult to localize. The peak-to-peak
amplitude is still high, which signalizes a good current flow during the
contact phase, which in turn suggests complete contact between the vocal
folds. Noise is also present in the signal.
-
-
Breathy voice is easy to distinguish from
the other voices presented in Fig.21. First of all,
the peak amplitude is lower, which may depend on the poor contact between
the vocal folds or their incomplete closure. Also, the duty ratio of the
signal is obviously greater than in other voice qualities. Considering the
ratio of rise time and fall time, higher values are observed. The skewness
of the pulse is smaller. The maximum contact phase is extremely short with
the maximum located approximately in the middle of the pulse. The contour
of the waveform can be described as consisting of short triangular pulses
with long, straight, almost zero-valued, flat pauses (plateaus) between them.
The instants of glottal closure and opening can be identified as intersections
with the baseline. In the no-contact phase, a slow fluctuation of the signal
can be observed. The description of the waveform using the proposed model
is still possible, although some phases (for example the start of closing
or the end of opening) are obviously reduced or even absent. The pitch is
low but higher than for creaky voice.
-
-
Tense voice (in Fig.
21) is produced with a strongly increased muscular
tension in the larynx. The EGG waveform typical of tense voice differs in
many aspects from the waveforms of other voice qualities. First of all, the
shape is rounded, much more sinusoidal and smoothed when compared to other
voices. The peak-to-peak amplitude is lower, which can be caused by incomplete
glottal closure. The steepness of the increasing signal slope is significantly
reduced and the parabolic part starts earlier and lasts longer during the
maximum contact phase. The increase in the impedance is gradual with a slight
indication of a knee, but it seems that the use of this time instant as an
estimate of the opening instant is very unreliable. The maximum contact phase
is relatively long. The duty cycle of the signal is comparable to that of
modal voice. The skewness is rather high, but the rise time relatively long.
The minimum of the signal is reached rather late in the no-contact phase.
The fundamental frequency of the signal is comparable to that of modal voice,
whereas the signal-to-noise ratio seems to be lower. The description of tense
voice in terms of the proposed model of straight lines (section 12) is biased
due to the rounded waveform. However, the measure of the distance between
the modelled and the original waveform should be notably higher than for
other voices. The start of the closing phase is not detectable in the waveform.
-
-
The EGG shape of falsetto also differs from
the other voice qualities. The pitch periods are almost three times shorter
than for modal voice, the minimum and maximum peaks are sharp and the waveform
can be accurately approximated with straight lines. The rise of the signal
is extremely fast. The maximum contact is of very short duration. The fall
is rapid, no knee is observed, although in the middle part of the fall the
time derivative reaches its minimum. The position of the opening instant
is uncertain. The skewness of the pulse is limited and so the pulses are
much more symmetrical than in the other examples. The EGG signal reaches
the minimum relatively early in the no-contact phase and quasi-linear and
fast rise can be observed up to a sharp acceleration at the closure instant.
The approximated duty cycle is rather high when compared to whispery voice.
It is apparent
from the depicted waveforms that the EGG differs for diverse voice
qualities. Below, an objective and automated way of describing
those differences will be presented. The proposed description of the EGG
waveform will be used to distinguish voice qualities that are used on the
linguistic and paralinguistic layers of communication.