27. Experiments with the EGG model

From the results of the experiment described in section 19 it was concluded that word stress is realized in laryngeal behavior through an increase in subglottal pressure. An increase in fundamental frequency during the sentence intonation is realized by an increase in muscular tension (the action of the cricothyroid muscle). It was our hypothesis that these changes can be distinguished in the EGG waveform.


27.1. Configuration of the model

In order to model the EGG waveform it is necessary to choose the apropiate overall configuration of the simulation system as well as the initial settings of the model parameters. The model is limited to only vowel articulation (as vowels were the subject of the experiments described in section 16.1) and modal voice production. These assumptions greatly simplify the modelling of the vocal tract and the glottal source. Based on the results of Ishizaka and Flanagan (1972:1243), the number of elements of the vocal tract model has been limited to only four cylindrical sections of equal length. Assuming the length of the vocal tract to be 16 cm, the lengths of the elements were set to lj = 4 cm. The cross-sectional areas of the vocal tract model which are used for the synthesis of vowels are given in Table 17. The formant structure of the model-generated sound2 is used as a criterion for the selection of correct configuration. The inductances and the capacitances of the electrical equivalents follow from eq. (37) and (38). The acoustic conductance of losses due to the heat conductivity near the tube walls is neglected.

Table 17: The cross-sectional areas (given in cm2) of the vocal tract model segments. The indexes of the sections proceed from glottis to mouth. The lengths of the segments are equal to lj=4 cm

vowel


Area1


Area2


Area3


Area4


[]     (neutral e)


5


5


5


5


[a:]  


0.8


0.4


3


8


[i:]


5


12


1


1


[u:]    


12


1


1


12

The next step in the formulation of the model is the setting of the physiological constraints of the vocal folds model.

The initial values are as follows (Ishizaka & Flanagan, 1972; Titze, 1988; Gubrynowicz, 1997; Stevens, 1994):

The following additional settings were used for the EGG model:

Figure 47. The simulated waveforms of the neutral [] modelling. The following tracks are depicted: a) the EGG and Ug waveforms, b) Ag1, Ag2, the cross-sectional area of the glottis from the superior view Ag, the contact area Ac, c) the folds displacements x1, x2 d) the pressure UR and its time derivative at the radiation load (generated sound)

The simulated waveforms are presented in Fig. 47. The waveform was simulated for neutral [] settings, i.e. the vocal tract was modelled as a tube with a constant cross-sectional area of 5 cm2. The phase difference between the motion of the upper and lower masses is about 0.5 ms, F0=140 Hz and the Open Quotient (measured in the Ug waveform) is about 62%, which agrees with the data published by Ishizaka and Flanagan (1972:1251) and Childers et al. (1986). The skewing of the glottal flow Ug waveform in Fig.47 is worth noticing. It is our hypothesis is that the skewness of the glottal airflow increases with the increase of the subglottal pressure Ps (see section 17.6).