Configuration

27. Experiments with the EGG model

From the results of the experiment described in section 19 it was concluded that word stress is realized in laryngeal behavior through an increase in subglottal pressure. An increase in fundamental frequency during the sentence intonation is realized by an increase in muscular tension (the action of the cricothyroid muscle). It was our hypothesis that these changes can be distinguished in the EGG waveform.

27.1. Configuration of the model

In order to model the EGG waveform it is necessary to choose the apropiate overall configuration of the simulation system as well as the initial settings of the model parameters. The model is limited to only vowel articulation (as vowels were the subject of the experiments described in section 16.1) and modal voice production. These assumptions greatly simplify the modelling of the vocal tract and the glottal source. Based on the results of Ishizaka and Flanagan (1972:1243), the number of elements of the vocal tract model has been limited to only four cylindrical sections of equal length. Assuming the length of the vocal tract to be 16 cm, the lengths of the elements were set to lj = 4 cm. The cross-sectional areas of the vocal tract model which are used for the synthesis of vowels are given in Table 17. The formant structure of the model-generated sound² is used as a criterion for the selection of correct configuration. The inductances and the capacitances of the electrical equivalents follow from eq. (37) and (38). The acoustic conductance of losses due to the heat conductivity near the tube walls is neglected.

Table 17: The cross-sectional areas (given in cm²) of the vocal tract model segments. The indexes of the sections proceed from glottis to mouth. The lengths of the segments are equal to lj=4 cm

vowel

Area₁

Area₂

Area₃

Area₄

[] (neutral e)

5

5

5

5

[a:]

0.8

0.4

3

8

[i:]

5

12

1

1

[u:]

12

1

1

12

Table 17: The cross-sectional areas (given in cm²) of the vocal tract model segments. The indexes of the sections proceed from glottis to mouth. The lengths of the segments are equal to lj=4 cm
vowel	Area₁	Area₂	Area₃	Area₄
[] (neutral e)	5	5	5	5
[a:]	0.8	0.4	3	8
[i:]	5	12	1	1
[u:]	12	1	1	12

The next step in the formulation of the model is the setting of the physiological constraints of the vocal folds model.

The initial values are as follows (Ishizaka & Flanagan, 1972; Titze, 1988; Gubrynowicz, 1997; Stevens, 1994):

subglottal pressure Ps= 8 cm H₂O
length of the glottis lg=1.4 cm
thickness of the vocal folds d1+d2=0.3 cm, d1=0.25 cm, d2=0.05 cm
mass of the folds m1+m2=0.15 g, m1=0.125 g, m2=0.025 g
initial configuration of the glottis is rectangular and with the initial areas of Ag01=A_g02=0.05 cm²
nonlinear coefficients of the springs etak1=etak2=100 and etah1=etah2=500
the elastic collision coefficients h1, h2 depend on the stiffness of the folds h1=3k₁, h2=3k₂
air density rho=0.00129 g/cm³
air viscosity coefficient µ=18.2466E-5 dyn*s/m
the linear stiffness of the folds' masses k1, k2 and the coupling stiffness kc are estimated to establish self-sustained oscillation of the folds. The region of self-sustained oscillation depends on the masses' ratio and the ratio of the damping factors zeta₁, zeta₂. Typical values are k1=80 000 dyn/cm, k2=8 000dyn/cm and kc=25 000 dyn/cm for the damping factors zeta₁=0.1 and zeta₂ =0.6
sampling frequency fs=1/T=16 000 Hz.

The following additional settings were used for the EGG model:

the closing angle theta_c=0.5º and the opening angle theta_o=2º, the values are slightly modified³ compared to those of Childers et al. (1986:1314)
shunt impedance C=0.1
scaling constant k=100
the resulting EGG signal was additionally inverted to represent increased contact by means of an increased amplitude of the waveform (in accordance with the Laryngograph device output).


Figure 47. The simulated waveforms of the neutral [] modelling. The following tracks are depicted: a) the EGG and Ug waveforms, b) Ag1, Ag2, the cross-sectional area of the glottis from the superior view Ag, the contact area Ac, c) the folds displacements x1, x2 d) the pressure UR and its time derivative at the radiation load (generated sound)

The simulated waveforms are presented in Fig. 47. The waveform was simulated for neutral [] settings, i.e. the vocal tract was modelled as a tube with a constant cross-sectional area of 5 cm². The phase difference between the motion of the upper and lower masses is about 0.5 ms, F0=140 Hz and the Open Quotient (measured in the Ug waveform) is about 62%, which agrees with the data published by Ishizaka and Flanagan (1972:1251) and Childers et al. (1986). The skewing of the glottal flow Ug waveform in Fig.47 is worth noticing. It is our hypothesis is that the skewness of the glottal airflow increases with the increase of the subglottal pressure Ps (see section 17.6).