21. Recording procedures and measurements

The EGG signals of the subjects were recorded onto cassette tapes in a quiet room. The speech signals were recorded at the same time as the EGG signals, but were not used for further analysis. The recordings were done on a commercially available tape-recorder and the EGG signal was provided by the Portable Laryngograph of Laryngograph Ltd. They were then digitally transmitted to a Silicon Graphics Indy computer. Each data file contained an EGG representation of the sustained vowels /i:/, /a:/, /u:/. The data processing was done completely automatically. The program (integrated in the ESPS/waves+(TM) speech processing environment) determines the start of phonation in the data and then analyzed one second of the data beginning 0.5 s after the initiation of phonation. This is in line with the advice of Klingholz (1991), who suggested the use of 20 to 30 pitch periods in order to be able to draw conclusions about voice quality.

The signal is divided into single pitch periods, which are described by a number of parameters. The periods of the EGG are determined using an algorithm proposed by Vieira et al. (1996) and described already in section 10.1. The recorded signal is band-pass filtered (60 - 5000 Hz) using an FIR filter with a 60 dB rejection in the stop bands, and as a result "unique" zero crossings between adjacent negative and positive "significant" peaks are found (see Vieira et al., 1996b for details). The zero-crossing points are used as markers for instants of glottal closure and they also determine the pitch periods in the original EGG signal (compensating for the shift in the signal which results from the order of FIR filter). Every pitch period is characterized using the parameters proposed in section 12. The description contains the duration and amplitude parametrization of the waveform. Moreover, the discrepancy between the true waveform and the corresponding straight line segment is computed. Also, the peak-to-peak amplitude and the relative difference in duration between successive pitch periods (first order perturbation factor, Gubrynowicz et al., 1980) is registered and compared.

The parametrization was done for every pitch period and subsequently the mean and the standard deviation of every parameter were registered. The resulting vector describing every vowels spoken by every speaker consists of 20 parameters.

The quality of the EGG data was often poor. In many cases the signal-to-noise ratio was low and the shape of the signal was often unusual due to notches and fluctuations. However, no recording was rejected from further processing. In some cases, it was necessary to move the analysis window, which usually starts 0.5 s after the phonation onset, to other portions of the signal. In general, the one part of the voiced segment was used that was most stable and of maximal duration.