phonemic | non-phonemic + pathology | settings |
As in any phonetic description, the description of voice needs appropriate, unambiguous and distinctive labels. Abercrombie (1967) and later Laver (1991:173) use three strands, all simultaneously and continuously (permanently) present that describe the segmental features of voice, the features of voice quality and the features of voice dynamics. Thus, the description of voice involves the corresponding labelling of any of these strands. Laver distinguished impressionistic and phonetic labels of voice (Laver, ibid.) The former requires an audible demonstration of the type of voice referred to before the listener can construct an accurate interpretation of the label (for example "flat", "thin", "bird-like" or "velvety", or other so called "imitation labels"). The latter should be a part of a well-organized vocabulary and should have an exact and agreed upon definition which can be assigned to a label by a group of trained phoneticians. Phonetic labels of voices consist of sets of labels that cover all aspects of voice production, assuming standard anatomy and physiology. In fact, they act as instructions for achieving a certain articulation with a certain voice quality (e.g. loud, slow, nasalized, harsh, whispery, creaky, falsetto). Unfortunately, as of yet no standardized labelling system of voice quality exists, and phonetic labels are not mutually exclusive and sometimes ambiguous.
In the linguistic literature voice quality is generally looked at from two perspectives: is it phonemic or non-phonemic?
Phonemic voice quality has a contrastive function in the phonological system of a language. In most languages a contrast between segments is achieved on an articulatory basis rather than by different phonation types (defined in section 5.1), although for example breathiness is phonemic for vowels in Gujarati and for stops in Igbo (Ladefoged & Maddieson, 1996:47, 304). The languages using phonation contrasts are summarized in Table I.
Table 1: Examples of languages which use phonation types distinctively (after: Ladefoged & Maddieson, 1996)
Language |
contrastive phonation types |
---|---|
most languages |
voiced vs. voiceless: |
Icelandic |
voiced vs. voiceless:
contrast between nasals1 |
Ik, Dafla, Amerindian languages of the Plains and Rockies, Bantu languages of the Congo basin, Indo-Iranian languages of the border region | voiced vs. voiceless: |
Gujarati, !Xóõ |
modal vs. breathy voice: |
Indo-Aryan languages |
modal vs. breathy voice: |
Mpi |
modal vs. stiff (slightly creaky) voice: |
Parauk |
slightly breathy vs. slightly stiff voice: |
Jalapa Mazatec |
modal vs. breathy vs. creaky: |
Korean |
stiff vs. modal voice: |
Javanese |
stiff vs. slack voice: |
1Jessen & Pétursson (1997)
|
The changes in voice source behavior may be associated with segmental or suprasegmental elements on the linguistic layer of communication. Of the different phonation types (see section 5.1) modal, creaky (laryngealized), breathy and harsh (Nì Chasaide & Gobl, 1997:452) are used linguistically. It is rather striking that the tense/lax voice opposition (in the sense of the degree of overall muscular tension) is used linguistically (Maddieson & Ladefoged, 1985). In a segmental context voice quality is used contrastively for vowels and consonant in South African, South East Asian and native North American languages as shown in Table I (Ladefoged & Maddieson, 1996; Nì Chasaide & Gobl, 1997). Although the laryngeal differences are associated with voice quality distinctions between consonants, they are primarily located at the onset or offset of a vowel (e.g. in the breathy nasals of Tsonga the acoustic effects affect mostly the vowel onset; vocal fold abduction for the breathy voiced nasal begins during a nasal consonant (Ní Chasaide & Gobl, 1997:454). A suprasegmental property such as intonation, tone or stress also affects the production of voice. In this regard the respective characteristics are perceived to be dependent on the language used. Studies have shown that listeners with different native languages judge voice quality differently (Hurme & Sonninen, 1986). In other words, the judgements of voice quality are affected by a listener's phonological system (Lin 1995:18).
An interesting but still not researched function of voice quality is that it is perceived unconciously. Independently of what is said, it can be perceived as friendly, curious, vicious, off-putting etc. Helmholtz (1863) named this direct perception of emotions based on voice quality `unbewußtes Schließen'.
Another issue concerning voice quality is its contribution to what is commonly called pathological voice. As already mentioned above, the labelling of different voices is not unambigous and the perception of voice quality is not universal, as it depends on both cultural differences in general and the phonological system of a listener's native language. The description of pathological voice, however, attempts to be universal and is based primarily on more abstract laryngeal functions.
Among the various systems of pathological voice description the most common ones concentrate on the degree of "hoarseness" (Hirano, 1981; Nawka & Anders, 1996). Hoarseness is a term used to explain the perceived voice abnormality as originating at a voice source rather than resulting from abnormalities in vocal tract configuration and is perceptually related to the noise generation during phonation. The perception of voice abnormality through hoarseness can be graded, if we provide a detailed and language-independent description of a voice quality. Hirano (ibid.) proposes a scale of voice judgements which includes quantifiable perceptual dimensions related to a set of descriptive parameters for acoustic phenomena (Lin, 1995:20). The factors involved in the classification include:
Each of those labels can be graded from 0 to 3. This labelling system is known as the GRBAS classification (Isshiki &Takeuchi, 1970; Hirano, 1981, 1989).
It is widely used in the US and Japan. In Europe the labelling of asthenicity (A) has been criticized as highly correlated with breathiness. Also, the judgments of the tenseness of voice diverge considerably. For this reason a simpler system, the so called RBH system (Wendler et al., 1986; Nawka &Anders 1996:8), which is based only on three perceptual dimensions (roughness, breathiness and hoarseness) has come into use.
Listen
to a voice graded to R3B2H3 (WAV file, 100 kB)
In Laver's (1991) framework it is possible to describe non-pathological voice qualities in a relatively objective manner.
Perceived voice quality can be described using phonetic settings (Table II). The settings are grouped into:
The description of a particular setting is usually given in terms of the degree of deviation from a neutral setting. The neutral setting is defined as a normal position relative to possible adjustments (Laver, 1991:186). Within this description voice quality is regarded as a superposition of a setting and an "organic component" which, to a wide extent, characterizes the baseline of the speaker's voice, i.e. its neutral setting.
Supralaryngeal Settings | Laryngeal Settings |
---|---|
Longitudinal axis:
labial labial protrusion |
Simple phonation types: |
Latitudinal axis settings: |
compound phonation types: |
velopharyngeal settings: |
Overall muscular tension settings: |