«Year: 2016 On the origin of post-aspirated stops: production and perception of /s/ + voiceless stop sequences in Andalusian Spanish Ruch, Hanna; ...»
(Ruch & Harrington, 2014) holds true for /sp/- and /sk/-sequences as well. This apparent-time study with 48 speakers of Andalusian Spanish, 24 of an Eastern (Granada) and 24 of a Western variety (Seville Spanish) showed that younger WAS speakers produced /sp, st, sk/ with a longer post-aspiration and a shorter pre-aspiration than older WAS speakers. Pre-aspiration and post-aspiration duration were inferred by a semi-automatic procedure (Ruch & Harrington, 2014) that acoustically measures voice termination time (VTT) and voice onset time (VOT), that is, the interval between the offset/onset of voicing in the preceding/following vowel. The results confirmed the hypothesis of a sound change in progress in Andalusian Spanish (O’Neill, 2010; Parrell, 2012; Torreira, 2007a) not only for /st/-, but also for /sp/- and /sk/-sequences: /s/ + voiceless stop sequences are increasingly produced with a long post-aspiration and a very short pre-aspiration in young speakers. This tendency for a longer post- and a shorter pre-aspiration was found for Eastern Andalusian Spanish as well, where younger and older speakers, however, differed significantly only in post-aspiration duration in /st/-sequences, but not in /sp, sk/ and not in pre-aspiration duration.
Another aim of this study has been to investigate the effect of place of articulation on the production of pre- and post-aspiration in the two varieties and age groups in order to tackle the articulatory or perceptual factors that might have given rise to the sound change. VTT was longest preceding velar, and shortest preceding bilabial and dental stops. This pattern is consistent with dialectological (e.g., Alther, 1935) and phonetic studies (e.g., Marrero, 1990; Sánchez Muñoz, 2004) that describe how aspiration resulting from /s/-weakening sometimes disappears preceding /p/ and /t/, but rarely so preceding /k/. It is also in line with findings for languages that have pre-aspiration segmentally such as Scottish Gaelic Art. 2, page 28 of 36 Ruch and Peters: On the Origin of Post-Aspirated Stops (Clayton, 2010; Nance & Stuart-Smith, 2013; Ní Chasaide, 1985) or dialects of Swedish (Helgason & Ringen, 2008), where pre-aspiration is reported to be longer in the velar than in the bilabial context. The longer pre-aspiration duration preceding velar stops is likely to be due to articulatory factors, that is, to a slower movement of the tongue back as opposed to the tongue tip and the lips in dental and bilabial stops (Helgason & Ringen, 2008). The lack of interactions between place of articulation, age, and variety in our study suggests that pre-aspiration is fading to equal degrees across stop types.
VOT appeared to be more variable when compared among places of articulation, varieties, and age groups. Intervocalic stops /p, t, k/ exhibited the expected VOT-pattern with the velar displaying the longest, and the bilabial stop displaying the shortest VOT.
The same gradation of VOT was found for hC-sequences in older WAS and in EAS participants. In younger WAS speakers, however, post-aspiration appeared to have the same length in /st/- as in /sk/-sequences and did therefore deviate from the VOT-pattern that has been attributed to phonetic universals and has been found for many languages (Cho & Ladefoged, 1999). The very long post-aspiration in /st/-sequences in young WAS speakers, and the fact that younger EAS speakers produced a longer VOT than older EAS speakers only in /st/ suggests that the upcoming post-aspiration cannot entirely be explained by articulatory factors. The finding that in EAS only /st/-sequences show the change supports the idea that the sound change was actuated in the dental context.
This interpretation is based on the assumption that, up to this point, the trajectory of the change was the same in EAS and WAS. Due to aerodynamic and perceptual factors, a slightly longer VOT might be particularly prone to imitation and further lengthening in the dental, but not in the bilabial and the velar context: There is evidence that the stop release of a [th] contains more energy in the high-frequency range than that of a [ph] (Harrington, 2010a, p. 104). The lesser auditory salience of the stop release of [p], due to the “lowest amplitude and spectrally most diffuse burst of any of the voiceless stops” has been suggested to account for the tendency of [p] to become voiced (Ohala, 1983, p. 195, based on Stevens, 1980). Although the distinction between [kh] and [th] is less clear, [th] shows a rise of the spectral energy towards the following vowel, while [kh] has its spectral moment in the mid-frequencies (Harrington, 2010a, pp. 104–106).7 As far as perception is concerned, there are experiments indicating that the voicing contrast in English is more salient for alveolar than for bilabial stops (Silbert, 2014).
If it is assumed that, due to aerodynamic factors, post-aspiration in /st/-sequences is perceptually more prominent than in /sp/- and /sk/-sequences, then it is possible that listeners first start to imitate the long VOT in /st/-words, and only later generalize it to the velar and the bilabial context.
A slightly longer VOT in hC-sequences than in intervocalic stops is likely to exist as synchronic variation also in speakers that have not taken part in the sound change, that is, speakers who produce mostly short VOT, but a long closure duration and pre-aspiration.
This assumption is supported by the slightly longer VOT in hC- than in C-words found for older EAS speakers (see Figure 2), and by Torreira’s (2007a) inter-dialectal comparison of /sp, st, sk/ where a similar tendency was observed for Puerto Rican /st, sk/ and Buenos Aires Spanish /sk/. The explanation of how this longer VOT arises has to remain an issue for future studies. One possible explanation is that the long stop closure (in hC- as opposed to C-sequences) results in a higher intra-oral pressure, the latter leading to more prominent stop release (Ruch & Harrington, 2014). Another possible explanation is that the coordination between the glottal and the oral gesture is more variable in the offset of the consonant sequence, as was observed for German fricative + stop clusters (Hoole, 2006, p. 145).
These data were calculated for Australian English speakers. All three stops were followed by /a/.
Ruch and Peters: On the Origin of Post-Aspirated Stops Art. 2, page 29 of 36 Such a looser coupling at the offset of the /s/ + stop sequence would permit an earlier release of the oral stop, with the consequence of a greater voice onset time.
This brings up the question of sound change actuation: why should a listener imitate a slightly different production target instead of compensating and filtering the deviant token out? In the case of slightly post-aspirated stops — as they are likely to exist at the beginning of the sound change — a longer VOT does not necessarily have to be perceived as a deviant token. A perception experiment with listeners of Argentinian Spanish (Ruch & Harrington, 2014) provided evidence that a slightly longer VOT favours the perception of an ambiguous stimulus [ˈpahtha] as pasta, and not as pata in a forced-choice perception experiment. The authors synthesized two continua between pata [ˈpata] and pasta [ˈpahta] by manipulating the duration of pre-aspiration. One of the two continua was generated with a slightly longer VOT (29 ms instead of 12 ms). In this continuum, the listeners were more inclined to answer pasta than in the continuum with short VOT. The results suggest that post-aspiration may enhance the cues of the underlying phonological /s/.
At least two pre-conditions would have to be met for post-aspiration to be imitated and to be spread within the speech community or, in other words, for the sound change to be actuated: First, the longer VOT needs acoustically to be sufficiently distinct in order to be perceived as a different production target (see Baker et al., 2011). Second, post-aspiration should not be perceived as a deviant /sC/-token because then, post-aspiration would be filtered out and would not be imitated by the listener when he becomes speaker (Garrett & Johnson, 2013). If a listener-speaker then imitates the slightly shifted production target and, eventually, exaggerates the target, these subtle shifts are accumulated and, as suggested by Garrett and Johnson (2013), can lead to a gradual sound change. Related to this scenario is the question of whether the sound change originated in the younger or in the older speaker group. The trend towards a longer VOT in hC- but not in C-sequences in older speakers (see Figure 3) could reflect a very early stage of the sound change or it could be the result of older speakers accommodating to younger speakers. In the latter case, however, and according to the model presented above, the VOT-difference between hC and C should be especially marked in /st/. The data of the current study do not allow a conclusive answer to this question. Longitudinal studies of individual Andalusian speakers (following Harrington et al., 2000, for British English) or studies on accommodation between younger and older speakers could shed light on this issue.
In both varieties and age groups, C- and hC-sequences very clearly differed in terms of VTT, duration of the oral stop closure, and the total duration of the voiceless interval: hCsequences displayed a longer and mostly positive VTT, a longer stop closure, and a total voiceless interval almost double the length of the intervocalic stops /p, t, k/. As the comparison between the two varieties and age groups suggests, the total duration of the voiceless interval is furthermore very stable in apparent-time in both varieties (see Figure 5).
In addition to the above-mentioned parameters, young WAS participants displayed significantly longer post-aspiration in hC- than in C-sequences, and young EAS speakers in /st/ than in /t/. The preceding vowel was slightly longer when followed by an intervocalic stop than when followed by an hC-sequence (see Figure 6). This finding runs counter to assumptions formulated for Puerto Rican (Figueroa, 2000; Resnick & Hammond, 1975) and Eastern Andalusian Spanish (Carlson, 2012), where vowel lengthening has been assumed as compensating for /s/-lenition.
These findings further demonstrate that multiple cues are used to distinguish between intervocalic voiceless stops and hC-sequences in production. Within the multiple acoustic parameters that are used by Andalusian speakers to distinguish C- and hC-sequences in production, VOT is becoming more prominent, and VTT and closure duration are becoming Art. 2, page 30 of 36 Ruch and Peters: On the Origin of Post-Aspirated Stops less prominent. From a phonological point of view, the finding of post-aspirated stops in a variety of Spanish is striking because Spanish does not have (post-)aspirated stops, either phonologically or phonetically.8 Voiceless intervocalic stops are realized without aspiration, and, as the results of the present study and previous studies (e.g., O’Neill, 2010; Torreira & Ernestus,
2011) have shown, they often exhibit a partly voiced stop closure (see Figures 2 and 3).
The aim of the forced-choice perception experiment was to test whether post-aspiration is used as an acoustic cue for /st/ when distinguishing a minimal pair such as /pata/pasta/. With the exception of 5 out of 74 listeners, they were all able to distinguish the minimal pair that differed only in VOT. This finding challenges Torreira’s (2012) assumption (based on acoustic data) that Western Andalusian post-aspirated stops are the result of coarticulatory overlap and are not intended by the speakers. Although almost all listeners were able to distinguish the minimal pair, there were differences between the two varieties and age groups in the perception of post-aspiration. Younger listeners needed a shorter VOT to perceive pasta, indicating that they were more sensitive to this cue. At the same time, Western Andalusian subjects displayed a steeper psychometric curve pointing to a more categorical perception of the VOT-difference.
It has to be kept in mind that the participants might have based their judgements on other cues than post-aspiration. There is evidence that phoneme distinction is made based on several, often co-varying, acoustic cues (Best et al., 1981; Dorman et al., 1977; Raphael, 2005).
In the forced-choice perception experiment in this study, an increased VOT in the stimulus is associated with an increased total duration of the phonological /st/ sequence and, consequently, a greater C:V1 ratio (see Table 4). This could explain why not only younger and WAS listeners, but also older participants were able to distinguish between the two words of the minimal pair (although in a less consistent way). Further research is needed to understand cue-weighting in the present sound change in progress, and to investigate to what degree the same speaker-listener uses these cues also in speech production.
Despite these caveats, the results of the perception study mirror to a certain degree the results of the production study and point to a relationship between production and perception, which was confirmed by intra-subject comparison: speakers who produced a longer post-aspiration in the production task were also more sensitive to this acoustic parameter in the perception experiment. The categorical distinction of a minimal pair based on VOT indicates that speaker-listeners of Andalusian Spanish use post-aspiration as a cue to /st/, and that post-aspirated stops are likely to be phonologized to at least some degree. Parrell (2012: 45) assumes this major or minor degree of phonologization to be the reason why in his study some WAS speakers showed consistently long VOT values across different speech rates, instead of switches from pre- to post-aspiration as speech rate increased.