The phonetic basis of speech preparation

Silent phases before speech initiation are often seen as the time-interval during which the
utterance is planned. In most studies on pauses the focus is on cognitive and linguistic
factors such as word frequencies or utterance complexity. The aim of our study is to
investigate how phonetic factors affect these silent phases. In particular we are interested
in the physiological aspects of speech initiation such as breathing, articulatory posturing and
coordination of breathing and oral gestures. Pilot studies from three areas will be presented
here: (1) the effect of breathing on reaction time, (2) the coordination of respiratory activity
and breathing during interspeech pauses and (3) the effect of answer type on gap duration in

Bangine as a language isolate

The birthplace of modern humans is potentially in West Africa, yet, north of the Bantu-speaking area,  it is among the least studied areas in the world. Language is a central part of humanity’s present and past: every modern human being communicates through language. Prehistoric unrecorded languages cannot be studied in the same way as speech is today, but we can still gain insights into our ancient ancestors’ languages by looking at the ways with which people correspond with each other. Historical linguists search forpresent speakers’ sound-meaning patterns to group languages into families, and then to reconstruct what the language family’sProto-language would have sounded like. A language isolate, one with no known living relatives, presents one of the biggestobstacles for historical linguistic reconstruction. A language isolate spoken by a population genetic isolate represents remnants of lost diversity and the keys to unlocking the mysteries of our species’ early migration patterns. Bangime is one of Africa’s four confirmed language isolates. Its speakers, the Bangande, are equally unique genetically. The affiliations of the languages and peoples surrounding the Bangande, the Dogon, Mande, and Songhai groups, are among the most debated in Africa. The INSIGHT2020 team will amass existing and gather new, big data from under-studied languages and compare them with innovative genetics research to expand the search for previous pathways of West African populations. We will use ground-breaking computer- assisted technologiesto test the hypothesis that the Bangande are the only population to have survived a yet undiscovered cataclysmic event that predated the Bantu Expansion. Findings will be made available to researchers in an accessible, multimodal, online repository. The Bangande community will also be informed in an ethically sensitive and culturally appropriate manner. Our interdisciplinary methodology can serve as a model for other areas with similar questions.

Articulatory variability and coordination: Speech errors from a dynamical perspective

The proper act of speaking is one of the factors that leads to effective communication between people. The seemingly invariant sequence of planned and produced speech units frequently results in an extremely variable output of articulatory movements; sometimes to such an extent that the speaker produces a, by the listener, perceived speech error. Interestingly, the speaker his or herself frequently doesn’t notice the error, suggesting that immediate auditory feedback is not the most important channel to monitor speech productions.  Since decades, the production and correction of speech errors have been a valuable source of information for linguists to model speech and language production processes. In general, errors have been interpreted and modeled as originating at the phonological level, because of competing phonemes or features. More recent studies suggest that errors are more gradual and, in certain cases, originate at the articulatory level. l present a series of studies, conducted at the Oral Dynamics lab in Toronto, examining errors from an articulatory point of view, exploring whether speech errors were influenced by phonetic context and thus originated at a lower phonetic or articulatory level. In addition, I will present data on how speakers control for these articulatory speech errors.

Proposing and testing a new nasality index measured using a synchronous multi-sensory system

We have lately developed a non-invasive multi-sensor acquisition set – the hyper-helmet – for rare songs recording in an intangible cultural heritage safeguarding perspective. In this presentation, we take advantage of this articulatory sensing system to study and test a new nasality index. The helmet’s acoustic microphone and nasal piezoelectric accelerometer are used to calculate an oral/nasal rms ratio. An ElectroGlottoGraph instrument is the mean to estimate the voicing selector parameter. In addition, a non-intrusive tongue imaging sensor (an ultrasonic probe) and a lips movement camera are backups for articulatory and nasality qualitative interpretation. A software has been developed for synchronous acquisition of all sensors and it is been used to record an English corpus interpreted by a native English-speaking Canadian mid-age man. Multiple tests have been held to verify numerous nasality theories. Some results are shown in this presentation.

Aerodynamic, articulatory and acoustic realization of French /R/

French uvular /ʁ/ is usually considered as problematic due to its variability, especially in positions such as word initial and word final.
In this presentation, physiological and aerodynamic analyses allowed us to determine its major axes of variation as well as to validate the use of several acoustic measurements.
An acoustic study is then presented on large corpora of continuous speech, so as to test the variability of French /ʁ/ in terms of the aforementioned results. Finally, a parallel with perception is drawn.

Issues in the morphology and phonology of Arabic

How different is the phonology and morphology of nontemplatic (concatenative) word formation from that of templatic (nonconcatenative) word formation ? We will focus on the Arabic verbal system, the prototypical example of templatic morphology, with the aim of deriving some of its distinctly special traits from basic principles. The key novel aspect of the approach is its focus on paradigms. The main result is that the paradigm coupled with general phonotactic constraints sets limits on the theoretically possible diversity of stems within that paradigm. The core analysis will be on Classical Arabic. However, we will bring in data from dialects which justify the approach and/or permit further theory development.

Linking perception and production in a cue-distractor paradigm (Adamantios Gafos, joint work with Kevin Roon and Chris Kirov)

When speaking words, a person must retrieve the phonological representation of a target lexical item by assembling a set of parameter values that specify the required vocal tract action.
We present a computationally explicit model of the process by which phonological production parameters are set. The model focuses on a specific task that requires the concurrent use of both speech perception and production, which in turn allows us to shed light on the nature of the representations involved in the perception-production link. Specifically, the proposed model formalizes how ongoing response planning is affected by perception and accounts for a range of results reported across previous studies. The key unit of the model is that of the dynamic field, a distribution of activation over the entire range of values associated with each representational parameter. The setting of parameter values takes place by the attainment of a stable distribution of activation over the entire field, stable in the sense that it persists even after the response cue in the above experiments has been removed. This and other properties of representations which have been taken as axiomatic in previous work are derived by the dynamics of the proposed model.

Suggested readings :
- Roon and Gafos, in press, Perceiving while producing. Journal of Memory and Language.
- Schoner et Erlhagen 2002 Dynamic field theory of movement preparation. Psychol Rev. 2002 Jul ;109(3):545-72

Kinematics and dynamics of gesture (joint work with Tanner Sorensen)

We propose a theory of gestural timing. It is a theory of how a gesture determines change in vocal tract state (e.g., change in constriction degree) based on the vocal tract state. A core postulate of the theory is that no executive time-keeper determines change in vocal tract state. That is, it is a theory of intrinsic timing. We compare the theory against others in which an executive time-keeper determines change in vocal tract state. Theories which employ an executive time-keeper have been proposed to correct for disparities between theoretically predicted and experimentally observed velocity profiles. Such theories of extrinsic timing make the gesture a nonautonomous dynamical system. For a nonautonomous dynamical system, the change in state depends not just on the state, but also on time. We show that this nonautonomous extension makes surprisingly weak kinematic predictions both qualitatively and quantitatively. We propose instead that the gesture is a theoretically simpler nonlinear autonomous dynamical system. For the proposed nonlinear autonomous dynamical system, the change in state depends nonlinearly on the state (and does not depend on time). This new theory provides formal expression to the notion of intrinsic timing. Furthermore, it predicts experimentally specific relations among kinematic variables which we can verify in datasets we have examined.

Suggested readings :

- Fowler, C. A. (1980). Coarticulation and theories of extrinsic timing. Journal of Phonetics 8, 113–33.
- Mottet, D., & Bootsma, R. J. (1999). The dynamics of goal-directed rhythmical aiming. Biological cybernetics 80(4), 235-245.

Phonetic nomograms for abstract phonological units (joint work with Jason Shaw, Philip Hoole, Chakir Zeroual and Simon Charlow)

We pursue an analysis of the relation between qualitative syllable parses and their quantitative phonetic consequences. To do this, we express the statistics of a symbolic organization corresponding to a syllable parse in terms of continuous phonetic parameters consonantal plateau durations, vowel durations, and their variances. These parameters can be estimated from continuous phonetic data. This enables analysis of the link between symbolic phonological form and the continuous phonetics in which this form is manifest. We illustrate the predictions of different syllabic organizations and derive a number of previously experimentally observed and simulation results. Specifically, we derive the canonical phonetic manifestations of different syllabic organizations but also the result that, under certain conditions we can make precise, the phonetic indices of one organization can change to a range of values characteristic of the other, phonologically distinct organization. Finally, we explore the behavior of phonetic indices for syllabic organization by progressively increasing the size of the lexical sample and concomitantly diversifying the phonetic context over which these indices are taken.

Suggested readings :
- Shaw, J. and A. Gafos (2015). Stochastic time models of syllable structure. PLoS ONE 10(5), DOI : 10.1371/journal.pone.0124714.
- Shaw, J., Gafos, A., Hoole, P., Zeroual, C.(2011) Dynamic invariance in the phonetic expression of syllable structure. Phonology 28, 455-490.

Aperçu phonétique et phonologique des langues du sud-ouest de la Colombie

L’exposé est divisé en trois parties. La première donne un regard sur la diversité linguistique de l’Amérique latine et de la Colombie. Après on donnera un aperçu des systèmes phonologiques des langues du sud-ouest du pays et plus particulièrement des langues nasa et nam trik. Finalement on fera une très rapide présentation d’un software développé pour supporter l’apprentissage de la langue nam trik à Totoro.