Nowadays, machine learning is everywhere promising a new golden age where « artificial intelligence » will solve effortlessly all kinds of problems such as autonomous driving, developing new vaccines and talking and behaving just as normal human beings. Beyond this naive and oversimplistic vision stands some actual facts: the unprecedented collection of data combined with the advances of learning mathematical theories and the ever-increasing computational resources change radically the methodological approaches in various research areas. The field of linguistics is one of them: with new speech, text, and other types of communications from many languages recorded every day it is now possible to study languages empirically from a data-centric perspective.
However, data is not sufficient by itself: one also needs to design theoretical and practical models of speech and languages. This remains a challenging endeavor which is far from being over. In this talk, I will explore the evolution of the modeling of the speech signal through an engineer’s perspective: why a model was necessary in the first place, what aspect of the data was explored and how the field changed with the advent of deep learning techniques. Then, I will explore how this « speech engineering legacy » can be exploited to the study languages. Particularly, I will focus on the problem of modeling continuums: how can we model the phonetic and linguistic continuums and can we use these tools to establish the notion of « distance » between sounds and between languages.
Samuel Akinbo (University of Minnesota)
William Havard (DEC-ENS)
Frédéric Isel (MoDyCo)
Malin Svensson Lundmark (Lund University & University of Southern Denmark)