Keynote conferences
Marie Tahon is currently a full professor at Le Mans University and conducts her research at the Laboratoire d’Informatique de l’Université du Mans (LIUM). She received her engineering degree from École Centrale de Lyon as well as a Master’s degree in acoustics from École Centrale de Lyon and INSA Lyon in 2007. She obtained her Ph.D. in Computer Science from Paris-Sud University in 2012. She worked at LIMSI–CNRS (Orsay) on automatic emotion recognition in speech during both her Ph.D. and postdoctoral research. She also held a teaching and research position (ATER) at LMSSC / CNAM (Paris) in acoustics, and later conducted postdoctoral research at IRISA (Lannion) with the “Expression” research team. Her research focuses on expressive speech processing, particularly in the areas of speech synthesis, emotion recognition, and speaker identification. She has also conducted research in musical acoustics, including the automatic analysis of traditional singing and studies in instrumental acoustics and organology. Title: Automatic processing of spontaneous speech: Applications to media data and telephone conversations.
Abstract: Spontaneous speech is what we use to communicate in everyday life. Both the linguistic content and the manner of expression (prosody) are modulated according to the context of the interaction. These modulations occur at very different levels of speech. First, floor taking must occur at moments that are relevant to the conversation; one may also interrupt the other person's speech. Second, speech production can be affected by an emotional state and lead to disfluencies, phonetic, or prosodic variations. One important issue is to characterize such spontaneous speech with data-driven automatic processing models.
This presentation will address the automatic processing of spontaneous speech in two application cases: the processing of media speech, and in particular interruptions; and the estimation of the degree of frustration in telephone calls from call centers. These two application frameworks provide a context for presenting a methodology for collecting and annotating subjective data in order to train neural networks. In the case of media speech, these models will be used to identify when an interruption occurs in a speech signal. In the case of telephone speech, models optimized to predict the degree of satisfaction will be discussed.
|
Loading...