CNRS - IRCAM

Department Member, Sound Analysis and Synthesis

Thesis Title: MeLos: Analysis and Modelling of Speech Prosody and Speaking Style

About

R&D in Speech and Music Technologies

• My main interests cover speech and music technologies, with a major in Voice Conversion and Speech Synthesis. My main research topics include statistical modeling of speech & music signals, speech & music processing, linguistics, and musicology.

• I have a PhD. in computer sciences on the modeling of speech prosody and speaking style for speech synthesis (2011), and a master degree in musicology on a comparative study of contemporary music and literature (2007).


• My research has initially started with F0 estimation of monophonic musical instruments @ CNMAT, University of Berkeley - California (2005).



• Then, my research moved to speech processing @ IRCAM for Voice Conversion and Speech Synthesis, with the modeling of F0 and spectral envelope correlation for voice conversion (2006). My PhD. has focused on the modeling of symbolic/acoustic characteristics of speech prosody and speaking style for Speech Synthesis (2011), and includes discrete/continuous HMMs, segmental HMMs, information fusion, transcription and stylization of speech prosody, linguistics, joint short/long term modeling, and speaker-independent modeling.


• Also, I am active in Arts & Technologies production. This includes a participation in the PHASE project (Haptic Platform of Sound Application for Musical Education for Imaginative Play - Centre Pompidou, 2004); co-funder and scientific committee of EMUS (International Conferences on Speech and Music, 2008) and AGORA music festival ("La Voix, L'icône" - IRCAM, 2008); R&D for the production of "HyperMusic: Prologue" (supervised/unsupervised speech segmentation - Hector Parra,IRCAM, 2009), and "Luna Park" (speech synthesis, real-time control of speech prosody - Georges Aperghis, IRCAM, 2011).


• I am currently researcher and manager of the VOICE4GAMES project @ IRCAM on speech level normalization, unsupervised speech segmentation, speech mode classification, voice casting and speakers similarity for Video Games and Audio Gaming.

Specialties
• Speech Synthesis (Unit Selection, HMM-based)
• Speech Recognition (Speech Segmentation, Voice Type, Speakers Similarity)
• Voice Conversion

• Statistical Modeling of Speech & Music Signals (GMM, HMM, UBM, SVM)
• Spoken/Natural Language Processing
• Speech Prosody & Speaking Style


• Linguistics
• Real-Time Signal Processing (Max/Msp)
• Musicology

Contact Information

Homepage:

http://recherche.ircam.fr/equipes/analyse-synthese/obin

Address:

IRCAM
1, place Stravinky
75004 Paris - FRANCE

 
IEEE Transactions on Audio, Speech, and Language Processing
IEEE Transactions on Pattern Analysis and Machine Intelligence
Speech Communication

x

Log In

or reset password

Reset Password

Enter the email address you signed up with, and we'll send a reset password email to that address

Academia © 2012