Automatic identification
Contact person
Francois PELLEGRINO
Scientific framework and objectives
Automatic Language Identification (ALI) as a discipline appeared about thirty years ago; but intensive research in the area only dates back to the beginning of the nineties.
Most of the systems that have been developed to this day have quite logically borrowed methods from automatic speech or speaker recognition. Although these systems are rather efficient and already allow us to come up with solutions in some fields (particularly in multilingual human-computer interfaces), they fail to address most of the linguistic and cognitive problems linked to the notion of inter-language distance (automatic typology, differences between languages and dialects, modelling of the cognitive processing of linguistic distance, etc.)
Read more...
Our current research explores approaches based on the modelling of suprasegmental cues (rhythm and intonation) with a view to integrating them into modular automatic identification systems. The efficiency of these models depends on the extraction of relevant parameters (in accordance with existing rhythm typologies, or revealed by perceptual experiments), building adequate models, and validating the models using read and spontaneous speech corpora.
Financial support
- Projet Emergence
Région Rhône-Alpes 2000-2003 - ACI, Jeune chercheur
MENRT 2000-2003 - APN / ATIP
CNRS-SHS 2000-2001
Publications
- Farinas, J., Rouas, J.L., Pellegrino, F., André-Obrecht, R., 2005, "Extraction automatique de paramètres prosodiques pour l’identification automatique des langues", Traitement du Signal, 22:2
- Ferragne, E., Pellegrino, F., 2004, "A comparative account of the suprasegmental and rhythmic features of British English dialects ", proc. of MIDL, Paris, 2004
- Ferragne, E., Pellegrino , F., 2004, "Diphthongization as a cue for the automatic identification of British English dialects", proc. of 148th meeting of the Acoustical Society of America, San Diego, 2004
- Ferragne, E., Pellegrino, F., 2004, "Rhythm in read British English: interdialect variability", proc. of 8th International Conference on Spoken Language Processing, Jeju, Korea, 2004
- Ferragne, E., Pellegrino, F., 2006, "Les systèmes vocaliques des dialectes de l'anglais britannique", actes de XXVIème Journées d’études sur la Parole, Dinard, 12-16 juin 2006, pp. 411-414
- Ferragne, E., Pellegrino, F., 2007, "Automatic dialect identification: a study of British English", in Speaker Classification II/2, Müller, C., Schötz, S. (eds), Springer, pp. 243-257, Lecture Notes in Computer Science
- Hamdi, R., Barkat-Defradas, M., Ferragne, E., Pellegrino, F., 2004, "Speech Timing and Rhythmic structure in Arabic dialects", proc. of 8th International Conference on Spoken Language Processing, Jeju, Korea, 2004
- Ohala, J., Marsico, E., 2001, "Differentiating phonetic from phonological events in speech", Premières Journées d'Etudes sur l'Identification Automatique des Langues, Lyon, France, 20 janvier 2001
- Pellegrino, F., 1998, "Une approche phonétique en identification automatique des
langues: la modélisation acoustique des systèmes vocaliques", Doctorat, Informatique, spécialité Traitement automatique de la parole soutenue le 22 décembre 1998, Université Paul Sabatier, Toulouse
- Pellegrino, F. (ed), 2001, "De la caractérisation à l'identification des langues. Actes sélectionnés de la 1ère journée d'étude sur l'identification automatique des langues (19/01/1999, Lyon)", Lyon, Edition en ligne
- Rouas, J.L., Farinas, J., Pellegrino, F., André-Obrecht, R., 2005, "Rhythmic Unit Extraction and Modelling for Automatic Language Identification", Speech Communication, 47:4, pp. 436-456
- Rouas, J.L., Barkat-Defradas, M., Pellegrino, F., Hamdi, R., 2006, "Identification automatique des parlers arabes par la prosodie", actes de XXVIème Journées d’études sur la Parole, Dinard, 12-16 juin 2006, pp. 193-196
|