mar. 16/11/2021 Séminaire de recherche DiLiS
"Methodological considerations in corpus-based morphological typology"
ISH, salle Elise Rivet (+ zoom - link below)
Conférence de :
  • Matthew Stave (DDL)

dans le cadre DILIS

A fundamental question in linguistic typology is the validity of morphological profiles such as agglutinating, inflecting/fusional and isolating. These labels presuppose that 1) a variety of morphological properties are strongly correlated, including rates of allomorphy, suppletion, ablaut processes, and the size of morphological paradigms; and 2) these properties are sufficiently uniform across the various grammatical subsystems of individual languages to constitute meaningful characterizations. Ultimately, these categories invest in the basic assumption that similarities and differences between languages are not entirely determined by genealogical and areal proximity, but that unrelated languages tend to converge on certain bundles of linguistic features for language-internal reasons.

In this study, we assess the reliability of two of these categories, agglutinating and inflecting, by comparing results from a bottom-up, usage-based analysis with existing, grammar-based categorizations. We utilize datasets from 19 languages from the DoReCo database (Paschen et al. 2020). Our sample represent 17 language families and all six macro-areas. The datasets are transcriptions of spontaneous spoken language, morphologically annotated by language experts, each containing at least around 10,000 words. The annotations in these datasets permit a number of novel, nuanced analyses, such as examining differences between nominal and verbal morphological behaviors.

We lay out methods for measuring five morphological categories for these languages: separation/cumulation of meanings in morphemes, in/variance of morphological form, locality/extendedness of morphemes in words, paradigm size, and degree of synthesis. We investigate whether these five factors covary across different languages and the distribution of indices over types and word classes in each language. To the extent that we find substantial language-internal variation, we also examine whether the best predictor for this variation is part-of-speech (in particular contrasting nouns and verbs) or frequency of roots and word forms. Particular emphasis is given to the methodological considerations involved in performing quantitative cross-linguistic analysis on morphologically-annotated texts.

Participer à la réunion Zoom
ID de réunion : 876 9464 1039
Code secret : XEiUK6


