Slide 1The double life of an endangered language researcherOutlineSuggested Research ProgramEndangered LanguagesImportance of Endangered LanguagesThree Language CommunitiesOther sources of informationNorth Slope IñupiatProperties of Iñupiaq (From notes by Lawrence Kaplan)Properties of IñupiaqProperties of IñupiaqProperties of IñupiaqProperties of IñupiaqProperties of IñupiaqType token curvesType token ratio curvesIñupiaq Orthography and FontsMapucheProperties of Mapudungun (Zúñiga 2000)Properties of MapudungunProperties of MapudungunProperties of MapudungunProperties of MapudungunType Token CurveMapudungun OrthographyAnishinaabeLow (Digital) ResourcesBeyond Low ResourcesLanguage technologies in informal registers (language styles)Rapid changeRapid changeAttitudes toward change Examples from OjibweAttitudes toward change Examples from OjibweAttitudes toward changeAttitudes toward changeAttitudes toward changeMany small varietiesSupport for many small varietiesMany small varietiesMany small varietiesMorphosyntactic divrgencesWhat Language technologies are useful?What do language communities want?What do language communities want?What do language communites want?What about MT?Suggested Research ProgramAVENUE Mapudungun and IñupiaqAvenue ArchitectureTransfer Rule FormalismTransfer Rule Formalism (II)MapudungunMapudungun-to-SpanishMapudungun-to-SpanishMapudungun-to-SpanishMapudungun morphemes Spanish wordsSlide 58Slide 59Mapudungun dual Spanish PluralKofketun I eat breadMorphemes that correspond to Spanish tense, aspect, and moodSlide 63Feature manipulation before transferFeature manipulation before transferTest suiteEvaluationSample OutputIñupiaqIñupiaq resourcesIñupiaq XFST transducerSlide 72MorphophonemicsSlide 74A call to actionLori LevinLanguage Technologies Institute Carnegie Mellon UniversityAdaptable, Community Controlled Language TechnologiesPictures by Rodolfo Vega Pictures by Laura TomokiyoThe double life of an endangered language researcherResearchers urgently need to try new things.[endangered [language researcher]]Speakers of endangered languages urgently need tools that work. [[endangered language] researcher]Picture by Laura TomokiyoOutlineThe needs of language communitiesThe AVENUE project’s experience with:Iñupiaq (Alaska)Mapudungun (Chile)Suggested Research ProgramBeyond bootstrapping from low resourcesGenre and register adaptationTranslation between related languages and dialectsNon-synchronous grammars in order to handle extreme agglutination and polysynthesisTechnologies based on mobile phonesNew techniques: Learning in the wild (in the context of use), active learning, self training, etc.Endangered LanguagesAround 6000 human languages are currently spoken90% are not expected to survive the next centuryIn the US, about 200 indigenous languages are still spokenOnly a few will survive the next 30 years (Noori p.c.)Importance of Endangered LanguagesCultural lossStories, songs, ethnic identityScientific lossThe study of human language will suffer from losing 90% of the samplesAnother kind of scientific lossNames of places, geological formations, plants, animals, etc.Three Language CommunitiesNorth Slope Iñupiat (Alaska)Edna MacLean (linguist, lexicographer, native speaker)Larry Kaplan (linguist, Alaska Native Language Center, University of Alaska, Fairbanks)Aric Bills (linguistics student, UAF)Mapuche (Chile, Argentina)Rosendo Huisca (language expert, lexicographer, native speaker)Eliseo Cañulef (bilingual education and language maintenance)Anishinaabe (Ojibwe, Potawatame, Odawa) (Great Lakes)Margaret Noori (linguist, language revitalization)Other sources of informationDelyth Prys Welsh, Native speakerLanguage technologies developer, terminologist, language revitalizationJonathan AmithNahuatl (Mexico), Anthropologist, linguistLanguage technologies developerPer LanggaardKalaallisut (Greenland), Greenlandic GovernmentLanguage technologies developerNorth Slope IñupiatLanguage: North Slope IñupiaqAbout 5000 peopleAlmost all native speakers are over 40 years oldSome bilingual education and second language educationStatus: endangeredRelated to languages whose status is better: Inuktitut (Canada), Kalaallisut (Greenland)Related to languages that are also endangered: Kobuk Pass Inupiaq.Properties of Iñupiaq(From notes by Lawrence Kaplan)vowels: a i u aa ii uu ai ia au ua iu uiFconsonants:p t ch k q ‘F(f) ł ł s sr kh (x) qh (X) hv l ļ z y g (ɣ) ġ (ʁ)m n ñ ŋProperties of IñupiaqWord structureStem (noun or verb) – postbase/s (optional) – inflection –enclitic (optional)FNiġi – ñiaq – tu(q) – guuq. Eat - will - s/he – it is said“It is said that s/he will eat.’Properties of IñupiaqDual NumberNiġi-ruŋa. ‘I am eating’ or ‘I ate.’ (singular) Niġi-ruguk. ‘We2 are eating.’ or ‘We2 ate.’ (dual) Niġi-rugut. ‘We are eating. or ‘We ate.’ (plural)Properties of IñupiaqErgative Case (transitive sentences)Aŋuti-m tuttu niġi-gaa. Man-Rel. caribou-Abs. eat-trans. 3s-3s‘The man ate/is eating caribou.’FTuttu-m aŋun niġi-gaa. caribou-Rel. man-Abs. eat-trans. 3s-3s‘The caribou ate the man.’Properties of IñupiaqAnti-passive (indefinite object)Tuttu-mik tautuk-tuŋa. ‘I ate caribou.’ or ‘I am eating caribou.’Aŋuti-m tuttu niġi-gaa. Man-Rel. caribou-Abs. eat-trans. 3s-3s‘The man ate/is eating caribou.’Properties of IñupiaqLong, multi-morphemic wordsTauqsiġñiaġviŋmuŋniaŋitchugut. ‘We won’t go to the store.’Kalaallisut (Greenlandic, Per Langgaard, p.c.)PittsburghimukarthussaqarnavianngilaqPittsburgh+PROP+Trim+SG+kar+tuq+ssaq+qar+naviar+nngit+v+IND+3SG "It is not likely that anyone is going to Pittsburgh"Type token curves0 1000 2000 3000 4000 5000 6000 7000 8000 9000 100000100020003000400050006000Type-Token CurvesEnglishArabicHocąkInupiaqFinnishTokensTypesType token ratio curves103907701150153019102290267030503430381041904570495053305710609064706850723076107990837087509130951000.20.40.60.811.2Type-Token Ratio CurvesEnglish Arabic HocąkInupiaqTokensTypesIñupiaq Orthography and
or
We will never post anything without your permission.
Don't have an account? Sign up