Что такое findslide.org?

FindSlide.org - это сайт презентаций, докладов, шаблонов в формате PowerPoint.


Для правообладателей

Обратная связь

Email: Нажмите что бы посмотреть 

Яндекс.Метрика

Презентация на тему Identifying dialectal features of the Udmurt language with the help of an internet corpus

Содержание

Udmurt languageUralic family, Permic branchUdmurtia and neighboring regions340,000 speakersStandard literary language; 4 main dialectal areas
Identifying dialectal features of the Udmurt language with the help of an Udmurt languageUralic family, Permic branchUdmurtia and neighboring regions340,000 speakersStandard literary language; 4 main dialectal areas CorpusCollection of textsLinguistic annotation:metadatalemmatization, morphological annotationany other kind of annotation (e.g. borrowings)Search Udmurt vk-corpusPosts and comments of Udmurt-language Vkontakte groups and users2.5 million tokens Udmurt vk-corpusМон бы пукысал али и кылзӥськысал Лариса Васильевнаез, сое можно кылзыны вечность. Интерес не пропадёт. Тау та смена понна котькудӥзлы! Алиночка Владимировна, Udmurt vk-corpusМон бы пукысал али и кылзӥськысал Лариса Васильевнаез, сое можно кылзыны вечность. Интерес не пропадёт. Тау та смена понна котькудӥзлы! Алиночка Владимировна, Udmurt vk-corpusWeb interface: search Udmurt vk-corpusWeb interface: search results DialectologyPhoneticsLexiconMorphologySyntaxtraditional dialectology vk-corpus: phoneticsPeople try not to deviate from the standard variety; orthography cannot vk-corpus: lexiconMany people try to use the standard vocabularyNevertheless, dialectal words show Particle   бон/   бен ‘Forest’ (Maksimov 2007) Подорожник (Maksimov 2013) Borrowed Russian verbsThe standard way of borrowing a Russian verb is to Borrowed Russian verbsThere is a detransitivising suffix -ськ-/-ск- in Udmurt, which semantically Borrowed Russian verbsIf a reflexive Russian verb is borrowed:either the light verb Borrowed Russian verbsPossible hypotheses regarding the distribution of the two variants:lexical (depends Borrowed Russian verbsPossible hypotheses regarding the distribution of the two variants:lexical: same Russian verbs: кариськыны / карыны (vk + blogs) Borrowed Russian verbsThe choice is clearly geographically conditionedThe detransitive-less strategy prevails on ConclusionAn internet corpus can provide the data for identifying dialectal featuresThe phonetic Thank you for your attention!
Слайды презентации

Слайд 2 Udmurt language
Uralic family, Permic branch
Udmurtia and neighboring regions
340,000

Udmurt languageUralic family, Permic branchUdmurtia and neighboring regions340,000 speakersStandard literary language; 4 main dialectal areas

speakers
Standard literary language; 4 main dialectal areas


Слайд 3 Corpus
Collection of texts
Linguistic annotation:
metadata
lemmatization, morphological annotation
any other kind

CorpusCollection of textsLinguistic annotation:metadatalemmatization, morphological annotationany other kind of annotation (e.g.

of annotation (e.g. borrowings)
Search engine
corpus ≠ library
corpus ≠ Yandex/Google


Слайд 4 Udmurt vk-corpus
Posts and comments of Udmurt-language Vkontakte groups

Udmurt vk-corpusPosts and comments of Udmurt-language Vkontakte groups and users2.5 million

and users
2.5 million tokens in Udmurt (400 groups, 2000

users)
Sentence-level language recognition (rus/udm), morphological annotation
Author-related metadata: sex, birth year, birth place, current location

Слайд 5 Udmurt vk-corpus
Мон бы пукысал али и кылзӥськысал Лариса Васильевнаез, сое можно кылзыны вечность. Интерес не пропадёт.

Udmurt vk-corpusМон бы пукысал али и кылзӥськысал Лариса Васильевнаез, сое можно кылзыны вечность. Интерес не пропадёт. Тау та смена

Тау та смена понна котькудӥзлы! Алиночка Владимировна, тон прекрасной адями☺
привет ? не надо грустить, Алёна.

А вот лучше малпаськы сессиед сярысь?
Алексей, ? точно

Слайд 6 Udmurt vk-corpus
Мон бы пукысал али и кылзӥськысал Лариса Васильевнаез, сое можно кылзыны вечность. Интерес не пропадёт.

Udmurt vk-corpusМон бы пукысал али и кылзӥськысал Лариса Васильевнаез, сое можно кылзыны вечность. Интерес не пропадёт. Тау та смена

Тау та смена понна котькудӥзлы! Алиночка Владимировна, тон прекрасной адями☺
привет ? не надо грустить, Алёна.

А вот лучше малпаськы сессиед сярысь?
Алексей, ? точно

sentences in Russian
borrowed words / code switching within a sentence

Слайд 7 Udmurt vk-corpus
Web interface: search

Udmurt vk-corpusWeb interface: search

Слайд 8 Udmurt vk-corpus
Web interface: search results

Udmurt vk-corpusWeb interface: search results

Слайд 9
Dialectology
Phonetics
Lexicon
Morphology
Syntax
traditional dialectology

DialectologyPhoneticsLexiconMorphologySyntaxtraditional dialectology

Слайд 10 vk-corpus: phonetics
People try not to deviate from the

vk-corpus: phoneticsPeople try not to deviate from the standard variety; orthography

standard variety; orthography cannot reflect all dialectal features; the

diacritics (ӵ, ӟ, ӝ, ӥ, ӧ) are often omitted

* a little too hard


Слайд 11 vk-corpus: lexicon
Many people try to use the standard

vk-corpus: lexiconMany people try to use the standard vocabularyNevertheless, dialectal words

vocabulary
Nevertheless, dialectal words show up quite often
I have too

few tokens for each of Udmurtia’s 25 districts => only high-frequency vocabulary can be studied

Слайд 12 Particle бон/ бен























Particle  бон/  бен

Слайд 13 ‘Forest’ (Maksimov 2007)


















‘Forest’ (Maksimov 2007)

Слайд 14 Подорожник (Maksimov 2013)

Подорожник (Maksimov 2013)

Слайд 15 Borrowed Russian verbs
The standard way of borrowing a

Borrowed Russian verbsThe standard way of borrowing a Russian verb is

Russian verb is to use the construction Vinf +

[карыны]:

Трос инты-ын снимать кар-о-м.
many place-loc shoot.rus do-fut-1pl
‘We’re going to shoot [the movie] in many places.’
‘Мы будем снимать во многих местах.’

Слайд 16 Borrowed Russian verbs
There is a detransitivising suffix -ськ-/-ск-

Borrowed Russian verbsThere is a detransitivising suffix -ськ-/-ск- in Udmurt, which

in Udmurt, which semantically is very close to the

Russian suffix -ся:
passive
impersonal modal passive
generic subject/object
autocausative
reflexive
reciprocal

Слайд 17 Borrowed Russian verbs
If a reflexive Russian verb is

Borrowed Russian verbsIf a reflexive Russian verb is borrowed:either the light

borrowed:
either the light verb карыны has the -ськ- suffix:
Кызьы дозвониться кар-иськ-оно тӥ дор-ы.????
how reach.rus do-detr-deb you.pl near-ill
‘How

can I reach you guys [by phone]?’
or it does not:
со-ос ю-о, кыск-о, материться кар-о.
s/he-pl drink-prs.3pl smoke-prs.3pl swear.rus do-prs.3pl
‘They drink, smoke, swear.’


Слайд 18 Borrowed Russian verbs
Possible hypotheses regarding the distribution of

Borrowed Russian verbsPossible hypotheses regarding the distribution of the two variants:lexical

the two variants:
lexical (depends on the verb)
depends on the

meaning of the -ся suffix
depends on the aspect of the Russian verb
depends on the form of карыны
random

Слайд 19 Borrowed Russian verbs
Possible hypotheses regarding the distribution of

Borrowed Russian verbsPossible hypotheses regarding the distribution of the two variants:lexical:

the two variants:
lexical: same verbs often occur in both

constructions
depends on the meaning of -ся: no correlation
depends on the aspect: no correlation; btw, the aspect is not always chosen according to Russian rules
depends on the form of карыны: no correlation
random: no, because people tend to consistently use only one of the strategies

Слайд 20 Russian verbs: кариськыны / карыны (vk + blogs)

Russian verbs: кариськыны / карыны (vk + blogs)

Слайд 21 Borrowed Russian verbs
The choice is clearly geographically conditioned
The

Borrowed Russian verbsThe choice is clearly geographically conditionedThe detransitive-less strategy prevails

detransitive-less strategy prevails on the territory of the neighboring

Tatarstan and Bashkortostan regions
The light verb construction for verbal borrowings is exactly the same in Tatar and Bashkir (therefore, contact influence may be the driving force behind this distribution)

Слайд 22 Conclusion
An internet corpus can provide the data for

ConclusionAn internet corpus can provide the data for identifying dialectal featuresThe

identifying dialectal features
The phonetic differences are almost impossible to

extract from such a corpus
Lexical features can be identified, provided the frequency is high enough
Besides, interesting syntactic features can be identified (which is valuable, since the science does not know much about them)

  • Имя файла: identifying-dialectal-features-of-the-udmurt-language-with-the-help-of-an-internet-corpus.pptx
  • Количество просмотров: 225
  • Количество скачиваний: 0