Скачать презентацию 1 st International Conference In association with CIIL-Mysore Скачать презентацию 1 st International Conference In association with CIIL-Mysore

0699dd414c422420a15e4f3fd60c8deb.ppt

  • Количество слайдов: 24

1 st International Conference In association with CIIL-Mysore, IIT-Mumbai, IIIT- 1 st International Conference In association with CIIL-Mysore, IIT-Mumbai, IIIT-

Words unite people. Words can divide nations – they indulge in ‘war of words’… Words unite people. Words can divide nations – they indulge in ‘war of words’… Word-smiths fashion texts Word-mongers talk nineteen to the dozen Word-lords don’t tell you that they ‘double-speak’ Word-poets open the inner abyss of lanes & byelanes of meaning And so do Word. Nets Which is why we are all here!

Welcome to 1 st Global Word. Net Conference MY ADDRESS HAS TWO PARTS n Welcome to 1 st Global Word. Net Conference MY ADDRESS HAS TWO PARTS n First, I shall tell you a little about what the Indian linguistic scene is like, and what we at CIIL have been doing n Then, we will offer our suggestions on what we in India could do in Word. Net

CENTRAL INSTITUTE OF INDIAN LANGUAGES ^ma. Vr` ^mfm g§ñWm. Z {ejm {d^m. J, ^ma. CENTRAL INSTITUTE OF INDIAN LANGUAGES ^ma. Vr` ^mfm g§ñWm. Z {ejm {d^m. J, ^ma. V ga. H$ma Initiatives in LANGUAGE TECHNOLOGY

CIIL in the first three decades: Equipping Language teachers and Analysts technologically CIIL in the first three decades: Equipping Language teachers and Analysts technologically

1. An Apex Institution under Languages Division, MHRD n n In July 2001, 32 1. An Apex Institution under Languages Division, MHRD n n In July 2001, 32 years completed This 287 -people institution works for development of Indian languages. CIIL has five Centers with Research Groups (16) and Service Groups (6). 7 Regional Language Centers are at Bhubaneswar, Guwahati, Lucknow, Mysore, Patiala, Pune, & Solan.

2. Four Main Objectives n n n 1. Develops languages by creating content, corpus, 2. Four Main Objectives n n n 1. Develops languages by creating content, corpus, techniques and technologies. 2. Protects & Documents Minority & Tribal languages 3. Creates linguistic harmony by teaching 15 Indian tongues to non-native learners. 4. Above all, advices both Central and State governments on matters related to language.

3. Functionality and Multi-disciplinarity n Although the mainstay are Indian Languages & Linguistics, the 3. Functionality and Multi-disciplinarity n Although the mainstay are Indian Languages & Linguistics, the focus of all projects and programmes is on developing materials & products – in print, audio, video and computational. n In addition, there is enough interest in Comp. Lit, Education, Language Technology & NLP, Folklore, Geography, Statistics Psychology, Sociology & Translation

4. Coverage of CIIL - sizable n n n n Archived 118 lgs data 4. Coverage of CIIL - sizable n n n n Archived 118 lgs data Creating Voice Corpora Studied 80 Tribal lgs 35 grammars on-line soon Published 490 books Cassette Courses in : Assamese, Urdu, Bengali Kashmiri & Marathi Radio courses in Hindi through Kannada

5. Major Publications – 490+ books all produced in-house n n n n n 5. Major Publications – 490+ books all produced in-house n n n n n 22 Grammars 30 Intensive Courses 24 2 nd Lg Textbooks 5 Common Vocab. 18 Dictionaries 49 Apni Boli (KVS) 15 Pictorial Glossaries 16 Literacy Books 12 Rhymes/Lg Games 12 Folklore 16 Proceedings 9 Bibliographies

6. The Challenge before CIIL: Enormous 6. The Challenge before CIIL: Enormous

A truly plural world of languages 1, 576 rationalized mother-tongues; u 1, 796 other A truly plural world of languages 1, 576 rationalized mother-tongues; u 1, 796 other mother-tongues; u 114 languages with 10, 000+ speakers; u Large variation: Hindi (337 m) to Maram of Manipur with 10, 144; u Large non-scheduled lgs - Bhili (6 m) and Santali (5 m); u 146 radio lgs/69 school lgs /35 lg dailies. u

7. Programs - Modes of Delivery n n n n n 10 months L 7. Programs - Modes of Delivery n n n n n 10 months L 2 teaching: 8000 teachers trained Distance Courses in Tamil/Telugu/Bengali/Urdu On-line Programs in 15 Indian languages Kannada for officials in Karnataka Radio courses with AIR’s collaboration 3 -months Courses in Communication Orientation for Mother-tongue teachers Refresher Courses in Linguistics NLP Training modules

8. Language Technology – Further Goals n n n n n Enlargement of 3 8. Language Technology – Further Goals n n n n n Enlargement of 3 -million word Corpora: 100 m word corpora for Hindi-Urdu Multilingual multidirectional E- Dictionaries On-line Administrative Glossaries Lexical databases for MT Programs Tagging & Corpus Tools E-Zines and E-Journals Language Information Services Anukriti: Web-based Translation services

9 Indian Lgs & IT at CIIL n n n n 132 -node LAN 9 Indian Lgs & IT at CIIL n n n n 132 -node LAN set up V-SAT through STPI Brousing centre Has 2400 E-Journals & 350 paper journals. Collaborating with Schoolnet for electronic materials New generation Lg Labs Focus: Visual Phonetics

10. LIS-India Website Type Language Name: Type Area Name: n n n Home or 10. LIS-India Website Type Language Name: Type Area Name: n n n Home or http: //www. ciil. org/ General Information Language/ Area Profile: Geolinguistic; Sociolinguistic; Cultural; Literary Language/Area History: Genealogical; Archaeological; Cultural; Textual Language Vitality: Attitudinal; Utilitarian; Socio-political; Referential Grammatical Information: Phonetic; Graphemic; Phonological; Morphological; Lexical; Syntactic; Semantic; Stylistic Biblio search

11. Anukriti A Translation with NBT/SA n n WEB-BASED n Electronic lexicon SERVICE SITE 11. Anukriti A Translation with NBT/SA n n WEB-BASED n Electronic lexicon SERVICE SITE calledn Corpus & tools ANUKRUTI. n Parallel corpora To be maintained with NBT/Sahitya Akademin Cultural Glossaries E-journals n Thesauri Technological Tools n Word finders n Word. Nets

12. Bhasha Bharati Project To be set up in collaboration with u Sahitya Akademi 12. Bhasha Bharati Project To be set up in collaboration with u Sahitya Akademi u Sangeet Natak Academy u All India Radio u Doordarshan u National Library u National Archive u National Book Trust u Major TV Channels u Films Division u u u u Major Newspaper houses Numerous Foundations Individual writers Heirs of writers Personal libraries Little magazines This rich manuscriptorium will display plural literary and linguistic landscape of India.

13. Doctoral Programs under planning Already available through 22 Universities: Linguistics & Psychology Now 13. Doctoral Programs under planning Already available through 22 Universities: Linguistics & Psychology Now being planned in NLP Folklore/Communication Translation Indian Gram. Tradition

14. Future Programs Dip in Experimental Phonetics n Masters by Research in Field Linguistics 14. Future Programs Dip in Experimental Phonetics n Masters by Research in Field Linguistics n Courses in Statistical Linguistics n Diploma in Translation Studies n Dip in Folklore/Comp. Lit. & Semiotics n Internship in Linguistic Geography n Internship in NLP & Corpus Linguistics n

WHAT COULD WE DO TO CREATE AN WHAT COULD WE DO TO CREATE AN

India has already had a strong lexicographical tradition n Working on Word. Net, therefore, India has already had a strong lexicographical tradition n Working on Word. Net, therefore, should come naturally to us. Efforts have already begun as we see in Hindi, Tamil, Oriya and a few other languages. There does not seem to be any academic coordination, however. n n n Early 20 th century Indian linguistics was dominated by studies on sound-system and etymologies Mid-20 th C focussed on word-formation patterns Late 20 th C emphasized on syntax

We haven’t so far worked seriously on Lexical Semantics n n While Sociolinguistics was We haven’t so far worked seriously on Lexical Semantics n n While Sociolinguistics was a favourite, serious Psycholinguistics was almost absent Formal Syntax was highly valued, but intricacies of Semantics were not so attractive. Making of Dictionaries continued throughout, but major concerted efforts in each language were highly individualistic or had happened long ago. While writing softwares or applying them means money, and is hence a crowded field, Language Technology has so far been neglected.

So, what do we need to do now? n n n Create an Indian So, what do we need to do now? n n n Create an Indian Word. Net Association Work coordinatedly Remember to focus on areal semantic features because with so much linguistic & cultural diversity, India is ideal to test and validate the concept of Word. Net.