In 2018, the Government of the Indian state of Odisha published 21 dictionaries in the 21 provincial indigenous languages of the state. The dictionaries were developed in collaboration with the native communities for their planned application in multilingual primary education programs. The trilingual dictionaries, with translations in indigenous languages to English and Oriya (official language of Odisha) were published in August 2019 for public use on an online education portal managed by the Government.
On October 17, the online education portal Academia Virtual Odisha renewed the license of all dictionaries under an international Creative Commons Attribution 4.0 license.
Governments using CC licenses to share knowledge makes us happy. ?
The Indian state of Odisha just relicensed 21 dictionaries – in all the 21 indigenous languages that are spoken in the province – under CC-BY 4.0. ?
They can all be downloaded here: https://t.co/L6O4WvZt7J pic.twitter.com/t8kIuelmJm
– Creative Commons (@creativecommons) October 16, 2019
It makes us happy that governments use CC licenses to disseminate knowledge.
The Indian state of Odisha has just renewed the license of 21 dictionaries (in the 21 indigenous languages spoken in the province) under CC-BY 4.0.
They can all be downloaded here: https://t.co/L6O4WvZt7J
Eighth clause of the Indian Constitution
The eighth clause of the Indian Constitution lays the foundation for the use of a group of 22 Indian languages in governance, education and cultural promotion. It also provides guidelines for public service exams to be conducted in those scheduled languages. The languages of the current list are Assamese, Bengali, Bodo, Dogri, Gujarati, Hindi, Kannada, Cashmere, Konkani, Maithili, Malabar, Meitei, Marathi, Nepali, Oriya, Punjabi, Sanskrit, Santali, Sindi, Tamil, Télugu and Urdu. The inclusion in this list also helps the provinces to officially recognize the languages that are spoken locally for education and administration. There are requests to add 38 more languages to this list.
It is not the first time that the Odisha Government has allowed its online resources to be disseminated under a free Creative Commons license. In 2017, the Odisha Government made headlines as the first state government in India to grant a new license to eight social networks under a free license. This allowed the use of the content to be used on openly licensed platforms such as Wikipedia and its other sister projects such as Wikimedia Commons (multimedia library), Wikisource (free online library) and Wiktionary (free online dictionary). Over time, Jnanaranjan Sahu, a Wikimedian from OWUG, created a tool to easily migrate openly licensed images from licensed social networks to Wikimedia Commons so images can be used on Wikipedia.
Subhashish Panigrahi of Rising Voices interviewed Ranjana Chopra, who heads the Department of Development of Scheduled Castes and Tribes, Minorities and Classes of the Odisha Government, to learn about this project.
To provide some context on the development of dictionaries, Chopra explained how the need for bilingual dictionaries in indigenous languages to provide multilingual education was the driving force behind this work. Given that 21 different indigenous languages are spoken in the state and that grassroots workers need a trilingual domain, they face immense obstacles due to the lack of such resources. Some key organizations that participated in the compilation were the Academy of Tribal Languages and Cultures (ATLC), the Institute for Programmed Training of Caste and Tribes (SCSTRTI) and the Special Development Councils (SDC), all state government entities . The Museum of Art and Tribal Utensils of the capital of Odisha, Bhubaneswar, remains a resource center for visitors to learn about the indigenous peoples of Odisha, their languages and their cultures.
Rising Voices (RV): Many languages are not well documented. How did you collaborate with the communities of speakers to collect and compile the words? How did existing dictionaries help in this process?
Ranjana Chopra (RC): It is a fact that the indigenous languages have not been properly documented in the (Odisha) state although some sporadic attempts have been made over the years. Indigenous languages are full of dialectical variations. However, despite the variations, there are ‘nucleus’ areas where the “core language” is referred / spoken, although with some mixtures. While preparing bilingual dictionaries and trilingual proficiency modules, resource persons from the various nucleus areas were invited to work on the texts with a well-organized non-overlapping time plan. The ‘nucleus’ area and the relevant resource persons were identified through conducting workshops in respective language localities.
Ranjana Chopra (RC): It is a fact that indigenous languages have not been properly documented in the state (of Odisha), although some sporadic attempts have been made over the years. Indigenous languages are full of dialectical variations. However, despite the variations, there are “core” areas where “core language” is referred to or spoken, although with some mixtures. While preparing bilingual dictionaries and trilingual proficiency modules, specialists from the different “core” areas were invited to work on the texts with a well-organized work plan and without overlaps. The “core” area and the relevant human resources were identified through workshops in the respective linguistic locations.
RV: Some languages, such as bonda (remosam), have only 8,000 speakers. Half of the bonda community lives in remote villages. How do you plan multilingual education in your native languages in general and, in particular, how will you make these books available to you?
RC: As stated earlier, the texts prepared on the indigenous languages are meant to strengthen ongoing interventions in multingual education including such activities in the Bonda language. Language teachers engaged from the same language speaking communities are supposed to facilitate formal education through their native languages. These language texts (bi-lingual dictionaries and trilingual proficiency modules) would help them in delivering the requirements.
As Academy of Tribal Languages and Culture was in charge of developing these dictionaries and trilingual proficiency modules, it has already dispatched copies of these resources to relevant authorities and pockets where different development activities are going on. Copies are also provided to native-language teachers and front-line workers of line departments including the Accredited Social Health Activist (ASHA) who work in creating awareness among citizens for health planning and use of existing health services, and the Anganwadi Workers who educate people living in rural areas about basic health education including contraception and nutrition, also provide pre-school education.
RC: As I said before, the texts prepared on indigenous languages are intended to reinforce ongoing interventions in the field of multilingual education, including such activities in the Bonda language. It is assumed that participating language teachers who come from communities that speak the same language should facilitate formal education through their native languages. These linguistic texts (bilingual dictionaries and trilingual proficiency modules) would help them meet the requirements.
As the Academy of Tribal Languages and Cultures was responsible for developing these dictionaries and modules of trilingual competence, it has already sent copies of these resources to the competent authorities and entities in which various development activities are carried out. Copies are also provided to mother tongue teachers and frontline department workers, including the Accredited Social Health Activist (ASHA) who works to make citizens aware of health planning and use of services. existing health facilities, and Anganwadi workers who teach people living in rural areas basic health education (including contraception and nutrition) and who also offer preschool education.
The author also contacted the secretary of the Department of Electronics and Information Technology of the Government of Odisha, Manoj Kumar Mishra, in relation to its plans to make resources available in Unicode. According to Wikipedia, “Unicode is a computer standard that is used for the coding, representation and handling of texts in most of the world's writing systems.” Before Unicode was available, many texts, especially in India, used legacy coding standards that can make web search almost impossible, while text written in Unicode helps make the search and sharing of texts universal.
RC: As you may have noticed, the content of the dictionaries is readable but cannot be searched yet. Is there a plan to publish them also in Unicode so that they are universally localizable, regardless of the devices and operating systems?
Manoj Kumar Mishra: We are committed to bringing all the resources available on any of the (Odisha) government websites under Unicode, so that the content would be searchable on the web and will reach to every single citizen of the state, residing anywhere. Currently, we are working to improve the infrastructure required to make the text and the Odia font to be compatible and readable on different devices. Over the coming months, we will work to bring the dictionaries searchable on the Internet. We are also exploring to add these files to Odia Wikisource, where the community resource will convert it to Unicode, and will automatically become searchable on the web and consequently making the rich treasure of our written heritage accessible to all
Manoj Kumar Mishra: We are committed to bringing all available resources to any of the government (Odisha) websites under Unicode, so that the content can be searched on the web and reaches all citizens of the state, wherever they live. Currently, we are working to improve the necessary infrastructure so that the text and the Oriya font are compatible and readable on different devices. In the coming months, we will work so that dictionaries can be searched online. We are also exploring the possibility of adding these files to Odia Wikisource, where the community will convert them into Unicode and can automatically be searched on the web, so that the rich treasure of our written heritage will be accessible to all.
The dictionaries are useful for linguists and ethnographers to develop resources for languages that have no written form. They will also be used to create multimedia content intended to help younger generations make greater use of the language. India hosts more than 780 languages and approximately 220-250 languages have died in the last 50 years.