Note: Transliteration will not be perfect, as each language or script is unique. There will be equivalent substitution in some cases.
- About Unicode and Character table for all Scripts
- Unicode Table with equivalents for Indian Scripts
- Tamil Editor
- Kannada Editor
- Hindi (Devanagari Script) Editor
- Kannada from/to Tamil
- Kannada (Devanagari Script) from/to Hindi
- Hindi (Devanagari Script) from/to Tamil
India has at least 33 Principal languages in common use out of 1600+ different Indian languages which are spoken.
The language diversity of its one billion people makes the communications problems of most other countries seem trivial by comparison.
The number of languages in India as being 1652 (1961 census). More recent censuses show a slightly different number, 1,576 to 1,721 “mother tongues” with separate grammatical structures. The exact figure is not available due to the richness of native Indian languages and their variants, contradictions in language surveys and human’s inability to identify the difference between Indian languages and Indian dialects.
It is hard to accurately determine exact number of languages and dialects in India. It is like counting number of birds in a region, as in Birbal story. The taste of water changes every mile, language or dialect changes every four miles (कोस कोस पर बदले पानी , चार कोस पर वाणी।). There are different theories about how many of these mother-tongues qualify to be described as independent languages.
The encyclopaedic People of India series of the Anthropological Survey of India, identified 75 "major languages" out of a total of 325 languages used in Indian households. Ethnologue, too reports India as a home for 398 languages, including 387 living and 11 extinct languages.
As per the 2011 Census, there are about 122 languages, out of which 23 (including English) are listed as the official languages of the Republic of India. 800 million Indians can speak this 23 languages. The Constitution stresses the primacy of Hindi which, written in Devanagari script. But English remains in widespread official use.
Collections and Essays
- Declaration of Human Rights: Tamil
- Declaration of Human Rights: MALAYALAM
- Languages: Concepts and Myths
- As per the People's Linguistic Survey of India (PLSI), as many as 780 different languages are spoken and 86 different scripts are used in the country. Nearly 250 languages have been lost in the last 50 years. 22 of the 780 languages are scheduled Indian languages. 122 languages have been declared by the census as spoken by a population exceeding 10,000. [hindustantimes 17 Jul 2013]
- For many educated Indians, English is their important language for communication.
- 23 national languages, plus English, associate official. They are: Assamese; Bengali; Bodo; Dogri; English; Gujarati; Hindi; Kannada;
Kashmiri; Konkani; Maithili; Malayalam; Marathi; Meitei (Manipuri); Nepali; Odia;
Punjabi; Sanskrit; Santali; Sindhi; Tamil; Telugu; and Urdu.
These languages have over 720 dialects.
- Formation of lingustic provinces. The acceptance of this policy involved the statutory recognition of all the major regional languages.
- The general script of the Aryan languages is different from the general script of the Dravidian languages. Along with these two main language families, there are others from Sino-Mongoloid family spoken in the East India.
Number of Languages
|Radio programs ||100 plus|
|Newspapers ||90 plus|
|Schools teach||60 plus|
|Language Family||Number||Percentage of population|
|Indo-European or Indo-Aryan||54||70.0|
|Austro- Asiatic languages||25||1.2|
325 recognized/documented Indian languages
Agaria, Ahirani, Aimol, Aiton, Anal, Andamanese, Angani, Angika, Ao, Apatani, Arabic, Armenian, Ashing, Assamese, Asuri, Awadhi
Badaga, Baghelkhandi, Bagri, Baigani, Bajania, Balti, Bangni, Banjari, Basturia, Bauria, Bawm, Bazigar Boli, Bengali, Bhanja- bhumia, Bantu, Bharmauri, Bhairi, Bhili, Bhojpuri, Bhotia, Bhuiya, Bhumij, Bhunjia, Biate, Bilaspuri, Birhor, Birjia, Bishnupriya, Bodo, Bokar, Bondo, bori, Braj Bhasha, Brijlal, Bugun, Bundelkhandi, Burmese, Bushari
Chakhesang, Chakma, Chambilai, Chameali, Chang, Changpa, Chattisgarhi, Chikari, Chinali, Chiru, Chote, Churasi
Dalu, Deori, Dhanki, Dhimal, Dhodia, Dhundhari, Didayi, Dimasa, Dingal, Dogri, Dommari, Droskhat/Dokpa, Duhlian-Twang
Gadaba, Gadiali, Gallong, Gameti, Gamit, Gangte, Garasia, Garhwali, Garo, Giarahi, Gondi, Gujarati, Gujjari, Gurung, Gutob
Hajong, Halam, Halbi, Harauti, Haryanavi, Hebrew, Himachali, Hindi, Hinduri, Hindusthani, Hmar, Ho, Hrusso, Hualngo
Jabalpuri, Jangali, Jarawa, Jaunsari, Juang
Kabui, Kachanga, Kachari, Kachchi, Kadar, Kagati, Kakbarak, Kanashi, Kangri, Kannada, Karbi, Karen, Karko, Kashmiri, Kathiawari, Khadiboli, Khaka, Khamba, Khampa, Khampti, Khampti-shan, Kharia, Khasi, Khaskura, Khatri, Kherwari, Khiangan, Khorusti, Khotta, Kinnauri, Kiradi, Kisan, Koch, Kodagu, Koi, Koireng, Kokni, Kolami, Kom, Komkar, Konda, Konicha, Konkani, Konyak, Koracha, Koraga, Korava, Korku, Korwa, Kota, Kotwalia, Kudmali, Kui, Kuki, Kulvi, Kumaoni, Kunbi, Kurukh, Kuvi
Ladakhi, Lahauli, Laihawlh, Lakher (Mara), Lalung,Lambani, Lamgang, Laotian, Laria, Lepcha, Limbu, Lisu, Lodha, Lotha, Lushai
Mag, Magahi, Magarkura, Mahal, Maithili, Majhi, Makrani, Malankudi, Malayalam, Malhar, Malto, Malvi, Manchat, Mandiali, Mangari, Mao, Maram, Marathi, Maria, Maring, Marwari, Mavchi, Meitei, Memba, Mewari, Mewati, Milang, Minyong, Miri, Mishing, Mishmi, Mizo, Monpa, Monsang, Moyon, Muduga, Multani, Mundari
Na, Nagari, Nagpuri, Naikadi, Naiki, Nati, Nepali, Nicobarese, Nimari, Nishi, Nocte,
Odki, Onge, Oriya
Padam, Pahari, Paharia, Palilibo, Paite, Panchpargania, Pang, Pangi, Pangwali, Parimu, Parji, Paschima, Pasi, Pashto, Pawri, Pengo, Persian, Phom, Pochury, Punchi, Punjabi,
Rai (Raikhura), Rajasthani, Ralte, Ramo, Rathi, Rengma, Riang,
Sadri, Sajalong, Sambalpuri, Sangtam, Sansi, Santali, Sadra, Saraji, Sarhodi, Saurashtri, Sema, Sentinelese, Shekhawati, Sherdukpen, Sherpa, Shimong, Shina, Shompen, Sikligar, Sindhi, Singpo, Siraji, Sirmauri, Soliga, Sulung, Surajpuri
Tagin, Tai, Tamang, Tamil,Tangam, Tangkhul, Tangsa, Tataotrong, Telugu, Thado, Thar, Tharu, Tibetan, Toda, Toto, Tulu
Yereva, Yerukula, Yimchungre
Zakring (Meyer), Zeliang, Zemi, Zou.
Endangered languages and scripts
An endangered language is a language that is at a risk of falling out of use, generally because it has few surviving speakers. If it loses all of its native speakers, it becomes an extinct language. UNESCO defines four levels of language endangerment between "safe" (not endangered) and "extinct": Vulnerable; Definitely endangered; Severely endangered; and Critically endangered. 191 languages of India are classified as vulnerable or endangered.
Some notes on North Indian languages
- Hindi is an Indian language spoken in most states in northern and central India. It is an Indo-European language, of the Indo-Iranian subfamily. It evolved from the Middle Indo-Aryan prakrit languages of the middle ages. Hindi derives a lot of its higher vocabulary from Sankrit, with a large number of Persian, Arabic and Turkish words.
- Marathi is one of the many languages of India, and has a long literary history. It is the language spoken in the state of Maharashtra. Other names for the language are Maharashtra, Maharathi, Malhatee, Marthi, and Muruthu.
- Kashmiri is an Indo-Aryan language spoken in parts of India and Pakistan. It is an SVO language written in a persian script.
*subject–verb–object (SVO) is a sentence structure where the subject comes first, the verb second, and the object third. SVO is the second-most common order by number of known languages, after SOV.
- Sindhi is an Indo-Aryan language spoken in the province of Sindh, Pakistan. The language can be written using Devanagari
- Nepali is an Indo-Iranian language spoken in Nepal, India and Bhutan.
- ----- more to be added
Some notes on South Indian Languages
South India, surrounded by three oceans and is separated from north India by the Vindhya mountain range. This this triangular volcanic land, is insulated by the Arabian Sea and Eastern Ghats on the east and the Bay of Bengal and Western Ghats on the west. This was once part of the geologically primeval Gondwanaland, remained culturally undisturbed for millennia, evolving an aura of poised tranquillity.
The dominant features of south India are the tropical climate, lush green tropical vegetation in the coastal areas and the architecture, culture, languages and lifestyle which had remained essentially Dravidian at the core.
The major languages in South India today are: Tamil, Telugu, Kannada and Malayalam. The minor languages are several and these are: Brahui, Gondi, Kui, Malto, Oraon (Kurukh), Toda, Tulu and Konkani (in Kannada script).
- Tamil is the state language of Tamil Nadu. Tamil is official language in other countries like Singapore and Malaysia.
- Telugu - the state language of Andhra Pradesh
- Kannada - the state language of Karnataka
- Malayalam - the state language of Kerala.
- Of the minor languages, Tulu and Konkani are the only active languages and modern works continue to be published in these languages forming part of the literature of Karnataka State.
Around 18 percent of the Indian populace speak Dravidian languages. Only a few isolated groups of Dravidian speakers, such as the Gonds in Madhya Pradesh and Orissa, and the Kurukhs in Madhya Pradesh and Bihar, reside in the north India. Dravidian speakers are also the Brahuis in Pakistan)
The Dravidian family of languages includes approximately 26 languages, appear to be unrelated to languages of other known families. Some scholars include the Dravidian languages in a larger Elamo-Dravidian language family, which includes the ancient Elamite language.
Major languages in India, with over 720 dialects are written in over 13 different scripts.
The Brahmi script is the earliest writing system after the Indus script. Most of the Indian scripts and several hundred scripts found in Southeast and East Asia are derived from Brahmi. Brahmi is an abugida that thrived in the Indian subcontinent and uses a system of diacritical marks to associate vowels with consonant symbols. It has numerous descendents like Gupta script. A southern form of Brahmi developed into the Grantha script
The Kharosthi script (also known as 'Indo-Bactrian' script) was more or less contemporary to Brahmi script and was employed to represent a form of Prakrit.
Origin of Brahmi Script is debatable with possible candidates: Unknown so far, Indus, Hieratic, cuneiform, Phoenician, Aramaic etc.
Prinsep deciphered Brahmi and Kharosthi from the bilingual Indo-Greek coins. Tamil-Brahmi script was found in Palani in Southern India, scientifically dated to 540 BCE.
The Unicode Consortium is a non-profit organization devoted to developing, maintaining, and promoting software internationalization standards and data, particularly the Unicode Standard, which specifies the representation of text in all modern software products and standards. The Unicode Consortium actively develops standards in the area of internationalization including defining the behaviour and relationships between Unicode characters. The Consortium works closely with W3C and ISO—in particular with ISO/IEC/JTC 1/SC2/WG2, which is responsible for maintaining ISO/IEC 10646, the International Standard synchronized with the Unicode Standard.
The latest electronic version of the Unicode Standard can be found at Unicode site. The publications of the Unicode Consortium include Unicode Standard, with its Annexes and Character http://www.unicode.org/ucd/, Unicode Technical Standards and Reports http://unicode.org/reports/, Unicode Technical Notes and the Unicode Locales project, the Common Locale Data Repository.
The Unicode Character Standard primarily encodes scripts rather than languages. That is, where more than one language shares a set of symbols that have a historically related derivation, the union of the set of symbols of each such language is unified into a single collection identified as a single script. These collections of symbols (i.e., scripts) then serve as inventories of symbols which are drawn upon to write particular languages. In many cases, a single script may serve to write tens or even hundreds of languages (e.g., the Latin script).
A bilingual person is, in its broadest definition, anyone with communicative skills in two languages, be it active or passive. In a narrow definition, the term bilingual is often reserved for those speakers with native or native-like proficiency in two languages. Similarly, the terms trilingual and multilingual are used to describe comparable situation in which three or more languages are involved. Many bilingual speakers are able to switch from language to language with ease, sometimes in mid-sentence.
In India, 3-language formula is some what popular. Many can speak more than 2 languages. Speakers of Indian languages tend to maintain their languages over generations and centuries, even when they live away from the region where it is dominant.
By necessity, a substantial minority are able to speak two Indian languages; even in the so-called linguistic states, there are minorities who do not speak the official language as their native tongue and must therefore learn it as a second language. Many tribal people are bilingual. Rural-urban migrants are frequently bilingual in the regional standard language as well as in their village dialect.
There are 26 major languages (no majority language) with more than one million speakers each!
Popularity of English Language
Around 10 percent Indians, speak English. Even 5%, which is 50 million people, the largest population of English-speakers in the world. English is the main vehicle for certain kinds of knowledge, a library language. It is the best source of scientific knowledge today.
English is the lingua franca, or more precisely, the link language of India. It remains the language of the Lok Sabha (the parliament), of the higher courts, of the highest levels of the Indian Civil Service, of the major universities, of multi-nationals and of most large sale Indian businesses.
Indian software developers and programmers, invariable use English in their work.
Digital Divide or Digital Opportunity
But what of the other 95 or 90 percent, who can not use English?
So, is there a market for local (vernacular) computing in South Asia? The story of the two shoe salesmen who went to a rural village will provide the answer. The first came back to his headquarters and said, "The situation is absolutely hopeless. Of a thousand people, not a single one wears shoes." The second returned to his home company and said, 'This is an incredible opportunity: no one owns shoes and we can sell a thousand pairs!"
Points to note:
- For a multi lingual nation, India, is still lagging behind in language Internet usage and seeks investments
- The average language net user is a 25-year old, who accesses the Internet from a cyber cafe, reads regional language newspaper.
- The language newspapers comprise eight of the top selling newspapers in the country yet language net users are only 9.6 per cent of the total. So, the critical issue is infrastructure.
- The survey, titled 'Net Bhasha: state and future of language Internet services in India', showed a strong off-line presence resulted in a strong on-line use. Therefore, for a strong regional language newspaper, which has already a strong reader base, converting this base to its on-line offering is easier
- Internet is still an urban and English-language dominated medium in the country despite its strongest use for e-mail.
- Cyber cafes have proliferated, in towns. Ease of use, technology issues like fonts, which have to be downloaded, which put off potential users, are the other hitches for the lower use. "Language use will benefit if cyber cafes offered language keyboards," it is felt.
Some Interesting Facts
- Traffic is on the left side (and cars have Right Hand Drive).
- English used in India is modelled on British English.
- Date format: dd/mm/yyyy
- Number format: 100 thousand = 1 lakh. 10 million or 100 lakhs or 1,00,00,000 = 1 crore.
- Postal Code (PIN): 6 digits.
- Official Measurements: Metric
- Voltage 220V; 50 Hz
- Financial Year starts on April 1.
Old language map of India
References (including sources) and Language Studies WEB sites
- AncientScripts.com URL:http://www.ancientscripts.com/
- India: Languages and Scripts URL:http://www.cs.colostate.edu/~malaiya/scripts.html
- Wikipedia URL:http://en.wikipedia.org/wiki/Indian_languages
- Introduction to indic scripts URL:http://people.w3.org/rishida/scripts/indic-overview/