A Study of the Statistics of Letters in Bodo News Text

Language Documentation, its Significance and Scopes in NE India

January, 2016

Sanjib Narzary and Mahananda Brahma

Abstract There are 33 consonants, 11 vowels and 17 conjunctive notations in Devanagiri. The frequency of the uses of these letters in any language using devanagiri scripts provides information about the morphological structure of the language in context to uses of letters which finds application in the study of Human Computer Interaction, Cryptography, Data Compression, Keyboard Design, Language Processing etc. The method employed in our work based on the statistics procured systematic monogram, digram and trigram with special reference to the Bodo language. This statistical data so obtained has been analysed by using NodeJS Javascript event driven environment. It has been found that with the monogram identification technique most frequent use consonant and conjunctive notation letters are among the न र ब स य म ल ग ज द फ ख व थ and ा िा ा ा ा ा ा ा ा . The letter distribution with bigram and trigram indicates that the combined letters are mostly used to form a meaningful word in Bodo, e.g., ल + क + ा + र + ा = लक्र (lokhra) (tiger in English).

Citation

@inproceedings{narzary-etal-2016-bodo-word-statistics,
    title = "A Study of the Statistics of Letters in Bodo News Text",
    author = "Sanjib Narzary and Mahananda Brahma ",
    booktitle = "Language Documentation, its Significance and Scopes in NE India",
    month = jan,
    year = "2016",
    address = "Kokrajhar, India",
    publisher = "CIT Kokrajhar",
    url = "",
}