Deep Learning based Named Entity Recognition for the Bodo Language
Procedia Computer Science Journal
Feb, 2024
Abstract
One of the important application of natural language processing (NLP) is Name Entity Recognition (NER). It automatically recognise and categorise named entities in a document. Named Entities can be the name of an individual, group, place, etc. It is crucial to the success of NER applications including text summarization, machine translation and information extraction and retrieval. It is one of the most useful application tools for a variety of topics and languages. Despite its widespread use and effectiveness in English, this field is currently under investigation for other Indian languages, such as Bodo. Due to the lack of resources and a high-quality dataset, NER in Bodo is a difficult task. In this research, a deep learning-based NER tagger is investigated for the Bodo language and NER tagged dataset is generated for Bodo language using Docanno and enlarge the dataset size by employing a data augmentation technique. As there is no Bodo NER baseline model to compared with, we employed several deep learning techniques for Bodo NER System and compared their results. We achieved an accuracy of 99.62%, precision of 99.75%, recall of 98.74% and F-score of 99.35% when employed with LSTM and character based. This study also highlights GRU and CNN based models performance in Bodo NER task.