Khasi Language As Dominant Part-of-Speech(POS) Ascendant in NLP
7 Pages Posted: 20 Mar 2019
Date Written: 2018
Part-Of-Speech(POS) tagging is the pre-processing technique or text processing in Natural Language Processing (NLP). The POS tagger is a system that generates the tags of each word as output from the given input sentence. POS tagging in India is a challenging task as Indian languages are morphologically rich. The most difficult challenge of POS tagging is ambiguity because a single word can be used in multiple senses depending on the context it is used in. Therefore, words or items of the language structure can only be disambiguated based on the speech contexts. This paper presents the grammatical POS and the designed POS tag-sets of Khasi language. Khasi is an Austro-Asiatic language spoken in the central and eastern parts of the state of Meghalaya, India. Though Khasi is mostly isolating in morphology, some words are derived through certain morphological processing.
Keywords: Khasi Language, Natural Language Processing (NLP), Machine Learning, Part of speech (POS), POS tagging, POS tagger, Rule-based, Stochastic, Transformation method
Suggested Citation: Suggested Citation