Abstract:
Part-of-Speech (POS) tagging is the process of labelling syntactic categories to each word in a sentence. Identifying POS tags for words is the fundamental task for various natural language processing applications such as grammar checking and machine translation etc. A novel POS tagging approach for Tamil language is proposed in this paper to determine the hierarchical POS tags for words. This approach uses Hidden Markov Model, Viterbi algorithm, Tamil grammar rules, N-gram and Stemming techniques. Test results show that the POS tags for Tamil words determined by this approach are found to be with 96% accuracy as approved by a Scholar in Tamil.