Classification of Kannada documents using novel semantic symbolic representation and selection method

Ranganathbabu Kasturi Rangan

Bukahally Somashekar Harish

Chaluvegowda Kanakalakshmi Roopa

International Journal of Artificial Intelligence

Classification of Kannada documents using novel semantic symbolic representation and selection method

Abstract

Kannada is one of the 22 scheduled Indian regional languages. It is also a low-resource regional language. The Kannada document classification is arduous due to its vocabulary richness, agglutinative terms, and lack of resources. The good representation and the prominent feature selection aid in solving the challenges in document classification tasks. In this paper, we are proposing semantic symbolic representation and feature selection method, for better representation of Kannada terms in interval values embedded with positional information. Following, selection of prominent discriminative symbolic feature vectors is also proposed. Further the symbolic document classifier is used to classify the Kannada documents. The proposed cluster based symbolic representation preserves the intra class variance and reduces the ambiguity in classification of Kannada documents. The experiments are performed over two Kannada document datasets which are multilabel and unbalanced. The comparative analysis of proposed method with other standard methods is also presented.

Cite

Full View

DOI

10.11591/ijai.v14.i4.pp3354-3365

ISSN Information

2089-4872

Pages

3354-3365

More Information

Volume 14

Issue 4

Publish at 2025-08-01

Discover Our Library

Embark on a journey through our expansive collection of articles and let curiosity lead your path to innovation.

Explore Now