A Comparison of Algorithms for Text Classification of Albanian News Articles

Kadriu, Arbana; Abazi-Bexheti, Lejla

Download This Paper

Open PDF in Browser

Add Paper to My Library

A Comparison of Algorithms for Text Classification of Albanian News Articles

2017 ENTRENOVA Conference Proceedings

7 Pages Posted: 20 Nov 2018

See all articles by Arbana Kadriu

Lejla Abazi-Bexheti

South East European University (SEEU)

Date Written: September 7, 2017

Abstract

Text classification is an essential work in text mining and information retrieval. There are a lot of algorithms developed aiming to classify computational data and most of them are extended to classify textual data. We have used some of these algorithms to train the classifiers with part of our crawled Albanian news articles and classify the other part with the already learned classifiers. The used categories are: latest news, economy, sport, showbiz, technology, culture, and world. First, we remove all stop words from the gained articles and the output of this step is a separate text file for each category. All these files are then split in sentences, and for each sentence the appropriate category is assigned. All these sentences are then projected to a single list of tuples sentence/category. This list is used to train (80% of the overall number) and to test (the remained 20%) different classifiers. This list is at the end shuffled aiming to randomize the sequence of different categories. We have trained and then test our articles measuring the accuracy for each classifier separately. We have also analysed the training and testing time.

Keywords: data mining, text classification, news articles, machine learning

JEL Classification: C00, C30

Suggested Citation: Suggested Citation

Kadriu, Arbana and Abazi-Bexheti, Lejla, A Comparison of Algorithms for Text Classification of Albanian News Articles (September 7, 2017). 2017 ENTRENOVA Conference Proceedings, Available at SSRN: https://ssrn.com/abstract=3282474

Arbana Kadriu (Contact Author)

SEE University ( email )

Illindenska bb
Tetovo
Macedonia

Lejla Abazi-Bexheti

South East European University (SEEU) ( email )

Ilindenska nn
Tetovo, 1200
Macedonia

Download This Paper

Open PDF in Browser

Do you have a job opening that you would like to promote on SSRN?

Place Job Opening

Paper statistics

Downloads

43

Abstract Views

552

PlumX Metrics

Feedback

A Comparison of Algorithms for Text Classification of Albanian News Articles

Arbana Kadriu

Lejla Abazi-Bexheti

Abstract

Arbana Kadriu (Contact Author)

SEE University ( email )

Lejla Abazi-Bexheti

South East European University (SEEU) ( email )

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Related eJournals

Innovation Measurement & Indicators eJournal

Computation Theory eJournal

Electrical Engineering eJournal

Mechanical Engineering eJournal