header

Reading Comprehension Based Question Answering System in Bangla Language with Transformer-Based Learning

13 Pages Posted: 25 May 2022 Publication Status: Published

See all articles by Tanjim Taharat Aurpa

Tanjim Taharat Aurpa

Bangabandhu Sheikh Mujibur Rahman Digital University

Richita Khandakar Rifat

Jahangirnagar University

Md Shoaib Ahmed

Jahangirnagar University

Md Musfique Anwar

Jahangirnagar University

A. B. M. Shawkat Ali

The University of Fiji

Abstract

Question answering (QA) system in any language is an assortment of mechanisms for obtaining answers to user questions with various data compositions. Reading comprehension (RC) is one type of composition, and the popularity of this type is increasing day by day in Natural Language Processing (NLP) research area. Some works have been done in several languages, mainly in English. In the Bangla language, neither any dataset available for RC nor any work has been done in the past. In this research work, we develop a question-answering system from RC. For doing this, we construct a dataset containing 3636 reading comprehensions along with questions and answers. We apply a transformer-based deep neural network model to obtain convenient answers to questions based on reading comprehensions precisely and swiftly. We exploit some deep neural network architectures such as LSTM (Long Short-Term Memory), Bi-LSTM (Bidirectional LSTM) with attention, RNN (Recurrent Neural Network), ELECTRA, and BERT (Bidirectional Encoder Representations from Transformers) to our dataset for training. The transformer-based pre-training language architectures BERT and ELECTRA perform more prominently than others from those architectures. Finally, the trained model of BERT performs a satisfactory outcome with 87.78% of testing accuracy and 99% training accuracy, and ELECTRA provides training and testing accuracy of 82.5% and 93%, respectively.

Keywords: Bangla Question Answering, Transformer-Based Learning, Reading Comprehension, Bangla Language, Bangla Reading Comprehension

Suggested Citation

Aurpa, Tanjim Taharat and Rifat, Richita Khandakar and Ahmed, Md Shoaib and Anwar, Md Musfique and Ali, A. B. M. Shawkat, Reading Comprehension Based Question Answering System in Bangla Language with Transformer-Based Learning. Available at SSRN: https://ssrn.com/abstract=4119325 or http://dx.doi.org/10.2139/ssrn.4119325

Tanjim Taharat Aurpa (Contact Author)

Bangabandhu Sheikh Mujibur Rahman Digital University ( email )

Gazipur
Bangladesh

Richita Khandakar Rifat

Jahangirnagar University ( email )

Savar
Social Science Faculty, Savar
Dhaka, 1342
Bangladesh

Md Shoaib Ahmed

Jahangirnagar University ( email )

Savar
Social Science Faculty, Savar
Dhaka, 1342
Bangladesh

Md Musfique Anwar

Jahangirnagar University ( email )

Savar
Social Science Faculty, Savar
Dhaka, Dhaka 1342
Bangladesh

A. B. M. Shawkat Ali

The University of Fiji ( email )

Saweni Lautoka
Lautoka City, PMB
Fiji

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
101
Abstract Views
568
PlumX Metrics