Towards a Novel Weakly Supervised Joint Approach of Named Entity Recognition and Normalization for Noisy Text

5 Pages Posted: 21 May 2018 Last revised: 19 Jun 2018

See all articles by Assia Mezhar

Assia Mezhar

University Hassan II of Casablanca

Mohammed Ramdani

University Hassan II of Casablanca

Amal El Mzabi

University Hassan II of Casablanca

Multiple version iconThere are 2 versions of this paper

Date Written: May 15, 2018

Abstract

The application of Natural Language Processing (NLP) tasks to the attractive social media corpus is very challenging because social media users often prefer communicating with casual language using out- of-vocabulary (OOV) words and internet abbreviations (Slang). That's why, we have to boost the performance of NLP tasks when applied to social media text. So, we are interested in improving the very major fundamental NLP task, Named Entity Recognition (NER), which assign to each entity a label whether it's a (person, location, organization, etc.) from Twitter. NER will be improved by converting non-standard entities to their canonical form called the Named Entity Normalization (NEN). In this paper, we propose a novel weakly supervised joint approach for named entity recognition and normalization for noisy text. We jointly conduct weakly supervised NER and normalization of both single-token OOV words and multitoken Slang to recognize and restore any type of named entities to their canonical form. This approach can give better results than existing state-of-art NER systems, NEN systems and pipe line approaches.

Keywords: Lexical Normalization, NLP, Weak Supervision, NER

Suggested Citation

Mezhar, Assia and Ramdani, Mohammed and El Mzabi, Amal, Towards a Novel Weakly Supervised Joint Approach of Named Entity Recognition and Normalization for Noisy Text (May 15, 2018). Smart Application and Data Analysis for Smart Cities (SADASC'18). Available at SSRN: https://ssrn.com/abstract=3178707 or http://dx.doi.org/10.2139/ssrn.3178707

Assia Mezhar (Contact Author)

University Hassan II of Casablanca ( email )

Casablanca, 20000
Morocco

Mohammed Ramdani

University Hassan II of Casablanca ( email )

Casablanca, 20000
Morocco

Amal El Mzabi

University Hassan II of Casablanca ( email )

Casablanca, 20000
Morocco

Register to save articles to
your library

Register

Paper statistics

Downloads
22
Abstract Views
174
PlumX Metrics