Unifying Context with Labeled Property Graphs: A Pipeline-Based System for Comprehensive Text Representation in Nlp

31 Pages Posted: 7 Jul 2023

See all articles by Ali Hur

Ali Hur

Edith Cowan University

Naeem Janjua

Edith Cowan University

Mohi Ahmed

Independent

Abstract

The extraction of valuable insights from vast amounts of unstructured digital text presents significant challenges across diverse domains. This research addresses this challenge by proposing a novel pipeline-based system that generates domain-agnostic and task-agnostic text representations. Our approach leverages labeled property graphs (LPG) to encode contextual information, facilitating the integration of diverse linguistic elements into a unified representation. By addressing the crucial aspect of comprehensive context modeling and fine-grained semantics, our system enables efficient graph-based querying and manipulation. We demonstrate the effectiveness of our system through the implementation of NLP components that operate on LPG-based representations. Additionally, we introduce specialized patterns and algorithms to enhance specific NLP tasks, including temporal link detection, event enrichments, nominal mention detection, event participant detection, and named entity disambiguation. The evaluation of our approach, using the MEANTIME corpus comprising manually annotated documents, provides encouraging results and valuable insights into the strengths of our system. Our pipeline-based framework serves as a solid foundation for future research, aiming to refine and optimize text representations. By combining our contributions, we present an advanced method that utilizes LPG-based graph structures to generate comprehensive and semantically rich text representations, addressing the challenges associated with efficient information extraction and analysis in NLP.

Keywords: Natural Language processingNatural language understandingText graphsText representationText miningInformation extraction

Suggested Citation

Hur, Ali and Janjua, Naeem and Ahmed, Mohi, Unifying Context with Labeled Property Graphs: A Pipeline-Based System for Comprehensive Text Representation in Nlp. Available at SSRN: https://ssrn.com/abstract=4503138 or http://dx.doi.org/10.2139/ssrn.4503138

Ali Hur

Edith Cowan University ( email )

Mount Lawley Campus
Perth
Churchlands 6018 WA
Australia

Naeem Janjua (Contact Author)

Edith Cowan University ( email )

Mount Lawley Campus
Perth
Churchlands 6018 WA
Australia

Mohi Ahmed

Independent ( email )

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
42
Abstract Views
222
PlumX Metrics