Unifying Context with Labeled Property Graphs: A Pipeline-Based System for Comprehensive Text Representation in Nlp
31 Pages Posted: 7 Jul 2023
Abstract
The extraction of valuable insights from vast amounts of unstructured digital text presents significant challenges across diverse domains. This research addresses this challenge by proposing a novel pipeline-based system that generates domain-agnostic and task-agnostic text representations. Our approach leverages labeled property graphs (LPG) to encode contextual information, facilitating the integration of diverse linguistic elements into a unified representation. By addressing the crucial aspect of comprehensive context modeling and fine-grained semantics, our system enables efficient graph-based querying and manipulation. We demonstrate the effectiveness of our system through the implementation of NLP components that operate on LPG-based representations. Additionally, we introduce specialized patterns and algorithms to enhance specific NLP tasks, including temporal link detection, event enrichments, nominal mention detection, event participant detection, and named entity disambiguation. The evaluation of our approach, using the MEANTIME corpus comprising manually annotated documents, provides encouraging results and valuable insights into the strengths of our system. Our pipeline-based framework serves as a solid foundation for future research, aiming to refine and optimize text representations. By combining our contributions, we present an advanced method that utilizes LPG-based graph structures to generate comprehensive and semantically rich text representations, addressing the challenges associated with efficient information extraction and analysis in NLP.
Keywords: Natural Language processingNatural language understandingText graphsText representationText miningInformation extraction
Suggested Citation: Suggested Citation