C-Graph Architectural Evolution

Pasquereau, Benoit; Meyrick, Paul

Not Available for Download

Add Paper to My Library

C-Graph Architectural Evolution

Proceedings of the 5th Annual RELX Search Summit

Posted: 16 Nov 2021 Last revised: 18 Nov 2021

See all articles by Benoit Pasquereau

Paul Meyrick

Elsevier

Date Written: September 21, 2021

Abstract

The C-Graph project started as a mechanism to calculate in 'real-time' document and author metrics to be stored in SOLR for Scopus. It replaced a batch solution based on AWS RedShift which was expensive to run and difficult to extend.

Once the migration of the metric calculation to C-Graph was over, it was clear that C-Graph could be also be used to compute metrics for other Elsevier products like Science Direct and Engineering Village as well as power complex algorithms like query intent for Search.

This paper presents how the architecture of C-Graph which was initially developed to compute a finite known set of metrics on one dataset had to evolve to handle the following:
* other datasources, some as xocs feeds like Science Direct, other provided as Kafka topics like grant awards and also as rdf triples.
* not just other metrics but also different ways to compute the metrics, for example adding the option to exclude self citations.
* the set of documents on which the metrics are calculated: customers only want metrics computed using a subset of the documents.

The initial implementation was also strongly coupled with its first client SOLR for Scopus and we present how we incrementally decoupled the two.

The new architecture had to accommodate this new requirements while keeping good performance and low operating costs.

Keywords: Graph, architecture evolution, Kafka, decoupling, semantic technologies

Suggested Citation: Suggested Citation

Pasquereau, Benoit and Meyrick, Paul, C-Graph Architectural Evolution (September 21, 2021). Proceedings of the 5th Annual RELX Search Summit, Available at SSRN: https://ssrn.com/abstract=3965027

Benoit Pasquereau (Contact Author)

Elsevier ( email )

Paul Meyrick

Elsevier ( email )

Not Available for Download

Do you have a job opening that you would like to promote on SSRN?

Place Job Opening

Paper statistics

Abstract Views

193

PlumX Metrics

Feedback