Data Observability: Ensuring Trust in Data Pipelines

8 Pages Posted: 10 Dec 2024

See all articles by Krishna Kanagarla

Krishna Kanagarla

Sara Software Systems LLC; Independent

Date Written: March 04, 2024

Abstract

Data observability is the key process you need to implement to guarantee the consistency of the data pipeline in terms of credibility. The features of data quality, data freshness, lineage, and schema changes checked in real-time help prevent problems before they accumulate, and impact the data pipeline. As the field of data observability is still in its infancy, this paper aims at identifying which components it could be composed of, what tools it could ideally comprise, and what value it can bring to businesses, with an accentuation of recommendations for its implementation in contemporary data workloads.

Keywords: Data visibility, data accuracy, datasets delivery, constant monitoring, data origin, schema evolution, outliers, decision making process, risks management, work productivity

Suggested Citation

Kanagarla, Krishna Prasanth Brahmaji, Data Observability: Ensuring Trust in Data Pipelines (March 04, 2024). Available at SSRN: https://ssrn.com/abstract=5043481 or http://dx.doi.org/10.2139/ssrn.5043481

Independent ( email )

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
98
Abstract Views
299
Rank
595,734
PlumX Metrics