Human DCG vs Engagement DCG: Evaluating the Wisdom of the Crowd

Posted: 16 Nov 2021

Date Written: September 21, 2021

Abstract

Human subject matter expert (SME) relevance ratings have long been used to derive various search relevance precision and recall evaluations. Another source of derived relevance data is based on end customer-facing or wisdom of the crowd result selection. Fortunately, both methods can be compared directly using conventional search precision metrics like Discounted Cumulative Gain (DCG), Precision(k), Mean Average Precision, and Mean Relevance Rank.

Human relevance testing (HRT) is performed by human subject matter experts in blind studies rating query document pairs. Ratings results are then tallied and analyzed both at the query / document pair level and across the entire dataset of all query/document pairs. Wisdom of the crowd metrics assume that customer engagement with a particular document is proportional to that document's relevance. To measure engagement, a model is defined that uses the type and amount of activity performed by the user with the document to create an equivalent relevance rating for the query document pair. This engagement-based rating is used to then calculate the same relevance precision metrics as are calculated in HRT studies.

Over time, assumptions have been made that hDCG and eDCG may be correlated via some additive offset or multiplier. Unfortunately, this has been shown not to be true on a consistent basis. Until now, there has been no known study to compare hDCG and eDCG related search precision results. Using the Search Test Framework (STF), developed at LexisNexis, the technical challenges involved in carrying out such a study have been addressed and resolved. A process for directly comparing the eDCG/hDCG results has been developed and used on actual data. This presentation will discuss the issues involved, the metrics employed, the data used, and an analysis of the results of the study. As a conclusion, we will review the advantages and disadvantages of both methods and make suggestions for creating overall metrics that consider both type of results.

Keywords: Search Relevance Precision, Discounted Cumulative Gain, Subject Matter Experts, Wisdom of the Crowd, Search Precision

Suggested Citation

Rosenoff, Doug and Kottapuzhackal, Sreenath and Chatelain, Edward and Bandepalli, Venkata, Human DCG vs Engagement DCG: Evaluating the Wisdom of the Crowd (September 21, 2021). Proceedings of the 5th Annual RELX Search Summit, Available at SSRN: https://ssrn.com/abstract=3965010

Doug Rosenoff (Contact Author)

LexisNexis ( email )

P. O. Box 933
Dayton, OH 45401
United States

Sreenath Kottapuzhackal

LexisNexis ( email )

P. O. Box 933
Dayton, OH 45401
United States

Edward Chatelain

LexisNexis ( email )

P. O. Box 933
Raleigh, NC 27606
United States

Venkata Bandepalli

LexisNexis ( email )

P. O. Box 933
Dayton, OH 45401
United States

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Abstract Views
114
PlumX Metrics