Abstract

http://ssrn.com/abstract=1688190
 
 

References (25)



 


 



Relevance-Based Retrieval on Hidden-Web Text Databases Without Ranking Support


Vagelis Hristidis


affiliation not provided to SSRN

Yuheng Hu


affiliation not provided to SSRN

Panagiotis G. Ipeirotis


New York University - Leonard N. Stern School of Business

September 2009

NYU Working Paper No. CEDER-09-05

Abstract:     
Many online or local data sources provide powerful querying mechanismsbut limited ranking capabilities. For instance, PubMed allows users tosubmit highly expressive Boolean keyword queries, but ranks the queryresults by date only. However, a user would typically prefer a rankingby relevance, measured by an Information Retrieval (IR) rankingfunction. The naive approach would be to submit a disjunctive query withall query keywords, retrieve the returned documents, and then re-rankthem. Unfortunately, such an operation would be very expensive due tothe large number of results returned by disjunctive queries. In thispaper we present algorithms that return the top results for a query,ranked according to an IR-style ranking function, while operating on topof a source with a Boolean query interface with no ranking capabilities(or a ranking capability of no interest to the end user). The algorithmsgenerate a series of conjunctive queries that return only documents thatare candidates for being highly ranked according to a relevance metric.Our approach can also be applied to other settings where the ranking ismonotonic on a set of factors (query keywords in IR) and the sourcequery interface is a Boolean expression of these factors. Ourcomprehensive experimental evaluation on the PubMed database and a TRECdataset show that we achieve order of magnitude improvement compared tothe current baseline approaches.

Number of Pages in PDF File: 14

working papers series


Download This Paper

Date posted: October 6, 2010  

Suggested Citation

Hristidis, Vagelis and Hu, Yuheng and Ipeirotis, Panagiotis G., Relevance-Based Retrieval on Hidden-Web Text Databases Without Ranking Support (September 2009). Vagelis Hristidis was partly supported by NSF grant IIS-0811922 and DHSgrant 2009-ST-062-000016. Pan, Vol. , pp. -, 2009. Available at SSRN: http://ssrn.com/abstract=1688190

Contact Information

Vagelis Hristidis (Contact Author)
affiliation not provided to SSRN ( email )
No Address Available
Yuheng Hu
affiliation not provided to SSRN ( email )
No Address Available
Panagiotis G. Ipeirotis
New York University - Leonard N. Stern School of Business ( email )
44 West Fourth Street
Ste 8-84
New York, NY 10012
United States
+1-212-998-0803 (Phone)
HOME PAGE: http://www.stern.nyu.edu/~panos
Feedback to SSRN


Paper statistics
Abstract Views: 286
Downloads: 37
References:  25

© 2014 Social Science Electronic Publishing, Inc. All Rights Reserved.  FAQ   Terms of Use   Privacy Policy   Copyright   Contact Us
This page was processed by apollo3 in 0.406 seconds