|
||||
|
||||
A Missing Data Paradox for Nearest Neighbor Recommender Systems
Daniel M. Fleder University of Pennsylvania - The Wharton School Kartik Hosanagar University of Pennsylvania - The Wharton School October 1, 2007 Abstract: Recommender systems typically work over sparse matrices. Although most methods assume so, these matrices' entries are often not missing at random (NMAR). How problematic is this? We present a puzzle. Some methods explicitly account for NMAR processes. This has been shown to improve predictions. Many methods, however, assume that entries are missing at random (MAR). While they may be wrong in that assumption, we show they may benefit nonetheless from its being violated. Given that some data must go missing, NMAR can often pick the "right" values to preserve (i.e. it preserves the more informative data). Thus despite the perception that NMAR is bad, it can often improve recommendations. This may explain some of the historical success of collaborative filtering even when this assumption has been violated.
Keywords: recommender systems, collaborative filtering, predictive modeling, missing data Working Paper SeriesDate posted: January 04, 2009 ; Last revised: January 04, 2009Suggested Citation |
|
|||||||
© 2009 Social Science Electronic Publishing, Inc. All Rights Reserved. Terms of Use Privacy Policy
This page was served by apollo3 in 0.094 seconds.