Abstract

http://ssrn.com/abstract=2167044
 
 

References (3)



 
 

Footnotes (4)



 


 



The Statistical Properties of Random Bitstreams and the Sampling Distribution of Cosine Similarity


Graham L. Giller


Bloomberg LP; Giller Investments

October 25, 2012


Abstract:     
We summarize the statistical properties of statistics computed from independent random bitstreams including the commonly discussed support and cosine similarity. We derive the moments of the asymptotically normal approximation to the sampling distribution of the cosine similarity of independent random bitstreams and compare those computed moments to those measured by Monte-Carlo simulation. We find agreement for bitstreams of internet scale in length (i.e. of order 10,000 bits) and much smaller (100 and 10 bits) and demonstrate that the expected value of the cosine similarity of independent bitstreams might very significantly distant from zero. To compensate for this bias we propose a new statistic Support Adjusted Cosine Similarity or SACS.

Number of Pages in PDF File: 5

Keywords: collaborative filtering, cosine similarity, random bitstreams, sampling distribution, support, nested binomial distribution, Monte-Carlo simulation, delta method

working papers series





Download This Paper

Date posted: October 26, 2012 ; Last revised: November 12, 2012

Suggested Citation

Giller, Graham L., The Statistical Properties of Random Bitstreams and the Sampling Distribution of Cosine Similarity (October 25, 2012). Available at SSRN: http://ssrn.com/abstract=2167044 or http://dx.doi.org/10.2139/ssrn.2167044

Contact Information

Graham L. Giller (Contact Author)
Bloomberg LP ( email )
731 Lexington Avenue
New York, NY 10022
United States
HOME PAGE: http://www.bloomberg.com
Giller Investments ( email )
121 Red Hill Road
Holmdel, NJ 07733
United States
Feedback to SSRN


Paper statistics
Abstract Views: 304
Downloads: 81
Download Rank: 184,829
References:  3
Footnotes:  4

© 2014 Social Science Electronic Publishing, Inc. All Rights Reserved.  FAQ   Terms of Use   Privacy Policy   Copyright   Contact Us
This page was processed by apollo8 in 0.422 seconds