Feedback to SSRN (Beta)
What type of feedback would you like to send?
Abstract: This paper examines the effect of recommender systems on the diversity of sales. Two anecdotal views exist about such effects. Some believe recommenders help consumers discover new products and thus increase sales diversity. Others believe recommenders only reinforce the popularity of already popular products. This paper seeks to reconcile these seemingly incompatible views. We explore the question in two ways. First, modeling recommender systems analytically allows us to explore their path dependent effects. Second, turning to simulation, we increase the realism of our results by combining choice models with actual implementations of recommender systems. We arrive at three main results. First, some well known recommenders can lead to a reduction in sales diversity. Because common recommenders (e.g., collaborative filters) recommend products based on sales and ratings, they cannot recommend products with limited historical data, even if they would be rated favorably. In turn, these recommenders can create a rich-get-richer effect for popular products and vice-versa for unpopular ones. This bias toward popularity can prevent what may otherwise be better consumer-product matches. That diversity can decrease is surprising to consumers who express that recommendations have helped them discover new products. In line with this, result two shows that it is possible for individual-level diversity to increase but aggregate diversity to decrease. Recommenders can push each person to new products, but they often push users toward the same products.. Third, we show how basic design choices affect the outcome, and thus managers can choose recommender designs that are more consistent with their sales goals and consumers’ preferences.
recommender systems, collaborative filtering, sales diversity, Lorenz curve, Gini coefficient, long tail, electronic commerce, path dependence, simulation, concentration, diversity
Abstract: Content Delivery Networks (CDNs) are a vital component of the Internet's content delivery value chain, servicing nearly a third of the Internet's most popular content sites. However, in spite of their strategic importance little is known about the optimal pricing policies or adoption drivers of CDNs. We address these questions using analytic models of CDN pricing and adoption under Markovian traffic and extend the results to bursty traffic using numerical simulations. When traffic is Markovian, we find that CDNs should provide volume discounts to content providers. In addition, the optimal pricing policy entails lower emphasis on value-based pricing and greater emphasis on cost-based pricing as the relative density of content providers with high outsourcing costs increases. However, when traffic is bursty and content providers have varying levels of traffic burstiness, as ex-pected in reality, volume discounts may be suboptimal and may even be replaced by volume taxes. Fi-nally, a pricing policy that accounts for both the mean and variance in traffic such as percentile-based pricing is more profitable than pure volume-based pricing when there is heterogeneity in burstiness across content providers. This finding is in contrast to the current practices of many CDN firms that use pure volume-based pricing.
Content delivery networks, internet, pricing, bursty traffic, web hosting, media delivery
Abstract: In Peer-to-Peer (P2P) media distribution, users obtain content from other users who already have it. This form of decentralized product distribution demonstrates several unique features. Only a small fraction of users in the network are queried when a potential adopter seeks a file and many of these users may even free-ride i.e. not distribute the content to others. As a result, generated demand may not always be fulfilled immediately. We present mixing models for product diffusion in P2P networks that capture decentralized product distribution by current adopters, incomplete demand fulfillment and other unique aspects of P2P product diffusion. The models serve to demonstrate the important role that P2P search process and distribution referrals - payments made to users that distribute files - play in efficient P2P media distribution. We demonstrate the ability of our diffusion models to derive normative insights for P2P media distributors by studying the effectiveness of distribution referrals in speeding product diffusion and determining optimal referral policies for fully decentralized and hierarchical P2P networks.
Peer to peer, P2P, product diffusion, P2P diffusion, supply-constrained diffusion
Abstract: Sponsored search accounts for 40% of the total online advertising market. These ads appear as ordered lists along with the regular search results in search engine results pages. The conventional wisdom in the industry is that the top position is the most desirable position for advertisers. This has led to intense competition among advertisers to secure the top positions in the results pages. We evaluate the impact of ad placement on revenues and profits generated from sponsored search using data from for several hundred keywords from the ad campaign of an online retailer. Using a hierarchical Bayesian model, we measure the impact of ad placement on both click-through rate and conversion rate for these keywords. We find that while click through rate decreases with position, conversion rate first increases and then decreases with position. The net effect is that, contrary to conventional wisdom, the topmost position in sponsored search advertisements is not necessarily the revenue- or profit-maximizing position. Using a theoretical model we show that one potential driver of these results is the heterogeneity in search costs across consumers and the additional browsing cost incurred in evaluating products across multiple websites. Our results inform the advertising strategies of firms participating in sponsored search auctions and provide insight into consumer behavior in these environments. Specifically, they help correct a significant misunderstanding among advertisers regarding the value of the top position. Further, they reveal potential inefficiencies in present auction mechanisms used by the search engines.
Sponsored search, ad placement, hierarchical Bayesian estimation, online advertising, online auctions, search engine marketing
Abstract: Peer-to-Peer (P2P) networks, a decentralized content distribution format in which users distribute media to each other, is fast gaining popularity for delivery of digital media such as music and videos. Product diffusion in P2P is unique because free riders ý users who download content from others in the network without redistributing it to others ý can create a supply constraint that results in the incomplete fulfillment of generated demand. P2P firms offer distribution referrals, i.e. payments to users who distribute content to others, to provide users with incentives to distribute content. In this paper, we study a P2P firm's optimal referral strategy. Starting with a simple model for media diffusion in P2P networks, we apply optimal control theory to determine a dynamic referral strategy. The diffusion model uniquely captures the role of the referral in addressing the supply constraint in P2P diffusion. We find that the referral strategy is governed by two main effects. Early in the diffusion, the referral strategy is dominated by a scarcity effect, namely that there exist very few users distributing the file in the work. Because the availability of users willing to distribute the file increases with time, the referral is nonincreasing with time during this phase. If the product is sufficiently diffused in the network, referral policy is dominated by a saturation effect later in the diffusion. In this Stage, the referral is non-decreasing with time in order to encourage sales which usually slow down late in the diffusion. In networks with significant free-riding, the optimal trajectory involves a very high referral at the beginning, followed by a decreasing trajectory. If the product is sufficiently diffused, the referral may start to increase in the final few periods due to the saturation effect mentioned above. Finally, we observe that firm profits under this dynamic strategy can be considerably higher than under a myopic referral policy. Our research represents a first step towards understanding marketing and operational issues in this emerging distribution format for entertainment goods and other digital media.
Peer to peer, P2P, product diffusion, dynamic referral, Internet marketing, networks and marketing
Abstract: Network caches are the storage centers in the supply chain for content delivery - the digital equivalent of warehouses. Operated by access networks and other operators, they provide benefits to content publishers in the forms of bandwidth cost reduction, response time improvement, and handling of flash crowds. Yet, caching has not been fully embraced by publishers, since its use can interfere with site personalization strategies and/or collection of visitor information for business intelligence purposes. While recent work has focused on technological solutions to these issues, this paper provides the first study of the managerial issues related to the design and provisioning of incentive compatible caching services. Starting with a single class of caching service, we find conditions under which the profit maximizing cache operator should offer the service for free. This occurs when the access networks' bandwidth costs are high and a large fraction of content publishers value personalization and business intelligence. Yet, some publishers will still opt out of the service, i.e., cache-bust, as observed in practice. We next derive the conditions under which the profit-maximizing cache operator should provision two vertically differentiated service classes, namely premium and best-effort. Interestingly, caching service differentiation is different from traditional vertical differentiation models, in that the premium and best effort market segments do not abut. Thus, optimal prices for the two service classes can be set independently, and cannibalization does not occur. It is possible for the cache operator to continue to offer the best-effort service for free while charging for the premium service. Furthermore, consumers are better off because more content is cached and delivered faster to them. Finally, we find that declining bandwidth costs will put negative pressure on cache operator profits, unless consumer adoption of broadband connectivity and the availability of multimedia content provide the necessary increase in traffic volume for the caches.
Web caching, dontent delivery, pricing, quality of service
Abstract: Information specialists in enterprises and consumers on the Internet regularly use Distributed Information Retrieval (DIR) systems that query a large number of Information Retrieval (IR) systems, merge the retrieved results and display them to users. There can be considerable heterogeneity in the quality of results returned by different IR servers. Further, since different servers handle collections of different sizes, have different processing and bandwidth capacities, there can be considerable heterogeneity in their response times. The broker in the distributed IR system thus has to decide which servers to query, how long to wait for responses and which retrieved results to display based on the benefits and costs imposed on users. The benefit of querying more servers and waiting longer is the ability to retrieve more documents. The costs may be in the form of access fees charged by IR servers or user's cost associated with waiting for the servers to respond. We formulate the broker's decision problem as a stochastic mixed integer program. We present closed-form results for the optimal query set and wait time in the special case when the relevance scores and response times of the IR servers are independent and identically distributed. When servers are heterogeneous, we present a simulations-based optimization technique and demonstrate how the optimal query set and wait time may be determined. The technique is computationally efficient and can be used to generate decision rules for source selection and query termination that are relatively easy to implement. We use data gathered from two different contexts - a DIR system that queries IR engines of several US federal agencies and a comparison shopping engine that queries multiple stores for price and product information - to validate our technique. Our research demonstrates that user satisfaction can be considerably improved by modeling user utility and incorporating historical information on performance of the IR servers.
Distributed IR, metasearch, Patent search, Optimal operational decisions, Utility theory, Source selection, Query termination
Abstract: Cooperative caching is a popular mechanism to allow an array of distributed caches to cooperate and serve each others' web requests. Monitoring and controlling duplication of documents across cooperating caches is a challenging operational problem faced by cache managers. In this paper, we analyze optimal duplication in a game-theoretic setting with two cooperating caches. We have three primary findings. First, our results suggest that intermediate levels of duplication - greater than that of CARP, a protocol that allows no duplication, but less than that of ICP, a protocol that does not monitor duplication - are desirable. Second, the game is a game of strategic substitutes wherein an increase in duplication by one cache results in a decrease in duplication by the other. Thus, a cache that can credibly signal that it is incapable of monitoring duplication levels can get higher contribution from other caches i.e., get other caches to eliminate more duplicate documents. Finally, decentralized decision-making by selfish caches can be quite inefficient (i.e., result in higher average latency) relative to the socially optimal solution. At the same time, the socially optimal solution can be highly asymmetric even when caches are symmetric and thus may not be acceptable to the cache that has to contribute the most resources. These factors should be accounted for when contracts for cooperative caching are structured by independent ISPs.
Content Delivery, Caching, game theory
Abstract: There has been significant recent interest in studying consumer behavior in sponsored search advertising (SSA). Researchers have typically used aggregate data provided to advertisers by search engines containing measures such as advertiser’s bid, average ad position, total impressions, clicks and cost on a daily basis for keywords in the advertiser’s campaign. A variety of random utility models have been proposed and estimated using such aggregate data. Researchers have used these models to understand the factors that drive consumer click and conversion propensities. We show that estimating these random utility models on aggregated data will lead to systematically biased estimates. Specifically, the impact of ad position on click-through rate (ctr) is moderated and the predicted ctr is higher than the actual ctr. We demonstrate the existence and magnitude of this bias, then describe a new model of consumer click behavior to overcome it. We show that the parameters of this model can be accurately estimated from aggregate data.
e-commerce, sponsored search, online advertising, aggregation bias, econometrics, probabilistic modeling, consumer behavior
Abstract: Recommender systems are becoming integral to how consumers discover media. The value that recommenders offer is personalization: in environments with many product choices, recommenders personalize the browsing and consumption experience to each user’s taste. Popular applications include product recommendations at e-commerce sites and online newspapers’ selecting articles to display based on the current reader’s interests. This ability to focus more closely on one's taste and filter all else out has spawned criticism that recommenders will fragment consumers. Critics say recommenders cause consumers to have less in common with one another and that the media should do more to increase exposure to a variety of content. Others, however, contend that recommenders do the opposite: they may homogenize users because they share information among those who would otherwise not communicate. These are opposing views, discussed in the literature for over ten years for which there is not yet empirical evidence. We present an empirical study of recommender systems in the music industry. In contrast to concerns that users are becoming more fragmented, we find that in our setting users become more similar to one another in their purchases. This increase in similarity occurs for two reasons, which we term volume and taste effects. The volume effect is that consumers simply purchase more after recommendations, increasing the chance of having more purchases in common. The taste effect is that, conditional on volume, consumers buy a more similar mix of products after recommendations. When we view consumers as a similarity network before versus after recommendations, we find that the network becomes denser and smaller, or characterized by shorter inter-user distances. These findings suggest that for this setting, recommender systems are associated with an increase in commonality among users and that concerns of fragmentation may be misplaced.
recommender systems, collaborative filtering, fragmentation, personalization, long tail
Abstract: Recommender systems typically work over sparse matrices. Although most methods assume so, these matrices' entries are often not missing at random (NMAR). How problematic is this? We present a puzzle. Some methods explicitly account for NMAR processes. This has been shown to improve predictions. Many methods, however, assume that entries are missing at random (MAR). While they may be wrong in that assumption, we show they may benefit nonetheless from its being violated. Given that some data must go missing, NMAR can often pick the "right" values to preserve (i.e. it preserves the more informative data). Thus despite the perception that NMAR is bad, it can often improve recommendations. This may explain some of the historical success of collaborative filtering even when this assumption has been violated.
recommender systems, collaborative filtering, predictive modeling, missing data
© 2009 Social Science Electronic Publishing, Inc. All Rights Reserved. Terms of Use Privacy Policy This page was served by apollo 4 in 0.156 seconds.