A Statistical Model for Social Network Labeling
22 Pages Posted: 7 Sep 2015
Date Written: July 26, 2015
We consider a social network from which one observes not only network structure (i.e., nodes and edges) but also a set of labels (or tags, keywords) for each node (or user). These labels are self-created and closely related to the user's career status, life style, personal interests, and many others. Thus, they are of great interest for online marketing. To model their joint behavior with network structure, a complete data model is developed. The model is based on the classical p1 model but allows the reciprocation parameter to be label-dependent. By focusing on connected pairs only, the complete data model can be generalized into a conditional model. Compared with the complete data model, the conditional model specifies only the conditional likelihood for the connected pairs. As a result, it suffers less risk from model mis-specification. Furthermore, because the conditional model involves connected pairs only, the computational cost is much lower. The resulting estimator is consistent and asymptotically normal. Depending on the network sparsity level, the convergence rate could be different. To demonstrate its finite sample performance, numerical studies (based on both simulated and real datasets) are presented.
Keywords: Conditional Maximum Likelihood; Maximum Likelihood; p1 Model; Sina Weibo; Large-Scale Network
JEL Classification: C30
Suggested Citation: Suggested Citation