Varying Naive Bayes Models with Applications to Classification of Chinese Text Documents

32 Pages Posted: 9 Feb 2015

See all articles by Guoyu Guan

Guoyu Guan

Northeast Normal University

Jianhua Guo

Northeast Normal University

Hansheng Wang

Peking University - Guanghua School of Management

Date Written: February 9, 2015

Abstract

Document classification is an area of great importance for which many classification methods have been well developed. However, most of these methods cannot generate time-dependent classification rules. Thus, they are not the best choices for problems with time-varying structures. To address this problem, we propose a varying naive Bayes model, which is a natural extension of the naive Bayes model that allows for time-dependent classification rule. The method of kernel smoothing is developed for parameter estimation and a BIC-type criterion is invented for feature selection. Asymptotic theory is developed and numerical studies are conducted. Finally, the proposed method is demonstrated on a real dataset, which was generated by the Mayor Public Hotline of Changchun, the capital city of Jilin Province in Northeast China.

Keywords: BIC; Chinese Document Classification; Screening Consistency; Time-dependent Classification Rule; Varying Naive Bayes

JEL Classification: C35

Suggested Citation

Guan, Guoyu and Guo, Jianhua and Wang, Hansheng, Varying Naive Bayes Models with Applications to Classification of Chinese Text Documents (February 9, 2015). Available at SSRN: https://ssrn.com/abstract=2562219 or http://dx.doi.org/10.2139/ssrn.2562219

Guoyu Guan

Northeast Normal University ( email )

Changchun
China

Jianhua Guo

Northeast Normal University ( email )

Changchun
China

Hansheng Wang (Contact Author)

Peking University - Guanghua School of Management ( email )

Peking University
Beijing, Beijing 100871
China

HOME PAGE: http://hansheng.gsm.pku.edu.cn

Register to save articles to
your library

Register

Paper statistics

Downloads
66
rank
332,374
Abstract Views
527
PlumX Metrics