ISBN: 978-981-09-5471-0 DOI: 10.18178/wcse.2015.04.102
A Cost-Sensitive Ensemble Model for Click-Through Rate Prediction
Abstract— Click-Through Rate prediction is crucial to sponsored search because it can be used to
influence ranking, filtering, and pricing of ads. Therefore, estimating click-through rate (CTR)
precisely makes significant difference in the efficiency of advertising on the Internet. The CTR
prediction can be casted as a binary classification problem (user click as positive class and don’t click
as negative class) with imbalanced data because the positive class presented with very few samples
but associated with a higher identification importance. In this paper, we describe a new cost-sensitive
ensemble model for CTR prediction. In this model, we used cost items to denote the uneven
identification importance among classes, such that the ensemble strategies can intentionally bias the
learning towards classes associated with higher identification importance and eventually improve the
identification performance. For feature selection, we extracted two sets of predictive features: basic
features and synthetic features. Finally, we made experiments on the dataset of KDD Cup 2012-Track
2 and tested the effectiveness of our model. Experiment results demonstrate that the cost-sensitive
ensemble method significantly improve the effectiveness of CTR prediction.
Index Terms— Click-Through Rate Prediction; Imbalanced data; Cost-Sensitive Ensemble Algorithm;
Feature Selection
Hongjian Liu, Defeng Guo
GiantStones Information Technology co., ltd, CHINA
Cite: Hongjian Liu, Defeng Guo, "A Cost-Sensitive Ensemble Model for Click-Through Rate Prediction," 2015 The 5th International Workshop on Computer Science and Engineering-Information Processing and Control Engineering (WCSE 2015-IPCE), pp.621-628, Moscow, Russia, April 15-17, 2015.