ISBN: 978-981-11-0008-6 DOI: 10.18178/wcse.2016.06.098
Automatic Summarization from Indonesian Hashtag on Twitter Using TF-IDF and Phrase Reinforcement Algorithm
Abstract— The objective of this research is to produce a summary about what is currently happening from
Indonesian hashtag on Twitter. Combination of TF-IDF (term frequency-inverse document frequency) and
Phrase Reinforcement Algorithm are used as the methodology to do the automatic summarization. We use 2
sentences as the final summary result. It contains all essential information given by Twitter data. At the end
of this paper, we describe the evaluation result by analyzing result using Precision and ROUGE. Based on the
result, we conclude that TF-IDF and Phrase Reinforcement Algorithm can successfully generate summary
and it works well enough on hashtags that do not have such lot variants of the word. Generally, summary
results quality is quite low because the data still contains too much noise. The precision is 0.327 and
ROUGE-1 is 0.3087.
Index Terms— summary, hashtag, twitter, phrase reinforcement algorithm, automatic summarization, tf-idf.
Willyh Hariardi, Novita Latief, David Febryanto, Derwin Suhartono
Bina Nusantara University, School of Computer Science, INDONESIA
Cite: Willyh Hariardi, Novita Latief, David Febryanto, Derwin Suhartono, "Automatic Summarization from Indonesian Hashtag on Twitter Using TF-IDF and Phrase Reinforcement Algorithm," Proceedings of 2016 6th International Workshop on Computer Science and Engineering, pp. 575-579, Tokyo, 17-19 June, 2016.