ISBN: 978-981-11-0008-6 DOI: 10.18178/wcse.2016.06.102
ODD Visualizer: Scalable Open Data De-identification Visualizer
Abstract— Due to the significant values it can derive, large-scaled open data analysis (or big data analysis)
attracts lots of attentions from various domains researchers and experts. However, the progresses of data
releasing for open usages are still slow in the latest decade. Only about 10% amount of datasets owned by
worldwide governments have been released, and the main reason of that is due to concern for “privacy
preserving’. According to previous real case studies, even though the personally identifiable information
have been de-identified, sensitive personal information still could be uncovered by heterogeneous or crossdomain
data joining operation. This kind of privacy re-identification are usually too complicated or obscure
to be realized by data owner, not to mention that this problem will be more severe as the scale of data goes
large. To our best knowledge so far, none of existent research work leverages data visualization approach to
provide direct and clear manner detecting information re-identification problem. In this project, we aim to
propose a method for scalable open data de-identification visualization consisting of: 1) platform for scalable
storing and computation for de-identification measuring and 2) novel data visualization technique depicting
distribution of de-identification robustness in a global view. It was demonstrated that our work not only
provides efficient estimation and visualization for data de-identification but also a useful guideline helping
users determine which parts of data should be released or not.
Index Terms— data de-identification, data visualization, privacy preserving, personally identifiable
information, sensitive personal information.
Chiun-How Kao, Chih-Hung Hsieh, Yu-Feng Chu, Yu-Ting Kuang
Institute for Information Industry, TAIWAN
Chien-Lung Hsu
Department of Information Management, Chang-Gung University, TAIWAN
Cite: Chiun-How Kao, Chih-Hung Hsieh, Chien-Lung Hsu, Yu-Feng Chu, Yu-Ting Kuang, "ODD Visualizer: Scalable Open Data De-identification Visualizer," Proceedings of 2016 6th International Workshop on Computer Science and Engineering, pp. 594-598, Tokyo, 17-19 June, 2016.