Privacy-aware Synthesizing for Crowdsourced Data

Privacy-aware Synthesizing for Crowdsourced Data

Mengdi Huai, Di Wang, Chenglin Miao, Jinhui Xu, Aidong Zhang

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence
Main track. Pages 2542-2548. https://doi.org/10.24963/ijcai.2019/353

Although releasing crowdsourced data brings many benefits to the data analyzers to conduct statistical analysis, it may violate crowd users' data privacy. A potential way to address this problem is to employ traditional differential privacy (DP) mechanisms and perturb the data with some noise before releasing them. However, considering that there usually exist conflicts among the crowdsourced data and these data are usually large in volume, directly using these mechanisms can not guarantee good utility in the setting of releasing crowdsourced data. To address this challenge, in this paper, we propose a novel privacy-aware synthesizing method (i.e., PrisCrowd) for crowdsourced data, based on which the data collector can release users' data with strong privacy protection for their private information, while at the same time, the data analyzer can achieve good utility from the released data. Both theoretical analysis and extensive experiments on real-world datasets demonstrate the desired performance of the proposed method.
Keywords:
Machine Learning: Data Mining
Multidisciplinary Topics and Applications: Security and Privacy
Humans and AI: Human Computation and Crowdsourcing