Document Type : Review Paper
2 Associate Professor of Electrical and Computer Engineering Department, Faculty of Engineering, Kharazmi University
3 Department of Computer Engineering, Science and Research Branch, Islamic Azad University, Tehran, Iran
By increasing access to high amounts of data through internet-based technologies such as social networks and mobile phones and electronic devices, many companies have considered the issues of accessing large, random and fast data along with maintaining data confidentiality. Therefore, confidentiality concerns and protection of specific data disclosure are one of the most challenging topics. In this paper, a variety of data anonymity methods, anonymity operators, the attacks that can endanger data anonymity and lead to the disclosure of sensitive data in the big data have been investigated. Also, different aspects of big data such as data sources, content format, data preparation, data processing and common data repositories will be discussed. Privacy attacks and contrastive techniques like k anonymity, neighborhood t and L diversity have been investigated and two main challenges to use k anonymity on big data will be identified, as well. Two main challenges to use k anonymity on big data will be identified. The first challenge of confidential attributes can also be as pseudo-identifier attributes, which increases the number of pseudo-identifier elements, and it may lead to the loss of great information to achieve k anonymity. The second challenge in big data is the unlimited number of data controllers are likely to lead to the disclosure of sensitive data through the independent publication of k anonymity. Then different anonymity algorithms will be presented and finally, the different parameters of time order and the consumable space of big data anonymity algorithms will be compared.