Results 1 - 10
of
48
ℓ-diversity: Privacy beyond k-anonymity
- In ICDE
, 2006
"... Publishing data about individuals without revealing sensitive information about them is an important problem. In recent years, a new definition of privacy called k-anonymity has gained popularity. In a k-anonymized dataset, each record is indistinguishable from at least k − 1 other records with resp ..."
Abstract
-
Cited by 294 (8 self)
- Add to MetaCart
Publishing data about individuals without revealing sensitive information about them is an important problem. In recent years, a new definition of privacy called k-anonymity has gained popularity. In a k-anonymized dataset, each record is indistinguishable from at least k − 1 other records with respect to certain “identifying ” attributes. In this paper we show using two simple attacks that a k-anonymized dataset has some subtle, but severe privacy problems. First, an attacker can discover the values of sensitive attributes when there is little diversity in those sensitive attributes. This kind of attack is a known problem [60]. Second, attackers often have background knowledge, and we show that k-anonymity does not guarantee privacy against attackers using background knowledge. We give a detailed analysis of these two attacks and we propose a novel and powerful privacy criterion called ℓ-diversity that can defend against such attacks. In addition to building a formal foundation for ℓ-diversity, we show in an experimental evaluation that ℓ-diversity is practical and can be implemented efficiently. 1.
Worst-case background knowledge in privacy
- In ICDE
, 2007
"... Recent work has shown the necessity of considering an attacker’s background knowledge when reasoning about privacy in data publishing. However, in practice, the data publisher does not know what background knowledge the attacker possesses. Thus, it is important to consider the worst-case. In this pa ..."
Abstract
-
Cited by 56 (1 self)
- Add to MetaCart
Recent work has shown the necessity of considering an attacker’s background knowledge when reasoning about privacy in data publishing. However, in practice, the data publisher does not know what background knowledge the attacker possesses. Thus, it is important to consider the worst-case. In this paper, we initiate a formal study of worst-case background knowledge. We propose a language that can express any background knowledge about the data. We provide a polynomial time algorithm to measure the amount of disclosure of sensitive information in the worst case, given that the attacker has at most k pieces of information in this language. We also provide a method to efficiently sanitize the data so that the amount of disclosure in the worst case is less than a specified threshold. 1.
Mobiscopes for human spaces
- IEEE Pervasive Computing
, 2007
"... The proliferation of affordable mobile devices with processing and sensing capabilities, together with the rapid growth in ubiquitous network connectivity, herald an era of Mobiscopes; networked sensing applications that rely on multiple mobile sensors to accomplish global tasks. These distributed s ..."
Abstract
-
Cited by 40 (5 self)
- Add to MetaCart
The proliferation of affordable mobile devices with processing and sensing capabilities, together with the rapid growth in ubiquitous network connectivity, herald an era of Mobiscopes; networked sensing applications that rely on multiple mobile sensors to accomplish global tasks. These distributed sensing systems extend the model of traditional sensor networks, introducing challenges in data management, data integrity, privacy, and network system design. While several applications that fit the above description exist in prior literature, they provide tailored one-time solutions to what essentially is the same set of problems. It is time to work towards a general architecture that identifies common challenges and provides a generalizable methodology for the design of future Mobiscopes. Towards that end, this paper surveys a variety of current and emerging mobile, networked, sensing applications; articulates their common challenges; and provides architectural guidelines and design directions for this important
Privacy-Preserving Data Publishing: A Survey on Recent Developments
"... The collection of digital information by governments, corporations, and individuals has created tremendous opportunities for knowledge- and information-based decision making. Driven by mutual benefits, or by regulations that require certain data to be published, there is a demand for the exchange an ..."
Abstract
-
Cited by 31 (0 self)
- Add to MetaCart
The collection of digital information by governments, corporations, and individuals has created tremendous opportunities for knowledge- and information-based decision making. Driven by mutual benefits, or by regulations that require certain data to be published, there is a demand for the exchange and publication of data among various parties. Data in its original form, however, typically contains sensitive information about individuals, and publishing such data will violate individual privacy. The current practice in data publishing relies mainly on policies and guidelines as to what types of data can be published, and agreements on the use of published data. This approach alone may lead to excessive data distortion or insufficient protection. Privacy-preserving data publishing (PPDP) provides methods and tools for publishing useful information while preserving data privacy. Recently, PPDP has received considerable attention in research communities, and many approaches have been proposed for different data publishing scenarios. In this survey, we will systematically summarize and evaluate different approaches to PPDP, study the challenges in practical data publishing, clarify the differences and requirements that distinguish PPDP from other related problems, and propose future research directions.
Attacks on privacy and de finetti’s theorem
- In SIGMOD
, 2009
"... In this paper we present a method for reasoning about privacy using the concepts of exchangeability and deFinetti’s theorem. We illustrate the usefulness of this technique by using it to attack a popular data sanitization scheme known as Anatomy. We stress that Anatomy is not the only sanitization s ..."
Abstract
-
Cited by 24 (2 self)
- Add to MetaCart
In this paper we present a method for reasoning about privacy using the concepts of exchangeability and deFinetti’s theorem. We illustrate the usefulness of this technique by using it to attack a popular data sanitization scheme known as Anatomy. We stress that Anatomy is not the only sanitization scheme that is vulnerable to this attack. In fact, any scheme that uses the random worlds model, i.i.d. model, or tuple-independent model needs to be re-evaluated. The difference between the attack presented here and others that have been proposed in the past is that we do not need extensive background knowledge. An attacker only needs to know the nonsensitive attributes of one individual in the data, and can carry out this attack just by building a machine learning model over the sanitized data. The reason this attack is successful is that it exploits a subtle flaw in the way prior work computed the probability of disclosure of a sensitive attribute. We demonstrate this theoretically, empirically, and with intuitive examples. We also discuss how this generalizes to many other privacy schemes.
On the Anonymization of Sparse High-Dimensional Data
"... Abstract — Existing research on privacy-preserving data publishing focuses on relational data: in this context, the objective is to enforce privacy-preserving paradigms, such as kanonymity and ℓ-diversity, while minimizing the information loss incurred in the anonymizing process (i.e. maximize data ..."
Abstract
-
Cited by 15 (1 self)
- Add to MetaCart
Abstract — Existing research on privacy-preserving data publishing focuses on relational data: in this context, the objective is to enforce privacy-preserving paradigms, such as kanonymity and ℓ-diversity, while minimizing the information loss incurred in the anonymizing process (i.e. maximize data utility). However, existing techniques adopt an indexing- or clusteringbased approach, and work well for fixed-schema data, with low dimensionality. Nevertheless, certain applications require privacy-preserving publishing of transaction data (or basket data), which involves hundreds or even thousands of dimensions, rendering existing methods unusable. We propose a novel anonymization method for sparse highdimensional data. We employ a particular representation that captures the correlation in the underlying data, and facilitates the formation of anonymized groups with low information loss. We propose an efficient anonymization algorithm based on this representation. We show experimentally, using real-life datasets, that our method clearly outperforms existing state-of-the-art in terms of both data utility and computational overhead. I.
An Attacker’s View of Distance Preserving Maps for Privacy Preserving Data Mining
- Proc. PKDD
, 2006
"... Abstract. We examine the effectiveness of distance preserving transformations in privacy preserving data mining. These techniques are potentially very useful in that some important data mining algorithms can be efficiently applied to the transformed data and produce exactly the same results as if ap ..."
Abstract
-
Cited by 11 (1 self)
- Add to MetaCart
Abstract. We examine the effectiveness of distance preserving transformations in privacy preserving data mining. These techniques are potentially very useful in that some important data mining algorithms can be efficiently applied to the transformed data and produce exactly the same results as if applied to the original data e.g. distance-based clustering, k-nearest neighbor classification. However, the issue of how well the original data is hidden has, to our knowledge, not been carefully studied. We take a step in this direction by assuming the role of an attacker armed with two types of prior information regarding the original data. We examine how well the attacker can recover the original data from the transformed data and prior information. Our results offer insight into the vulnerabilities of distance preserving transformations. 1
A random rotation perturbation approach to privacy preserving data classification
- in Proceedings of International Conference on Data Mining (ICDM
, 2005
"... This paper presents a random rotation perturbation approach for privacy preserving data classification. Concretely, we identify the importance of classification-specific information with respect to the loss of information factor, and present a random rotation perturbation framework for privacy prese ..."
Abstract
-
Cited by 11 (8 self)
- Add to MetaCart
This paper presents a random rotation perturbation approach for privacy preserving data classification. Concretely, we identify the importance of classification-specific information with respect to the loss of information factor, and present a random rotation perturbation framework for privacy preserving data classification. Our approach has two unique characteristics. First, we identify that many classification models utilize the geometric properties of datasets, which can be preserved by geometric rotation. We prove that the three types of classifiers will deliver the same performance over the rotation perturbed dataset as over the original dataset. Second, we propose a multi-column privacy model to address the problems of evaluating privacy quality for multidimensional perturbation. With this metric, we develop a local optimal algorithm to find the good rotation perturbation in terms of privacy guarantee. We also analyze both naive estimation and ICA-based reconstruction attacks with the privacy model. Our initial experiments show that the random rotation approach can provide high privacy guarantee while maintaining zero-loss of accuracy for the discussed classifiers. 1
PDA: privacypreserving data aggregation in wireless sensor networks
- in: Proceedings of the IEEE Infocom2007
, 2007
"... Abstract — Providing efficient data aggregation while preserving data privacy is a challenging problem in wireless sensor networks research. In this paper, we present two privacy-preserving data aggregation schemes for additive aggregation functions. The first scheme – Cluster-based Private Data Agg ..."
Abstract
-
Cited by 11 (1 self)
- Add to MetaCart
Abstract — Providing efficient data aggregation while preserving data privacy is a challenging problem in wireless sensor networks research. In this paper, we present two privacy-preserving data aggregation schemes for additive aggregation functions. The first scheme – Cluster-based Private Data Aggregation (CPDA)– leverages clustering protocol and algebraic properties of polynomials. It has the advantage of incurring less communication overhead. The second scheme – Slice-Mix-AggRegaTe (SMART)– builds on slicing techniques and the associative property of addition. It has the advantage of incurring less computation overhead. The goal of our work is to bridge the gap between collaborative data collection by wireless sensor networks and data privacy. We assess the two schemes by privacy-preservation efficacy, communication overhead, and data aggregation accuracy. We present simulation results of our schemes and compare their performance to a typical data aggregation scheme – TAG, where no data privacy protection is provided. Results show the efficacy and efficiency of our schemes. To the best of our knowledge, this paper is among the first on privacy-preserving data aggregation in wireless sensor networks. I.

