• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Privacy-preserving distributed mining of association rules on horizontally partitioned data (0)

by M Kantarcioglu
Venue:IEEE TKDE
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 92
Next 10 →

ℓ-diversity: Privacy beyond k-anonymity

by Ashwin Machanavajjhala, Daniel Kifer, Johannes Gehrke, Muthuramakrishnan Venkitasubramaniam - In ICDE , 2006
"... Publishing data about individuals without revealing sensitive information about them is an important problem. In recent years, a new definition of privacy called k-anonymity has gained popularity. In a k-anonymized dataset, each record is indistinguishable from at least k − 1 other records with resp ..."
Abstract - Cited by 294 (8 self) - Add to MetaCart
Publishing data about individuals without revealing sensitive information about them is an important problem. In recent years, a new definition of privacy called k-anonymity has gained popularity. In a k-anonymized dataset, each record is indistinguishable from at least k − 1 other records with respect to certain “identifying ” attributes. In this paper we show using two simple attacks that a k-anonymized dataset has some subtle, but severe privacy problems. First, an attacker can discover the values of sensitive attributes when there is little diversity in those sensitive attributes. This kind of attack is a known problem [60]. Second, attackers often have background knowledge, and we show that k-anonymity does not guarantee privacy against attackers using background knowledge. We give a detailed analysis of these two attacks and we propose a novel and powerful privacy criterion called ℓ-diversity that can defend against such attacks. In addition to building a formal foundation for ℓ-diversity, we show in an experimental evaluation that ℓ-diversity is practical and can be implemented efficiently. 1.

Privacy Preserving Association Rule Mining in Vertically Partitioned Data

by Jaideep Vaidya, Chris Clifton - In The Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining , 2002
"... Privacy considerations often constrain data mining projects. This paper addresses the problem of association rule mining where transactions are distributed across sources. Each site holds some attributes of each transaction, and the sites wish to collaborate to identify globally valid association ru ..."
Abstract - Cited by 159 (18 self) - Add to MetaCart
Privacy considerations often constrain data mining projects. This paper addresses the problem of association rule mining where transactions are distributed across sources. Each site holds some attributes of each transaction, and the sites wish to collaborate to identify globally valid association rules. However, the sites must not reveal individual transaction data. We present a two-party algorithm for efficiently discovering frequent itemsets with minimum support levels, without either site revealing individual transaction values.

Maintaining Data Privacy in Association Rule Mining

by Shariq Rizvi, Jayant R. Haritsa - In Proceedings of the 28th VLDB Conference, Hong Kong , 2002
"... Data mining services require accurate input data for their results to be meaningful, but privacy concerns may influence users to provide spurious information. We investigate here, with respect to mining association rules, whether users can be encouraged to provide correct information by ensuri ..."
Abstract - Cited by 102 (2 self) - Add to MetaCart
Data mining services require accurate input data for their results to be meaningful, but privacy concerns may influence users to provide spurious information. We investigate here, with respect to mining association rules, whether users can be encouraged to provide correct information by ensuring that the mining process cannot, with any reasonable degree of certainty, violate their privacy. We present a scheme, based on probabilistic distortion of user data, that can simultaneously provide a high degree of privacy to the user and retain a high level of accuracy in the mining results. The performance of the scheme is validated against representative real and syn- thetic datasets.

On the privacy preserving properties of random data perturbation techniques

by Hillol Kargupta, Souptik Datta - In ICDM , 2003
"... Privacy is becoming an increasingly important issue in many data mining applications. This has triggered the development of many privacy-preserving data mining techniques. A large fraction of them use randomized data distortion techniques to mask the data for preserving the privacy of sensitive data ..."
Abstract - Cited by 95 (4 self) - Add to MetaCart
Privacy is becoming an increasingly important issue in many data mining applications. This has triggered the development of many privacy-preserving data mining techniques. A large fraction of them use randomized data distortion techniques to mask the data for preserving the privacy of sensitive data. This methodology attempts to hide the sensitive data by randomly modifying the data values often using additive noise. This paper questions the utility of the random value distortion technique in privacy preservation. The paper notes that random objects (particularly random matrices) have “predictable ” structures in the spectral domain and it develops a random matrix-based spectral filtering technique to retrieve original data from the dataset distorted by adding random values. The paper presents the theoretical foundation of this filtering method and extensive experimental results to demonstrate that in many cases random data distortion preserve very little data privacy. 1.

Privacy-Preserving K-Means Clustering over Vertically Partitioned Data

by Jaideep Vaidya, Chris Clifton - IN SIGKDD , 2003
"... Privacy and security concerns can prevent sharing of data, derailing data mining projects. Distributed knowledge discovery, if done correctly, can alleviate this problem. The key is to obtain valid results, while providing guarantees on the (non)disclosure of data. We present a method for k-means cl ..."
Abstract - Cited by 83 (4 self) - Add to MetaCart
Privacy and security concerns can prevent sharing of data, derailing data mining projects. Distributed knowledge discovery, if done correctly, can alleviate this problem. The key is to obtain valid results, while providing guarantees on the (non)disclosure of data. We present a method for k-means clustering when different sites contain different attributes for a common set of entities. Each site learns the cluster of each entity, but learns nothing about the attributes at other sites.

Random projection-based multiplicative data perturbation for privacy preserving distributed data mining

by Kun Liu, Hillol Kargupta, Jessica Ryan - IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING , 2006
"... This paper explores the possibility of using multiplicative random projection matrices for privacy preserving distributed data mining. It specifically considers the problem of computing statistical aggregates like the inner product matrix, correlation coefficient matrix, and Euclidean distance matri ..."
Abstract - Cited by 36 (5 self) - Add to MetaCart
This paper explores the possibility of using multiplicative random projection matrices for privacy preserving distributed data mining. It specifically considers the problem of computing statistical aggregates like the inner product matrix, correlation coefficient matrix, and Euclidean distance matrix from distributed privacy sensitive data possibly owned by multiple parties. This class of problems is directly related to many other data-mining problems such as clustering, principal component analysis, and classification. This paper makes primary contributions on two different grounds. First, it explores Independent Component Analysis as a possible tool for breaching privacy in deterministic multiplicative perturbation-based models such as random orthogonal transformation and random rotation. Then, it proposes an approximate random projection-based technique to improve the level of privacy protection while still preserving certain statistical characteristics of the data. The paper presents extensive theoretical analysis and experimental results. Experiments demonstrate that the proposed technique is effective and can be successfully used for different types of privacypreserving data mining applications.

Secure set intersection cardinality with application to association rule mining

by Christopher Clifton, Jaideep Vaidya, Chris Clifton - Accepted for Publication in the Journal of Computer Security, IOS
"... There has been concern over the apparent conflict between privacy and data mining. There is no inherent conflict, as most types of data mining produce summary results that do not reveal information about individuals. The process of data mining may use private data, leading to the potential for priva ..."
Abstract - Cited by 26 (4 self) - Add to MetaCart
There has been concern over the apparent conflict between privacy and data mining. There is no inherent conflict, as most types of data mining produce summary results that do not reveal information about individuals. The process of data mining may use private data, leading to the potential for privacy breaches. Secure Multiparty Computation shows that results can be produced without revealing the data used to generate them. The problem is that general techniques for secure multiparty computation do not scale to data-mining size computations. This paper presents an efficient protocol for securely determining the size of set intersection, and shows how this can be used to generate association rules where multiple parties have different (and private) information about the same set of individuals. 1

Privacy Preserving Frequent Itemset Mining

by Stanley R.M. Oliveira, Osmar R. Zaïane , 2002
"... One crucial aspect of privacy preserving frequent itemset mining is the fact that the mining process deals with a trade-off: privacy and accuracy, which are typically contradictory, and improving one usually incurs a cost in the other. One alternative to address this particular problem is to look fo ..."
Abstract - Cited by 25 (2 self) - Add to MetaCart
One crucial aspect of privacy preserving frequent itemset mining is the fact that the mining process deals with a trade-off: privacy and accuracy, which are typically contradictory, and improving one usually incurs a cost in the other. One alternative to address this particular problem is to look for a balance between hiding restrictive patterns and disclosing nonrestrictive ones. In this paper, we propose a new framework for enforcing privacy in mining frequent itemsets. We combine, in a single framework, techniques for efficiently hiding restrictive patterns: a transaction retrieval engine relying on an inverted file and Boolean queries; and a set of algorithms to "sanitize" a database. In addition, we introduce performance measures for mining frequent itemsets that quantify the fraction of mining patterns which are preserved after sanitizing a database. We also report the results of a performance evaluation of our research prototype and an analysis of the results.

Strong Conditional Oblivious Transfer and Computing on Intervals

by Ian F. Blake, Vladimir Kolesnikov - IN ADVANCES IN CRYPTOLOGY - ASIACRYPT 2004 , 2004
"... We consider the problem of securely computing the Greater Than (GT) predicate and its generalization -- securely determining membership in a union of intervals. We approach these problems from the point of view of Q-Conditional Oblivious Transfer (Q-COT), introduced by Di Crescenzo, Ostrovsky an ..."
Abstract - Cited by 24 (8 self) - Add to MetaCart
We consider the problem of securely computing the Greater Than (GT) predicate and its generalization -- securely determining membership in a union of intervals. We approach these problems from the point of view of Q-Conditional Oblivious Transfer (Q-COT), introduced by Di Crescenzo, Ostrovsky and Rajagopalan [4]. Q-COT is an oblivious transfer that occurs i# predicate Q evaluates to true on the parties' inputs. We are working in the semi-honest model with computationally unbounded receiver. In this paper

On Addressing Efficiency Concerns in Privacy-Preserving Mining

by Shipra Agrawal, Vijay Krishnan, Jayant R. Haritsa - PROC. OF 9TH INTL. CONF. ON DATABASE SYSTEMS FOR ADVANCED APPLICATIONS (DASFAA , 2004
"... Data mining services require accurate input data for their results to be meaningful, but privacy concerns may influence users to provide spurious information. To encourage ..."
Abstract - Cited by 20 (1 self) - Add to MetaCart
Data mining services require accurate input data for their results to be meaningful, but privacy concerns may influence users to provide spurious information. To encourage
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University