DMCA
A framework for high-accuracy privacy-preserving mining (2005)
Cached
Download Links
- [dsl.serc.iisc.ernet.in]
- [arxiv.org]
- [arxiv.org]
- [dns2.icar.cnr.it]
- [staff.icar.cnr.it]
- [dsl.serc.iisc.ernet.in]
- [eprints.iisc.ernet.in]
- DBLP
Other Repositories/Bibliography
Venue: | In Proceedings of the 21st IEEE International Conference on Data Engineering |
Citations: | 70 - 1 self |
Citations
3610 | Fast Algorithms for Mining Association Rules
- AGRAWAL, SRIKANT
(Show Context)
Citation Context ...emonstrate in this section how it can be used for enhancing privacy-preserving mining of association rules, a popular mining model that identifies interesting correlations between database attributes =-=[7]-=-. The core computation of association rule mining is to identify “frequent itemsets”, that is, all those itemsets whose support (i.e. frequency) in the database is in excess of a user-specified thresh... |
3329 | Mining association rules between sets of items in large databases
- Agrawal, Swami
- 1993
(Show Context)
Citation Context ...ss FRAPP’s utility, we specifically evaluate the performance of our new perturbation mechanisms on the popular mining task of identifying frequent itemsets, the cornerstone of association rule mining =-=[4]-=-. Our experiments on a variety of real datasets indicate that both identity and support errors are substantially lower than those incurred by the prior privacy-preserving techniques. Another important... |
3275 |
An introduction to probability theory and its applications. Vol. II. Second edition
- Feller
- 1971
(Show Context)
Citation Context ...�¤� © ¡ � � kept constant, assumes � its maximum value when ¨ © ¨ © �� . In other words, the variabil� � ¥§¥¨¥ ity of ¨ � , or its lack of uniformity, decreases � the magnitude of chance fluctuations =-=[15]-=-. By � � using random matrix instead of � deterministic , we increase the variability of ¨ � � � (now ¨ assumes variable values for all � ), hence decreas� ing the fluctuation of ¡ from its expectatio... |
1038 |
Linear Algebra and its Applications
- Strang
- 1988
(Show Context)
Citation Context ...rix of these transition ¨¢©�¤�� probabilities, ����� © ¨¢©�¤�� ¥�� with . This random process maps to a Markov process, and the perturbation � matrix should therefore satisfy the following properties =-=[24]-=-: ������� and � � ������� � ��� © ��� ¤ � � ¨ �§¥ � � ¡ (1) Due to the constraints imposed by Equation 1, the domain of � is a subset of ��� � � ����� ����� . This domain is further restricted by the ... |
841 | Privacy-preserving data mining
- Agrawal, Srikant
(Show Context)
Citation Context ...rmation on Web forms to Ecommerce service providers. To encourage users to submit correct inputs, a variety of privacy-preserving data mining techniques have been proposed in the last few years (e.g. =-=[1, 8, 14, 20]-=-). Their goalsshipra,haritsa¡ @dsl.serc.iisc.ernet.in 1 is to ensure the privacy of the raw local data but, at the same time, to support accurate reconstruction of the global data mining models. Most ... |
443 | Mining quantitative association rules in large relational tables
- SRIKANT, AGRAWAL
(Show Context)
Citation Context ...emonstrate in this section how it can be used for enhancing privacy-preserving mining of association rules, a popular mining model that identifies interesting correlations between database attributes =-=[1, 21]-=-. The core of the association rule mining is to identify “frequent itemsets”, that is, all those itemsets whose support (i.e. frequency) in the database is in excess of a userspecified threshold. Equa... |
412 | On the design and quantification of privacy preserving data mining algorithms
- Agrawal, Aggarwal
- 2001
(Show Context)
Citation Context ...ts for discovery of long patterns. 9. Related Work The issue of maintaining privacy in data mining has attracted considerable attention in the recent past. The work closest to our approach is that of =-=[2, 8, 12, 13, 14, 18, 20]-=-. In the pioneering work of [8], privacy-preserving data classifiers based on adding noise to the record values were proposed. This approach was extended in [2] and [18] to address a variety of subtle... |
330 | Privacy preserving mining of association rules
- Evfimievski, Srikant, et al.
(Show Context)
Citation Context ...opriate value of ¨ . For � © �¤� , this value turned out to be � � ¢�� � and � � ¢£¢¥¤ for the CENSUS and HEALTH datasets, respectively. C&P. This is the Cut-and-Paste perturbation scheme proposed in =-=[14]-=-, with algorithmic parameters � and � . To choose � , we varied � from � to £ , and for each � , � was chosen such that the matrix (Equation 11) satisfies the privacy constraints (Equation 2). The res... |
314 | Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression
- Samarati, Sweeney
(Show Context)
Citation Context ...ix itself random which, to the best of our knowledge, has not been previously explored in the context of privacy preserving mining. Another model of privacy preserving data mining is kanonymity model =-=[23]-=-. The perturbation approach used in random perturbation model works under the strong privacy requirement that even the dataset forming server is not allowed to learn or recover precise records. Users ... |
298 | Limiting privacy breaches in privacy preserving data mining
- Evfimievski, Gehrke, et al.
- 2003
(Show Context)
Citation Context ...ork that facilitates a systematic approach to the design of random perturbation schemes for privacy-preserving mining. It supports “amplification”, a particularly strong notion of privacy proposed in =-=[13]-=-, which guarantees strict limits on privacy breaches of individual user information, independent of the distribution of the original data. The distinguishing feature of FRAPP is its quantitative chara... |
295 | C.: Privacy preserving association rule mining in vertically partitioned data
- Vaidya, Clifton
- 2002
(Show Context)
Citation Context ...ata miner – this work is complementary to ours since it addresses concerns about output privacy, whereas our focus is on the privacy of the input data. Maintaining input data privacy is considered in =-=[17, 25, 26, 27]-=- in the context of databases that are distributed across a number of sites with each site only willing to share data mining results, but not the source data. 10. Conclusions and Future Work In this pa... |
250 | Hippocratic databases
- Agrawal, Kiernan, et al.
- 2002
(Show Context)
Citation Context ...mediate database forming server can learn or recover precise records. Hippocratic databases are database systems that take responsibility for the privacy of the data they manage, and are discussed in =-=[3, 5, 6, 19]-=-. It involves specification of how the data is to be used in a privacy policy, and enforcing limited disclosure rules for regulatory concerns prompted by legislation. Finally, the problem addressed in... |
240 | C.: Privacy-preserving distributed mining of association rules on horizontally partitioned data - Kantarcioglu, Clifton - 2004 |
193 | Generalizing data to provide anonymity when disclosing information
- Samarati, Sweeney
(Show Context)
Citation Context ...rbation matrix elements themselves, which has not been, to the best of our knowledge, previously discussed in the literature. Another model of privacy preserving data mining is ¥ the -anonymity model =-=[21, 1]-=-, where each record value is replaced with a corresponding generalized value. Specifically, each perturbed record cannot be distinguished from at least ¥ other records in the data. However, the constr... |
190 | On the privacy preserving properties of random data perturbation techniques
- Kargupta, Datta, et al.
- 2003
(Show Context)
Citation Context ...ts for discovery of long patterns. 9. Related Work The issue of maintaining privacy in data mining has attracted considerable attention in the recent past. The work closest to our approach is that of =-=[2, 8, 12, 13, 14, 18, 20]-=-. In the pioneering work of [8], privacy-preserving data classifiers based on adding noise to the record values were proposed. This approach was extended in [2] and [18] to address a variety of subtle... |
169 | Maintaining data privacy in association rule mining
- Rizvi, Haritsa
- 2002
(Show Context)
Citation Context ...rmation on Web forms to Ecommerce service providers. To encourage users to submit correct inputs, a variety of privacy-preserving data mining techniques have been proposed in the last few years (e.g. =-=[1, 8, 14, 20]-=-). Their goalsshipra,haritsa¡ @dsl.serc.iisc.ernet.in 1 is to ensure the privacy of the raw local data but, at the same time, to support accurate reconstruction of the global data mining models. Most ... |
165 | Privacy-preserving k-means clustering over vertically partitioned data
- Vaidya, Clifton
- 2003
(Show Context)
Citation Context ...ata miner – this work is complementary to ours since it addresses concerns about output privacy, whereas our focus is on the privacy of the input data. Maintaining input data privacy is considered in =-=[17, 25, 26, 27]-=- in the context of databases that are distributed across a number of sites with each site only willing to share data mining results, but not the source data. 10. Conclusions and Future Work In this pa... |
112 | Disclosure Limitation of Sensitive Rules.
- Atallah, Bertino, et al.
- 1999
(Show Context)
Citation Context ... It involves specification of how the data is to be used in a privacy policy, and enforcing limited disclosure rules for regulatory concerns prompted by legislation. Finally, the problem addressed in =-=[10, 11, 22, 23]-=- is how to prevent sensitive rules from being inferred by the data miner – this work is complementary to ours since it addresses concerns about output privacy, whereas our focus is on the privacy of t... |
101 | P.S.: A condensation approach to privacy preserving data mining
- Aggarwal, Yu
- 2004
(Show Context)
Citation Context ...rmation on Web forms to Ecommerce service providers. To encourage users to submit correct inputs, a variety of privacy-preserving data mining techniques have been proposed in the last few years (e.g. =-=[1, 8, 14, 20]-=-). Their goalsshipra,haritsa¡ @dsl.serc.iisc.ernet.in 1 is to ensure the privacy of the raw local data but, at the same time, to support accurate reconstruction of the global data mining models. Most ... |
96 | Using Unknowns to Prevent Discovery of Association Rules.
- Saygin, Verykios, et al.
- 2001
(Show Context)
Citation Context ... It involves specification of how the data is to be used in a privacy policy, and enforcing limited disclosure rules for regulatory concerns prompted by legislation. Finally, the problem addressed in =-=[10, 11, 22, 23]-=- is how to prevent sensitive rules from being inferred by the data miner – this work is complementary to ours since it addresses concerns about output privacy, whereas our focus is on the privacy of t... |
84 | D.: Limiting disclosure in hippocratic databases
- LeFevre, Agrawal, et al.
- 2004
(Show Context)
Citation Context ...mediate database forming server can learn or recover precise records. Hippocratic databases are database systems that take responsibility for the privacy of the data they manage, and are discussed in =-=[3, 5, 6, 19]-=-. It involves specification of how the data is to be used in a privacy policy, and enforcing limited disclosure rules for regulatory concerns prompted by legislation. Finally, the problem addressed in... |
61 | Hiding Association Rules by using Confidence and Support,
- Dasseni, Verykios, et al.
- 2001
(Show Context)
Citation Context ... It involves specification of how the data is to be used in a privacy policy, and enforcing limited disclosure rules for regulatory concerns prompted by legislation. Finally, the problem addressed in =-=[10, 11, 22, 23]-=- is how to prevent sensitive rules from being inferred by the data miner – this work is complementary to ours since it addresses concerns about output privacy, whereas our focus is on the privacy of t... |
58 | Privacy preserving naïve bayes classifier for vertically partitioned data
- Vaidya, Clifton
(Show Context)
Citation Context ...ata miner – this work is complementary to ours since it addresses concerns about output privacy, whereas our focus is on the privacy of the input data. Maintaining input data privacy is considered in =-=[17, 25, 26, 27]-=- in the context of databases that are distributed across a number of sites with each site only willing to share data mining results, but not the source data. 10. Conclusions and Future Work In this pa... |
50 | Privacy-preserving distributed mining of association rules on horizontally partitioned data.
- Clifton
- 2004
(Show Context)
Citation Context ...ata miner – this work is complementary to ours since it addresses concerns about output privacy, whereas our focus is on the privacy of the input data. Maintaining input data privacy is considered in =-=[17, 25, 26, 27]-=- in the context of databases that are distributed across a number of sites with each site only willing to share data mining results, but not the source data. 10. Conclusions and Future Work In this pa... |
48 | Auditing compliance with a Hippocratic database
- Agrawal, Bayardo, et al.
- 2004
(Show Context)
Citation Context ...mediate database forming server can learn or recover precise records. Hippocratic databases are database systems that take responsibility for the privacy of the data they manage, and are discussed in =-=[3, 5, 6, 19]-=-. It involves specification of how the data is to be used in a privacy policy, and enforcing limited disclosure rules for regulatory concerns prompted by legislation. Finally, the problem addressed in... |
45 |
Post Randomisation For Statistical Disclosure Control: Theory and Implementation
- Gouweleeuw, Kooiman, et al.
- 1998
(Show Context)
Citation Context ...dology for limiting them, were given in the foundational work of [13]. Techniques for probabilistic perturbation have also been investigated in the statistics literature. For example, the PRAM method =-=[12, 16]-=-, intended for disclosure limitation in microdata files, considers the use of Markovian perturbation matrices. However, the ideal choice of matrix is left as an open research issue, and an iterative r... |
44 | Privacy preserving association rule mining,
- Saygin, Verykios, et al.
- 2002
(Show Context)
Citation Context ... It involves specification of how the data is to be used in a privacy policy, and enforcing limited disclosure rules for regulatory concerns prompted by legislation. Finally, the problem addressed in =-=[10, 11, 22, 23]-=- is how to prevent sensitive rules from being inferred by the data miner – this work is complementary to ours since it addresses concerns about output privacy, whereas our focus is on the privacy of t... |
26 | On Addressing Efficiency Concerns in Privacy-Preserving Mining
- Agrawal, Krishnan, et al.
- 2004
(Show Context)
Citation Context ...one using Apriori [7] algorithm, with an additional support reconstruction phase at the end of each pass to recover the original supports from the perturbed database supports computed during the pass =-=[9, 20]-=-. Specifically, the perturbation mechanisms evaluated in our study are the following: DET-GD. This scheme uses the deterministic gammadiagonal perturbation matrix � (Section 3) for perturbation and re... |
25 |
On the number of successes in independent trials
- Wang
- 1993
(Show Context)
Citation Context ...ability of success varies from � trial to trial , depending on the values of ¤ ¡ � and ¡ ¦ , respectively. The distribution of such a random variable ¡ is known as the � Poisson-Binomial distribution =-=[28]-=-. From Equation 3, the expectation of ¡ is given by � ¢ © ¡ � ��© � �� ��� © ¡ � ¢ � © � � �� ��� © ¡ � � ©��£� (4) � Using £�� to denote the number of records with value ¤ in the original database, a... |
13 |
Using Unknowns to Prevent Discovery
- Saygin, Verykios, et al.
- 2001
(Show Context)
Citation Context .... It involves specification of how the data is to be used in a privacy policy and enforcing limited disclosure rules for regulatory concerns prompted by legislation. Finally, the problem addressed in =-=[19, 10, 11, 20]-=- is how to prevent sensitive rules from being inferred by the data miner – this work is complementary to ours since it addresses concerns about output privacy, whereas our focus is on the privacy of t... |
9 |
Managing Healthcare Data Hippocratically
- Agrawal, Kini, et al.
(Show Context)
Citation Context ...mediate database forming server can learn or recover precise records. Hippocratic databases are database systems that take responsibility for the privacy of the data they manage, and are discussed in =-=[3, 5, 6, 19]-=-. It involves specification of how the data is to be used in a privacy policy, and enforcing limited disclosure rules for regulatory concerns prompted by legislation. Finally, the problem addressed in... |
3 |
On the number of successes in independent trials, Statistics Sinica 3
- Wang
- 1993
(Show Context)
Citation Context ...ess in a trial i varies from another trial j and actually depends on the values of Ui and Uj, respectively. The distribution of such a random variable Yv is known as the Poisson-Binomial distribution =-=[25]-=-. Now, from Equation 3, the expectation of Yv is given by E(Yv) = N∑ i=1 E(Y i v) = N∑ i=1 P(Y i v = 1) (4) Let Xu denote the number of records with value u in the original database. Since P(Y i v = 1... |