## Privacy-preserving distributed mining of association rules on horizontally partitioned data

### Cached

### Download Links

Venue: | In The ACM SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery (DMKD’02 |

Citations: | 187 - 18 self |

### BibTeX

@INPROCEEDINGS{Kantarcioglu_privacy-preservingdistributed,

author = {Murat Kantarcioglu and Chris Clifton and Senior Member},

title = {Privacy-preserving distributed mining of association rules on horizontally partitioned data},

booktitle = {In The ACM SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery (DMKD’02},

year = {},

pages = {24--31}

}

### Years of Citing Articles

### OpenURL

### Abstract

Abstract—Data mining can extract important knowledge from large data collections—but sometimes these collections are split among various parties. Privacy concerns may prevent the parties from directly sharing the data and some types of information about the data. This paper addresses secure mining of association rules over horizontally partitioned data. The methods incorporate cryptographic techniques to minimize the information shared, while adding little overhead to the mining task. Index Terms—Data mining, security, privacy. æ

### Citations

3184 | A method for obtaining digital signatures and public-key cryptosystems - Rivest, Shamir, et al. - 1978 |

2961 | New Directions in Cryptography
- Diffie, Hellman
- 1976
(Show Context)
Citation Context ...d be correct.KANTARCIOGLU AND CLIFTON: PRIVACY-PRESERVING DISTRIBUTED MINING OF ASSOCIATION RULES ON HORIZONTALLY 1037 and further definitions and discussion of their security, can be found in [21], =-=[22]-=-, [23], [24]. ACKNOWLEDGMENTS The authors wish to acknowledge the contributions of Mike Atallah and Jaideep Vaidya. Discussions with them have helped to tighten the proofs, giving clear bounds on the ... |

2899 | Fast algorithms for mining association rules
- Agrawal, Srikant
- 1994
(Show Context)
Citation Context ...iÞ support countABðiÞ database sizeðiÞ confidenceAB)C supportAB)C : supportAB Note that this does not require sharing any individual transactions. We can easily extend an algorithm such as a priori =-=[1]-=- to the distributed case using the following . The authors are with the Department of Computer Sciences, Purdue University, 250 N. University St., W. Lafayette, IN 47907. E-mail: {kanmurat, clifton}@c... |

1230 | A public key cryptosystem and a signature scheme based on discrete logarithms
- ElGamal
- 1985
(Show Context)
Citation Context ...orrect.KANTARCIOGLU AND CLIFTON: PRIVACY-PRESERVING DISTRIBUTED MINING OF ASSOCIATION RULES ON HORIZONTALLY 1037 and further definitions and discussion of their security, can be found in [21], [22], =-=[23]-=-, [24]. ACKNOWLEDGMENTS The authors wish to acknowledge the contributions of Mike Atallah and Jaideep Vaidya. Discussions with them have helped to tighten the proofs, giving clear bounds on the inform... |

717 | Crowds: Anonymity for Web Transactions
- Reiter, Rubin
- 2000
(Show Context)
Citation Context ...fully encrypted values of its own itemsets. 1 Phase 4 decrypts the merged frequent itemsets. Commutativity of encryption allows us to decrypt 1. An alternative would be to use an anonymizing protocol =-=[16]-=- to send all fully encrypted itemsets to Site 0, thus preventing Site 0 from knowing which were its own itemsets. The separate odd/even merging is lower cost and achieves sufficient security for pract... |

665 | Privacy-preserving data mining - Agrawal, Srikant - 2000 |

597 |
How to generate and exchange secrets
- Yao
- 1986
(Show Context)
Citation Context ...ped to some number that is bigger than or equal to m=2.( k m k mod m.) The last site needs to test if this sum minus xrðmod mÞ is less than m=2. This can be done securely using Yao’s generic method =-=[11]-=-. Clearly, this algorithm is secure as long as there is no collusion as no site can distinguish what it receives from a random number. Alternatively, the first site can simply send xr to the last site... |

418 | Privacy preserving data mining
- Lindell, Pinkas
(Show Context)
Citation Context ...TEMBER 2004 just that no one is allowed to see all the data. In return, we are able to get exact, rather than approximate, results. The other approach uses cryptographic tools to build decision trees =-=[8]-=-. In this work, the goal is to securely build an ID3 decision tree where the training set is distributed between two parties. The basic idea is that finding the attribute that maximizes information ga... |

340 | On the design and quantification of privacy preserving data mining algorithms
- Agrawal, Aggarwal
- 2001
(Show Context)
Citation Context ...ghtened the bounds on what private information is disclosed by showing that the ability to reconstruct the distribution can be used to tighten estimates of original values based on the distorted data =-=[5]-=-. More recently, the data distortion approach has been applied to Boolean association rules [6], [7]. Again, the idea is to modify data values such that reconstruction of the values for any individual... |

329 | An improved algorithm for computing logarithms over GF(p) and its cryptographic significance
- Pohlig, Hellman
- 1978
(Show Context)
Citation Context ...ata mining methods to be applied in situations where privacy concerns would appear to restrict such mining. APPENDIX CRYPTOGRAPHIC NOTES ON COMMUTATIVE ENCRYPTION The Pohlig-Hellman encryption scheme =-=[15]-=- can be used for a commutative encryption scheme meeting the requirements of Section 2.3. Pohlig-Hellman works as follows: Given a large prime p with no small factors of p 1, each party chooses a rand... |

265 | Privacy preserving mining of association rules
- Evfimievski, Srikant, et al.
- 2002
(Show Context)
Citation Context ...nstruct the distribution can be used to tighten estimates of original values based on the distorted data [5]. More recently, the data distortion approach has been applied to Boolean association rules =-=[6]-=-, [7]. Again, the idea is to modify data values such that reconstruction of the values for any individual transaction is difficult, but the rules learned on the distorted data are still valid. One int... |

232 | Privacy preserving association rule mining in vertically partitioned data
- Vaidya, Clifton
- 2002
(Show Context)
Citation Context ...ult. The ability to share nonsensitive data enables highly efficient solutions. The problem of privately computing association rules in vertically partitioned distributed data has also been addressed =-=[10]-=-. The vertically partitioned problem occurs when each transaction is split across multiple sites, with each site having a different set of attributes for the entire set of transactions. With horizonta... |

145 | Maintaining data privacy in association rule mining
- Rizvi, Haritsa
(Show Context)
Citation Context ...ct the distribution can be used to tighten estimates of original values based on the distorted data [5]. More recently, the data distortion approach has been applied to Boolean association rules [6], =-=[7]-=-. Again, the idea is to modify data values such that reconstruction of the values for any individual transaction is difficult, but the rules learned on the distorted data are still valid. One interest... |

122 |
Secure multi-party computation, working draft
- Goldreich
(Show Context)
Citation Context ...pression ðv1 þ v2Þ logðv1 þ v2Þ and show how to use this function for building the ID3 securely. This approach treats privacy-preserving data mining as a special case of secure multiparty computation =-=[9]-=- and not only aims for preserving individual privacy, but also tries to preserve leakage of any information other than the final result. We follow this approach, but address a different problem (assoc... |

121 | One-way accumulators: A decentralized alternative to digital signatures
- Benaloh, Mare
- 1993
(Show Context)
Citation Context ...s would be correct.KANTARCIOGLU AND CLIFTON: PRIVACY-PRESERVING DISTRIBUTED MINING OF ASSOCIATION RULES ON HORIZONTALLY 1037 and further definitions and discussion of their security, can be found in =-=[21]-=-, [22], [23], [24]. ACKNOWLEDGMENTS The authors wish to acknowledge the contributions of Mike Atallah and Jaideep Vaidya. Discussions with them have helped to tighten the proofs, giving clear bounds o... |

115 |
Secret sharing homomorphisms: keeping shares of a secret secret
- Benaloh
- 1986
(Show Context)
Citation Context ...es its input into n parts and sends the n 1 pieces to different sites. To reveal any parties input, n 1 parties must collude. The following is a brief summary of the protocol, details can be found in =-=[17]-=-. (A slightly more efficient version can be found in [18].) 1. Each site i randomly chooses n elements such that xi Pn j1 zi;j mod m, where xi is the input of site i. Site i sends zi;j to site j. 2... |

112 | A fast distributed algorithm for mining association rules
- Cheung, Han, et al.
- 1996
(Show Context)
Citation Context ...te the global support of each rule and (from the lemma) be certain that all rules with support at least k have been found. More thorough studies of distributed association rule mining can be found in =-=[2]-=-, [3]. The above approach protects individual data privacy, but it does require that each site disclose what rules it supports and how much it supports each potential global rule. What if this informa... |

78 | Efficient Mining of Association Rules in Distributed Databases
- Cheung, Ng, et al.
- 1996
(Show Context)
Citation Context ...e global support of each rule and (from the lemma) be certain that all rules with support at least k have been found. More thorough studies of distributed association rule mining can be found in [2], =-=[3]-=-. The above approach protects individual data privacy, but it does require that each site disclose what rules it supports and how much it supports each potential global rule. What if this information ... |

41 |
Mental poker
- Shamir, Rivest, et al.
- 1981
(Show Context)
Citation Context ....KANTARCIOGLU AND CLIFTON: PRIVACY-PRESERVING DISTRIBUTED MINING OF ASSOCIATION RULES ON HORIZONTALLY 1037 and further definitions and discussion of their security, can be found in [21], [22], [23], =-=[24]-=-. ACKNOWLEDGMENTS The authors wish to acknowledge the contributions of Mike Atallah and Jaideep Vaidya. Discussions with them have helped to tighten the proofs, giving clear bounds on the information ... |

25 | Defining Privacy For Data Mining
- Clifton, Kantarcioglu, et al.
(Show Context)
Citation Context ...t allow parties to choose their desired level of security are needed, allowing efficient solutions that maintain the desired security. Some suggested directions for research in this area are given in =-=[19]-=-. One line of research is to predict the value of information for a particular organization, allowing trade off between disclosure cost, computation cost, and benefit from the result. We believe some ... |

12 | A communication-privacy tradeoff for modular addition
- Chor, Kushilevitz
- 1993
(Show Context)
Citation Context ...ferent sites. To reveal any parties input, n 1 parties must collude. The following is a brief summary of the protocol, details can be found in [17]. (A slightly more efficient version can be found in =-=[18]-=-.) 1. Each site i randomly chooses n elements such that xi Pn j1 zi;j mod m, where xi is the input of site i. Site i sends zi;j to site j. 2. Every site i computes wi Pn j1 zj;i mod m and sends ... |

9 |
Enhancing Privacy and Trust
- Huberman, Franklin, et al.
- 2000
(Show Context)
Citation Context ...it is possible to determine if one is the square of the other (even though the base values are not revealed.) This violates the security requirement of Section 2.3. Huberman et al. provide a solution =-=[20]-=-. Rather than encrypting items directly, a hash of the items is encrypted. The hash occurs only at the originating site, the second and later encryption of items can use Pohlig-Hellman directly. The h... |

4 | An efficient protocol for Yao’s millionaires’problem - Ioannidis, Grama |

2 | Encryption Schemes,” (working draft - Goldreich - 2003 |