## On the privacy preserving properties of random data perturbation techniques (2003)

### Cached

### Download Links

- [www.cs.umbc.edu]
- [www.eecs.wsu.edu]
- [www.csee.umbc.edu]
- DBLP

### Other Repositories/Bibliography

Venue: | In ICDM |

Citations: | 148 - 5 self |

### BibTeX

@INPROCEEDINGS{Kargupta03onthe,

author = {Hillol Kargupta and Souptik Datta},

title = {On the privacy preserving properties of random data perturbation techniques},

booktitle = {In ICDM},

year = {2003},

pages = {99--106}

}

### Years of Citing Articles

### OpenURL

### Abstract

Privacy is becoming an increasingly important issue in many data mining applications. This has triggered the development of many privacy-preserving data mining techniques. A large fraction of them use randomized data distortion techniques to mask the data for preserving the privacy of sensitive data. This methodology attempts to hide the sensitive data by randomly modifying the data values often using additive noise. This paper questions the utility of the random value distortion technique in privacy preservation. The paper notes that random objects (particularly random matrices) have “predictable ” structures in the spectral domain and it develops a random matrix-based spectral filtering technique to retrieve original data from the dataset distorted by adding random values. The paper presents the theoretical foundation of this filtering method and extensive experimental results to demonstrate that in many cases random data distortion preserve very little data privacy. 1.

### Citations

753 | Random Graphs
- Janson, ÃLuczak, et al.
- 2000
(Show Context)
Citation Context ...signal processing literature [12] offers many filters to remove white noise from data and they often work reasonably well. Randomly generated structures like graphs demonstrate interesting properties =-=[7]-=-. In short, randomness does seem to have “structure” and this structure may be used to compromise privacy issues unless we pay careful attention. The rest of this paper illustrates this challenge in t... |

675 |
Random Matrices
- Mehta
- 1991
(Show Context)
Citation Context ...bit us from extracting the hidden information? This section presents a discussion on the properties of random matrices and presents some results that will be used later in this paper. Random matrices =-=[13]-=- exhibit many interesting properties that are often exploited in high energy physics [13], signal processing [16], and even data mining [10]. The random noise added to the data can be viewed as a rand... |

657 | Privacy-preserving data mining
- Agrawal, Srikant
- 2000
(Show Context)
Citation Context ...omains is facing growing concerns. Therefore, we need to develop data mining techniques that are sensitive to the privacy issue. This has fostered the development of a class of data mining algorithms =-=[2, 9]-=- that try to extract the data patterns without directly accessing the original data and guarantees that the mining process does not get sufficient information to reconstruct the original data. This pa... |

334 | On the design and quantification of privacy preserving data mining algorithms
- Agrawal, Aggarwal
- 2001
(Show Context)
Citation Context ...ut that in many cases the noise can be separated from the perturbed data by studying the spectral properties of the data and as a result its privacy can be seriously compromised. Agrawal and Aggarwal =-=[1]-=- have also considered the approach in [2] and have provided a expectation-maximization (EM) algorithm for reconstructing the distribution of the original data from perturbed observations. They also pr... |

261 | Privacy preserving mining of association rules
- Evfimievski, Srikant, et al.
- 2002
(Show Context)
Citation Context ... the original data (which could be used to guess the data value to a higher level of accuracy). However, [1] provides no explicit procedure to reconstruct the original data values. Evfimievski et al. =-=[5, 4]-=- and Rizvi [15] have also considered the approach in [2] in the context of association rule mining and suggest techniques for limiting privacy breaches. Our primary contribution is to provide an expli... |

257 | Limiting privacy breaches in privacy preserving data mining
- Evfimievski, Gehrke, et al.
- 2003
(Show Context)
Citation Context ... the original data (which could be used to guess the data value to a higher level of accuracy). However, [1] provides no explicit procedure to reconstruct the original data values. Evfimievski et al. =-=[5, 4]-=- and Rizvi [15] have also considered the approach in [2] in the context of association rule mining and suggest techniques for limiting privacy breaches. Our primary contribution is to provide an expli... |

230 | Privacy preserving association rule mining in vertically partitioned data
- Vaidya, Clifton
- 2002
(Show Context)
Citation Context ...by exchanging only the minimal necessary information among the participating nodes without transmitting the raw data. Privacy preserving association rule mining from homogeneous [9] and heterogeneous =-=[19]-=- distributed data sets are few examples. The second approach is based onsdata-swapping which works by swapping data values within same feature [3]. There is also an approach which works by adding rand... |

185 | Privacy-Preserving Distributed Mining of Association Rules on Horizontally Partitioned Data
- Kantarcioglu, Clifton
- 2004
(Show Context)
Citation Context ...omains is facing growing concerns. Therefore, we need to develop data mining techniques that are sensitive to the privacy issue. This has fostered the development of a class of data mining algorithms =-=[2, 9]-=- that try to extract the data patterns without directly accessing the original data and guarantees that the mining process does not get sufficient information to reconstruct the original data. This pa... |

144 | Maintaining data privacy in association rule mining
- Rizvi, Haritsa
- 2002
(Show Context)
Citation Context ...a (which could be used to guess the data value to a higher level of accuracy). However, [1] provides no explicit procedure to reconstruct the original data values. Evfimievski et al. [5, 4] and Rizvi =-=[15]-=- have also considered the approach in [2] in the context of association rule mining and suggest techniques for limiting privacy breaches. Our primary contribution is to provide an explicit filtering p... |

116 |
Error and perturbation bounds for subspaces associated with certain eigenvalue problems
- Stewart
- 1973
(Show Context)
Citation Context ... � � � � be the two-norm of the per� ��� � ����� ��� � � � turbation, where is the largest singular value of � � � . Then there exists an eigenvalue-eigenvector pair � � � � � of � � � � � satisfying =-=[20, 17]-=- ��� ����� ��� ��� � ������� � � � � � � ¨ where � is the distance � between and the closest eigenvalue � � � of , provided � � . This shows that the eigen� � values � � � of � � � � � and are in gene... |

100 |
Some limit theorems of the eigenvalues of sample convariance matrix
- Jonsson
- 1982
(Show Context)
Citation Context ...a � variable . We will consider asymptotics such that in the limit ����� as , we §���� ����� have , ����� , and ����� ������� ����� ��� , where ��� � . Under these � assumptions, it can be shown that =-=[8]-=- the empirical c.d.f. ������� converges in probability to a continuous distribution � ��������� function for � every , whose probability density function (p.d.f.) is given by � � ����� � � ��� �������... |

63 |
The statistical security of a statistical database
- Traub, Yemini, et al.
- 1984
(Show Context)
Citation Context ...omized value distortion technique for learning decision trees [2] and association rule learning [6] are examples of this approach. Additional work on randomized masking of data can be found elsewhere =-=[18]-=-. This paper explores the third approach [2]. It points out that in many cases the noise can be separated from the perturbed data by studying the spectral properties of the data and as a result its pr... |

53 |
Inequalities between two kinds of eigenvalues of a linear transformation
- Weyl
- 1949
(Show Context)
Citation Context ...here are orthogonal matrices whose column tively, and £ £ � � , , £ ¤ are diagonal matrices with the corresponding eigenvalues on their diagonals. The following result from matrix perturbation theory =-=[20]-=- gives a relationship between £ £ ¤ , . � � �¤ � (6) ¤ vectors are eigenvectors of � � � , � � � ��� , � � � , respec� ¤ , and � £ Theorem 1 [20] � �§¦ �©¨ � � � ��¦ ��¨ � � ����� � ��¦ ��¨ � � Suppos... |

32 |
Signal detection via spectral theory of large dimensional random matrices
- Silverstein, Combettes
- 1992
(Show Context)
Citation Context ...rices and presents some results that will be used later in this paper. Random matrices [13] exhibit many interesting properties that are often exploited in high energy physics [13], signal processing =-=[16]-=-, and even data mining [10]. The random noise added to the data can be viewed as a random matrix and therefore its properties can be understood by studying the properties of random matrices. In this p... |

12 | Dependency Detection in MobiMine and Random Matrices
- Kargupta, Sivakumar, et al.
- 2002
(Show Context)
Citation Context ...ults that will be used later in this paper. Random matrices [13] exhibit many interesting properties that are often exploited in high energy physics [13], signal processing [16], and even data mining =-=[10]-=-. The random noise added to the data can be viewed as a random matrix and therefore its properties can be understood by studying the properties of random matrices. In this paper we shall develop a spe... |

3 |
Random projection and privacy preserving correlation computation from distributed data
- Liu, Kargupta, et al.
- 2003
(Show Context)
Citation Context ...ke independent component analysis. However, projection matrices that satisfy certain conditions may be more appealing for such applications. More details about this possibility can be found elsewhere =-=[11]-=-. Acknowledgments The authors acknowledge supports from the United States National Science Foundation CAREER award IIS-0093353, NASA (NRA) NAS2-37143, and TEDCO, Maryland Technology Development Center... |

2 |
Randomization techniques for privacy preserving association rule mining
- Evfimievski
- 2002
(Show Context)
Citation Context ...sing randomized techniques. The perturbed data is then used to extract the patterns and models. The randomized value distortion technique for learning decision trees [2] and association rule learning =-=[6]-=- are examples of this approach. Additional work on randomized masking of data can be found elsewhere [18]. This paper explores the third approach [2]. It points out that in many cases the noise can be... |

1 | Data swaping: Balancing privacy against precision in mining for logic rules - Estivill-Castro, Brankovic - 1999 |