## Mondrian multidimensional k-anonymity (2006)

### Cached

### Download Links

- [ftp.cs.wisc.edu]
- [research.cs.wisc.edu]
- [www.cse.iitb.ac.in]
- [paul.rutgers.edu]
- [www.cs.wisc.edu]
- DBLP

### Other Repositories/Bibliography

Venue: | In ICDE |

Citations: | 187 - 5 self |

### BibTeX

@INPROCEEDINGS{Lefevre06mondrianmultidimensional,

author = {Kristen Lefevre and David J. Dewitt and Raghu Ramakrishnan},

title = {Mondrian multidimensional k-anonymity},

booktitle = {In ICDE},

year = {2006}

}

### Years of Citing Articles

### OpenURL

### Abstract

K-Anonymity has been proposed as a mechanism for protecting privacy in microdata publishing, and numerous recoding “models ” have been considered for achieving kanonymity. This paper proposes a new multidimensional model, which provides an additional degree of flexibility not seen in previous (single-dimensional) approaches. Often this flexibility leads to higher-quality anonymizations, as measured both by general-purpose metrics and more specific notions of query answerability. Optimal multidimensional anonymization is NP-hard (like previous optimal k-anonymity problems). However, we introduce a simple greedy approximation algorithm, and experimental results show that this greedy algorithm frequently leads to more desirable anonymizations than exhaustive optimal algorithms for two single-dimensional models. 1.

### Citations

10921 |
Computers and Intractability: A Guide to the Theory of NP-Completeness
- Garey, Johnson
(Show Context)
Citation Context ...hat every resulting multidimensional region Pi contains either |Pi| ≥ k or |Pi| = 0 points and CAV G ≤ positive constant c? Our proof is based on a straightforward reduction from integer partitioning =-=[7]-=-: Integer Partitioning Consider a set A of n positive integers {a1, ..., an}. Is there some A ′ ⊆ A, such that ∑ ai∈A ′ ai = ∑ aj∈A−A ′ aj ? Theorem 1 The k-anonymous strict multidimensional partition... |

2868 |
UCI Repository of machine learning databases
- Blake, Merz
- 1998
(Show Context)
Citation Context ...ues of each attribute into equal-width ranges. The parameters are described in Figure 8. In addition to synthetic data, we also used the Adults database from the UC Irvine Machine Learning Repository =-=[3]-=-, which contains census data, and has become a de facto benchmark for k-anonymity. We configured this data set as it was configured for the experiments reported in [2], using eight regular attributes,... |

795 |
k-anonymity: a model for protecting privacy
- Sweeney
(Show Context)
Citation Context ... Name and Social Security Number) must be removed. In addition, this process must account for the possibility of combining certain other attributes with external data to uniquely identify individuals =-=[15]-=-. For example, an individual might be “re-identified” by joining the released data with another (public) database on Age, Sex, and Zipcode. Figure 1 shows such an attack, where Ahmed’s medical informa... |

587 | An algorithm for finding best matches in logarithmic expected time
- Friedman, Bentley, et al.
- 1977
(Show Context)
Citation Context ... each region. In the previous sections, we alluded to a recursive algorithm for the first step. In this section we outline a simple scalable algorithm, reminiscent of those used to construct kd-trees =-=[5]-=-, that can be adapted to either strict or relaxed partitioning. The second step is described in more detail in Section 5 The strict partitioning algorithm is shown in Figure 6. Each iteration must cho... |

337 | Protecting respondents identities in microdata release
- Samarati
(Show Context)
Citation Context ...ck, where Ahmed’s medical information is determined by joining the released patient data with a public voter registration list. K-anonymity has been proposed to reduce the risk of this type of attack =-=[12, 13, 15]-=-. The primary goal of kanonymization is to protect the privacy of the individuals to whom the data pertains. However, subject to this constraint, it is important that the released data remain as “usef... |

293 | Achieving k-anonymity privacy protection using generalization and suppression
- Sweeney
- 2002
(Show Context)
Citation Context ...l in the worst case. • The greedy multidimensional algorithm often produces higher-quality results than optimal singledimensional algorithms (as well as the many existing single-dimensional heuristic =-=[6, 14, 16]-=- and stochastic search [8, 18] algorithms). 1.1. Basic Definitions Quasi-Identifier Attribute Set A quasi-identifer is a minimal set of attributes X1, ..., Xd in table T that can be joined with extern... |

259 | Data privacy through optimal kanonymization
- Bayardo, Agrawal
(Show Context)
Citation Context ...k-anonymization, an approach with several important advantages: 1 • The greedy algorithm is substantially more efficient than proposed optimal k-anonymization algorithms for single-dimensional models =-=[2, 9, 12]-=-. The time complexity of the greedy algorithm is O(nlogn), while the optimal algorithms are exponential in the worst case. • The greedy multidimensional algorithm often produces higher-quality results... |

224 |
Incognito: Efficient full-domain k-anonymity
- Lefevre, Dewitt, et al.
- 2005
(Show Context)
Citation Context ...mal (Normal only) Mean (µ) (Normal only) Figure 8. Parameters of synthetic generator comparing these results with those produced by optimal algorithms for two other models: full-domain generalization =-=[9, 12]-=-, and single-dimensional partitioning [2, 8]. The specific algorithms used in the comparison (Incognito [9] and K-Optimize [2]) were chosen for efficiency, but any exhaustive algorithm for these model... |

212 | Protecting Privacy when Disclosing Information: k-Anonymity and Its Enforcement through Generalization and Suppression
- Samarati, Sweeney
- 1998
(Show Context)
Citation Context ...re Ahmed’s medical information is determined by joining a released table of patient data with a public voter registration list. K-anonymity has been proposed to reduce the risk of this type of attack =-=[12, 13, 15]-=-. The primary goal of kanonymization is to protect the privacy of the individuals to whom the data pertains. However, subject to this constraint, it is important that the released data remain as “usef... |

210 |
R (2004) On the complexity of optimal k-anonymity
- Meyerson, Williams
(Show Context)
Citation Context ...pertains. However, subject to this constraint, it is important that the released data remain as “useful” as possible. Numerous recoding models have been proposed in the literature for k-anonymization =-=[8, 9, 13, 17, 10]-=-, and often the “quality” of the published data is dictated by the model that is used. The main contributions of this paper are a new multidimensional recoding model and a greedy algo1 Voter Registrat... |

183 | Transforming data to satisfy privacy constraints
- Iyengar
- 2002
(Show Context)
Citation Context ...pertains. However, subject to this constraint, it is important that the released data remain as “useful” as possible. Numerous recoding models have been proposed in the literature for k-anonymization =-=[8, 9, 13, 17, 10]-=-, and often the “quality” of the published data is dictated by the model that is used. The main contributions of this paper are a new multidimensional recoding model and a greedy algo1 Voter Registrat... |

122 | Top-down specialization for information and privacy preservation
- Fung, Wang, et al.
- 2005
(Show Context)
Citation Context ...l in the worst case. • The greedy multidimensional algorithm often produces higher-quality results than optimal singledimensional algorithms (as well as the many existing single-dimensional heuristic =-=[6, 14, 16]-=- and stochastic search [8, 18] algorithms). 1.1. Basic Definitions Quasi-Identifier Attribute Set A quasi-identifer is a minimal set of attributes X1, ..., Xd in table T that can be joined with extern... |

95 | Rainforest: A framework for fast decision tree construction of large datasets - Gehrke, Ganti - 1998 |

89 | Toward privacy in public databases
- CHAWLA, DWORK, et al.
- 2005
(Show Context)
Citation Context ...al data cells. Several approximation algorithms have been proposed for the problem of finding the k-anonymization that suppresses the fewest cells [1, 10]. In another related direction, Chawla et al. =-=[4]-=- propose a theoretical framework for privacy in data publishing based on private histograms. This work describes a recursive sanitization algorithm in multidimensional space. However, in their problem... |

82 | Bottom-up generalization: A data mining solution to privacy protection
- Wang, Yu, et al.
- 2004
(Show Context)
Citation Context ...l in the worst case. • The greedy multidimensional algorithm often produces higher-quality results than optimal singledimensional algorithms (as well as the many existing single-dimensional heuristic =-=[6, 14, 16]-=- and stochastic search [8, 18] algorithms). 1.1. Basic Definitions Quasi-Identifier Attribute Set A quasi-identifer is a minimal set of attributes X1, ..., Xd in table T that can be joined with extern... |

81 |
Anonymizing tables
- Aggarwal, Feder, et al.
- 2005
(Show Context)
Citation Context ... records total equiv classes )/(k) 1.3. Paper Overview The first contribution of this paper is a new multidimensional model for k-anonymization (Section 2). Like previous optimal k-anonymity problems =-=[1, 10]-=-, optimal multidimensional k-anonymization is NP-hard. However, for numeric data, we find that under reasonable assumptions the worst-case maximum size of equivalence classes is O(k) in the multidimen... |

42 | On rectangular partitionings in two dimensions: Algorithms, complexity, and applications - Muthukrishnan, Poosala, et al. - 1999 |

37 |
Elements of Statistical Disclosure Control
- Willenborg, Waal
- 2001
(Show Context)
Citation Context ...pertains. However, subject to this constraint, it is important that the released data remain as “useful” as possible. Numerous recoding models have been proposed in the literature for k-anonymization =-=[8, 9, 13, 17, 10]-=-. Often Voter Registration Data Name Age Sex Zipcode Ahmed 25 Male 53711 Brooke 28 Female 55410 Casey 31 Female 90210 Dave 19 Male 02174 Evelyn 40 Female 02237 Patient Data Age Sex Zipcode Disease 25 ... |

16 |
Using simulated annealing for kanonymity
- Winkler
- 2002
(Show Context)
Citation Context ...multidimensional algorithm often produces higher-quality results than optimal singledimensional algorithms (as well as the many existing single-dimensional heuristic [6, 14, 16] and stochastic search =-=[8, 18]-=- algorithms). 1.1. Basic Definitions Quasi-Identifier Attribute Set A quasi-identifer is a minimal set of attributes X1, ..., Xd in table T that can be joined with external information to re-identify ... |

4 |
Using simulated annealing for k-anonymity. Research Report 2002-07
- Winkler
- 2002
(Show Context)
Citation Context ...hm often produces better quality results than optimal singledimensional algorithms, thus producing better results than the many existing single-dimensional heuristic [6, 14, 16] and stochastic search =-=[8, 18]-=- algorithms. 1.1. Basic Definitions Quasi-Identifier Attribute Set A quasi-identifer is a minimal set of attributes X1, ..., Xd in table T that can be joined with external information to re-identify i... |