Results 1 - 10
of
24
Security-control methods for statistical databases: a comparative study
- ACM Computing Surveys
, 1989
"... This paper considers the problem of providing security to statistical databases against disclosure of confidential information. Security-control methods suggested in the literature are classified into four general approaches: conceptual, query restriction, data perturbation, and output perturbation. ..."
Abstract
-
Cited by 264 (0 self)
- Add to MetaCart
This paper considers the problem of providing security to statistical databases against disclosure of confidential information. Security-control methods suggested in the literature are classified into four general approaches: conceptual, query restriction, data perturbation, and output perturbation. Criteria for evaluating the performance of the various security-control methods are identified. Security-control methods that are based on each of the four approaches are discussed, together with their performance with respect to the identified evaluation criteria. A detailed comparative analysis of the most promising methods for protecting dynamic-online statistical databases is also presented. To date no single security-control method prevents both exact and partial disclosures. There are, however, a few perturbation-based methods that prevent exact disclosure and enable the database administrator to exercise “statistical disclosure control. ” Some of these methods, however introduce bias into query responses or suffer from the O/l query-set-size problem (i.e., partial disclosure is possible in case of null query set or a query set of size 1). We recommend directing future research efforts toward developing new methods that prevent exact disclosure and provide statistical-disclosure control, while at the same time do not suffer from the bias problem and the O/l query-set-size problem. Furthermore, efforts directed toward developing a bias-correction mechanism and solving the general problem of small query-set-size would help salvage a few of the current perturbation-based methods.
Computational Disclosure Control - A Primer on Data Privacy Protection
- Massachusetts Institute of Technology
, 2001
"... Today's globally networked society places great demand on the dissemination and sharing of person-specific data for many new and exciting uses. Even situations where aggregate statistical information was once the reporting norm now rely heavily on the transfer of microscopically detailed transaction ..."
Abstract
-
Cited by 23 (2 self)
- Add to MetaCart
Today's globally networked society places great demand on the dissemination and sharing of person-specific data for many new and exciting uses. Even situations where aggregate statistical information was once the reporting norm now rely heavily on the transfer of microscopically detailed transaction and encounter information. This happens at a time when more and more historically public information is also electronically available. When these data are linked together, they provide an electronic shadow of a person or organization that is as identifying and personal as a fingerprint even when the information contains no explicit identifiers, such as name and phone number. Other distinctive data, such as birth date and ZIP code, often combine uniquely and can be linked to publicly available information to re-identify individuals. Producing anonymous data that remains specific enough to be useful is often a very difficult task and practice today tends to either incorrectly believe confidentiality is maintained when it is not or produces data that are practically useless. The goal of the work presented in this book is to explore computational techniques for releasing useful information in such a way that the identity of any individual or entity contained in data cannot be recognized while the data remain practically useful. I begin by demonstrating ways to learn information about entities from publicly available information. I then provide a formal framework for reasoning about disclosure control and the ability to infer the identities of entities contained within the data. I formally define and present null-map, k-map and wrong-map as models of protection. Each model provides protection by ensuring that released information maps to no, k or incorrect entities, respectively. The book ends by examining four computational systems that attempt to maintain privacy while releasing electronic information. These systems are: (1) my Scrub System, which locates personally-identifying information in letters between doctors and notes written by clinicians; (2) my Datafly II System, which generalizes and suppresses values in field-structured data sets; (3) Statistics Netherlands' m-Argus System, which is becoming a European standard for producing public-use data; and, (4) my k-Similar algorithm, which finds optimal solutions such that data are minimally distorted while still providing adequate protection. By introducing anonymity and quality metrics, I show that Datafly II can overprotect data, Scrub and m-Argus can fail to provide adequate protection, but k-similar finds optimal results.
Data mining, national security, privacy and civil liberties
- SIGKDD Explorations
, 2002
"... (On Leave from the MITRE Corporation, ..."
Maximizing Sharing of Protected Information
, 2002
"... ... In this paper we address the problem of classifying information by enforcing explicit data classification as well as inference and association constraints. We formulate the problem of determining a classification that ensures satisfaction of the constraints, while at the same time guaranteein ..."
Abstract
-
Cited by 10 (7 self)
- Add to MetaCart
... In this paper we address the problem of classifying information by enforcing explicit data classification as well as inference and association constraints. We formulate the problem of determining a classification that ensures satisfaction of the constraints, while at the same time guaranteeing that information will not be overclassified. We present an approach to the solution of this problem and give an algorithm implementing it which is linear in simple cases, and quadratic in the general case. We also analyze a variant of the problem that is NP-complete.
Specification and Enforcement of Classification and Inference Constraints
- IEEE Symposium on Security and Privacy
, 1999
"... Although mandatory access control in database systems has been extensively studied in recent years, and several models and systems have been proposed, capabilities for enforcement of mandatory constraints remain limited. Lack of support for expressing and combating inference channels that improperly ..."
Abstract
-
Cited by 9 (3 self)
- Add to MetaCart
Although mandatory access control in database systems has been extensively studied in recent years, and several models and systems have been proposed, capabilities for enforcement of mandatory constraints remain limited. Lack of support for expressing and combating inference channels that improperly leak protected information remains a major limitation in today’s multilevel systems. Moreover, the working assumption that data are classified at insertion time makes previous approaches inapplicable to the classification of existing, possibly historical, data repositories that need to be classified for release. Such a capability would be of great benefit to, and appears to be in demand by, governmental, public, and private institutions. We address the problem of classifying existing data
A Practical Formalism for Imprecise Inference Control
- Proceedings of the 8th IFIP WG11.3 Workshop on Database Security
, 1994
"... This paper describes a powerful, yet practical, formalism for modeling and controlling imprecise FD-based inference in relational database systems. The formalism provides a canonical representation of inference which unifies precise inference and the primitive imprecise inference mechanisms of abduc ..."
Abstract
-
Cited by 8 (4 self)
- Add to MetaCart
This paper describes a powerful, yet practical, formalism for modeling and controlling imprecise FD-based inference in relational database systems. The formalism provides a canonical representation of inference which unifies precise inference and the primitive imprecise inference mechanisms of abduction and partial deduction. Whereas other imprecise (partial) inference models estimate the probability of making inferences, the formalism supports the analysis of the actual imprecise values inferred in a database extension. Imprecise inference is analyzed by transforming a precise database augmented with additional "catalytic" relations, conveying possibly imprecise a priori knowledge, into an equivalent imprecise database. The analysis of imprecise inference and the related inference control methodology are highly flexible and robust. They can be directly applied to classical, MLS, and imprecise databases. With minimal modifications, they also can be used in knowledge discovery or databa...
Structural Signatures for Tree Data Structures
, 2008
"... Data sharing with multiple parties over a third-party distribution framework requires that both data integrity and confidentiality be assured. One of the most widely used data organization structures is the tree structure. When such structures encode sensitive information (such as in XML documents), ..."
Abstract
-
Cited by 7 (3 self)
- Add to MetaCart
Data sharing with multiple parties over a third-party distribution framework requires that both data integrity and confidentiality be assured. One of the most widely used data organization structures is the tree structure. When such structures encode sensitive information (such as in XML documents), it is crucial that integrity and confidentiality be assured not only for the content, but also for the structure. Digital signature schemes are commonly used to authenticate the integrity of the data. The most widely used such technique for tree structures is the Merkle hash technique, which however is known to be “not hiding”, thus leading to unauthorized leakage of information. Most techniques in the literature are based on the Merkle hash technique and thus suffer from the problem of unauthorized information leakages. Assurance of integrity and confidentiality (no leakages) of tree-structured data is an important problem in the context of secure data publishing and content distribution systems. In this paper, we propose a signature scheme for tree structures, which assures both confidentiality and integrity and is also efficient, especially in third-party distribution environments. Our integrity assurance technique, which we refer to as the “Structural signature scheme”, is based on the structure of the tree as defined by tree traversals (pre-order, post-order, in-order) and is defined using a randomized notion of such traversal numbers. In addition to formally defining the technique, we prove that it protects against violations of content and structural integrity and information leakages. We also show through complexity and performance analysis that the structural signature scheme is efficient; with respect to the Merkle hash technique, it incurs comparable cost for signing the trees and incurs lower cost for user-side integrity verification.
Multilevel Secure Rules: Integrating the Multilevel Secure and Active Data Models
- In Proceedings: 6th IFIP WG11.3 Working Conference on Database Security
, 1992
"... . Traditional database security is made more complex by the addition of rules to the data model. The security policy must control access privileges and accessibility for rule descriptions, executing rules, and database transitions (events). In this paper we extend the multilevel secure relational m ..."
Abstract
-
Cited by 6 (3 self)
- Add to MetaCart
. Traditional database security is made more complex by the addition of rules to the data model. The security policy must control access privileges and accessibility for rule descriptions, executing rules, and database transitions (events). In this paper we extend the multilevel secure relational model to capture the functionality required of an active database, i. e. a database with production rules, able to respond to events. Database rules and events are given explicit security classifications by introducing multilevel secure relations for each. Database rule descriptions are treated as MLS objects. All new user-definable active components (rule actions, trigger detection daemons) conform to mandatory security constraints for subjects. An execution algorithm is given which employs cascading transactions to hide secure rule processing. Implications for implementing the new active functionality in an MLS relational database are also discussed. Keyword Codes: H.2.4; I.2.4; K.6.5 Keyw...
Analyzing FD Inference in Relational Databases
- Data and Knowledge Engineering
, 1996
"... Imprecise inference models the ability to infer sets of values or information chunks. Imprecise database inference is just as important as precise inference. In fact, it is more prevalent than its precise counterpart even in precise databases. Analyzing the extent of imprecise inference is important ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
Imprecise inference models the ability to infer sets of values or information chunks. Imprecise database inference is just as important as precise inference. In fact, it is more prevalent than its precise counterpart even in precise databases. Analyzing the extent of imprecise inference is important in knowledge discovery and database security. Imprecise inference analysis can be used to "mine" rule-based knowledge from database data. In database security, imprecise inference analysis can help determine whether or not a system is safe from imprecise inference attacks. This paper deals with the general problem of analyzing fuzzy inference based on functional dependencies (FDs) in database relations. Fuzzy inference, the ability to infer fuzzy set values, generalizes imprecise (setvalued) inference and precise inference. Likewise, fuzzy relational databases generalize their classical and imprecise counterparts by supporting fuzzy information storage and retrieval. Inference analysis is p...
Minimal Data Upgrading to Prevent Inference and Association Attacks
- PROC. OF THE 18TH ACM SIGMOD-SIGACT-SIGART SYMPOSIUM ON PRINCIPLES OF DATABASE SYSTEMS (PODS
, 1999
"... Despite advances in recent years in the area of mandatory access control in database systems, today's information repositories remain vulnerable to inference and data association attacks that can result in serious information leakage. Such information leakage can be prevented by properly classifying ..."
Abstract
-
Cited by 5 (4 self)
- Add to MetaCart
Despite advances in recent years in the area of mandatory access control in database systems, today's information repositories remain vulnerable to inference and data association attacks that can result in serious information leakage. Such information leakage can be prevented by properly classifying information according to constraints that express relationships among the security levels of data objects. In this paper we address the problem of classifying information by enforcing explicit data classification as well as inference and association constraints. We formulate the problem of determining a classification that ensures satisfaction of the constraints, while at the same time guaranteeing that information will not be unnecessarily overclassified. We present an approach to the solution of this problem and give an algorithm implementing it which is linear in simple cases, and low-order polynomial (n²) in the general case. We also analyze a variation of the problem which is NP-hard.

