Results 1 -
8 of
8
Pattern-Preserving k-Anonymization of Sequences and its Application to Mobility Data Mining
"... Abstract. Sequential pattern mining is a major research field in knowledge discovery and data mining. Thanks to the increasing availability of transaction data, it is now possible to provide new and improved services based on users ’ and customers ’ behavior. However, this puts the citizen’s privacy ..."
Abstract
-
Cited by 10 (2 self)
- Add to MetaCart
(Show Context)
Abstract. Sequential pattern mining is a major research field in knowledge discovery and data mining. Thanks to the increasing availability of transaction data, it is now possible to provide new and improved services based on users ’ and customers ’ behavior. However, this puts the citizen’s privacy at risk. Thus, it is important to develop new privacy-preserving data mining techniques that do not alter the analysis results significantly. In this paper we propose a new approach for anonymizing sequential data by hiding infrequent, and thus potentially sensible, subsequences. Our approach guarantees that the disclosed data are k-anonymous and preserve the quality of extracted patterns. An application to a real-world moving object database is presented, which shows the effectiveness of our approach also in complex contexts. 1
Privacy Preserving Publication of Moving Object Data
"... Abstract. The increasing availability of space-time trajectories left by location-aware devices is expected to enable novel classes of applications where the discovery of consumable, concise, and actionable knowledge is the key step. However, the analysis of mobility data is a critic task by the pri ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Abstract. The increasing availability of space-time trajectories left by location-aware devices is expected to enable novel classes of applications where the discovery of consumable, concise, and actionable knowledge is the key step. However, the analysis of mobility data is a critic task by the privacy point of view: in fact, the peculiar nature of location data might enable intrusive inferences in the life of the individuals whose data is analyzed. It is thus important to develop privacy-preserving techniques for the publication and the analysis of mobility data. This chapter provides a brief survey of the research on anonymity preserving data publishing of moving objects databases. While only few papers so far have tackled the problem of anonymity in the off-line case of publication of a moving objects database, rather large body of work has been developed for anonymity on relational data on one side, and for location privacy in the on-line, dynamic context of location based services (LBS), on the other side. In this chapter we first briefly review the basic concepts of k-anonymity on relational data. Then we focus on the body of research about privacy in LBS: we try to identify some useful concepts for our static context, while highlighting the differences, and discussing the inapplicability of some of the LBS solutions to the static case. Next we present in details some of the papers that recently have attacked the problem of moving objects anonymization in the static context. We discuss in details the problems addressed and the solutions proposed, highlighting merits and limits of each work, as well as the various problems still open. 1
On the use of aggregation operators for location privacy
"... Nowadays, the management of sequential and temporal data is an increasing need in many data mining processes. Therefore, the development of new privacy preserving data mining techniques for sequential data is a crucial need to ensure that sequence data analysis is performed without disclosure sensit ..."
Abstract
- Add to MetaCart
(Show Context)
Nowadays, the management of sequential and temporal data is an increasing need in many data mining processes. Therefore, the development of new privacy preserving data mining techniques for sequential data is a crucial need to ensure that sequence data analysis is performed without disclosure sensitive information. Although data analysis and protection are very different processes, they share a few common components such as similarity measurement. In this paper we propose a new similarity function for categorical sequences of events based on OWA operators and fuzzy quantifiers. The main advantage of this new similarity function is the possibility of incorporating the user preferences in the similarity computation. We describe the implications of the application of different user preference policies in the similarity measurement when microaggregation, a wellknown data protection method, is applied to sequential data.
Concealing Sequential and Spatiotemporal Patterns using Polynomial Sanitization
"... Earlier, Process of relevant pattern observation which is present in the database observed as a hurdle for database protection. Over the time, various approaches for hiding knowledge have emerged, mainly in the focus of Association rules and frequent item sets mining. This paper, have seen the probl ..."
Abstract
- Add to MetaCart
(Show Context)
Earlier, Process of relevant pattern observation which is present in the database observed as a hurdle for database protection. Over the time, various approaches for hiding knowledge have emerged, mainly in the focus of Association rules and frequent item sets mining. This paper, have seen the problem in different view i.e., Knowledge hiding to the context where the data and extracted knowledge have a sequential structure. The concept of NP hardness is observed over the sequential pattern hiding. A polynomial sanitization algorithm was adopted and implemented over the spatiotemporal patterns extracted from moving objects databases. Disseminating datasets of this kind presents a considerable opportunity for knowledge patterns of interest. The developed model is kept under the attack, which exploits the knowledge of underlying road networks.
On the use of aggregation operators for location privacy
- IFSA-EUSFLAT 2009
, 2009
"... Nowadays, the management of sequential and temporal data is an increasing need in many data mining processes. Therefore, the development of new privacy preserving data mining techniques for sequential data is a crucial need to ensure that sequence data analysis is performed without disclosure sensit ..."
Abstract
- Add to MetaCart
(Show Context)
Nowadays, the management of sequential and temporal data is an increasing need in many data mining processes. Therefore, the development of new privacy preserving data mining techniques for sequential data is a crucial need to ensure that sequence data analysis is performed without disclosure sensitive information. Although data analysis and protection are very different processes, they share a few common components such as similarity measurement. In this paper we propose a new similarity function for categorical sequences of events based on OWA operators and fuzzy quantifiers. The main advantage of this new similarity function is the possibility of incorporating the user preferences in the similarity computation. We describe the implications of the application of different user preference policies in the similarity measurement when microaggregation, a wellknown data protection method, is applied to sequential data.
Hiding Co-Occurring Frequent Itemsets
"... Knowledge hiding, hiding rules/patterns that are inferable from published data and attributed sensitive, is extensively studied in the literature in the context of frequent itemsets and association rules mining from transactional data. The research in this thread is focused mainly on developing soph ..."
Abstract
- Add to MetaCart
(Show Context)
Knowledge hiding, hiding rules/patterns that are inferable from published data and attributed sensitive, is extensively studied in the literature in the context of frequent itemsets and association rules mining from transactional data. The research in this thread is focused mainly on developing sophisticated methods that achieve less distortion in data quality. With this work, we extend frequent itemset hiding to co-occurring frequent itemset hiding problem. Cooccurring frequent itemsets are those itemsets that co-exist in the output of frequent itemset mining. What is different from the classical frequent hiding is the new sensitivity definition: an itemset set is sensitive if its itemsets appear altogether within the frequent itemset mining results. In other words, co-occurrence is defined with reference to the mining results but not to the raw input dataset, and thus it is a kind of meta-knowledge. Our notion of co-occurrence is also very different from association rules as itemsets in an association rule need to be frequently present in the same set of transactions, but the co-occurrence need not necessarily require the joint occurrence in the same set of transactions. In this paper, we briefly review the frequent itemset/association hiding problems and define the co-occurrence hiding along with the real world motivations. We explore its fundamental properties and show that frequent itemset hiding is a special case of the co-occurring frequent itemsets hiding. As a solution, we propose a two-stage sanitization framework, essentially a reduction, where an instance of the frequent itemset hiding is constructed in the first stage and the instance is solved in the second stage. Since the task is shown to be NP-Hard and the reduction is one-to-many, we propose heuristics only for the first stage as the second stage is a wellestablished field. Finally, an experimental evaluation is carried out on a couple of datasets, and the results are presented. O. Abul is fully supported by TUBITAK under the grant number
TRI-Patternization on Generic Visualized Time Series Data Sathiya.M
"... In the recent years, Privacy preserving techniques have been actively studied in the time-series data in various fields like financial, medical, and weather analysis. We are focusing towards preserving the data through Anonymity and Generalization technique. We first investigate, what’s the privacy ..."
Abstract
- Add to MetaCart
(Show Context)
In the recent years, Privacy preserving techniques have been actively studied in the time-series data in various fields like financial, medical, and weather analysis. We are focusing towards preserving the data through Anonymity and Generalization technique. We first investigate, what’s the privacy to be incorporated at the time-series data and after finding the data which need to be preserved various perturbation terminologies were identified and worked out towards secure multi-party computation (SMC) and encryption techniques in the distributed computing. Our project focused towards Generalized technique in which the data will be filtered or generalized in a grouped structure based on time series grouping algorithm and the data will be shown in the approximation format. So that, the data won't get disclosed. The second technique involves the display of data in the graphical format providing no clue about the exact data and approximation technique incorporates an exact preserving of data. The third technique involves the arrangement of data in the binary tree pattern and this provides an efficient way of ordering the data on the performance basis. The proposed system incorporates all the necessary features, In addition we are trying to incorporate security by adding a deformable/detectable noise to this time series data.