Results 1 - 10
of
30
From data mining to knowledge discovery in databases
- AI Magazine
, 1996
"... ■ Data mining and knowledge discovery in databases have been attracting a significant amount of research, industry, and media attention of late. What is all the excitement about? This article provides an overview of this emerging field, clarifying how data mining and knowledge discovery in databases ..."
Abstract
-
Cited by 215 (0 self)
- Add to MetaCart
■ Data mining and knowledge discovery in databases have been attracting a significant amount of research, industry, and media attention of late. What is all the excitement about? This article provides an overview of this emerging field, clarifying how data mining and knowledge discovery in databases are related both to each other and to related fields, such as machine learning, statistics, and databases. The article mentions particular real-world applications, specific data-mining techniques, challenges involved in real-world applications of knowledge discovery, and current and future research directions in the field. Across a wide variety of fields, data are
Knowledge Discovery and Data Mining: Towards a Unifying Framework
, 1996
"... This paper presents a first step towards a unifying framework for Knowledge Discovery in Databases. We describe links between data mining, knowledge discovery, and other related fields. We then define the KDD process and basic data mining algorithms, discuss application issues and conclude with an a ..."
Abstract
-
Cited by 108 (0 self)
- Add to MetaCart
This paper presents a first step towards a unifying framework for Knowledge Discovery in Databases. We describe links between data mining, knowledge discovery, and other related fields. We then define the KDD process and basic data mining algorithms, discuss application issues and conclude with an analysis of challenges facing practitioners in the field. keywords: Knowledge Discovery in Databases (KDD), Data mining, overview article, large databases, automated analysis, issues and challenges in data mining. To appear: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD-96), Portland, Oregon, August 2-4, 1996, AAAI Press. http://wwwaig. jpl.nasa.gov/kdd96 Knowledge Discovery and Data Mining: Towards a Unifying Framework Usama Fayyad Microsoft Research One Microsoft Way Redmond, WA 98052, USA fayyad@microsoft.com Gregory Piatetsky-Shapiro GTE Laboratories, MS 44 Waltham, MA 02154, USA gps@gte.com Padhraic Smyth Information and Computer S...
Using General Impressions to Analyze Discovered Classification Rules
, 1997
"... One of the important problems in data mining is the evaluation of subjective interestingness of the discovered rules. Past research has found that in many real-life applications it is easy to generate a large number of rules from the database, but most of the rules are not useful or interesting to t ..."
Abstract
-
Cited by 79 (13 self)
- Add to MetaCart
One of the important problems in data mining is the evaluation of subjective interestingness of the discovered rules. Past research has found that in many real-life applications it is easy to generate a large number of rules from the database, but most of the rules are not useful or interesting to the user. Due to the large number of rules, it is difficult for the user to analyze them manually in order to identify those interesting ones. Whether a rule is of interest to a user depends on his/her existing knowledge of the domain, and his/her interests. In this paper, we propose a technique that analyzes the discovered rules against a specific type of existing knowledge, which we call general impressions, to help the user identify interesting rules. We first propose a representation language to allow general impressions to be specified. We then present some algorithms to analyze the discovered classification rules against a set of general impressions. The results of the analysis tell us ...
Post-Analysis of Learned Rules
"... Rule induction research implicitly assumes that after producing the rules from a dataset, these rules will be used directly by an expert system or a human user. In real-life applications, the situation may not be as simple as that, particularly, when the user of the rules is a human being. The human ..."
Abstract
-
Cited by 57 (10 self)
- Add to MetaCart
Rule induction research implicitly assumes that after producing the rules from a dataset, these rules will be used directly by an expert system or a human user. In real-life applications, the situation may not be as simple as that, particularly, when the user of the rules is a human being. The human user almost always has some previous concepts or knowledge about the domain represented by the dataset. Naturally, he/she wishes to know how the new rules compare with his/her existing knowledge. In dynamic domains where the rules may change over time, it is important to know what the changes are. These aspects of research have largely been ignored in the past. With the increasing use of machine learning techniques in practical applications such as data mining, this issue of post analysis of rules warrants greater emphasis and attention. In this paper, we propose a technique to deal with this problem. A system has been implemented to perform the post analysis of classificat...
Spatial Data Mining: Progress and Challenges
- SIGMOD WORKSHOP ON RESEARCH ISSUES ON DATA MINING AND KNOWLEDGE DISCOVERY (DMKD
, 1996
"... Spatial data mining, i.e., mining knowledge from large amounts of spatial data, is a highly demanding field because huge amounts of spatial data have been collected in various applications, ranging from remote sensing, to geographical information systems (GIS), computer cartography, environ- mental ..."
Abstract
-
Cited by 47 (0 self)
- Add to MetaCart
Spatial data mining, i.e., mining knowledge from large amounts of spatial data, is a highly demanding field because huge amounts of spatial data have been collected in various applications, ranging from remote sensing, to geographical information systems (GIS), computer cartography, environ- mental assessment and planning, etc. The collected data far exceeded human's ability to analyze. Recent studies on data mining have extended the scope of data mining from relational and transactional databases to spatial databases. This paper summarizes recent works on spatial data mining, from spatial data generalization, to spatial data clustering, mining spatial association rules, etc. It shows that spatial data mining is a promising field, with fruitful research results and many challenging issues.
Knowledge discovery and interestingness measures: A survey
, 1999
"... Knowledge discovery in databases, also known as data mining, is the efficient discovery of previously unknown, valid, novel, potentially useful, and understandable patterns in large databases. It encompasses many different techniques and algorithms which differ in the kinds of data that can be analy ..."
Abstract
-
Cited by 44 (1 self)
- Add to MetaCart
Knowledge discovery in databases, also known as data mining, is the efficient discovery of previously unknown, valid, novel, potentially useful, and understandable patterns in large databases. It encompasses many different techniques and algorithms which differ in the kinds of data that can be analyzed and the form of knowledge representation used to convey the discovered knowledge. An important problem in the area of data mining is the development of effective measures of interestingness for ranking the discovered knowledge. In this report, we provide a general overview of the more successful and widely known data mining techniques and algorithms, and survey seventeen interestingness measures from the literature that have been successfully employed in data mining applications. 1 1
Analyzing the Subjective Interestingness of Association Rules
, 2000
"... Association rules are a class of important regularities in databases. They are found to be very useful in practical applications. However, association rule mining algorithms tend to produce a huge number of rules, most of which are of no interest to the user. Due to the large number of rules, it ..."
Abstract
-
Cited by 35 (1 self)
- Add to MetaCart
Association rules are a class of important regularities in databases. They are found to be very useful in practical applications. However, association rule mining algorithms tend to produce a huge number of rules, most of which are of no interest to the user. Due to the large number of rules, it is very difficult for the user to analyze them manually in order to identify those truly interesting ones. In this paper, we propose a new approach to assist the user in finding interesting rules (in particular, unexpected rules) from a set of discovered association rules. This technique is characterized by analyzing the discovered association rules using the user's existing knowledge about the domain and then ranking the discovered rules according to various interestingness criteria, e.g., conformity and various types of unexpectedness. This technique has been implemented and successfully used in a number of applications. Keywords: subjective interestingness, association rules, interestingness analysis in data mining. 1.
Finding Interesting Patterns Using User Expectations
- IEEE Transactions on Knowledge and Data Engineering
, 1996
"... One of the important issues in data mining is the interestingness problem. This problem is described as finding the interesting patterns from a large number of discovered patterns. Typically, in a data mining application, it is all too easy to discover a huge number of patterns. Most of these patter ..."
Abstract
-
Cited by 30 (4 self)
- Add to MetaCart
One of the important issues in data mining is the interestingness problem. This problem is described as finding the interesting patterns from a large number of discovered patterns. Typically, in a data mining application, it is all too easy to discover a huge number of patterns. Most of these patterns are actually useless or uninteresting to the user. But due to the huge number of patterns, it is difficult for a user to comprehend all the patterns and to identify those interesting to him/her. To prevent the user from being overwhelmed by the large number of patterns, techniques are needed to analyze and to rank the patterns according to their interestingness. This paper proposes such a technique. It performs post-analysis of the discovered patterns to help the user identify those interesting ones. The technique is based on fuzzy matching of the discovered patterns with a set of user-specified patterns. The degrees of match are then used to rank the discovered patterns according to vari...
An analysis of quantitative measures associated with rules
- Proceedings of PAKDD’99
, 1999
"... Abstract. In this paper, we analyze quantitative measures associated with if-then type rules. Basic quantities are identified and many existing measures are examined using the basic quantities. The main objective is to provide a synthesis of existing results in a simple and unified framework. The qu ..."
Abstract
-
Cited by 29 (22 self)
- Add to MetaCart
Abstract. In this paper, we analyze quantitative measures associated with if-then type rules. Basic quantities are identified and many existing measures are examined using the basic quantities. The main objective is to provide a synthesis of existing results in a simple and unified framework. The quantitative measure is viewed as a multi-facet concept, representing the confidence, uncertainty, applicability, quality, accuracy, and interestingness of rules. Roughly, they may be classified as representing one-way and two-way supports. 1
On Objective Measures of Rule Surprisingness.
- Proceedings of the Second European Conference on the Principles of Data Mining and Knowledge Discovery (PKDD'98
, 1998
"... . Most of the literature argues that surprisingness is an inherently subjective aspect of the discovered knowledge, which cannot be measured in objective terms. This paper departs from this view, and it has a twofold goal: (1) showing that it is indeed possible to define objective (rather than subje ..."
Abstract
-
Cited by 26 (4 self)
- Add to MetaCart
. Most of the literature argues that surprisingness is an inherently subjective aspect of the discovered knowledge, which cannot be measured in objective terms. This paper departs from this view, and it has a twofold goal: (1) showing that it is indeed possible to define objective (rather than subjective) measures of discovered rule surprisingness; (2) proposing new ideas and methods for defining objective rule surprisingness measures. 1 Introduction A crucial aspect of data mining is that the discovered knowledge (usually expressed in the form of "if-then" rules) should be somehow interesting, where the term interestingness is arguably related to the properties of surprisingness (unexpectedness), usefulness and novelty of the rule [Fayyad et al. 96]. In this paper we are interested in quantitative, objective measures of one of the above three properties, namely rule surprisingness. In general, the evaluation of the interestingness of discovered rules has both an objective (data-driv...

