Results 1 -
9 of
9
From data mining to knowledge discovery in databases
- AI Magazine
, 1996
"... ■ Data mining and knowledge discovery in databases have been attracting a significant amount of research, industry, and media attention of late. What is all the excitement about? This article provides an overview of this emerging field, clarifying how data mining and knowledge discovery in databases ..."
Abstract
-
Cited by 215 (0 self)
- Add to MetaCart
■ Data mining and knowledge discovery in databases have been attracting a significant amount of research, industry, and media attention of late. What is all the excitement about? This article provides an overview of this emerging field, clarifying how data mining and knowledge discovery in databases are related both to each other and to related fields, such as machine learning, statistics, and databases. The article mentions particular real-world applications, specific data-mining techniques, challenges involved in real-world applications of knowledge discovery, and current and future research directions in the field. Across a wide variety of fields, data are
Discovery of Relational Association Rules
- Relational data mining
, 2000
"... Within KDD, the discovery of frequent patterns has been studied in a variety of settings. In its simplest form, known from association rule mining, the task is to discover all frequent item sets, i.e., all combinations of items that are found in a sufficient number of examples. ..."
Abstract
-
Cited by 25 (0 self)
- Add to MetaCart
Within KDD, the discovery of frequent patterns has been studied in a variety of settings. In its simplest form, known from association rule mining, the task is to discover all frequent item sets, i.e., all combinations of items that are found in a sufficient number of examples.
Propositionalization-based relational subgroup discovery with RSD
- Machine Learning
, 2006
"... Abstract Relational rule learning algorithms are typically designed to construct classification and prediction rules. However, relational rule learning can be adapted also to subgroup discovery. This paper proposes a propositionalization approach to relational subgroup discovery, achieved through ap ..."
Abstract
-
Cited by 16 (3 self)
- Add to MetaCart
Abstract Relational rule learning algorithms are typically designed to construct classification and prediction rules. However, relational rule learning can be adapted also to subgroup discovery. This paper proposes a propositionalization approach to relational subgroup discovery, achieved through appropriately adapting rule learning and first-order feature construction. The proposed approach was successfully applied to standard ILP problems (East-West trains, King-Rook-King chess endgame and mutagenicity prediction) and two real-life problems (analysis of telephone calls and traffic accident analysis).
When distribution is part of the semantics: A new problem class for distributed knowledge discovery
- IN UBIQUITOUS DATA MINING FOR MOBILE AND DISTRIBUTED ENVIRONMENTS WORKSHOP ASSOCIATED WITH THE JOINT 12TH EUROPEAN CONFERENCE ON MACHINE LEARNING (ECML’01) AND 5TH EUROPEAN CONFERENCE ON PRINCIPLES AND PRACTICE OF KNOWLEDGE DISCOVERY IN DATABASES (PKDD’01
, 2001
"... Within a research project at DaimlerChrysler we use vehicles as mobile data sources for distributed knowledge discovery. We realized that current approaches are not suitable for our purposes. They aim to infer a global model and try to approximate the results one would get from a single joined dat ..."
Abstract
-
Cited by 14 (1 self)
- Add to MetaCart
Within a research project at DaimlerChrysler we use vehicles as mobile data sources for distributed knowledge discovery. We realized that current approaches are not suitable for our purposes. They aim to infer a global model and try to approximate the results one would get from a single joined data source. Thus, they treat distribution as a technical issue only and ignore that the distribution itself may have a meaning and that models depend on the context in which they were derived. The main contribution of this paper is the identification of a practically relevant new problem class for distributed knowledge discovery which addresses the semantics of distribution. We show that this problem class is the proper framework for many important applications in which it should become an integral part of the knowledge discovery process, affecting the results as well as the process itself. We outline a novel solution, called Knowledge Discovery from Models, which uses models as primary input and combines content driven and context driven analyses. Finally, we discuss challenging research questions, which are raised by the new problem class.
Decision support through subgroup discovery: Three case studies and the lessons learned
- Machine Learning
"... Abstract. This paper presents ways to use subgroup discovery to generate actionable knowledge for decision support. Actionable knowledge is explicit symbolic knowledge, typically presented in the form of rules, that allows the decision maker to recognize some important relations and to perform an ap ..."
Abstract
-
Cited by 11 (3 self)
- Add to MetaCart
Abstract. This paper presents ways to use subgroup discovery to generate actionable knowledge for decision support. Actionable knowledge is explicit symbolic knowledge, typically presented in the form of rules, that allows the decision maker to recognize some important relations and to perform an appropriate action, such as targeting a direct marketing campaign, or planning a population screening campaign aimed at detecting individuals with high disease risk. Different subgroup discovery approaches are outlined, and their advantages over using standard classification rule learning are discussed. Three case studies, a medical and two marketing ones, are used to present the lessons learned in solving problems requiring actionable knowledge generation for decision support. Keywords: data mining, subgroup discovery, decision support, actionability, lessons learned
Interactive exploration of interesting findings in the Telecommunication Network Alarm Sequence Analyzer TASA
- Information and Software Technology. 1999
, 1999
"... In this paper we describe the final version of a knowledge discovery system, Telecommunication Network Alarm Sequence Analyzer (TASA), for telecommunication networks alarm data analysis. The system is based on the discovery of recurrent, temporal patterns of alarms in databases; these patterns, epis ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
In this paper we describe the final version of a knowledge discovery system, Telecommunication Network Alarm Sequence Analyzer (TASA), for telecommunication networks alarm data analysis. The system is based on the discovery of recurrent, temporal patterns of alarms in databases; these patterns, episode rules, can be used in the construction of real-time alarm correlation systems. Also association rules are used for identifying relationships between alarm properties. TASA uses a methodology for knowledge discovery in databases (KDD) where one first discovers large collections of patterns at once, and then performs interactive retrievals from the collection of patterns. The proposed methodology suits very well such KDD formalisms as association and episode rules, where large collections of potentially interesting rules can be found efficiently. When searching for the most interesting rules, simple threshold-like restrictions, such as rule frequency and confidence may satisfy a large number of rules. In TASA, this problem can be alleviated by templates and pattern expressions that describe the form of rules that are to be selected or rejected. Using templates the user can flexibly specify the focus of interest, and also iteratively refine it. Different versions of TASA have been in prototype use in four telecommunication companies since the beginning of 1995. TASA has been found useful in, e.g. finding long-term, rather frequently occurring dependencies, creating an overview of a short-term alarm sequence, and evaluating the alarm data base consistency and correctness. # 1999 Elsevier Science B.V. All rights reserved.
Relational subgroup discovery for gene expression data mining
- In EMBEC: 3rd IFMBE European Medical & Biological Engineering Conf
, 2005
"... Abstract: We propose a methodology for predictive classification from gene expression data, able to combine the robustness of highdimensional statistical classification methods with the comprehensibility and interpretability of simple logic-based models. We first construct a robust classifier combin ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Abstract: We propose a methodology for predictive classification from gene expression data, able to combine the robustness of highdimensional statistical classification methods with the comprehensibility and interpretability of simple logic-based models. We first construct a robust classifier combining contributions of a large number of gene expression values, and then search for compact summarizations of subgroups among genes associated in the classifier with a given class. The subgroups are described by means of relational logic features extracted from publicly available gene annotations. The curse of dimensionality pertaining to the gene expression based classification problem due to the large number of attributes (genes) is turned into an advantage in the secondary subgroup discovery task, as here the original attributes become learning examples.
Abstract Keyword-Based Browsing and Analysis of Large Document Sets
"... focuses on the computerized exploration of large amounts of data and on the discovery of interesting patterns within them. While most work on KDD has been concerned with structured databases, there has been little work on handling the huge amount of information that is available only in unstructured ..."
Abstract
- Add to MetaCart
focuses on the computerized exploration of large amounts of data and on the discovery of interesting patterns within them. While most work on KDD has been concerned with structured databases, there has been little work on handling the huge amount of information that is available only in unstructured textual form. This paper describes the KDT system for Knowledge Discovery in Texts. It is built on top of a text-categorization paradigm where text articles are annotated with keywords organized in a hierarchical structure. Knowledge discovery is performed by analyzing the co-occurrence frequencies of keywords from this hierarchy in the various documents. We show how this termfrequency approach supports a range of KDD operations, providing a general framework for knowledge discovery and exploration in collections of unstructured text.
Dynamic Predicate Construction for Learning Relational Concepts
"... Abstract. The aim of this work is to enrich the search space of relational rule learning by allowing dynamic construction of predicates. Specifically, the use (resp. non-use) of large predicates lead to hypotheses that are overly specific (resp. general). Without suitable predicates predefined, the ..."
Abstract
- Add to MetaCart
Abstract. The aim of this work is to enrich the search space of relational rule learning by allowing dynamic construction of predicates. Specifically, the use (resp. non-use) of large predicates lead to hypotheses that are overly specific (resp. general). Without suitable predicates predefined, the space between these two hypotheses is inaccessible to the learner. We seek to address this problem by extensional predicate construction from domain clusters, thus allowing for the kind of intermediate hypotheses of interest here. We show that doing so lead not only to discovery of interesting domain subsets, but also better leveraging of predictive accuracy and overfitting. We develop a dynamic programming method for effectively achieving clusters of individuals, and demonstrate empirical results on on synthetic as well as real-world datasets.

