Results 1 
8 of
8
Graphical Models for Discovering Knowledge
, 1995
"... There are many different ways of representing knowledge, and for each of these ways there are many different discovery algorithms. How can we compare different representations? How can we mix, match and merge representations and algorithms on new problems with their own unique requirements? This cha ..."
Abstract

Cited by 29 (2 self)
 Add to MetaCart
There are many different ways of representing knowledge, and for each of these ways there are many different discovery algorithms. How can we compare different representations? How can we mix, match and merge representations and algorithms on new problems with their own unique requirements? This chapter introduces probabilistic modeling as a philosophy for addressing these questions and presents graphical models for representing probabilistic models. Probabilistic graphical models are a unified qualitative and quantitative framework for representing and reasoning with probabilities and independencies. 4.1 Introduction Perhaps one common element of the discovery systems described in this and previous books on knowledge discovery is that they are all different. Since the class of discovery problems is a challenging one, we cannot write a single program to address all of knowledge discovery. The KEFIR discovery system applied to health care by Matheus, PiatetskyShapiro, and McNeill (199...
Discretization and grouping: preprocessing steps for Data Mining
 in Principles of Data Mining and Knowledge Discovery
, 1998
"... . Unlike online discretization performed by a number of machine learning (ML) algorithms for building decision trees or decision rules, we propose o#line algorithms for discretizing numerical attributes and grouping values of nominal attributes. The number of resulting intervals obtained by di ..."
Abstract

Cited by 8 (1 self)
 Add to MetaCart
(Show Context)
. Unlike online discretization performed by a number of machine learning (ML) algorithms for building decision trees or decision rules, we propose o#line algorithms for discretizing numerical attributes and grouping values of nominal attributes. The number of resulting intervals obtained by discretization depends only on the data; the number of groups corresponds to the number of classes. Since both discretization and grouping is done with respect to the goal classes, the algorithms are suitable only for classification/prediction tasks. As a side e#ect of the o#line processing, the number of objects in the datasets and number of attributes may be reduced. It should be also mentioned that although the original idea of the discretization procedure is proposed to the Kex system, the algorithms show good performance together with other machine learning algorithms. 1 Introduction The Knowledge Discovery in Databases (KDD) process can involve a significant iteration and may ...
Empirical Comparisons of Various Discretization Procedures
, 1995
"... The genuine symbolic machine learning (ML) algorithms are capable of processing symbolic, categorial data only. However, realworld problems, e.g. in medicine or finance, involve both symbolic and numerical attributes. Therefore, there is an important issue of ML to discretize (categorize) numerical ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
The genuine symbolic machine learning (ML) algorithms are capable of processing symbolic, categorial data only. However, realworld problems, e.g. in medicine or finance, involve both symbolic and numerical attributes. Therefore, there is an important issue of ML to discretize (categorize) numerical attributes. There exist quite a few discretization procedures in the ML field. This paper describes two newer algorithms for categorization (discretization) of numerical attributes. The first one is implemented in the KEX (Knowledge EXplorer) as its preprocessing procedure. Its idea is to discretize the numerical attributes in such a way that the resulting categorization fits the way how KEX creates a knowledge base. Nevertheless, the resulting categorization is suitable also for other machine learning algorithms. The other discretization procedure is implemented in CN4, a large extension of the wellknown CN2 machine learning algorithm. The range of numerical attributes is devided into int...
Kočka T.: Rule induction for clickstream analysis: set covering and compositional approach
 In: Proc. IIPMW 2005
, 2005
"... approach ..."
(Show Context)
Mining Clickstream Data With Statistical and Rulebased Methods
"... Abstract. We present an analysis of the clickstream data provided for the ECML/PKDD data mining challenge. We primarily focus on predicting the next page that will be visited by a user based on a history of visited pages. We compare results of one statistical and two rulebased methods, and discuss ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract. We present an analysis of the clickstream data provided for the ECML/PKDD data mining challenge. We primarily focus on predicting the next page that will be visited by a user based on a history of visited pages. We compare results of one statistical and two rulebased methods, and discuss interesting patterns that appear in the data. 1
Quantity of sentences
, 2007
"... The method based on inductive constructed a set of PROSPECTORlike rules we used for literary works classification. Description of syntactic and morphological features of texts according to grammar of Russian calculated by means of expert system, an integrated component of information retrieval syst ..."
Abstract
 Add to MetaCart
The method based on inductive constructed a set of PROSPECTORlike rules we used for literary works classification. Description of syntactic and morphological features of texts according to grammar of Russian calculated by means of expert system, an integrated component of information retrieval system ”SMALT”. Inductive generation rules and recognition literary works completed by means of programs for inductive construction of the knowledge base “STATCOP”. AMICT'2007 3
Continuous Classes in Rule Induction: Empirical Comparison of Two Approaches
"... . This paper introduces modification of two existing ruleinducing ML algorithms; they are now capable of processing continuous classes. The first one is the CN4 algorithm, a large extension of the wellknown CN2. It invokes its discretization procedure and also deals with continuous classes within ..."
Abstract
 Add to MetaCart
(Show Context)
. This paper introduces modification of two existing ruleinducing ML algorithms; they are now capable of processing continuous classes. The first one is the CN4 algorithm, a large extension of the wellknown CN2. It invokes its discretization procedure and also deals with continuous classes within the inductive process itself, i.e., 'online'. The other ML system, KEX (Knowledge EXplorer), treats numerical attributes by its preprocessor, that is, they are discretized 'offline'; the continuous classes are processed within the inductive algorithm itself. Experimental results show a comparison of KEX and CN4 on some realworld as well as artificial data. 1 Introduction The genuine symbolic machine learning (ML) algorithms were designed to process symbolic data accompanied by symbolic classes (concepts) only. However, realworld data, e.g., in medicine or financing, consist of both symbolic and numerical data (attributes) and some applications even exhibit continuous classes (concepts). ...