Results 1 - 10
of
12
Automatic Construction of Decision Trees from Data: A Multi-Disciplinary Survey
- Data Mining and Knowledge Discovery
, 1997
"... Decision trees have proved to be valuable tools for the description, classification and generalization of data. Work on constructing decision trees from data exists in multiple disciplines such as statistics, pattern recognition, decision theory, signal processing, machine learning and artificial ne ..."
Abstract
-
Cited by 122 (1 self)
- Add to MetaCart
Decision trees have proved to be valuable tools for the description, classification and generalization of data. Work on constructing decision trees from data exists in multiple disciplines such as statistics, pattern recognition, decision theory, signal processing, machine learning and artificial neural networks. Researchers in these disciplines, sometimes working on quite different problems, identified similar issues and heuristics for decision tree construction. This paper surveys existing work on decision tree construction, attempting to identify the important issues involved, directions the work has taken and the current state of the art. Keywords: classification, tree-structured classifiers, data compaction 1. Introduction Advances in data collection methods, storage and processing technology are providing a unique challenge and opportunity for automated data exploration techniques. Enormous amounts of data are being collected daily from major scientific projects e.g., Human Genome...
Inductive and Bayesian learning in medical diagnosis
- Applied Artificial Intelligence
, 1993
"... Abstract. Although successful in medical diagnostic problems, inductive learning systems were not widely accepted in medical practice. In this paper two di erent approaches to machine learning in medical appli-cations are compared: the system for inductive learning of decision trees Assistant, and t ..."
Abstract
-
Cited by 56 (9 self)
- Add to MetaCart
Abstract. Although successful in medical diagnostic problems, inductive learning systems were not widely accepted in medical practice. In this paper two di erent approaches to machine learning in medical appli-cations are compared: the system for inductive learning of decision trees Assistant, and the naive Bayesian classi er. Both methodologies were tested in four medical diagnostic problems: localization of primary tumor, prognostics of recurrence of breast cancer, diagnosis of thyroid diseases, and rheumatology. The accuracy of automatically acquired diagnostic knowledge from stored data records is compared and the interpretation of the knowledge and the explanation ability of the classi cation process of each system is discussed. Surprisingly, thenaiveBayesian classi er is superior to Assistant in classi cation accuracy and explanation ability, while the interpretation of the acquired knowledge seems to be equally valuable. In ad-dition, two extensions to naive Bayesian classi er are brie y described: dealing with continuous attributes, and discovering the dependencies among attributes.
Overcoming the myopia of inductive learning algorithms with RELIEFF
- Applied Intelligence
, 1997
"... . Current inductive machine learning algorithms typically use greedy search with limited lookahead. This prevents them to detect significant conditional dependencies between the attributes that describe training objects. Instead of myopic impurity functions and lookahead, we propose to use RELIEFF, ..."
Abstract
-
Cited by 30 (11 self)
- Add to MetaCart
. Current inductive machine learning algorithms typically use greedy search with limited lookahead. This prevents them to detect significant conditional dependencies between the attributes that describe training objects. Instead of myopic impurity functions and lookahead, we propose to use RELIEFF, an extension of RELIEF developed by Kira and Rendell [10], [11], for heuristic guidance of inductive learning algorithms. We have reimplemented Assistant, a system for top down induction of decision trees, using RELIEFF as an estimator of attributes at each selection step. The algorithm is tested on several artificial and several real world problems and the results are compared with some other well known machine learning algorithms. Excellent results on artificial data sets and two real world problems show the advantage of the presented approach to inductive learning. Keywords: learning from examples, estimating attributes, impurity function, RELIEFF, empirical evaluation 1. Introduction ...
Machine learning for medical diagnosis: history, state of the art and perspective
- Artificial Intelligence in Medicine
, 2001
"... The paper provides an overview of the development of intelligent data analysis in medicine from a machine learning perspective: a historical view, a state of the art view and a view on some future trends in this subfield of applied artificial intelligence. The paper is not intended to provide a com- ..."
Abstract
-
Cited by 25 (0 self)
- Add to MetaCart
The paper provides an overview of the development of intelligent data analysis in medicine from a machine learning perspective: a historical view, a state of the art view and a view on some future trends in this subfield of applied artificial intelligence. The paper is not intended to provide a com-prehensive overview but rather describes some subeareas and directions which from my personal point of view seem to be important for applying machine learning in medical diagnosis. In the historical overview I emphasize the naive Bayesian classifier, neural networks and decision trees. I present a comparison of some state of the art systems, representatives from each branch of machine learning, when applied to several medical diagnostic tasks. The future trends are illustrated by two case studies. The first describes a recently developed method for dealing with reliability of decisions of classifiers, which seems to be promising for intelligent data analysis in medicine. The second describes an ap-proach to using machine learning in order to verify some unexplained phenomena from complementary medicine, which is not (yet) approved by the orthodox medical community but could in the future play an important role in overall medical diagnosis and treatment. 1
Naive Bayesian classifier within ILP-R
- Department of Computer Science, Katholieke Universiteit Leuven
, 1995
"... When dealing with the classification problems, current ILP systems often lag behind stateof -the-art attributional learners. Part of the blame can be ascribed to a much larger hypothesis space which, therefore, cannot be as thoroughly explored. However, sometimes it is due to the fact that ILP syste ..."
Abstract
-
Cited by 22 (1 self)
- Add to MetaCart
When dealing with the classification problems, current ILP systems often lag behind stateof -the-art attributional learners. Part of the blame can be ascribed to a much larger hypothesis space which, therefore, cannot be as thoroughly explored. However, sometimes it is due to the fact that ILP systems do not take into account the probabilistic aspects of hypotheses when classifying unseen examples. This paper proposes just that. We developed a naive Bayesian classifier within our ILP-R first order learner. The learner itself uses a clever RELIEF based heuristic which is able to detect strong dependencies within the literal space when such dependencies exist. We conducted a series of experiments on artificial and real-world data sets. The results show that the combination of ILP-R together with the naive Bayesian classifier sometimes significantly improves the classification of unseen instances as measured by both classification accuracy and average information score. 1 Introduction Th...
Linear Space Induction in First Order Logic with RELIEFF
- In
, 1995
"... Current ILP algorithms typically use variants and extensions of the greedy search. This prevents them to detect significant relationships between the training objects. Instead of myopic impurity functions, we propose the use of the heuristic based on RELIEF for guidance of ILP algorithms. At each st ..."
Abstract
-
Cited by 11 (8 self)
- Add to MetaCart
Current ILP algorithms typically use variants and extensions of the greedy search. This prevents them to detect significant relationships between the training objects. Instead of myopic impurity functions, we propose the use of the heuristic based on RELIEF for guidance of ILP algorithms. At each step, in our ILP-R system, this heuristic is used to determine a beam of candidate literals. The beam is then used in an exhaustive search for a potentially good conjunction of literals. From the efficiency point of view we introduce interesting declarative bias which enables us to keep the growth of the training set, when introducing new variables, within linear bounds (linear with respect to the clause length). This bias prohibits cross-referencing of variables in variable dependency tree. The resulting system has been tested on various artificial problems. The advantages and deficiencies of our approach are discussed. 1 Introduction ILP algorithms typically use variants of the greedy searc...
Analysing and Improving the Diagnosis of Ischaemic Heart Disease with Machine Learning
, 1999
"... Ischaemic heart disease is one of the world's most important causes of mortality, so improvements and rationalization of diagnostic procedures would be very useful. The four diagnostic levels consist of evaluation of signs and symptoms of the disease and ECG (electrocardiogram) at rest, sequentia ..."
Abstract
-
Cited by 10 (6 self)
- Add to MetaCart
Ischaemic heart disease is one of the world's most important causes of mortality, so improvements and rationalization of diagnostic procedures would be very useful. The four diagnostic levels consist of evaluation of signs and symptoms of the disease and ECG (electrocardiogram) at rest, sequential ECG testing during the controlled exercise, myocardial scintigraphy, and finally coronary angiography (which is considered to be the reference method).
Machine Learning in Prognosis of the Femoral Neck Fracture Recovery
, 1996
"... We compare the performance of several machine learning algorithms in the problem of prognostics of the femoral neck fracture recovery: the K-nearest neighbours algorithm, the semi-naive Bayesian classifier, backpropagation with weight elimination learning of the multilayered neural networks, the LFC ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
We compare the performance of several machine learning algorithms in the problem of prognostics of the femoral neck fracture recovery: the K-nearest neighbours algorithm, the semi-naive Bayesian classifier, backpropagation with weight elimination learning of the multilayered neural networks, the LFC (lookahead feature construction) algorithm, and the Assistant-I and Assistant-R algorithms for top down induction of decision trees using information gain and RELIEFF as search heuristics, respectively. We compare the prognostic accuracy and the explanation ability of di#erent classifiers. Among the di#erent algorithms the semi-naive Bayesian classifier and Assistant-R seem to be the most appropriate. We analyze the combination of decisions of several classifiers for solving prediction problems and show that the combined classifier improves both performance and the explanation ability. Keywords: learning from examples, estimating attributes, explanation ability, impurity function, empirica...
Intelligent Data Analysis in Medicine and Pharmacology
- in: Intelligent Data Analysis in Medicine and Pharmacology
, 1997
"... : Anaplastic thyroid carcinoma is a rare but very aggressive tumor. Many factors that might influence the survival of patients have been suggested. The aim of our study was to determine which of the factors, known at the time of admission to the hospital, might predict survival of the patients wi ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
: Anaplastic thyroid carcinoma is a rare but very aggressive tumor. Many factors that might influence the survival of patients have been suggested. The aim of our study was to determine which of the factors, known at the time of admission to the hospital, might predict survival of the patients with anaplastic thyroid carcinoma. Our aim was also to assess the relative importance of the factors and to identify potentially useful decision and regression trees generated by machine learning algorithms. Our study included 126 patients with anaplastic thyroid carcinoma treated at the Institute of Oncology Ljubljana from 1972 to 1992. In this chapter, we compare the machine learning approach with previous statistical evaluations of the problem (univariate and multivariate analysis) and show that it can provide a more thorough analysis and improve the understanding of the data. 1 2 INTELLIGENT DATA ANALYSIS IN MEDICINE AND PHARMACOLOGY 1.1 INTRODUCTION Anaplastic thyroid carcino...
Prognosing the Survival Time of the Patients with the Anaplastic Thyroid Carcinoma with Machine Learning
, 1997
"... Anaplastic thyroid carcinoma is a rare but very aggressive tumor. Many factors that might influence the survival of patients have been suggested. The aim of our study was to determine which of the factors, known at the time of admission to the hospital, might predict survival of patients with anapla ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Anaplastic thyroid carcinoma is a rare but very aggressive tumor. Many factors that might influence the survival of patients have been suggested. The aim of our study was to determine which of the factors, known at the time of admission to the hospital, might predict survival of patients with anaplastic thyroid carcinoma. Our aim was also to assess the relative importance of the factors and to identify potentially useful decision and regression trees generated by machine learning algorithms. Our study included 126 patients (90 females and 36 males; mean age was 66.7 years) with anaplastic thyroid carcinoma treated at the Institute of Oncology Ljubljana from 1972 to 1992. Patients were classified into categories according to 11 attributes: sex, age, history, physical findings, extent of disease on admission, and tumor morphology. In this paper we compare the machine learning approach with the previous statistical evaluations on the problem (univariate and multivariate analysis) and show...

