Results 1 - 10
of
21
Comparative Experiments on Disambiguating Word Senses: An Illustration of the Role of Bias in Machine Learning
, 1996
"... This paper describes an experimental comparison of seven different learning algorithms on the problem of learning to disambiguate the meaning of a word from context. The algorithms tested include statistical, neural-network, decision-tree, rule-based, and case-based classification techniques. The sp ..."
Abstract
-
Cited by 99 (1 self)
- Add to MetaCart
This paper describes an experimental comparison of seven different learning algorithms on the problem of learning to disambiguate the meaning of a word from context. The algorithms tested include statistical, neural-network, decision-tree, rule-based, and case-based classification techniques. The specific problem tested involves disambiguating six senses of the word "line" using the words in the current and proceeding sentence as context. The statistical and neural-network methods perform the best on this particular problem and we discuss a potential reason for this ob- served difference. We also discuss the role of bias in machine ]earning and its importance in explaining performance differences observed on specific problems.
Parcel: Feature Subset Selection in Variable Cost Domains
, 1998
"... The vast majority of classification systems are designed with a single set of features, and optimised to a single specified cost. However, in examples such as medical and financial risk modelling, costs are known to vary subsequent to system design. In this paper, we present a design method for feat ..."
Abstract
-
Cited by 20 (1 self)
- Add to MetaCart
The vast majority of classification systems are designed with a single set of features, and optimised to a single specified cost. However, in examples such as medical and financial risk modelling, costs are known to vary subsequent to system design. In this paper, we present a design method for feature selection in the presence of varying costs. Starting from the Wilcoxon nonparametric statistic for the performance of a classification system, we introduce a concept called the maximum realisable receiver operating characteristic (MRROC), and prove a related theorem. A novel criterion for feature selection, based on the area under the MRROC curve, is then introduced. This leads to a framework which we call Parcel. This has the flexibility to use different combinations of features at different operating points on the resulting MRROC curve. Empirical support for each stage in our approach is provided by experiments on real world problems, with Parcel achieving superior results. iv v C...
How to Shift Bias: Lessons from the Baldwin Effect
, 1996
"... An inductive learning algorithm takes a set of data as input and generates a hypothesis as output. A set of data is typically consistent with an infinite number of hypotheses; therefore, there must be factors other than the data that determine the output of the learning algorithm. In machine learnin ..."
Abstract
-
Cited by 18 (3 self)
- Add to MetaCart
An inductive learning algorithm takes a set of data as input and generates a hypothesis as output. A set of data is typically consistent with an infinite number of hypotheses; therefore, there must be factors other than the data that determine the output of the learning algorithm. In machine learning, these other factors are called the bias of the learner. Classical learning algorithms have a fixed bias, implicit in their design. Recently developed learning algorithms dynamically adjust their bias as they search for a hypothesis. Algorithms that shift bias in this manner are not as well understood as classical algorithms. In this paper, we show that the Baldwin effect has implications for the design and analysis of bias shifting algorithms. The Baldwin effect was proposed in 1896, to explain how phenomena that might appear to require Lamarckian evolution (inheritance of acquired characteristics) can arise from purely Darwinian evolution. Hinton and Nowlan presented a computational model of the Baldwin effect in 1987. We explore a variation on their model, which we constructed explicitly to illustrate the lessons that the Baldwin effect has for research in bias shifting algorithms. The main lesson is that it appears that a good strategy for shift of bias in a learning algorithm is to begin with a weak bias and gradually shift to a strong bias.
DLAB: A Declarative Language Bias Formalism
- In Proceedings of the International Symposium on Methodologies for Intelligent Systems (ISMIS-96
, 1996
"... . We describe the principles and functionalities of Dlab (Declarative LAnguage Bias). Dlab can be used in inductive learning systems to define syntactically and traverse efficiently finite subspaces of first order clausal logic, be it a set of propositional formulae, association rules, Horn clauses, ..."
Abstract
-
Cited by 17 (1 self)
- Add to MetaCart
. We describe the principles and functionalities of Dlab (Declarative LAnguage Bias). Dlab can be used in inductive learning systems to define syntactically and traverse efficiently finite subspaces of first order clausal logic, be it a set of propositional formulae, association rules, Horn clauses, or full clauses. A Prolog implementation of Dlab is available by ftp access. Keywords: declarative language bias, concept learning, knowledge discovery 1 Introduction The notion bias, generally circumscribed as "a tendency to show prejudice against one group and favouritism towards another" (Collins Cobuild, 1987), has been adapted to the field of computational inductive reasoning to become a generic term for "any basis for choosing one generalization over another, other than strict consistency with the instances" (Mitchell [14]). We borrow a more finetuned definition of inductive bias from Utgoff [20]. Definition1 (inductive bias). Except for the presented examples and counterexamples ...
Integrating Declarative Knowledge in Hierarchical Clustering Tasks
- Proceedings of the International Symposium on Intelligent Data Analysis
, 1999
"... The capability of making use of existing prior knowledge is an important challenge for Knowledge Discovery tasks. As an unsupervised learning task, clustering appears to be one of the tasks that more benefits might obtain from prior knowledge. In this paper, we propose a method for providing declara ..."
Abstract
-
Cited by 16 (0 self)
- Add to MetaCart
The capability of making use of existing prior knowledge is an important challenge for Knowledge Discovery tasks. As an unsupervised learning task, clustering appears to be one of the tasks that more benefits might obtain from prior knowledge. In this paper, we propose a method for providing declarative prior knowledge to a hierarchical clustering system stressing the interactive component. Preliminary results suggest that declarative knowledge is a powerful bias in order to improve the quality of clustering in domains were the internal biases of the system are inappropriate or there is not enough evidence in data and that it can lead the system to build more comprehensible clusterings.
DOGMA: A GA-based relational learner
- Proceedings of the 8th International Conference on Inductive Logic Programming
, 1998
"... We describe a GA-based concept learning/theory revision system DOGMA and discuss how it can be applied to relational learning. The search for better theories in DOGMA is guided by anovel tness function that combines the minimal description length and information gain measures. To show the e cacy of ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
We describe a GA-based concept learning/theory revision system DOGMA and discuss how it can be applied to relational learning. The search for better theories in DOGMA is guided by anovel tness function that combines the minimal description length and information gain measures. To show the e cacy of the system we compare it to other learners in three relational domains.
Learning preconditions for planning from plan traces and HTN structure
- Computational Intelligence
, 2005
"... Agreat challenge in developing planning systems for practical applications is the difficulty of acquiring the domain information needed to guide such systems. This paper describes a way to learn some of that knowledge. More specifically, the following points are discussed. (1) We introduce a theoret ..."
Abstract
-
Cited by 9 (2 self)
- Add to MetaCart
Agreat challenge in developing planning systems for practical applications is the difficulty of acquiring the domain information needed to guide such systems. This paper describes a way to learn some of that knowledge. More specifically, the following points are discussed. (1) We introduce a theoretical basis for formally defining algorithms that learn preconditions for Hierarchical Task Network (HTN) methods. (2) We describe Candidate Elimination Method Learner (CaMeL), a supervised, eager, and incremental learning process for preconditions of HTN methods. We state and prove theorems about CaMeL’s soundness, completeness, and convergence properties. (3) We present empirical results about CaMeL’s convergence under various conditions. Among other things, CaMeL converges the fastest on the preconditions of the HTN methods that are needed the most often. Thus CaMeL’s output can be useful even before it has fully converged.
How the brain might work: A hierarchical and temporal model for learning and recognition
- STANFORD UNIVERSITY
, 2008
"... ..."
Machine Learning Techniques for Civil Engineering Problems
, 1997
"... The growing volume of information databases presents opportunities for advanced data analysis techniques from machine learning (ML) research. Practical applications of ML are very different from theoretical or empirical studies, involving organizational and human aspects, and various other constrain ..."
Abstract
-
Cited by 6 (3 self)
- Add to MetaCart
The growing volume of information databases presents opportunities for advanced data analysis techniques from machine learning (ML) research. Practical applications of ML are very different from theoretical or empirical studies, involving organizational and human aspects, and various other constraints. Despite the importance of applied ML, little has been discussed in the general ML literature on this topic. In order to remedy this situation, we studied practical applications of ML and developed a proposal for a seven-steps process that can guide practical applications of ML in engineering. The process is illustrated by relevant applications of ML in civil engineering. This illustration shows that the potential of ML has only begun to be explored, but also cautions that in order to be successful, the application process must carefully address the issues related to the seven-step process. 1 Introduction Over the last several decades we have witnessed an explosion in information generat...
Generalizaci'on Y Atenci'on Selectiva Para La Formaci'on De Conceptos
"... del aprendizaje inductivo [1]. Dentro de este paradigma, es posible distinguir dos tendencias diferenciadas seg'un el grado de asesoramiento que requieren por parte de un tutor externo. En el aprendizaje supervisado se asume que las observaciones vienen preclasificadas y la tarea a realizar consist ..."
Abstract
-
Cited by 6 (5 self)
- Add to MetaCart
del aprendizaje inductivo [1]. Dentro de este paradigma, es posible distinguir dos tendencias diferenciadas seg'un el grado de asesoramiento que requieren por parte de un tutor externo. En el aprendizaje supervisado se asume que las observaciones vienen preclasificadas y la tarea a realizar consiste en inferir conceptos que describan adecuadamente cada clase. En cambio, en el aprendizaje no supervisado, dado que no existe un tutor, el objetivo reside en descubrir las agrupaciones que subyacen en un determinado conjunto de observaciones as'i como un concepto para cada grupo. Aunque el origen de las agrupaciones que manejan estas aproximaciones sea distinto, como se ve, ambas deben afrontar un problema com'un, el de la caracterizaci'on. Desde esta premisa, es l'ogico que las aproximaciones iniciales al aprendizaje no supervisado [2] fueran muy similares en esp'iritu a las que se realizaban en la modalidad supervisada [3].

