Results 11  20
of
2,272
A Comparison of Prediction Accuracy, Complexity, and Training Time of Thirtythree Old and New Classification Algorithms
, 2000
"... . Twentytwo decision tree, nine statistical, and two neural network algorithms are compared on thirtytwo datasets in terms of classication accuracy, training time, and (in the case of trees) number of leaves. Classication accuracy is measured by mean error rate and mean rank of error rate. Both cr ..."
Abstract

Cited by 214 (8 self)
 Add to MetaCart
(Show Context)
. Twentytwo decision tree, nine statistical, and two neural network algorithms are compared on thirtytwo datasets in terms of classication accuracy, training time, and (in the case of trees) number of leaves. Classication accuracy is measured by mean error rate and mean rank of error rate. Both criteria place a statistical, splinebased, algorithm called Polyclass at the top, although it is not statistically signicantly dierent from twenty other algorithms. Another statistical algorithm, logistic regression, is second with respect to the two accuracy criteria. The most accurate decision tree algorithm is Quest with linear splits, which ranks fourth and fth, respectively. Although splinebased statistical algorithms tend to have good accuracy, they also require relatively long training times. Polyclass, for example, is third last in terms of median training time. It often requires hours of training compared to seconds for other algorithms. The Quest and logistic regression algor...
Word sense disambiguation using a second language monolingual corpus
 COMPUTATIONAL LINGUISTICS
, 1994
"... This paper presents a new approach for resolving lexical ambiguities in one language using statistical data from a monolingual corpus of another language. This approach exploits the differences between mappings of words to senses in different languages. The paper concentrates on the problem of targe ..."
Abstract

Cited by 163 (1 self)
 Add to MetaCart
This paper presents a new approach for resolving lexical ambiguities in one language using statistical data from a monolingual corpus of another language. This approach exploits the differences between mappings of words to senses in different languages. The paper concentrates on the problem of target word selection in machine translation, for which the approach is directly applicable. The presented algorithm identifies syntactic relations between words, using a source language parser, and maps the alternative interpretations of these relations to the target language, using a bilingual lexicon. The preferred senses are then selected according to statistics on lexical relations in the target language. The selection is based on a statistical model and on a constraint propagation algorithm, which simultaneously handles all ambiguities in the sentence. The method was evaluated using three sets of Hebrew and German examples and was found to be very useful for disambiguation. The paper includes a detailed comparative analysis of statistical sense disambiguation methods.
A longitudinal study of engineering student performance and retention. I. Success and failure in the introductory course
 J. Engr. Education
, 1993
"... A cohort of chemical engineering students has been taught in an experimental sequence of five chemical engineering courses, beginning with the introductory course in the Fall 1990 semester. Differences in academic performance have been observed between students from rural and small town backgrounds ..."
Abstract

Cited by 147 (11 self)
 Add to MetaCart
A cohort of chemical engineering students has been taught in an experimental sequence of five chemical engineering courses, beginning with the introductory course in the Fall 1990 semester. Differences in academic performance have been observed between students from rural and small town backgrounds (“rural students, ” N=55) and students from urban and suburban backgrounds (“urban students, ” N=65), with the urban students doing better on almost every measure investigated. In the introductory course, 80% of the urban students and 55 % of the rural students passed with a grade of C or better, with average grades of 2.63 for the urban students and 1.80 for the rural students (A=4.0). The urban group continued to earn higher grades in subsequent chemical engineering courses. After four years, 79 % of the urban students and 64 % of the rural students had graduated or were still enrolled in chemical engineering; the others had either transferred out of engineering or were no longer attending the university. This paper presents data on the students ’ home and school backgrounds and speculates on possible causes of observed performance differences between the two populations. * Journal of Engineering Education, 83(3), 209–217 (1994). Charts in the published version have been converted to
Unbiased recursive partitioning: a conditional inference framework
 J. Comput. Graph. Statist
, 2006
"... Recursive binary partitioning is a popular tool for regression analysis. Two fundamental problems of exhaustive search procedures usually applied to fit such models have been known for a long time: Overfitting and a selection bias towards covariates with many possible splits or missing values. While ..."
Abstract

Cited by 127 (12 self)
 Add to MetaCart
(Show Context)
Recursive binary partitioning is a popular tool for regression analysis. Two fundamental problems of exhaustive search procedures usually applied to fit such models have been known for a long time: Overfitting and a selection bias towards covariates with many possible splits or missing values. While pruning procedures are able to solve the overfitting problem, the variable selection bias still seriously effects the interpretability of treestructured regression models. For some special cases unbiased procedures have been suggested, however lacking a common theoretical foundation. We propose a unified framework for recursive partitioning which embeds treestructured regression models into a well defined theory of conditional inference procedures. Stopping criteria based on multiple test procedures are implemented and it is shown that the predictive performance of the resulting trees is as good as the performance of established exhaustive search procedures. It turns out that the partitions and therefore the models induced by both approaches are structurally different, indicating the need for an unbiased variable selection. The methodology presented here is applicable to all kinds of regression problems, including nominal, ordinal, numeric, censored as well as multivariate response variables and arbitrary measurement scales of the covariates. Data from studies on animal abundance, glaucoma classification, node positive breast cancer and mammography experience are reanalyzed.
Pruning Adaptive Boosting
, 1997
"... The boosting algorithm AdaBoost, developed by Freund and Schapire, has exhibited outstanding performance on several benchmark problems when using C4.5 as the "weak" algorithm to be "boosted." Like other ensemble learning approaches, AdaBoost constructs a composite hypothesis by v ..."
Abstract

Cited by 124 (1 self)
 Add to MetaCart
The boosting algorithm AdaBoost, developed by Freund and Schapire, has exhibited outstanding performance on several benchmark problems when using C4.5 as the "weak" algorithm to be "boosted." Like other ensemble learning approaches, AdaBoost constructs a composite hypothesis by voting many individual hypotheses. In practice, the large amount of memory required to store these hypotheses can make ensemble methods hard to deploy in applications. This paper shows that by selecting a subset of the hypotheses, it is possible to obtain nearly the same levels of performance as the entire set. The results also provide some insight into the behavior of AdaBoost.
Ontology Matching: A Machine Learning Approach
 Handbook on Ontologies in Information Systems
, 2003
"... Finally, we describe a set of experiments on several realworld domains, and show that GLUE proposes highly accurate semantic mappings. 1 A Motivating Example: the Semantic Web The current WorldWide Web has well over 1.5 billion pages [2], but the vast majority of them are in humanreadable forma ..."
Abstract

Cited by 122 (2 self)
 Add to MetaCart
(Show Context)
Finally, we describe a set of experiments on several realworld domains, and show that GLUE proposes highly accurate semantic mappings. 1 A Motivating Example: the Semantic Web The current WorldWide Web has well over 1.5 billion pages [2], but the vast majority of them are in humanreadable format only (e.g., HTML). As Work done while the author was at the University of Washington, Seattle 2 AnHai Doan et al. a consequence software agents (softbots) cannot understand and process this information, and much of the potential of the Web has so far remained untapped. In response, researchers have created the vision of the Semantic Web [5], where data has structure and ontologies describe the semantics of the data. When data is marked up using ontologies, softbots can better understand the semantics and therefore more intelligently locate and integrate data for a wide variety of tasks. The following example illustrates the vision of the Semantic Web. Example 1. Suppose you want to fi
Learning to Match Ontologies on the Semantic Web
, 2003
"... On the Semantic Web, data will inevitably come from many different ontologies, and information processing across ontologies is not possible without knowing the semantic mappings between them. Manually finding such mappings is tedious, errorprone, and clearly not possible at the Web scale. Hence, th ..."
Abstract

Cited by 120 (2 self)
 Add to MetaCart
On the Semantic Web, data will inevitably come from many different ontologies, and information processing across ontologies is not possible without knowing the semantic mappings between them. Manually finding such mappings is tedious, errorprone, and clearly not possible at the Web scale. Hence, the development of tools to assist in the ontology mapping process is crucial to the success of the Semantic Web. We describe GLUE, a system that employs machine learning techniques to find such mappings. Given two ontologies, for each concept in one ontology GLUE finds the most similar concept in the other ontology. We give wellfounded probabilistic definitions to several practical similarity measures, and show that GLUE can work with all of them. Another key feature of GLUE is that it uses multiple learning strategies, each of which exploits well a different type of information either in the data instances or in the taxonomic structure of the ontologies. To further improve matching accuracy, we extend GLUE to incorporate commonsense knowledge and domain constraints into the matching process. Our approach is thus distinguished in that it works with a variety of welldefined similarity notions and that it efficiently incorporates multiple types of knowledge. We describe a set of experiments on several realworld domains, and show that GLUE proposes highly accurate semantic mappings. Finally, we extend GLUE to find complex mappings between ontologies, and describe experiments that show the promise of the approach.
Learning Bayesian Networks from Data: An InformationTheory Based Approach
, 2001
"... This paper provides algorithms that use an informationtheoretic analysis to learn Bayesian network structures from data. Based on our threephase learning framework, we develop efficient algorithms that can effectively learn Bayesian networks, requiring only polynomial numbers of conditional indepe ..."
Abstract

Cited by 118 (4 self)
 Add to MetaCart
This paper provides algorithms that use an informationtheoretic analysis to learn Bayesian network structures from data. Based on our threephase learning framework, we develop efficient algorithms that can effectively learn Bayesian networks, requiring only polynomial numbers of conditional independence (CI) tests in typical cases. We provide precise conditions that specify when these algorithms are guaranteed to be correct as well as empirical evidence (from real world applications and simulation tests) that demonstrates that these systems work efficiently and reliably in practice.
RainForest  a Framework for Fast Decision Tree Construction of Large Datasets
 In VLDB
, 1998
"... Classification of large datasets is an important data mining problem. Many classification algorithms have been proposed in the literature, but studies have shown that so far no algorithm uniformly outperforms all other algorithms in terms of quality. In this paper, we present a unifying framework fo ..."
Abstract

Cited by 112 (8 self)
 Add to MetaCart
Classification of large datasets is an important data mining problem. Many classification algorithms have been proposed in the literature, but studies have shown that so far no algorithm uniformly outperforms all other algorithms in terms of quality. In this paper, we present a unifying framework for decision tree classifiers that separates the scalability aspects of algorithms for constructing a decision tree from the central features that determine the quality of the tree. This generic algorithm is easy to instantiate with specific algorithms from the literature (including C4.5, CART,