Results 1  10
of
120
Optimal Bit Allocation via the Generalized BFOS Algorithm
 IEEE Transactions on Information Theory
, 1991
"... We analyze the use of the generalized Breiman, Friedman, Olshen, and Stone (BFOS) algorithm, a recently developed technique for variable rate vector quantizer design, for optimal bit allocation. It is shown that if each source has a convex quantizer function then the complexity of the algorithm is l ..."
Abstract

Cited by 71 (6 self)
 Add to MetaCart
We analyze the use of the generalized Breiman, Friedman, Olshen, and Stone (BFOS) algorithm, a recently developed technique for variable rate vector quantizer design, for optimal bit allocation. It is shown that if each source has a convex quantizer function then the complexity of the algorithm is low.
A Hierarchical Stochastic Model for Automatic Prediction of Prosodic Boundary Location
 COMPUTATIONAL LINGUISTICS
, 1994
"... Prosodic phrase structure ..."
Tree Based Discretization for Continuous State Space Reinforcement Learning
, 1998
"... Reinforcement learning is an effective technique for learning action policies in discrete stochastic environments, but its efficiency can decay exponentially with the size of the state space. In many situations significant portions of a large state space may be irrelevant to a specific goal and can ..."
Abstract

Cited by 57 (6 self)
 Add to MetaCart
(Show Context)
Reinforcement learning is an effective technique for learning action policies in discrete stochastic environments, but its efficiency can decay exponentially with the size of the state space. In many situations significant portions of a large state space may be irrelevant to a specific goal and can be aggregated into a few, relevant, states. The U Tree algorithm generates a tree based state discretization that efficiently finds the relevant state chunks of large propositional domains. In this paper, we extend the U Tree algorithm to challenging domains with a continuous state space for which there is no initial discretization. This Continuous U Tree algorithm transfers traditional regression tree techniques to reinforcement learning. We have performed experiments in a variety of domains that show that Continuous U Tree effectively handles large continuous state spaces. In this paper, we report on results in two different domains, one gives a clear visualization of the algorithm and another empirically demonstrates an effective state discretization in a simple multiagent environment.
Learning DNF by decision trees
 Proceedings of the Eleventh International Joint Conference on Artificial Intelligence
, 1989
"... We investigate the problem of learning DNF concepts from examples using decision trees as a concept description language. Due to the replication problem, DNF concepts do not always have a concise decision tree description when the tests at the nodes are limited to the initial attributes. However, th ..."
Abstract

Cited by 48 (1 self)
 Add to MetaCart
We investigate the problem of learning DNF concepts from examples using decision trees as a concept description language. Due to the replication problem, DNF concepts do not always have a concise decision tree description when the tests at the nodes are limited to the initial attributes. However, the representational complexity may be overcome by using high level attributes as tests. We present a novel algorithm that modifies the initial bias determined by the primitive attributes by adaptively enlarging the attribute set with high level attributes. We show empirically that this algorithm outperforms a standard decision tree algorithm for learning small random DNF with and without noise, when the examples are drawn from the uniform distribution. 1
Learning Text Analysis Rules For DomainSpecific Natural Language Processing
, 1997
"... An enormous amount of knowledge is needed to infer the meaning of unrestricted natural language. The problem can be reduced to a manageable size by restricting attention to a specific domain, which is a corpus of texts together with a predefined set of concepts that are of interest to that domain. T ..."
Abstract

Cited by 37 (5 self)
 Add to MetaCart
An enormous amount of knowledge is needed to infer the meaning of unrestricted natural language. The problem can be reduced to a manageable size by restricting attention to a specific domain, which is a corpus of texts together with a predefined set of concepts that are of interest to that domain. Two widely different domains are used to illustrate this domainspecific approach. One domain is a collection of Wall Street Journal articles in which the target concept is management succession events: identifying persons moving into corporate management positions or moving out. A second domain is a collection of hospital discharge summaries in which the target concepts are various classes of diagnosis or symptom.
Analysis of a Random Forests Model
, 2012
"... Random forests are a scheme proposed by Leo Breiman in the 2000’s for building a predictor ensemble with a set of decision trees that grow in randomly selected subspaces of data. Despite growing interest and practical use, there has been little exploration of the statistical properties of random for ..."
Abstract

Cited by 34 (2 self)
 Add to MetaCart
Random forests are a scheme proposed by Leo Breiman in the 2000’s for building a predictor ensemble with a set of decision trees that grow in randomly selected subspaces of data. Despite growing interest and practical use, there has been little exploration of the statistical properties of random forests, and little is known about the mathematical forces driving the algorithm. In this paper, we offer an indepth analysis of a random forests model suggested by Breiman (2004), which is very close to the original algorithm. We show in particular that the procedure is consistent and adapts to sparsity, in the sense that its rate of convergence depends only on the number of strong features and not on how many noise variables are present.
A Global Optimization Technique for Statistical Classifier Design
 IEEE Transactions on Signal Processing
"... A global optimization method is introduced for the design of statistical classifiers that minimize the rate of misclassification. We first derive the theoretical basis for the method, based on which we develop a novel design algorithm and demonstrate its effectiveness and superior performance in the ..."
Abstract

Cited by 28 (10 self)
 Add to MetaCart
(Show Context)
A global optimization method is introduced for the design of statistical classifiers that minimize the rate of misclassification. We first derive the theoretical basis for the method, based on which we develop a novel design algorithm and demonstrate its effectiveness and superior performance in the design of practical classifiers for some of the most popular structures currently in use. The method, grounded in ideas from statistical physics and information theory, extends the deterministic annealing approach for optimization, both to incorporate structural constraints on data assignments to classes and to minimize the probability of error as the cost objective. During the design, data are assigned to classes in probability, so as to minimize the expected classification error given a specified level of randomness, as measured by Shannon's entropy. The constrained optimization is equivalent to a free energy minimization, motivating a deterministic annealing approach in which the entropy...
Comparative Analysis of Serial Decision Tree Classification Algorithms
"... Classification of data objects based on a predefined knowledge of the objects is a data mining and knowledge management technique used in grouping similar data objects together. It can be defined as supervised learning algorithms as it assigns class labels to data objects based on the relationship b ..."
Abstract

Cited by 21 (0 self)
 Add to MetaCart
Classification of data objects based on a predefined knowledge of the objects is a data mining and knowledge management technique used in grouping similar data objects together. It can be defined as supervised learning algorithms as it assigns class labels to data objects based on the relationship between the data items with a predefined class label. Classification algorithms have a wide range of applications like churn prediction, fraud detection, artificial intelligence, and credit card rating etc. Also there are many classification algorithms available in literature but decision trees is the most commonly used because of its ease of implementation and easier to understand compared to other classification algorithms. Decision Tree classification algorithm can be implemented in a serial or parallel fashion based on the volume of data, memory space available on the computer resource and scalability of the algorithm. In this paper we will review the serial implementations of the decision tree algorithms, identify those that are commonly used. We will also use experimental analysis based on sample data records (Statlog data sets) to evaluate the performance of the commonly used serial decision tree algorithms.
Treebased modeling of prosodic phrasing and segmental duration for Korean TTS systems
"... This study describes the treebased modeling of prosodic phrasing, pause duration between phrases and segmental duration for Korean TTS systems. We collected 400 sentences from various genres and built a corresponding speech corpus uttered by a professional female announcer. The phonemic and prosodi ..."
Abstract

Cited by 21 (0 self)
 Add to MetaCart
This study describes the treebased modeling of prosodic phrasing, pause duration between phrases and segmental duration for Korean TTS systems. We collected 400 sentences from various genres and built a corresponding speech corpus uttered by a professional female announcer. The phonemic and prosodic boundaries were manually marked on the recorded speech, and morphological analysis, graphemetophoneme conversion and syntactic analysis were also done on the text. A decision tree and regression trees were trained on 240 sentences (of approximately 20 minutes length), and tested on 160 sentences (of approximately 13 minutes length). Features for modeling prosody are proposed, and their effectiveness is measured by interpreting the resulting trees. The misclassification rate of the decision tree was 14.46%, the RMSEs of the regression trees, which predict pause duration and segmental duration, were 132 ms and 22 ms respectively for the test set. To understand the performance of our approac...