Results 1  10
of
25,403
A MaximumEntropyInspired Parser
, 1999
"... We present a new parser for parsing down to Penn treebank style parse trees that achieves 90.1% average precision/recall for sentences of length 40 and less, and 89.5% for sentences of length 100 and less when trained and tested on the previously established [5,9,10,15,17] "stan dard" se ..."
Abstract

Cited by 971 (19 self)
 Add to MetaCart
" sections of the Wall Street Journal tree bank. This represents a 13% decrease in error rate over the best singleparser results on this corpus [9]. The major technical innova tion is the use of a "maximumentropyinspired" model for conditioning and smoothing that let us successfully to test
Divergence measures based on the Shannon entropy
 IEEE Transactions on Information theory
, 1991
"... AbstractA new class of informationtheoretic divergence measures based on the Shannon entropy is introduced. Unlike the wellknown Kullback divergences, the new measures do not require the condition of absolute continuity to be satisfied by the probability distributions involved. More importantly, ..."
Abstract

Cited by 666 (0 self)
 Add to MetaCart
AbstractA new class of informationtheoretic divergence measures based on the Shannon entropy is introduced. Unlike the wellknown Kullback divergences, the new measures do not require the condition of absolute continuity to be satisfied by the probability distributions involved. More importantly
EntropyBased Algorithms For Best Basis Selection
 IEEE Transactions on Information Theory
, 1992
"... pretations (position, frequency, and scale), and we have experimented with featureextraction methods that use bestbasis compression for frontend complexity reduction. The method relies heavily on the remarkable orthogonality properties of the new libraries. It is obviously a nonlinear transformat ..."
Abstract

Cited by 675 (20 self)
 Add to MetaCart
, we can use information cost functionals defined for signals with normalized energy, since all expansions in a given library will conserve energy. Since two expansions will have the same energy globally, it is not necessary to normalize expansions to compare their costs. This feature greatly enlarges
Maximum entropy markov models for information extraction and segmentation
, 2000
"... Hidden Markov models (HMMs) are a powerful probabilistic tool for modeling sequential data, and have been applied with success to many textrelated tasks, such as partofspeech tagging, text segmentation and information extraction. In these cases, the observations are usually modeled as multinomial ..."
Abstract

Cited by 561 (18 self)
 Add to MetaCart
, capitalization, formatting, partofspeech), and defines the conditional probability of state sequences given observation sequences. It does this by using the maximum entropy framework to fit a set of exponential models that represent the probability of a state given an observation and the previous state. We
Shallow Parsing with Conditional Random Fields
, 2003
"... Conditional random fields for sequence labeling offer advantages over both generative models like HMMs and classifiers applied at each sequence position. Among sequence labeling tasks in language processing, shallow parsing has received much attention, with the development of standard evaluati ..."
Abstract

Cited by 581 (8 self)
 Add to MetaCart
Conditional random fields for sequence labeling offer advantages over both generative models like HMMs and classifiers applied at each sequence position. Among sequence labeling tasks in language processing, shallow parsing has received much attention, with the development of standard
Global Optimization with Polynomials and the Problem of Moments
 SIAM JOURNAL ON OPTIMIZATION
, 2001
"... We consider the problem of finding the unconstrained global minimum of a realvalued polynomial p(x) : R R, as well as the global minimum of p(x), in a compact set K defined by polynomial inequalities. It is shown that this problem reduces to solving an (often finite) sequence of convex linear ma ..."
Abstract

Cited by 577 (48 self)
 Add to MetaCart
matrix inequality (LMI) problems. A notion of KarushKuhnTucker polynomials is introduced in a global optimality condition. Some illustrative examples are provided.
Conditional random fields: Probabilistic models for segmenting and labeling sequence data
, 2001
"... We present conditional random fields, a framework for building probabilistic models to segment and label sequence data. Conditional random fields offer several advantages over hidden Markov models and stochastic grammars for such tasks, including the ability to relax strong independence assumptions ..."
Abstract

Cited by 3485 (85 self)
 Add to MetaCart
made in those models. Conditional random fields also avoid a fundamental limitation of maximum entropy Markov models (MEMMs) and other discriminative Markov models based on directed graphical models, which can be biased towards states with few successor states. We present iterative parameter estimation
Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms
, 2002
"... We describe new algorithms for training tagging models, as an alternative to maximumentropy models or conditional random fields (CRFs). The algorithms rely on Viterbi decoding of training examples, combined with simple additive updates. We describe theory justifying the algorithms through a modific ..."
Abstract

Cited by 660 (13 self)
 Add to MetaCart
We describe new algorithms for training tagging models, as an alternative to maximumentropy models or conditional random fields (CRFs). The algorithms rely on Viterbi decoding of training examples, combined with simple additive updates. We describe theory justifying the algorithms through a
A Fast Marching Level Set Method for Monotonically Advancing Fronts
 PROC. NAT. ACAD. SCI
, 1995
"... We present a fast marching level set method for monotonically advancing fronts, which leads to an extremely fast scheme for solving the Eikonal equation. Level set methods are numerical techniques for computing the position of propagating fronts. They rely on an initial value partial differential eq ..."
Abstract

Cited by 630 (24 self)
 Add to MetaCart
describe a particular case of such methods for interfaces whose speed depends only on local position. The technique works by coupling work on entropy conditions for interface motion, the theory of viscosity solutions for HamiltonJacobi equations and fast adaptive narrow band level set methods
Training Support Vector Machines: an Application to Face Detection
, 1997
"... We investigate the application of Support Vector Machines (SVMs) in computer vision. SVM is a learning technique developed by V. Vapnik and his team (AT&T Bell Labs.) that can be seen as a new method for training polynomial, neural network, or Radial Basis Functions classifiers. The decision sur ..."
Abstract

Cited by 727 (1 self)
 Add to MetaCart
global optimality, and can be used to train SVM's over very large data sets. The main idea behind the decomposition is the iterative solution of subproblems and the evaluation of optimality conditions which are used both to generate improved iterative values, and also establish the stopping
Results 1  10
of
25,403