Results 1  10
of
135,061
Maximum entropy markov models for information extraction and segmentation
, 2000
"... Hidden Markov models (HMMs) are a powerful probabilistic tool for modeling sequential data, and have been applied with success to many textrelated tasks, such as partofspeech tagging, text segmentation and information extraction. In these cases, the observations are usually modeled as multinomial ..."
Abstract

Cited by 554 (18 self)
 Add to MetaCart
, capitalization, formatting, partofspeech), and defines the conditional probability of state sequences given observation sequences. It does this by using the maximum entropy framework to fit a set of exponential models that represent the probability of a state given an observation and the previous state. We
Antide Sitter Space, Thermal Phase Transition, and Confinement in Gauge Theories
 Adv. Theor. Math. Phys
, 1998
"... The correspondence between supergravity (and string theory) on AdS space and boundary conformal field theory relates the thermodynamics of N = 4 super YangMills theory in four dimensions to the thermodynamics of Schwarzschild black holes in Antide Sitter space. In this description, quantum phenome ..."
Abstract

Cited by 1087 (4 self)
 Add to MetaCart
phenomena such as the spontaneous breaking of the center of the gauge group, magnetic confinement, and the mass gap are coded in classical geometry. The correspondence makes it manifest that the entropy of a very large AdS Schwarzschild black hole must scale “holographically ” with the volume of its horizon
TnT  A Statistical PartOfSpeech Tagger
, 2000
"... Trigrams'n'Tags (TnT) is an efficient statistical partofspeech tagger. Contrary to claims found elsewhere in the literature, we argue that a tagger based on Markov models performs at least as well as other current approaches, including the Maximum Entropy framework. A recent comparison h ..."
Abstract

Cited by 525 (5 self)
 Add to MetaCart
Trigrams'n'Tags (TnT) is an efficient statistical partofspeech tagger. Contrary to claims found elsewhere in the literature, we argue that a tagger based on Markov models performs at least as well as other current approaches, including the Maximum Entropy framework. A recent comparison
Shallow Parsing with Conditional Random Fields
, 2003
"... Conditional random fields for sequence labeling offer advantages over both generative models like HMMs and classifiers applied at each sequence position. Among sequence labeling tasks in language processing, shallow parsing has received much attention, with the development of standard evaluati ..."
Abstract

Cited by 575 (8 self)
 Add to MetaCart
optimization algorithms were critical in achieving these results. We present extensive comparisons between models and training methods that confirm and strengthen previous results on shallow parsing and training methods for maximumentropy models.
Graphical models, exponential families, and variational inference
, 2008
"... The formalism of probabilistic graphical models provides a unifying framework for capturing complex dependencies among random variables, and building largescale multivariate statistical models. Graphical models have become a focus of research in many statistical, computational and mathematical fiel ..."
Abstract

Cited by 800 (26 self)
 Add to MetaCart
of probability distributions — are best studied in the general setting. Working with exponential family representations, and exploiting the conjugate duality between the cumulant function and the entropy for exponential families, we develop general variational representations of the problems of computing
Iterative decoding of binary block and convolutional codes
 IEEE Trans. Inform. Theory
, 1996
"... Abstract Iterative decoding of twodimensional systematic convolutional codes has been termed “turbo ” (de)coding. Using loglikelihood algebra, we show that any decoder can he used which accepts soft inputsincluding a priori valuesand delivers soft outputs that can he split into three terms: the ..."
Abstract

Cited by 600 (43 self)
 Add to MetaCart
is controlled by a stop criterion derived from cross entropy, which results in a minimal number of iterations. Optimal and suboptimal decoders with reduced complexity are presented. Simulation results show that very simple component codes are sufficient, block codes are appropriate for high rates
Fuzzy extractors: How to generate strong keys from biometrics and other noisy data. Technical Report 2003/235, Cryptology ePrint archive, http://eprint.iacr.org, 2006. Previous version appeared at EUROCRYPT 2004
 34 [DRS07] [DS05] [EHMS00] [FJ01] Yevgeniy Dodis, Leonid Reyzin, and Adam
, 2004
"... We provide formal definitions and efficient secure techniques for • turning noisy information into keys usable for any cryptographic application, and, in particular, • reliably and securely authenticating biometric data. Our techniques apply not just to biometric information, but to any keying mater ..."
Abstract

Cited by 532 (38 self)
 Add to MetaCart
We provide formal definitions and efficient secure techniques for • turning noisy information into keys usable for any cryptographic application, and, in particular, • reliably and securely authenticating biometric data. Our techniques apply not just to biometric information, but to any keying material that, unlike traditional cryptographic keys, is (1) not reproducible precisely and (2) not distributed uniformly. We propose two primitives: a fuzzy extractor reliably extracts nearly uniform randomness R from its input; the extraction is errortolerant in the sense that R will be the same even if the input changes, as long as it remains reasonably close to the original. Thus, R can be used as a key in a cryptographic application. A secure sketch produces public information about its input w that does not reveal w, and yet allows exact recovery of w given another value that is close to w. Thus, it can be used to reliably reproduce errorprone biometric inputs without incurring the security risk inherent in storing them. We define the primitives to be both formally secure and versatile, generalizing much prior work. In addition, we provide nearly optimal constructions of both primitives for various measures of “closeness” of input data, such as Hamming distance, edit distance, and set difference.
High confidence visual recognition of persons by a test of statistical independence
 IEEE Trans. on Pattern Analysis and Machine Intelligence
, 1993
"... Abstruct A method for rapid visual recognition of personal identity is described, based on the failure of a statistical test of independence. The most unique phenotypic feature visible in a person’s face is the detailed texture of each eye’s iris: An estimate of its statistical complexity in a samp ..."
Abstract

Cited by 596 (8 self)
 Add to MetaCart
Abstruct A method for rapid visual recognition of personal identity is described, based on the failure of a statistical test of independence. The most unique phenotypic feature visible in a person’s face is the detailed texture of each eye’s iris: An estimate of its statistical complexity in a sample of the human population reveals variation corresponding to several hundred independent degreesoffreedom. Morphogenetic randomness in the texture expressed phenotypically in the iris trabecular meshwork ensures that a test of statistical independence on two coded patterns originating from different eyes is passed almost certainly, whereas the same test is failed almost certainly when the compared codes originate from the same eye. The visible texture of a person’s iris in a realtime video image is encoded into a compact sequence of multiscale quadrature 2D Gabor wavelet coefficients, whose mostsignificant bits comprise a 256byte “iris code. ” Statistical decision theory generates identification decisions from ExclusiveOR comparisons of complete iris codes at the rate of 4000 per second, including calculation of decision confidence levels. The distributions observed empirically in such comparisons imply a theoretical “crossover ” error rate of one in 131000 when a decision criterion is adopted that would equalize the false accept and false reject error rates. In the typical recognition case, given the mean observed degree of iris code agreement, the decision confidence levels correspond formally to a conditional false accept probability of one in about lo”’. Index Terms Image analysis, statistical pattern recognition, biometric identification, statistical decision theory, 2D Gabor filters, wavelets, texture analysis, morphogenesis. I.
The information bottleneck method
 University of Illinois
, 1999
"... We define the relevant information in a signal x ∈ X as being the information that this signal provides about another signal y ∈ Y. Examples include the information that face images provide about the names of the people portrayed, or the information that speech sounds provide about the words spoken. ..."
Abstract

Cited by 545 (38 self)
 Add to MetaCart
We define the relevant information in a signal x ∈ X as being the information that this signal provides about another signal y ∈ Y. Examples include the information that face images provide about the names of the people portrayed, or the information that speech sounds provide about the words spoken. Understanding the signal x requires more than just predicting y, it also requires specifying which features of X play a role in the prediction. We formalize this problem as that of finding a short code for X that preserves the maximum information about Y. That is, we squeeze the information that X provides about Y through a ‘bottleneck ’ formed by a limited set of codewords ˜X. This constrained optimization problem can be seen as a generalization of rate distortion theory in which the distortion measure d(x, ˜x) emerges from the joint statistics of X and Y. This approach yields an exact set of self consistent equations for the coding rules X → ˜ X and ˜ X → Y. Solutions to these equations can be found by a convergent re–estimation method that generalizes the Blahut–Arimoto algorithm. Our variational principle provides a surprisingly rich framework for discussing a variety of problems in signal processing and learning, as will be described in detail elsewhere. 1 1
FeatureRich PartofSpeech Tagging with a Cyclic Dependency Network
 IN PROCEEDINGS OF HLTNAACL
, 2003
"... We present a new partofspeech tagger that demonstrates the following ideas: (i) explicit use of both preceding and following tag contexts via a dependency network representation, (ii) broad use of lexical features, including jointly conditioning on multiple consecutive words, (iii) effective ..."
Abstract

Cited by 660 (23 self)
 Add to MetaCart
We present a new partofspeech tagger that demonstrates the following ideas: (i) explicit use of both preceding and following tag contexts via a dependency network representation, (ii) broad use of lexical features, including jointly conditioning on multiple consecutive words, (iii) effective use of priors in conditional loglinear models, and (iv) finegrained modeling of unknown word features. Using these ideas together, the resulting tagger gives a 97.24% accuracy on the Penn Treebank WSJ, an error reduction of 4.4% on the best previous single automatically learned tagging result.
Results 1  10
of
135,061