Incorporating nonlocal information into information extraction systems by Gibbs sampling
 IN ACL
, 2005
"... Most current statistical natural language processing models use only local features so as to permit dynamic programming in inference, but this makes them unable to fully account for the long distance structure that is prevalent in language use. We show how to solve this dilemma with Gibbs sampling, ..."
Cited by 730 (25 self)
, a simple Monte Carlo method used to perform approximate inference in factored probabilistic models. By using simulated annealing in place of Viterbi decoding in sequence models such as HMMs, CMMs, and CRFs, it is possible to incorporate nonlocal structure while preserving tractable inference. We
A Simple, Fast, and Accurate Algorithm to Estimate Large Phylogenies by Maximum Likelihood
, 2003
"... The increase in the number of large data sets and the complexity of current probabilistic sequence evolution models necessitates fast and reliable phylogeny reconstruction methods. We describe a new approach, based on the maximumlikelihood principle, which clearly satisfies these requirements. The ..."
Cited by 2182 (27 self)
The increase in the number of large data sets and the complexity of current probabilistic sequence evolution models necessitates fast and reliable phylogeny reconstruction methods. We describe a new approach, based on the maximumlikelihood principle, which clearly satisfies these requirements
The ratedistortion function for source coding with side information at the decoder
 IEEE Trans. Inform. Theory
, 1976
"... AbstractLet {(X,, Y,J}r = 1 be a sequence of independent drawings of a pair of dependent random variables X, Y. Let us say that X takes values in the finite set 6. It is desired to encode the sequence {X,} in blocks of length n into a binary stream*of rate R, which can in turn be decoded as a seque ..."
Cited by 1060 (1 self)
AbstractLet {(X,, Y,J}r = 1 be a sequence of independent drawings of a pair of dependent random variables X, Y. Let us say that X takes values in the finite set 6. It is desired to encode the sequence {X,} in blocks of length n into a binary stream*of rate R, which can in turn be decoded as a
1. FORGETTING CONCATENATION AND REDUCTION SEQUENCE
"... this paper. 1. FORGETTING CONCATENATION AND REDUCTION SEQUENCE Let p, q be finite sequences. The functor p q yielding a finite sequence is defined by: (Def. 1)(i) p q = p q if p = / 0 or q = / 0, (ii) there exists a natural number i and there exists a finite sequence r such that len p = i + ..."
this paper. 1. FORGETTING CONCATENATION AND REDUCTION SEQUENCE Let p, q be finite sequences. The functor p q yielding a finite sequence is defined by: (Def. 1)(i) p q = p q if p = / 0 or q = / 0, (ii) there exists a natural number i and there exists a finite sequence r such that len p = i
The Nature and Growth of Vertical Specialization in World Trade
 Journal of International Economics
"... Abstract: Dramatic changes are occurring in the nature of international trade. Production processes increasingly involve a sequential, vertical trading chain stretching across many countries, with each country specializing in particular stages of a good’s production sequence. We document a key aspe ..."
Cited by 481 (20 self)
Abstract: Dramatic changes are occurring in the nature of international trade. Production processes increasingly involve a sequential, vertical trading chain stretching across many countries, with each country specializing in particular stages of a good’s production sequence. We document a key
A PostProcessing System To Yield Reduced Word Error Rates: Recognizer Output Voting Error Reduction (ROVER)
, 1997
"... This paper describes a system developed at NIST to produce a composite Automatic Speech Recognition (ASR) system output when the outputs of multiple ASR systems are available, and for which, in many cases, the composite ASR output has lower error rate than any of the individual systems. The system i ..."
Cited by 422 (2 self)
implements a "voting" or rescoring process to reconcile differences in ASR system outputs. We refer to this system as the NIST Recognizer Output Voting Error Reduction (ROVER) system. As additional knowledge sources are added to an ASR system, (e.g., acoustic and language models), error rates
On the Hardness of Computing Maximum SelfReduction Sequences
, 2000
"... Various combinatorial problems on graphs can be approached by reducing the size of the graph according to certain rules. Given an instance of a graph problem, it is desirable to apply a sequence of selfreductions that is as long as possible so that the remaining graph is as small as possible. We s ..."
Various combinatorial problems on graphs can be approached by reducing the size of the graph according to certain rules. Given an instance of a graph problem, it is desirable to apply a sequence of selfreductions that is as long as possible so that the remaining graph is as small as possible. We
AN INSERTION OPERATOR PRESERVING INFINITE REDUCTION SEQUENCES
, 2008
"... A common way to show the termination of the union of two abstract reduction systems, provided both systems terminate, is to prove they enjoy a specific property (some sort of “commutation ” for instance). This specific property is actually used to show that, for the union not to terminate, one out ..."
Cited by 1 (0 self)
A common way to show the termination of the union of two abstract reduction systems, provided both systems terminate, is to prove they enjoy a specific property (some sort of “commutation ” for instance). This specific property is actually used to show that, for the union not to terminate, one out
Temporal sequence learning and data reduction for anomaly detection
 ACM TRANSACTIONS ON INFORMATION SYSTEMS SECURITY
, 1999
"... The anomaly detection problem can be formulated as one of learning to characterize the behaviors of an individual, system, or network in terms of temporal sequences of discrete data. We present an approach to this problem based on instance based learning (IBL) techniques. To cast the anomaly detecti ..."
Cited by 191 (6 self)
The anomaly detection problem can be formulated as one of learning to characterize the behaviors of an individual, system, or network in terms of temporal sequences of discrete data. We present an approach to this problem based on instance based learning (IBL) techniques. To cast the anomaly
Early Results for Named Entity Recognition with Conditional Random Fields, Feature Induction and WebEnhanced Lexicons
, 2003
"... This paper presents a feature induction method for CRFs. Founded on the principle of constructing only those feature conjunctions that significantly increase loglikelihood, the approach builds on that of Della Pietra et al (1997), but is altered to work with conditional rather than joint probabiliti ..."
Cited by 267 (12 self)
probabilities, and with a meanfield approximation and other additional modifications that improve efficiency specifically for a sequence model. In comparison with traditional approaches, automated feature induction offers both improved accuracy and significant reduction in feature count; it enables the use
