Results 11 -
16 of
16
Exploiting Syntactic and Distributional Information for Spelling Correction with Web-Scale N-gram Models
"... We propose a novel way of incorporating dependency parse and word co-occurrence information into a state-of-the-art web-scale n-gram model for spelling correction. The syntactic and distributional information provides extra evidence in addition to that provided by a web-scale n-gram corpus and espec ..."
Abstract
- Add to MetaCart
We propose a novel way of incorporating dependency parse and word co-occurrence information into a state-of-the-art web-scale n-gram model for spelling correction. The syntactic and distributional information provides extra evidence in addition to that provided by a web-scale n-gram corpus and especially helps with data sparsity problems. Experimental results show that introducing syntactic features into n-gram based models significantly reduces errors by up to 12.4 % over the current state-of-the-art. The word co-occurrence information shows potential but only improves overall accuracy slightly. 1
NADA: A Robust System for Non-Referential Pronoun Detection
"... Nada is a novel, publicly-available program that accurately distinguishes between the referential and non-referential pronoun it in raw English text. Like recent state-of-the-art approaches, Nada uses very large-scale web N-gram features, but Nada makes these features practical by compressing the N- ..."
Abstract
- Add to MetaCart
Nada is a novel, publicly-available program that accurately distinguishes between the referential and non-referential pronoun it in raw English text. Like recent state-of-the-art approaches, Nada uses very large-scale web N-gram features, but Nada makes these features practical by compressing the N-gram counts so they can fit into computer memory. Nada therefore operates as a fast, stand-alone system. Nada also improves over previous web-scale systems by considering the entire sentence, rather than narrow context windows, via long-distance lexical features. Nada very substantially outperforms other state-of-the-art systems in nonreferential detection accuracy. 1
Unsupervised Learning on an Approximate Corpus ∗
"... Unsupervised learning techniques can take advantage of large amounts of unannotated text, but the largest text corpus (the Web) is not easy to use in its full form. Instead, we have statistics about this corpus in the form of n-gram counts (Brants and Franz, 2006). While n-gram counts do not directl ..."
Abstract
- Add to MetaCart
Unsupervised learning techniques can take advantage of large amounts of unannotated text, but the largest text corpus (the Web) is not easy to use in its full form. Instead, we have statistics about this corpus in the form of n-gram counts (Brants and Franz, 2006). While n-gram counts do not directly provide sentences, a distribution over sentences can be estimated from them in the same way that n-gram language models are estimated. We treat this distribution over sentences as an approximate corpus and show how unsupervised learning can be performed on such a corpus using variational inference. We compare hidden Markov model (HMM) training on exact and approximate corpora of various sizes, measuring speed and accuracy on unsupervised part-of-speech tagging. 1
Rel-grams: A Probabilistic Model of Relations in Text
"... (bomb; explode near;?) (?; claim; responsibility) We introduce the Rel-grams language model, which is analogous to an n-grams model, but is computed over relations rather than over words. The model encodes the conditional probability of observing a relational tuple R, given that R ′ was observed in ..."
Abstract
- Add to MetaCart
(bomb; explode near;?) (?; claim; responsibility) We introduce the Rel-grams language model, which is analogous to an n-grams model, but is computed over relations rather than over words. The model encodes the conditional probability of observing a relational tuple R, given that R ′ was observed in a window of prior relational tuples. We build a database of Rel-grams co-occurence statistics from Re-Verb extractions over 1.8M news wire documents and show that a graphical model based on these statistics is useful for automatically discovering event templates. We make this database freely available and hope it will prove a useful resource for a wide variety of NLP tasks. 1
A Beam-Search Decoder for Grammatical Error Correction
"... We present a novel beam-search decoder for grammatical error correction. The decoder iteratively generates new hypothesis corrections from current hypotheses and scores them based on features of grammatical correctness and fluency. These features include scores from discriminative classifiers for sp ..."
Abstract
- Add to MetaCart
We present a novel beam-search decoder for grammatical error correction. The decoder iteratively generates new hypothesis corrections from current hypotheses and scores them based on features of grammatical correctness and fluency. These features include scores from discriminative classifiers for specific error categories, such as articles and prepositions. Unlike all previous approaches, our method is able to perform correction of whole sentences with multiple and interacting errors while still taking advantage of powerful existing classifier approaches. Our decoder achieves an F1 correction score significantly higher than all previous published scores on the Helping Our Own (HOO) shared task data set. 1

