Results 1 -
2 of
2
A Bit of Progress in Language Modeling
, 2001
"... Language modeling is the art of determining the probability of a sequence of words. This is useful in a large variety of areas including speech recognition, optical character recognition, handwriting recognition, machine translation, and spelling correction (Church, 1988; Brown et al., 1990; Hull, 1 ..."
Abstract
-
Cited by 70 (1 self)
- Add to MetaCart
Language modeling is the art of determining the probability of a sequence of words. This is useful in a large variety of areas including speech recognition, optical character recognition, handwriting recognition, machine translation, and spelling correction (Church, 1988; Brown et al., 1990; Hull, 1992; Kernighan et al., 1990; Srihari and Baltus, 1992). The most commonly used language models are very simple (e.g. a Katz-smoothed trigram model). There are many improvements over this simple model however, including caching, clustering, higherorder n-grams, skipping models, and sentence-mixture models, all of which we will describe below. Unfortunately, these more complicated techniques have rarely been examined in combination. It is entirely possible that two techniques that work well separately will not work well together, and, as we will show, even possible that some techniques will work better together than either one does by itself. In this...
Robust Knowledge Discovery from Parallel Speech and Text Sources
, 2001
"... INTRODUCTION As a by-product of the recent information explosion, the same basic facts are often available from multiple sources such as the Internet, television, radio and newspapers. We present here a project currently in its early stages that aims to take advantage of the redundancies in paralle ..."
Abstract
- Add to MetaCart
INTRODUCTION As a by-product of the recent information explosion, the same basic facts are often available from multiple sources such as the Internet, television, radio and newspapers. We present here a project currently in its early stages that aims to take advantage of the redundancies in parallel sources to achieve robustness in automatic knowledge extraction. Consider, for instance, the following sampling of actual news from various sources on a particular day: CNN: James McDougal, President Bill Clinton's former business partner in Arkansas and a cooperating witness in the Whitewater investigation, died Sunday while serving a federal prison term. He was 57. MSNBC: Fort Worth, Texas, March 8. Whitewater figure James McDougal died of an apparent heart attack in a private community hospital in Fort Worth, Texas, Sunday. He was 57. ABC News: Washington, March 8. James McDougal, a key figure in Independent Counsel Kenneth Starr's Whitewater investigation, is dead. The Detroit Ne

