Results 1 -
2 of
2
Robust Bilingual Word Alignment for Machine Aided Translation
- In Proceedings of the Workshop on Very Large Corpora
, 1993
"... We have developed a new program called word_align for aligning parallel text, text such as the Canadian Hansards that are available in two or more languages. The program takes the output of char_align (Church, 1993), a robust alternative to sentence-based alignment pro- grams, and applies word-level ..."
Abstract
-
Cited by 64 (2 self)
- Add to MetaCart
We have developed a new program called word_align for aligning parallel text, text such as the Canadian Hansards that are available in two or more languages. The program takes the output of char_align (Church, 1993), a robust alternative to sentence-based alignment pro- grams, and applies word-level constraints us- ing a version of Brown et al.'s Model 2 (Brown et al., 1993), modified and extended to deal with robustness issues. Word_align was tested on a subset of Canadian Itansards supplied by Simard (Simard et al., 1992). The combination of word_align plus char_align reduces the variance (average square error) by a factor of 5 over char_align alone. More importantly, because word_align and char_align were designed to work robustly on texts that are smaller and more noisy than the 1tansards, it has been pos- sible to successfully deploy the programs at AT&T Language Line Services, a commercial translation service, to help them with difficult terminology.
Probabilistic Tpee-Adjoining Gpammar As A Filamework
- In Proceedings of the 14th International Conference on Computational Linguistics
, 1992
"... In this paper, I argue for the use of a probabilistic form of tree-adjoining grammar (TAG) iu statistical natural language processing. I first discuss two previous statistical approaches --- one that coucentrates on the probabilities of structural operations, and auo[her that emphasizes ccoccurrcucc ..."
Abstract
- Add to MetaCart
In this paper, I argue for the use of a probabilistic form of tree-adjoining grammar (TAG) iu statistical natural language processing. I first discuss two previous statistical approaches --- one that coucentrates on the probabilities of structural operations, and auo[her that emphasizes ccoccurrcucc relationships between words. I argue that a purely structural approach, ex- emplified by probabilistic context-free grammar, lacks sufficient sensitivity to lexical coiltext, and, conversely, that lexical co-occurence analyses require a richer notion of locality that is best provided by importing some notion of strueturc.

