Results 1 -
3 of
3
Ccg supertags in factored statistical machine translation
- In ACL Workshop on Statistical Machine Translation
, 2007
"... Combinatorial Categorial Grammar (CCG) supertags present phrase-based machine translation with an opportunity to access rich syntactic information at a word level. The challenge is incorporating this information into the translation process. Factored translation models allow the inclusion of superta ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
Combinatorial Categorial Grammar (CCG) supertags present phrase-based machine translation with an opportunity to access rich syntactic information at a word level. The challenge is incorporating this information into the translation process. Factored translation models allow the inclusion of supertags as a factor in the source or target language. We show that this results in an improvement in the quality of translation and that the value of syntactic supertags in flat structured phrase-based models is largely due to better local reorderings. 1
HKUST Statistical Machine Translation Experiments for IWSLT 2007
"... This paper describes the HKUST experiments in the IWSLT 2007 evaluation campaign on spoken language translation. Our primary objective was to compare the open-source phrase-based statistical machine translation toolkit Moses against Pharaoh. We focused on Chinese to English translation, but we also ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
This paper describes the HKUST experiments in the IWSLT 2007 evaluation campaign on spoken language translation. Our primary objective was to compare the open-source phrase-based statistical machine translation toolkit Moses against Pharaoh. We focused on Chinese to English translation, but we also report results on the Arabic to English, Italian to English, and Japanese to English tasks. 1.
Automatic Sentence Structure Annotation for Spoken Language Processing
, 2008
"... Increasing amounts of easily available electronic data are precipitating a need for automatic processing
that can aid humans in digesting large amounts of data. Speech and video are becoming
an increasingly significant portion of on-line information, from news and television broadcasts, to
oral hist ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Increasing amounts of easily available electronic data are precipitating a need for automatic processing
that can aid humans in digesting large amounts of data. Speech and video are becoming
an increasingly significant portion of on-line information, from news and television broadcasts, to
oral histories, on-line lectures, or user generated content. Automatic processing of audio and video
sources requires automatic speech recognition (ASR) in order to provide transcripts. Typical ASR
generates only words, without punctuation, capitalization, or further structure. Many techniques
available from natural language processing therefore suffer when applied to speech recognition output,
because they assume the presence of reliable punctuation and structure. In addition, errors from
automatic transcription also degrade the performance of downstream processing such as machine
translation, name detection, or information retrieval. We develop approaches for automatically
annotating structure in speech, including sentence and sub-sentence segmentation, and then turn
towards optimizing ASR and annotation for downstream applications.

