• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Integrated transcription and identification of named entities in broadcast speech (1999)

by S Renals, Y Gotoh
Venue:In Proc. Eurospeech
Add To MetaCart

Tools

Sorted by:
Results 1 - 2 of 2

Automatic Sentence Structure Annotation for Spoken Language Processing

by Dustin Lundring Hillard , 2008
"... Increasing amounts of easily available electronic data are precipitating a need for automatic processing that can aid humans in digesting large amounts of data. Speech and video are becoming an increasingly significant portion of on-line information, from news and television broadcasts, to oral hist ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
Increasing amounts of easily available electronic data are precipitating a need for automatic processing that can aid humans in digesting large amounts of data. Speech and video are becoming an increasingly significant portion of on-line information, from news and television broadcasts, to oral histories, on-line lectures, or user generated content. Automatic processing of audio and video sources requires automatic speech recognition (ASR) in order to provide transcripts. Typical ASR generates only words, without punctuation, capitalization, or further structure. Many techniques available from natural language processing therefore suffer when applied to speech recognition output, because they assume the presence of reliable punctuation and structure. In addition, errors from automatic transcription also degrade the performance of downstream processing such as machine translation, name detection, or information retrieval. We develop approaches for automatically annotating structure in speech, including sentence and sub-sentence segmentation, and then turn towards optimizing ASR and annotation for downstream applications.

In memory of my brother,

by Ingrid Ahmer, Thor Christopher Ahmer , 1955
"... This thesis addresses the application of automatic speech recognition to the task of offline closed-captioning of television programs, and describes the collection of corpora to support such research and an exploration of issues to be addressed. The use of automatic speech recognition (ASR) for tran ..."
Abstract - Add to MetaCart
This thesis addresses the application of automatic speech recognition to the task of offline closed-captioning of television programs, and describes the collection of corpora to support such research and an exploration of issues to be addressed. The use of automatic speech recognition (ASR) for transcription of broadcast speech and as an aid to captioning is reviewed. As background to the task, the methodology for large vocabulary continuous speech recognition (LVCSR) is presented, with particular attention given to the issues of large vocabulary language modelling and consideration of the acoustic complexity arising in broadcast material. A speech corpus of segmented and transcribed speech utterances for ten program episodes was developed for a typical genre of television programming (travelogues) for which offline closed-captions are applied. The development of this corpus demonstrates the feasibility of using existing closed-caption sources for generating labelled acoustic data suitable for speech recognition research. The speech corpus exhibits far greater acoustic complexity and much lower signal to noise ratios than occurs in broadcast news data (which has been systematically evaluated in ASR research). Noise-tolerant speech recognisers were developed and effectively
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University