Results 1 - 10
of
21
Weighted Finite-State Transducers in Speech Recognition
, 2001
"... We survey the use of weighted finite-state transducers (WFSTs) in speech recognition. We show that WFSTs provide a common and natural representation for HMM models, context-dependency, pronunciation dictionaries, grammars, and alternative recognition outputs. Furthermore, general transducer oper ..."
Abstract
-
Cited by 101 (3 self)
- Add to MetaCart
We survey the use of weighted finite-state transducers (WFSTs) in speech recognition. We show that WFSTs provide a common and natural representation for HMM models, context-dependency, pronunciation dictionaries, grammars, and alternative recognition outputs. Furthermore, general transducer operations combine these representations flexibly and efficiently. Weighted
An Efficient Compiler for Weighted Rewrite Rules
- IN 34TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS
, 1996
"... Context-dependent rewrite rules are used in many areas of natural language and speech processing. Work in computational phonology has demonstrated that, given certain conditions, such rewrite rules can be represented as finite-state transducers (FSTs). We describe a new algorithm for compilin ..."
Abstract
-
Cited by 67 (23 self)
- Add to MetaCart
Context-dependent rewrite rules are used in many areas of natural language and speech processing. Work in computational phonology has demonstrated that, given certain conditions, such rewrite rules can be represented as finite-state transducers (FSTs). We describe a new algorithm for compiling rewrite rules into FSTs. We show the algorithm to be simpler and more efficient than existing algorithms. Further, many
Effective Use of Natural Language Processing Techniques for Automatic Conflation of Multi-Word Terms: The Role of Derivational Morphology, Part of Speech Tagging, and Shallow Parsing
- In Research and Development in Information Retrieval
"... We present a corpus-based system to expand multi-word index terms using a part-of-speech tagger and a full-fledged derivational morphological system, combined with a shallow parser. The system has been applied to French. The unique contribution of the research is in using these linguistically based ..."
Abstract
-
Cited by 20 (3 self)
- Add to MetaCart
We present a corpus-based system to expand multi-word index terms using a part-of-speech tagger and a full-fledged derivational morphological system, combined with a shallow parser. The system has been applied to French. The unique contribution of the research is in using these linguistically based tools with safety filters in order to avoid the problems of degradation typically associated with derivational analysis and generation. The successful expansion and thus conflation of terms, increases indexing coverage up to 30% with precision of nearly 90% for correct identification of related terms. The fully implemented system is described with particular attention on the role of derivational morphology and phrasal relations. Results and evaluation are presented in terms of precision and recall, with an analysis and discussion of errors. This paper illustrates how natural language processing tools, when combined effectively for tasks to which they are especially suited, indicates the pote...
Heterogeneous Relation Graphs as a Mechanism for Representing Linguistic Information
- Speech Communications
, 2001
"... 1 Introduction This paper describes Heterogeneous Relation Graphs (HRG), a formalism for describing linguistic structures. HRG was developed for use in a speech synthesis system (Festival (Black et al., 1996-1999)), and its design reflects the specific needs of such a system 1 . However, in essence ..."
Abstract
-
Cited by 13 (2 self)
- Add to MetaCart
1 Introduction This paper describes Heterogeneous Relation Graphs (HRG), a formalism for describing linguistic structures. HRG was developed for use in a speech synthesis system (Festival (Black et al., 1996-1999)), and its design reflects the specific needs of such a system 1 . However, in essence HRG can also be used to store any type of linguistic structure and we have found it useful for database annotation and other speech and language applications. Storing linguistic information in speech synthesis systems presents some particularly interesting problems that distinguish this from some other formalisms used in speech and language processing. Foremost of these is that the linguistic data processed in a synthesis system is linguistically heterogeneous. That is, rather than dealing with syntax or phonology independently, synthesizers can be involved in text analysis, syntactic analysis, morphology, phonology, phonetics, prosody, articulatory control and acoustics. It is highly des...
Mixed-Lingual Text Analysis for Polyglot TTS Synthesis
, 2003
"... Text-to-speech (TTS) synthesis is more and more confronted with the language mixing phenomenon. An important step towards the solution of this problem and thus towards a socalled polyglot TTS system is an analysis component for mixedlingual texts. In this paper it is shown how such an analyzer can b ..."
Abstract
-
Cited by 12 (5 self)
- Add to MetaCart
Text-to-speech (TTS) synthesis is more and more confronted with the language mixing phenomenon. An important step towards the solution of this problem and thus towards a socalled polyglot TTS system is an analysis component for mixedlingual texts. In this paper it is shown how such an analyzer can be realized for a set of languages, starting from a corresponding set of monolingual analyzers which are based on DCGs and chart parsing.
Issues in Text-to-Speech Conversion for Mandarin
, 1996
"... Research on text-to-speech (TTS) conversion for Mandarin Chinese is a much younger enterprise than comparable research for English or other European languages. Nonetheless, impressive progress has been made over the last couple of decades, and Mandarin Chinese systems now exist which approach, or in ..."
Abstract
-
Cited by 6 (3 self)
- Add to MetaCart
Research on text-to-speech (TTS) conversion for Mandarin Chinese is a much younger enterprise than comparable research for English or other European languages. Nonetheless, impressive progress has been made over the last couple of decades, and Mandarin Chinese systems now exist which approach, or in some ways even surpass in quality available systems for English. This article has two goals. The first is to summarize the published literature on Mandarin synthesis, with a view to clarifying the similarities or differences among the various efforts. One property shared by a great many systems is the dependence on the syllable as the basic unit of synthesis. We shall argue that this property stems both from the accidental fact that Mandarin has a small number of syllable types, and from traditional Sinological views of the linguistic structure of Chinese. Despite the popularity of the syllable, though, there are problems with using it as the basic synthesis unit, as we shall show. The seco...
Integrating Geometrical and Linguistic Analysis for E-Mail Signature Block Parsing
, 1999
"... ing with credit is permitted. To copy otherwise, to republish, to post on servers, to redistribute to lists, or to use any component of this work in other works, requires prior specific permission and/or a fee. Permissions may be requested from Publications Dept, ACM Inc., 1515 Broadway, New York, N ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
ing with credit is permitted. To copy otherwise, to republish, to post on servers, to redistribute to lists, or to use any component of this work in other works, requires prior specific permission and/or a fee. Permissions may be requested from Publications Dept, ACM Inc., 1515 Broadway, New York, NY 10036 USA, fax +1 (212) 869-0481, or permissions@acm.org. 2 \Delta H. Chen, J. Hu and R. W. Sproat 1. INTRODUCTION The rapidly increasing use of the Internet in recent years has made e-mail one of the most common forms of business and personal communication. How to manage the large and dynamic collections of e-mail documents for efficient storage and information retrieval, and how to provide conversions between e-mail and other forms of messages (e.g., voice mail and fax) to allow convenient access whenever and wherever the user needs, are some of the most important research areas in multimedia messaging. The content of modern-day e-mail has expanded beyond text to include encoded docum...
Multi-Context Rules for Phonological Processing in Polyglot TTS Synthesis
, 2004
"... Polyglot text-to-speech synthesis, i.e. the synthesis of sentences containing one or more inclusions from other languages, primarily depends on an accurate morphosyntactic analyzer for such mixed-lingual texts. From the output of this analyzer, the pronunciation can be derived by means of phonologic ..."
Abstract
-
Cited by 6 (4 self)
- Add to MetaCart
Polyglot text-to-speech synthesis, i.e. the synthesis of sentences containing one or more inclusions from other languages, primarily depends on an accurate morphosyntactic analyzer for such mixed-lingual texts. From the output of this analyzer, the pronunciation can be derived by means of phonological transformations which are language-specific and depend on various contexts. In this paper a new rule formalism for such phonological transformations is presented, which complies also with the requirements of the mixed-lingual situation.
A formal computational analysis of indic scripts
- In International Symposium on Indic Scripts: Past and Future
, 2003
"... The Brahmi-derived Indic scripts occupy a special place in the study of writing systems. They are alphasyllabic scripts (Bright, 1996a) (though Daniels (1996) prefers the term abugida), meaning that they are basically segmental in that almost all segments are represented in ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
The Brahmi-derived Indic scripts occupy a special place in the study of writing systems. They are alphasyllabic scripts (Bright, 1996a) (though Daniels (1996) prefers the term abugida), meaning that they are basically segmental in that almost all segments are represented in
Corpus-based unit selection for natural-sounding speech synthesis
, 2003
"... Speech synthesis is an automatic encoding process carried out by machine through which symbols conveying linguistic information are converted into an acoustic waveform. In the past decade or so, a recent trend toward a non-parametric, corpus-based approach has focused on using real human speech as s ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Speech synthesis is an automatic encoding process carried out by machine through which symbols conveying linguistic information are converted into an acoustic waveform. In the past decade or so, a recent trend toward a non-parametric, corpus-based approach has focused on using real human speech as source material for producing novel natural-sounding speech. This work proposes a communication-theoretic formulation in which unit selection is a noisy channel through which an input sequence of symbols passes and an output sequence, possibly corrupted due to the coverage limits of the corpus, emerges. The penalty of approximation is quantified by substitution and concatenation costs which grade what unit contexts are interchangeable and where concatenations are not perceivable. These costs are semi-automatically derived from data and are found to agree with acoustic-phonetic knowledge.

