Results 1 - 10
of
22
Functional Phonology -- Formalizing the interactions between articulatory and perceptual drives
, 1998
"... ..."
A Multichannel Articulatory Database and its Application for Automatic Speech Recognition
- In Proceedings 5 th Seminar of Speech Production
, 2000
"... The goal of this research is to improve the performance of a speaker-independent Automatic Speech Recognition (ASR) system by using directly measured articulatory parameters in the training phase. This paper examines the need for a multi-channel/multi-speaker articulatory database and describes the ..."
Abstract
-
Cited by 21 (1 self)
- Add to MetaCart
The goal of this research is to improve the performance of a speaker-independent Automatic Speech Recognition (ASR) system by using directly measured articulatory parameters in the training phase. This paper examines the need for a multi-channel/multi-speaker articulatory database and describes the design of such a database and the processes involved in its creation. 1.
Against formal phonology
- Language
, 2005
"... Chomsky and Halle (1968) and many formal linguists rely on the notion of a universally available phonetic space defined in discrete time. This assumption plays a central role in phonological theory. Discreteness at the phonetic level guarantees the discreteness of all other levels of language. But d ..."
Abstract
-
Cited by 16 (10 self)
- Add to MetaCart
Chomsky and Halle (1968) and many formal linguists rely on the notion of a universally available phonetic space defined in discrete time. This assumption plays a central role in phonological theory. Discreteness at the phonetic level guarantees the discreteness of all other levels of language. But decades of phonetics research demonstrate that there exists no universal inventory of phonetic objects. We discuss three kinds of evidence: first, phonologies differ incommensurably. Second, some phonetic characteristics of languages depend on intrinsically temporal patterns, and, third, some linguistic sound categories within a language are different from each other despite a high degree of overlap that precludes distinctness. Linguistics has mistakenly presumed that speech can always be spelled with letter-like tokens. A variety of implications of these conclusions for research in phonology are discussed.* The generative paradigm of language description (Chomsky 1964, 1965, Chomsky & Halle 1968) has dominated linguistic thinking in the United States for many years. Its specific claims about the phonetic basis of linguistic analysis still provide the cornerstone of most linguistic research. Many criticisms have been raised against the phonetic claims of the Sound pattern of English (Chomsky & Halle 1968), some from early on
Polysp: a polysystemic, phonetically-rich approach to speech understanding
- Italian Journal of Linguistics - Rivista di Linguistica
, 2001
"... understanding ..."
Lexicalized phonotactic word segmentation
- Proceedings of the Association for Computational Linguistics (ACL
, 2008
"... This paper presents a new unsupervised algorithm (WordEnds) for inferring word boundaries from transcribed adult conversations. Phone ngrams before and after observed pauses are used to bootstrap a simple discriminative model of boundary marking. This fast algorithm delivers high performance even on ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
This paper presents a new unsupervised algorithm (WordEnds) for inferring word boundaries from transcribed adult conversations. Phone ngrams before and after observed pauses are used to bootstrap a simple discriminative model of boundary marking. This fast algorithm delivers high performance even on morphologically complex words in English and Arabic, and promising results on accurate phonetic transcriptions with extensive pronunciation variation. Expanding training data beyond the traditional miniature datasets pushes performance numbers well above those previously reported. This suggests that WordEnds is a viable model of child language acquisition and might be useful in speech understanding. 1
Towards Formal Structural Representation of Spoken Language: An Evolving Transformation System (ETS) Approach
, 2005
"... Speech recognition has been a very active area of research over the past twenty years. Despite an evident progress, it is generally agreed by the practitioners of the field that performance of the current speech recognition systems is rather suboptimal and new ap-proaches are needed. The motivation ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
Speech recognition has been a very active area of research over the past twenty years. Despite an evident progress, it is generally agreed by the practitioners of the field that performance of the current speech recognition systems is rather suboptimal and new ap-proaches are needed. The motivation behind the undertaken research is an observation that the notion of representation of objects and concepts that once was considered to be central in the early days of pattern recognition, has been largely marginalised by the ad-vent of statistical approaches. As a consequence of a predominantly statistical approach to speech recognition problem, due to the numeric, feature vector-based, nature of rep-resentation, the classes inductively discovered from real data using decision-theoretic techniques have little meaning outside the statistical framework. This is because deci-sion surfaces or probability distributions are difficult to analyse linguistically. Because of the later limitation it is doubtful that the gap between speech recognition and lin-guistic research can be bridged by the numeric representations. This thesis investigates an alternative, structural, approach to spoken language representation and categorisa-
Exploring Syllable Structure in Connectionist Networks
- In Proceedings from the 14th International Congress of Phonetic Sciences, (ICPhS-99
, 1999
"... This work explores the typological observation that syllables are asymmetric in their treatment of onsets and codas; many languages permit only onsets, few require codas, and none prohibit onsets. It is theorized that phonetic and computational factors are responsible for these types of syllabic str ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
This work explores the typological observation that syllables are asymmetric in their treatment of onsets and codas; many languages permit only onsets, few require codas, and none prohibit onsets. It is theorized that phonetic and computational factors are responsible for these types of syllabic structures, and that optimal syllables are those that enhance production and perception. This theory is investigated using connectionist models of word recognition and production. Results indicate that certain consonant classes are more easily perceived in CV syllables than in VC syllables, and that this will lead to statistical preferences for these types of phoneme sequences within and across languages. These results are contrasted with the more traditional view which posits innate symbolic mechanisms to account for phonological patterns. 1. INTRODUCTION Formal syllable theory holds that words are not merely sequences of phonemes, but are instead composed of hierarchically organized units s...
A Diphone-Based Text-to-Speech System for Scottish Gaelic
, 1997
"... In this thesis, a diphone--based text--to--speech system for Scottish Gaelic, a language spoken by about 80.000 native speakers in Scotland and Canada, is presented. Text-- to--speech systems convert orthographic text input into speech output. The present system consists of two main parts: ffl an a ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
In this thesis, a diphone--based text--to--speech system for Scottish Gaelic, a language spoken by about 80.000 native speakers in Scotland and Canada, is presented. Text-- to--speech systems convert orthographic text input into speech output. The present system consists of two main parts: ffl an automatic phonetic transcription module which produces an orthophonic transcription of the orthographic input text ffl a speech synthesis module which synthesizes an utterance from its transcription by concatenating and modifying previously recorded speech units. Diphones, speech units that cover two sounds and the transition between them, form the basis of the synthesis module. Duration and intonation are modelled on the basis of simple heuristics. The diphone inventory was designed for the Gaelic of Bayble, Lewis. Scottish Gaelic distinguishes four main phonetic settings: velarised, palatalised, nasalised, and neutral. As the domain of these settings is the syllable, they are difficult t...

