Results 1 -
2 of
2
ARTIMIS: Natural dialogue meets rational agency
- in Proceedings of IJCAI-97
, 1997
"... We present an effective generic communicating rational agent, ARTIMIS, and its application to cooperative spoken dialogue. ARTIMIS ' kernel is the implementation of a formal theory of interaction. This theory involves a set of generic axioms which models, in a homogeneous logical framework, principl ..."
Abstract
-
Cited by 39 (2 self)
- Add to MetaCart
We present an effective generic communicating rational agent, ARTIMIS, and its application to cooperative spoken dialogue. ARTIMIS ' kernel is the implementation of a formal theory of interaction. This theory involves a set of generic axioms which models, in a homogeneous logical framework, principles of rational behaviour, communication, and cooperation. The theory is interpreted by a specifically designed reasoning engine. When applied to the context of natural dialogue, ARTIMIS includes specialised components for speech and natural language processing. 1
A hidden Markov-model-based trainable speech synthesizer
, 1999
"... This paper presents a new approach to speech synthesis in which a set of cross-word decision-tree state-clustered context-dependent hidden Markov models are used to define a set of subphone units to be used in a concatenation synthesizer. The models, trees, waveform segments and other parameters ..."
Abstract
-
Cited by 19 (0 self)
- Add to MetaCart
This paper presents a new approach to speech synthesis in which a set of cross-word decision-tree state-clustered context-dependent hidden Markov models are used to define a set of subphone units to be used in a concatenation synthesizer. The models, trees, waveform segments and other parameters representing each clustered state are obtained completely automatically through training on a 1 hour single-speaker continuous-speech database. During synthesis the required utterance, specified as a string of words of known phonetic pronounciation, is generated as a sequence of these clustered states using a TD-PSOLA waveform concatenation synthesizer. The system produces speech which, though in a monotone, is both natural sounding and highly intelligible. A Modified Rhyme Test conducted to measure segmental intelligibility yielded a 50% error rate. The speech produced by the system mimics the voice of the speaker used to record the training database. The system can be retrained on...

