• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 136
Next 10 →

Analysis of speaker adaptation algorithms for HMM-based speech synthesis and a constrained SMAPLR adaptation algorithm

by Junichi Yamagishi, Takao Kobayashi, Senior Member, Yuji Nakano, Katsumi Ogata, Juri Isogai - IEEE Trans. Audio Speech Lang. Process , 2009
"... Abstract—In this paper, we analyze the effects of several factors and configuration choices encountered during training and model construction when we want to obtain better and more stable adaptation in HMM-based speech synthesis. We then propose a new adaptation algorithm called constrained structu ..."
Abstract - Cited by 87 (28 self) - Add to MetaCart
Abstract—In this paper, we analyze the effects of several factors and configuration choices encountered during training and model construction when we want to obtain better and more stable adaptation in HMM-based speech synthesis. We then propose a new adaptation algorithm called constrained structural maximum a posteriori linear regression (CSMAPLR) whose derivation is based on the knowledge obtained in this analysis and on the results of comparing several conventional adaptation algorithms. Here, we investigate six major aspects of the speaker adaptation: initial models; the amount of the training data for the initial models; the transform functions, estimation criteria, and sensitivity of several linear regression adaptation algorithms; and combination algorithms. Analyzing the effect of the initial model, we compare speaker-dependent models, gender-independent models, and the simultaneous use of the gender-dependent models to single use of the gender-dependent models. Analyzing the effect of the transform functions, we compare the transform function for only mean vectors with that for mean vectors and covariance matrices. Analyzing the effect of the estimation criteria, we compare the ML criterion with a robust estimation criterion called structural MAP. We evaluate the sensitivity of several thresholds for the piecewise linear regression algorithms and take up methods com-bining MAP adaptation with the linear regression algorithms. We incorporate these adaptation algorithms into our speech synthesis system and present several subjective and objective evaluation results showing the utility and effectiveness of these algorithms in speaker adaptation for HMM-based speech synthesis. Index Terms—Average voice, hidden Markov model (HMM)-based speech synthesis, speaker adaptation, speech synthesis, voice conversion. I.

unknown title

by Simon King A, Vasilis Karaiskos B, School Of Informatics
"... again organised by the University of Edinburgh with assistance from the other members of the Blizzard Challenge committee – Prof. Keiichi Tokuda and Prof. Alan Black. Two English corpora were used: the ‘rjs ’ corpus provided by Phonetic Arts, and the ‘roger ’ corpus from the University of Edinburgh. ..."
Abstract - Add to MetaCart
again organised by the University of Edinburgh with assistance from the other members of the Blizzard Challenge committee – Prof. Keiichi Tokuda and Prof. Alan Black. Two English corpora were used: the ‘rjs ’ corpus provided by Phonetic Arts, and the ‘roger ’ corpus from the University of Edinburgh

Parsing Techniques: A Practical Guide

by Dick Grune, Ceriel Jacobs, Dick Grune, Ceriel J. Jacobs , 1990
"... 1.1 Parsing as a craft......................................... 14 1.2 The approach used........................................ 14 1.3 Outline of the contents..................................... 15 ..."
Abstract - Cited by 62 (0 self) - Add to MetaCart
1.1 Parsing as a craft......................................... 14 1.2 The approach used........................................ 14 1.3 Outline of the contents..................................... 15

Robust speaker-adaptive HMM-based text-to-speech synthesis

by Junichi Yamagishi, Takashi Nose, Heiga Zen, Zhen-hua Ling, Tomoki Toda, Keiichi Tokuda, Simon King, Senior Member, Steve Renals - IEEE Trans. on Audio, Speech and Language Processing , 2009
"... Abstract—This paper describes a speaker-adaptive HMM-based speech synthesis system. The new system, called “HTS-2007, ” employs speaker adaptation (CSMAPLR+MAP), feature-space adaptive training, mixed-gender modeling, and full-covariance modeling using CSMAPLR transforms, in addition to several othe ..."
Abstract - Cited by 58 (18 self) - Add to MetaCart
Abstract—This paper describes a speaker-adaptive HMM-based speech synthesis system. The new system, called “HTS-2007, ” employs speaker adaptation (CSMAPLR+MAP), feature-space adaptive training, mixed-gender modeling, and full-covariance modeling using CSMAPLR transforms, in addition to several other techniques that have proved effective in our previous systems. Subjective evaluation results show that the new system generates significantly better quality synthetic speech than speaker-dependent approaches with realistic amounts of speech data, and that it bears comparison with speaker-dependent approaches even when large amounts of speech data are available. In addition, a comparison study with several speech synthesis techniques shows the new system is very robust: It is able to build voices from less-than-ideal speech data and synthesize good-quality speech even for out-of-domain sentences. Index Terms—Average voice, HMM-based speech synthesis, HMM Speech Synthesis System, HTS, speaker adaptation, speech synthesis, voice conversion.

Session II Results from Our Collaboration Model

by Michiaki Yasumura [chair, Tetsuya Onoda, Ph. D. Student, Phillip Codognet, David Farber
"... Michiaki Yasumura [Chair] The title of the session was “Results from Our Collaboration Model”, and it was quite unique since it consisted of research results from different parts of the three research layers. Eeach talk contained many speakers, coming from different laboratories, which is quite uniq ..."
Abstract - Add to MetaCart
unique in Japan. The listeners may have understood how much we have collaborated. First speakers were Hideaki Ogawa, Taizo Zushi and Junichi Yura. Their talk was on two projects, the Color project and

(Diptera: Cecidomyildae: Asphondyliini) Inducing Leaf Galls on Illicium anisatum (Illidaceae) in Japan

by Makoto Tokuda , 2004
"... Abstract. A new genus, Illiciomyia Tokuda, is erected for Illiciomyia yukawai sp. n. that induces leaf galls on Illicium anisatum (Illiciaceae) in Japan. The new genus belongs to the subtribe Asphondyliina (Diptera: Cecidomyiidae: Asphondyliini) and can be distinguished from other genera by the lack ..."
Abstract - Add to MetaCart
Abstract. A new genus, Illiciomyia Tokuda, is erected for Illiciomyia yukawai sp. n. that induces leaf galls on Illicium anisatum (Illiciaceae) in Japan. The new genus belongs to the subtribe Asphondyliina (Diptera: Cecidomyiidae: Asphondyliini) and can be distinguished from other genera

Tracker

by Junichi Tokuda Phd, Tokuda J, Tokuda J, Tokuda J
"... 3D Slicer’s data I/O in OR •  Import images from MRI/CT/Ultrasound.. •  Import tool tracking data •  Send commands to robotic devices ..."
Abstract - Add to MetaCart
3D Slicer’s data I/O in OR •  Import images from MRI/CT/Ultrasound.. •  Import tool tracking data •  Send commands to robotic devices

The HTS-2008 system: Yet another evaluation of the speaker-adaptive HMM-based speech synthesis system

by Junichi Yamagishi, Heiga Zen, Yi-jian Wu, Tomoki Toda, Keiichi Tokuda - in the 2008 Blizzard Challenge,” in Proc. Blizzard Challenge 2008 , 2008
"... For the 2008 Blizzard Challenge, we used the same speakeradaptive approach to HMM-based speech synthesis that was used in the HTS entry to the 2007 challenge, but an improved system was built in which the multi-accented English average voice model was trained on 41 hours of speech data with highorde ..."
Abstract - Cited by 23 (6 self) - Add to MetaCart
For the 2008 Blizzard Challenge, we used the same speakeradaptive approach to HMM-based speech synthesis that was used in the HTS entry to the 2007 challenge, but an improved system was built in which the multi-accented English average voice model was trained on 41 hours of speech data with highorder mel-cepstral analysis using an efficient forward-backward algorithm for the HSMM. The listener evaluation scores for the synthetic speech generated from this system was much better than in 2007: the system had the equal best naturalness on the small English data set and the equal best intelligibility on both small and large data sets for English, and had the equal best naturalness on the Mandarin data. In fact, the English system was found to be as intelligible as human speech. Index Terms: speech synthesis, HMM, HTS, speaker adaptation 1.

Thousands of voices for HMM-based speech synthesis

by Junichi Yamagishi, Bela Usabaev, Simon King, Oliver Watts, John Dines, Jilei Tian, Rile Hu, Keiichiro Oura, Keiichi Tokuda, Reima Karhila, Mikko Kurimo - in Proc. Interspeech 2009 , 2009
"... Our recent experiments with HMM-based speech synthesis systems have demonstrated that speaker-adaptive HMM-based speech synthesis (which uses an ‘average voice model ’ plus model adaptation) is robust to non-ideal speech data that are recorded under various conditions and with varying micro-phones, ..."
Abstract - Cited by 20 (8 self) - Add to MetaCart
Our recent experiments with HMM-based speech synthesis systems have demonstrated that speaker-adaptive HMM-based speech synthesis (which uses an ‘average voice model ’ plus model adaptation) is robust to non-ideal speech data that are recorded under various conditions and with varying micro-phones, that are not perfectly clean, and/or that lack of pho-netic balance. This enables us consider building high-quality voices on ’non-TTS ’ corpora such as ASR corpora. Since ASR corpora generally include a large number of speakers, this leads to the possibility of producing an enormous number of voices automatically. In this paper we show thousands of voices for HMM-based speech synthesis that we have made from several popular ASR corpora such as the Wall Street Journal databases (WSJ0/WSJ1/WSJCAM0), Resource Management, Globalphone and Speecon. We report some perceptual evalu-ation results and outline the outstanding issues. Index Terms: speech synthesis, HMMs, speaker adaptation 1.

Projection Profile Matching for Intraoperative MRI Registration embedded in MR imaging sequence

by Nobuhiko Hata, Junichi Tokuda, Shigeo Morikawa, Takeyoshi Dohi
"... Abstract. Fast image registration for magnetic resonance image (MRI)-guided surgery using projection profile matching embedded in MR pulse sequence is proposed. The method can perform two-dimensional image registration by matching projection profiles acquired with zero-degree phase encoding. The mat ..."
Abstract - Add to MetaCart
Abstract. Fast image registration for magnetic resonance image (MRI)-guided surgery using projection profile matching embedded in MR pulse sequence is proposed. The method can perform two-dimensional image registration by matching projection profiles acquired with zero-degree phase encoding. The matching process continuously measures displacement by optimizing cross correlation value from profiles acquired 64 times by a special pulse excitation and echo acquisition in one imaging cycle. A phantom experiment concluded that the method can perform the registration in 25ms with the accuracy of 0.50mm out of 100mm field of view. The paper also includes in-vivo experiment to registration MRI arm in motion. Unlike previously reported image registration by post-processing, the method is suitable in intraoperative setting where fast registration is in great need.
Next 10 →
Results 1 - 10 of 136
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University