## Time-domain and frequency-domain techniques for prosodic modification of speech (1995)

Venue: | Elsevier Science B.V |

Citations: | 12 - 1 self |

### BibTeX

@INPROCEEDINGS{Moulines95time-domainand,

author = {E. Moulines},

title = {Time-domain and frequency-domain techniques for prosodic modification of speech},

booktitle = {Elsevier Science B.V},

year = {1995},

pages = {519--555},

publisher = {Elsevier}

}

### Years of Citing Articles

### OpenURL

### Abstract

### Citations

273 |
Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones
- Moulines, Charpentier
- 1990
(Show Context)
Citation Context ... noticeable in unvoiced portions of the signal. A possible improvement could be obtained if the segments that are used more than once in the output signal are time reversed whenever they are repeated =-=[13]-=-. This helps to break up the repetitive structure and corresponds to sign-inverting the phase, which is allowed in unvoiced speech but can not be used for voiced segments. 5. Pitch-scaling transformat... |

72 |
High quality time-scale modification for speech
- Roucos, Wilgus
- 1985
(Show Context)
Citation Context ...iterative procedure that slowly converges to alocal optimum it becomes important that a good initial estimate rO(t4(u)",,) or, equivalently, a good choice for yl(n) can be proposed. Roucos and Wilgus =-=[9]-=- experimentally studied the convergence of OLA time-scaling and found that for initial estimates like Gaussian \vhite noise 100 iterations were typically required to obtain high quality results. In th... |

70 | An overlap-add technique based on waveform similarity (wsola) for high quality time-scale modification of speech
- Verhelst, Roelands
- 1993
(Show Context)
Citation Context ... + D-1(uR) -uR + L1u), (4.6)sProsodic Modifications of Speech 535 where ~u are chosen such as to ensure sufficient signal continuit y at waveform segment boundaries according to some criterion. WSOLA =-=[10]-=- proposes a synchronization strategy in~pired on a time-scaling criterion. 4.3.2. A waveform similarity criterion for time-scaling We considered that a time-scaled version of an original signal should... |

58 |
Shape Invariant Time-Scale and Pitch Modification of Speech
- Quatieri, McAulay
- 1992
(Show Context)
Citation Context ..., from both theoretical and practical point of view. In particular, it maintains the temporal structure of the original waveform during voicing: TD-PSOLA is shape invariant in the sense defined by in =-=[19]-=-; it is worth noting with this respect that the shape-invariant time-scale and pitch-scale transformations based on the sinusoidal representation proposed in this chapter makes use of pitch pulse onse... |

50 |
Non parametric techniques for pitch-scale and time-scale modification of speech
- Moulines, Laroche
- 1995
(Show Context)
Citation Context ...atory time-scale modification is applied in a fourth, and final step, to restore the original signal duration. The first non-parametric method to modify the pitch scale was proposed by [14] (see also =-=[15, 16]-=- for more recent references). It was derived from the phase vocoder and follows exactly the scheme outlined above. Around each time instant ta ( u) = uR (where R is a small portion of the window lengt... |

41 |
Diphone synthesis using an overlap-add technique for speech waveforms concatenation
- Charpentier, Stella
- 1986
(Show Context)
Citation Context ...been uttered by another speaker' 5.5. FD-PSOLA Historically, the FD-PSOLA technique was the first pitch-synchronous time-scale and pitch-scale modification technique proposed in the liter at ure (see =-=[20]-=-). At that time, FD-PSOLA was primarily thought of as a pitch-synchronous implementation of the speech modification technique proposed in [14]. It was remarked that the pitch synchronicity of the proc... |

36 | A weighted overlap-add method of short-time Fourier analysis/synthesis - Crochiere - 1980 |

33 |
Multirate digital signal processing
- Crochiere, Rabiner
- 1983
(Show Context)
Citation Context ...n resampling plays a key role in many pitch modification algorithms (time-domain PSOLA is a notabIe exception). For clarity, we describe in this appendix the basic concepts. More details can be found =-=[4]-=-. The time-domain resamplingmethod described below applies for constant and raProsodSc Modifications of Speech tional sampling-rate conversion factors 0: = D jU. This resampling method consists of 1)... |

27 |
Short-time Fourier analysis of sampled speech
- Portnoff
- 1981
(Show Context)
Citation Context ...iteh-sealing 2.1. A simp/e modt/ lor voiced speech In discussing the problems oftime-scaling and pitch-scaling, we will find it helpful to refer once in a while to a specific model for voiced speech (=-=[1]-=-). It should be stressed, ho\vever, that this model will only serve explanatory purposes and is not actually used in the methods presented in this chapter. For methods that do explicitly rely on such ... |

23 | Short-time Fourier transform - Nawab, Quatieri - 1988 |

15 |
A Frobenius norm approach to glottal closure detection from the speech signal
- Ma, Kamp, et al.
- 1994
(Show Context)
Citation Context ...espect to the linear relationship is small in the closed glottis region and large at the instants of glottal closure. Several works in this direction have been developed recent I y (see, for example, =-=[17]-=- and the references therein). Another approach consists in using the pointwise Teager operator associated with some kind of band-pass filtering (see [18]). In either case, the use of a pitch detector ... |

11 |
System to independently modify excitation and/or spectrum of speech waveform without explicit pitch extraction
- Seneff
(Show Context)
Citation Context ...mation: compensatory time-scale modification is applied in a fourth, and final step, to restore the original signal duration. The first non-parametric method to modify the pitch scale was proposed by =-=[14]-=- (see also [15, 16] for more recent references). It was derived from the phase vocoder and follows exactly the scheme outlined above. Around each time instant ta ( u) = uR (where R is a small portion ... |

6 |
Energy Onset Times for Speaker Identification
- Quatieri, Jankowski, et al.
- 1994
(Show Context)
Citation Context ...e been developed recent I y (see, for example, [17] and the references therein). Another approach consists in using the pointwise Teager operator associated with some kind of band-pass filtering (see =-=[18]-=-). In either case, the use of a pitch detector as a post-processor is mandatory; it helps to avoid incorrect labeling by maintaining the coherencebetween the successive detection of the pitch onsets a... |

2 | Application of the short-time Fourier transform to speech processing and spectral analysis - Allen - 1982 |

2 |
1 Autocorrelation method for high quality pitch/time scaling
- Laroche
- 1993
(Show Context)
Citation Context ...ures related to a speech-production model are made and modified in this time-scaling). In fact, some useful results have even been shown in experiments on WSOLA time-scaling for digital audio signals =-=[11, 12]-=-. As synchronized OLA algorithms are easily interpreted as automatic waveformediting methods, we can also easily see that they do have limitations to their capabilities. For one, the structure of sign... |

1 |
Acousticalquantaand the theoryofhearing
- Gabor
(Show Context)
Citation Context ...ve expectation related to hearing a time-scaled version of an original acoustic signal. This is because it is our most elementary experience that sound has a time pattem as weIl as a frequency pattem =-=[2]-=- and that these pattems are relatively independent as they are related to the rhythm and the melody, respectively. We therefore require a different type of time-scaling ( one that does not affect the ... |

1 |
Discrete- Time Processing oJ Speech Signals
- Deller, Proakis, et al.
- 1993
(Show Context)
Citation Context ...ficient implementations are available, based on the F FT algorithm and simple modifications of overlap-and-add synthesis methods [3-6] ( a good comprehensive treatment of this concept can be found in =-=[7]-=-). The basic idea is to use a windowing function w( n) to restrict the analysis to short segments of x(n) around the analysis time instants in such a way that x(n) can be considered to have fixed char... |

1 |
Signal estimat.ionfrom modified short-time Fourier transform
- Griffin, Lim
- 1984
(Show Context)
Citation Context ...a is required which leads to the correct result if Y(t.(u), "') is a STFT and to a reasonable result otherwise. One such synthesis method uses overlap-addition (OLA). As introduced by Griffin and Lim =-=[8]-=-, this procedure consists of seeking the synthetic signal y(n) whose short-time Fourier transform ( around time instants t. ( u) ) Y(t$(u),UJ) = L fu(m)y(t$(u) + m)exp(-jUJm) rn best fits the modified... |

1 |
On the application of automatic waveformediting for time \varping digital and analogrecordings
- Spleesters, Verhelst, et al.
- 1994
(Show Context)
Citation Context ...ures related to a speech-production model are made and modified in this time-scaling). In fact, some useful results have even been shown in experiments on WSOLA time-scaling for digital audio signals =-=[11, 12]-=-. As synchronized OLA algorithms are easily interpreted as automatic waveformediting methods, we can also easily see that they do have limitations to their capabilities. For one, the structure of sign... |

1 | Traitement de la par!)le p4r analyse-synthise de FouriEr. Application a la $ynthêse p4r diphone - Charpentier - 1988 |