Cepstral Estimation of Speech Signals
|
Homomorphic deconvolution is a common method in speech analysis and synthesis, geology and other applications,
e.g., for pitch extraction, for blind deconvolution of reverberation signals from microphones, for seismic
deconvolution, and in vibratory diagnosis of gear systems. Cepstrum analysis is a well known method to perform
the homomorphic deconvolution where the cepstrum is calculated as the inverse Fourier transform of the logarithm
of the spectrum. In many applications the convolved signals are separated with use of liftering. This window
technique relies on the fact that the cepstrum of the two signals occupy different time intervals. One of the
signals is assumed to be a periodic function of impulses and the cepstrum will be a periodic function of
impulses of the same period as the original signal. The other signal should have a smooth spectrum which
after the logarithm and inverse Fourier transform is converted into a cepstrum signal of short duration,
ideally shorter than the occurence of the first impulse of the periodic signal, not to occupy the same
time interval. The liftering technique is used to separate the two signals, using a highpass lifter to
extract the periodic impulse train and a lowpass lifter to extract the signal of short duration. However,
when the period is short or the other signal has large spectral dynamics, the corresponding cepstrum of
the two signals will be severely overlapping. The errors caused by bias and variance might be large and
algorithms based on robust spectrum analysis techniques could be useful for better performance.
A collaboration with Dr. Elisabeth Zetterholm at dept. of Linguistics initiated some master projects
where multitaper time-frequency analysis and also cepstrum techniques were applied to speaker verification.
The aim of these studies was to find the characteristic properties of the speaker even in cases when the voice
is deliberately changed. The data recordings were made by Elisabeth Zetterholm and the professional impersonator
Anders Mårtensson. This have also initiated the recent work concerning optimal cepstrum analysis of stochastic
processes. We have applied the PM MW technique for speaker verification. This is the initial work in a co-operation
between us and department of Computer Science and Statistics, University of Eastern Finland, Joensuu. In this paper,
the PM MW was showed to give a somewhat better result than the usual Hamming window for speaker verification. Several
new ideas on how to design even more appropriate multiple windows for cepstrum analysis are under investigation, e.g.,
mean squared error optimal multiple windows and optimal smoothing.