Pitch (i.e., fundamental frequency FO and fundamental period TO)
occupies a key position in the acoustic speech signal. The prosodic
information of an utterance is predominantly determined by this
parameter. The ear is more sensitive to changes of fundamental frequency
than to changes of other speech signal parameters by an order of
magnitude. The quality of vocoded speech is essentially influenced by
the quality and faultlessness of the pitch measure- ment. Hence the
importance of this parameter necessitates using good and reliable
measurement methods. At first glance the task looks simple: one just has
to detect the funda- mental frequency or period of a quasi-periodic
signal. For a number of reasons, however, the task of pitch
determination has to be counted among the most difficult problems in
speech analysis. 1) In principle, speech is a nonstationary process; the
momentary position of the vocal tract may change abruptly at any time.
This leads to drastic variations in the temporal structure of the
signal, even between subsequent pitch periods, and assuming a
quasi-periodic signal is often far from realistic. 2) Due to the
flexibility of the human vocal tract and the wide variety of voices,
there exist a multitude of possible temporal structures. Narrow-band
formants at low harmonics (especially at the second or third harmonic)
are an additional source of difficulty. 3) For an arbitrary speech
signal uttered by an unknown speaker, the fundamental frequency can vary
over a range of almost four octaves (50 to 800 Hz).