The volume addresses issues concerning prosody generation in speech
synthesis, including prosody modeling, how we can convey para- and
non-linguistic information in speech synthesis, and prosody control in
speech synthesis (including prosody conversions). A high level of
quality has already been achieved in speech synthesis by using
selection-based methods with segments of human speech. Although the
method enables synthetic speech with various voice qualities and
speaking styles, it requires large speech corpora with targeted quality
and style.
Accordingly, speech conversion techniques are now of growing interest
among researchers. HMM/GMM-based methods are widely used, but entail
several major problems when viewed from the prosody perspective;
prosodic features cover a wider time span than segmental features and
their frame-by-frame processing is not always appropriate. The book
offers a good overview of state-of-the-art studies on prosody in speech
synthesis.