Next: LESAN: Lexical and syntactic
Up: Some of the completed
Previous: PolyTTS: Polyglot text-to-speech synthesis
Paralinguistic transformations (PT) change the information about a
speaker's pitch, age and sex in a speech signal and leave the
linguistic contents invariant. PTs are spectrum transformations and
can be used to normalize speech, i.e. to compensate for variations in
the speech signal due to differently shaped speech organs across
speakers. The major such variations are caused by different vocal
tract lengths that are directly related to the speaker's pitches. The
corresponding PTs are direct functions of the fundamental frequency F0.
The focus of the project is on the following issues:
- exploration of various F0-based PTs known from acoustic
phonetics and the evaluation of their performance with a standard
speech recognition system. F0-based PTs allow an immediate adaption
to a new speaker which is an advantage over commonly used
normalization schemes.
- development of a new feature extraction method that locates
peaks shaped by the formants in the spectrum by a pattern matching
approach. Spectral peaks are supposed to be the most robust
information-conveying acoustical features in a noisy environment.
The new feature extraction method will allow a more direct
evaluation of the normalization schemes by means of PTs.
More information can be found in [Gla03].
Supported by:
This project was partly supported by
NCCR IM2.
Last updated: Tue Jan 24 09:54:29 CET 2012
by: Beat Pfister