Near-End Speech Enhancement
A speech pre-processing algorithm is presented that improves the speech intelligibility in noise for the near-end listener. The algorithm improves intelligibility by optimally redistributing the speech energy over time and frequency according to a perceptual distortion measure, which is based on a spectro-temporal auditory model.
Since this auditory model takes into account short- time information, transients will receive more amplification than stationary vowels, which has been shown to be beneficial for intelligibility of speech in noise.
The proposed method is compared to unprocessed speech and two reference methods using an intelligibility listening test. Results show that the proposed method leads to significant intelligibility gains while still preserving quality.
The attached file contains Matlab code that implements the algorithm.
- Speech Energy Redistribution for Intelligibility Improvement in Noise Based on a Perceptual Distortion Measure
C. H. Taal; R. C. Hendriks; R. Heusdens;
Computer Speech and Language,
Volume 2013, 2013.
- A Speech Preprocessing Strategy For Intelligibility Improvement In Noise Based On A Perceptual Distortion Measure
Cees H. Taal; Richard C. Hendriks; Richard Heusdens;
In Proc. IEEE Int. Conf. Acoustics, Speech, Signal Proc. (ICASSP),
pp. 4061-4064, May 2012.