Intelligibility Enhancement Based on Mutual Information

Abstract: Speech intelligibility enhancement is considered for multiple-microphone acquisition and single loudspeaker rendering. This is based on the mutual information measured between the message spoken at far-end environment and the message perceived by a listener at near-end. We prove that the joint optimal processing can be decomposed into far-end and near-end processing. The former is a minimum variance distortionless response beamformer that reduces the noise in the talker environment and the latter is a post-filter that redistributes the power over the frequency bands. Disjoint processing is optimal provided that the post-filtering operation is aware of the residual noise from the beamforming operation. Our results show that both processing steps are necessary for the effective conveyance of a message and, importantly, that the second step must be aware of the remaining noise from the beamforming operation in the first step. In addition, we study the use of the mutual information applied on the perceptually more relevant powers per critical band.

Related publications

Intelligibility Enhancement Based on Mutual Information
S. Khademi; R.C. Hendriks; W.B. Kleijn;
IEEE/ACM Trans. Audio, Speech, Language Process.,
Volume 25, Issue 8, pp. 1694-1708, August 2017. ISSN 2329-9290. DOI: 10.1109/TASLP.2017.2714424
document
Jointly optimal near-end and far-end multi-microphone speech intelligibility enhancement based on mutual information
S. Khademi; R. C. Hendriks; W. B. Kleijn;
In Proc. IEEE Int. Conf. Acoustics, Speech, Signal Proc. (ICASSP),
Shanghai, China, March 2016.
document

Repository data

File:	JOINT_INT_ENH_MI.zip
Size:	1.5 MB
Modified:	18 August 2017
Type:	software
Authors:	Seyran Khademi, Richard Hendriks, Bastiaan Kleijn
Date:	March 2016
Contact:	Richard Hendriks