A. Narayanan and D. Wang, Investigation of speech separation as a front-end for noise robust speech recognition, IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol.22, issue.4, pp.826-835, 2014.

H. Kayser, C. Spille, D. Marquardt, and B. T. Meyer, Improving automatic speech recognition in spatially-aware hearing aids, Sixteenth Annual Conference of the International Speech Communication Association, 2015.

J. Chen and D. Wang, Dnn based mask estimation for supervised speech separation," in Audio source separation, pp.207-235, 2018.

M. Ahmadi, V. L. Gross, and D. G. Sinex, Perceptual learning for speech in noise after application of binary time-frequency masks, The Journal of the Acoustical Society of America, vol.133, issue.3, pp.1687-1692, 2013.

D. Wang, U. Kjems, M. S. Pedersen, J. B. Boldt, and T. Lunner, Speech intelligibility in background noise with ideal binary time-frequency masking, The Journal of the Acoustical Society of America, vol.125, issue.4, pp.2336-2347, 2009.

H. Erdogan, J. R. Hershey, S. Watanabe, and J. Le-roux, Deep recurrent networks for separation and recognition of singlechannel speech in nonstationary background audio, New Era for Robust Speech Recognition, pp.165-186, 2017.

A. Narayanan and D. Wang, Ideal ratio mask estimation using deep neural networks for robust speech recognition, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp.7092-7096, 2013.

N. Ito, S. Araki, and T. Nakatani, Permutation-free convolutive blind source separation via full-band clustering based on frequency-independent source presence priors, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, 2013.

N. Ito, S. Araki, T. Yoshioka, and T. Nakatani, Relaxed disjointness based clustering for joint blind source separation and dereverberation, Acoustic Signal Enhancement (IWAENC), pp.268-272, 2014.

D. H. Vu and R. Haeb-umbach, Blind speech separation employing directional statistics in an expectation maximization framework, Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on, pp.241-244, 2010.

J. Heymann, L. Drude, and R. Haeb-umbach, Neural network based spectral mask estimation for acoustic beamforming, Acoustics, Speech and Signal Processing, pp.196-200, 2016.

J. Chen and D. Wang, Long short-term memory for speaker generalization in supervised speech separation, The Journal of the Acoustical Society of America, vol.141, issue.6, pp.4705-4714, 2017.

D. Websdale and B. Milner, A comparison of perceptually motivated loss functions for binary mask estimation in speech separation, 18th Annual Conference of the International Speech Communication Association, pp.2003-2007, 2017.

Q. Liu, W. Wang, and P. Jackson, Audio-visual convolutive blind source separation, Sensor Signal Processing for Defence (SSPD 2010), pp.1-5, 2010.

J. Chen, Y. Wang, S. E. Yoho, D. Wang, and E. W. Healy, Large-scale training to increase speech intelligibility for hearingimpaired listeners in novel noises, The Journal of the Acoustical Society of America, vol.139, issue.5, pp.2604-2612, 2016.

M. Gogate, A. Adeel, and A. Hussain, A novel brain-inspired compression-based optimised multimodal fusion for emotion recognition, 2017 IEEE Symposium Series on Computational Intelligence (SSCI), pp.1-7, 2017.

M. Cooke, J. Barker, S. Cunningham, and X. Shao, An audiovisual corpus for speech perception and automatic speech recognition, The Journal of the Acoustical Society of America, vol.120, issue.5, pp.2421-2424, 2006.

J. Barker, R. Marxer, E. Vincent, and S. Watanabe, The third chimespeech separation and recognition challenge: Dataset, task and baselines, Automatic Speech Recognition and Understanding (ASRU), pp.504-511, 2015.

P. Viola and M. Jones, Rapid object detection using a boosted cascade of simple features, Computer Vision and Pattern Recognition, vol.1, p.511, 2001.

D. A. Ross, J. Lim, R. Lin, and M. Yang, Incremental learning for robust visual tracking, International Journal of Computer Vision, vol.77, issue.1-3, pp.125-141, 2008.

D. P. Kingma and J. Ba, Adam: A method for stochastic optimization, 2014.