, References

P. Comon and C. Jutten, Handbook of blind source separation: independent component analysis and applications. Academic press, 2010.
URL : https://hal.archives-ouvertes.fr/hal-00460653

T. Virtanen, Monaural Sound Source Separation by Nonnegative Matrix Factorization With Temporal Continuity and Sparseness Criteria, IEEE Transactions on Audio, Speech and Language Processing, vol.15, issue.3, pp.1066-1074, 2007.
DOI : 10.1109/TASL.2006.885253

C. Févotte, N. Bertin, and J. Durrieu, Nonnegative Matrix Factorization with the Itakura-Saito Divergence: With Application to Music Analysis, Neural Computation, vol.14, issue.3, pp.793-830, 2009.
DOI : 10.1016/j.sigpro.2007.01.024

A. Liutkus, D. Fitzgerald, Z. Rafii, B. Pardo, and L. Daudet, Kernel Additive Models for Source Separation, IEEE Transactions on Signal Processing, vol.62, issue.16, pp.4298-4310, 2014.
DOI : 10.1109/TSP.2014.2332434

URL : https://hal.archives-ouvertes.fr/hal-01011044

P. Huang, M. K. Hasegawa-johnson, and P. Smaragdis, Deep learning for monaural speech separation, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2014.
DOI : 10.1109/ICASSP.2014.6853860

A. Liutkus and R. Badeau, Generalized Wiener filtering with fractional power spectrograms, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2015.
DOI : 10.1109/ICASSP.2015.7177973

URL : https://hal.archives-ouvertes.fr/hal-01110028

P. Magron, R. Badeau, and B. David, Model-Based STFT Phase Recovery for Audio Source Separation, IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol.26, issue.6, pp.1095-1105, 2018.
DOI : 10.1109/TASLP.2018.2811540

URL : https://hal.archives-ouvertes.fr/hal-01718718

J. , L. Roux, N. Ono, and S. Sagayama, Explicit consistency constraints for STFT spectrograms and their application to phase reconstruction, Proc. ISCA Workshop on Statistical and Perceptual Audition (SAPA), 2008.

J. , L. Roux, and E. Vincent, Consistent Wiener filtering for audio source separation, IEEE Signal Processing Letters, vol.20, issue.3, pp.217-220, 2013.
URL : https://hal.archives-ouvertes.fr/hal-00725350

J. , L. Roux, H. Kameoka, E. Vincent, N. Ono et al., Complex NMF under spectrogram consistency constraints, Proc, 2009.
URL : https://hal.archives-ouvertes.fr/inria-00544149

J. Bronson and P. Depalle, Phase constrained complex NMF: Separating overlapping partials in mixtures of harmonic musical sources, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2014.
DOI : 10.1109/ICASSP.2014.6855053

E. M. Grais, M. U. Sen, and H. Erdogan, Deep neural networks for single channel source separation, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2014.
DOI : 10.1109/ICASSP.2014.6854299

E. M. Grais, G. Roma, A. J. Simpson, and M. D. Plumbley, Two-Stage Single-Channel Audio Source Separation Using Deep Neural Networks, IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol.25, issue.9, pp.1773-1783, 2017.
DOI : 10.1109/TASLP.2017.2716443

A. A. Nugraha, A. Liutkus, and E. Vincent, Multichannel Audio Source Separation With Deep Neural Networks, IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol.24, issue.9, pp.1652-1664, 2016.
DOI : 10.1109/TASLP.2016.2580946

URL : https://hal.archives-ouvertes.fr/hal-01163369

N. Takahashi and Y. Mitsufuji, Multi-Scale multi-band densenets for audio source separation, 2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2017.
DOI : 10.1109/WASPAA.2017.8169987

S. I. Mimilakis, K. Drossos, T. Virtanen, and G. Schuller, A recurrent encoder-decoder approach with skip-filtering connections for monaural singing voice separation, 2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP), 2017.
DOI : 10.1109/MLSP.2017.8168117

S. I. Mimilakis, K. Drossos, J. F. Santos, G. Schuller, T. Virtanen et al., Monaural singing voice separation with skipfiltering connections and recurrent inference of time-frequency mask, Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2018.

]. K. Drossos, S. I. Mimilakis, D. Serdyuk, G. Schuller, T. Virtanen et al., MaD TwinNet: Masker-denoiser architecture with twin networks for monaural sound source separation, Proc. IEEE International Joint Conference on Neural Networks (IJCNN), 2018.

A. Liutkus, F. Stöter, Z. Rafii, D. Kitamura, B. Rivet et al., The 2016 Signal Separation Evaluation Campaign, Proc. International Conference on Latent Variable Analysis and Signal Separation (LVA/ICA), 2017.
DOI : 10.1109/EUSIPCO.2016.7760551

URL : https://hal.archives-ouvertes.fr/hal-01472932

D. Serdyuk, N. Ke, A. Sordoni, A. Trischler, C. Pal et al., Twin Networks: Matching the future for sequence generation, Proc. of International Conference on Learning Representations (ICLR), 2018.

D. Griffin and J. S. Lim, Signal estimation from modified short-time Fourier transform, IEEE Transactions on Acoustics, Speech, and Signal Processing, vol.32, issue.2, pp.236-243, 1984.
DOI : 10.1109/TASSP.1984.1164317

M. Krawczyk and T. Gerkmann, STFT Phase Reconstruction in Voiced Speech for an Improved Single-Channel Speech Enhancement, IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol.22, issue.12, pp.1931-1940, 2014.
DOI : 10.1109/TASLP.2014.2354236

J. Laroche and M. Dolson, Improved phase vocoder time-scale modification of audio, IEEE Transactions on Speech and Audio Processing, vol.7, issue.3, pp.323-332, 1999.
DOI : 10.1109/89.759041

P. Magron, R. Badeau, and B. David, Phase-dependent anisotropic Gaussian model for audio source separation, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2017.
DOI : 10.1109/ICASSP.2017.7952212

URL : https://hal.archives-ouvertes.fr/hal-01416355

P. Magron, J. L. Roux, and T. Virtanen, Consistent anisotropic wiener filtering for audio source separation, 2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2017.
DOI : 10.1109/WASPAA.2017.8170037

URL : https://hal.archives-ouvertes.fr/hal-01593126

K. Drossos, S. I. Mimilakis, D. Serdyuk, G. Schuller, T. Virtanen et al., Mad twinnet pre-trained weights Available: https, Feb, 2018.

M. Abe and J. O. Smith, Design criteria for simple sinusoidal parameter estimation based on quadratic interpolation of FFT magnitude peaks, Audio Engineering Society Convention 117, 2004.

E. Vincent, R. Gribonval, and C. Févotte, Performance measurement in blind audio source separation, IEEE Transactions on Audio, Speech and Language Processing, vol.14, issue.4, pp.1462-1469, 2006.
DOI : 10.1109/TSA.2005.858005

URL : https://hal.archives-ouvertes.fr/inria-00544230

C. Raffel, B. Mcfee, E. J. Humphrey, J. Salamon, O. Nieto et al., mir eval: A transparent implementation of common MIR metrics, Proc. International Society for Music Information Retrieval Conference (ISMIR), 2014.

W. Lim and T. Lee, Harmonic and percussive source separation using a convolutional auto encoder, 2017 25th European Signal Processing Conference (EUSIPCO), 2017.
DOI : 10.23919/EUSIPCO.2017.8081520

URL : https://zenodo.org/record/1159650/files/1570346835.pdf