J. A. Benediktsson and P. Ghamisi, Spectral-Spatial Classification of Hyperspectral Remote Sensing Images, 2015.

G. Moser and J. Zerubia, Mathematical Models for Remote Sensing Image Processing, 2018.
URL : https://hal.archives-ouvertes.fr/hal-01419170

Y. Chen, H. Jiang, C. Li, X. Jia, and P. Ghamisi, Deep feature extraction and classification of hyperspectral images based on convolutional neural networks, IEEE Trans. Geosci. Remote Sens, vol.54, issue.10, pp.6232-6251, 2016.

Y. Yuan, J. Fang, X. Lu, and Y. Feng, Remote sensing image scene classification using rearranged local features, IEEE Trans. Geosci. Remote Sens, vol.57, issue.3, pp.1779-1792, 2019.

X. Lu, W. Ji, X. Li, and X. Zheng, Bidirectional adaptive feature fusion for remote sensing scene classification, Neurocomputing, vol.328, pp.135-146, 2019.

G. Camps-valls and L. Bruzzone, Kernel Methods for Remote Sensing Data Analysis, 2009.

M. Belgiu and L. , Dr ?gu ¸t , Random forest in remote sensing: a review of applications and future directions, ISPRS J. Photogramm. Remote Sens, vol.114, pp.24-31, 2016.

C. Chang, Statistical detection theory approach to hyperspectral image classification, IEEE Trans. Geosci. Remote Sens, vol.57, issue.4, pp.2057-2074, 2019.

G. Hughes, On the mean accuracy of statistical pattern recognizers, IEEE Trans. Inf. Theory, vol.14, issue.1, pp.55-63, 1968.

L. Qi, X. Lu, and X. Li, Exploiting spatial relation for fine-grained image classification, Patt. Recognit, vol.91, pp.47-55, 2019.

J. M. Bioucas-dias, A. Plaza, N. Dobigeon, M. Parente, Q. Du et al., Chanussot , Hyperspectral unmixing overview: geometrical, statistical, and sparse regression-based approaches, IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens, vol.5, pp.354-379, 2012.

O. Eches, N. Dobigeon, and J. Tourneret, Enhancing hyperspectral image unmixing with spatial correlations, IEEE Trans. Geosci. Remote Sens, vol.49, pp.4239-4247, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00548759

M. Aharon, M. Elad, and A. Bruckstein, K-SVD: an algorithm for designing overcomplete dictionaries for sparse representation, IEEE Trans. Signal Process, vol.54, issue.11, p.4311, 2006.

M. Zibulevsky and B. A. Pearlmutter, Blind source separation by sparse decomposition in a signal dictionary, Neural Comput, vol.13, issue.4, pp.863-882, 2001.

C. J. Porta, A. A. Bekit, B. H. Lampe, and C. Chang, Hyperspectral image classification via compressive sensing, IEEE Trans. Geosci. Remote Sens, vol.57, issue.10, pp.8290-8303, 2019.

Y. C. Cavalcanti, T. Oberlin, N. Dobigeon, S. Stute, M. Ribeiro et al., Unmixing dynamic PET images with variable specific binding kinetics, Med. Image Anal, vol.49, pp.117-127, 2018.

Y. Koren, R. Bell, and C. Volinsky, Matrix factorization techniques for recommender systems, Computer, issue.8, pp.30-37, 2009.

E. Elhamifar and R. Vidal, Sparse subspace clustering: algorithm, theory, and applications, IEEE Trans. Pattern Anal. Mach. Intell, vol.35, issue.11, pp.2765-2781, 2013.

D. D. Lee and H. S. Seung, Learning the parts of objects by non-negative matrix factorization, Nature, vol.401, issue.6755, p.788, 1999.

D. Donoho and V. Stodden, When does non-negative matrix factorization give a correct decomposition into parts? in: Adv, Neural Information Process. Systems, pp.1141-1148, 2004.

J. Mairal, F. Bach, and J. Ponce, Task-driven dictionary learning, IEEE Trans. Pattern Anal. Mach. Intell, vol.34, issue.4, pp.791-804, 2012.
URL : https://hal.archives-ouvertes.fr/inria-00521534

X. Zheng, Y. Yuan, and X. Lu, Dimensionality reduction by spatial-spectral preservation in selected bands, IEEE Trans. Geosci. Remote Sens, vol.55, issue.9, pp.5185-5197, 2017.

Z. Zhang, W. Jiang, J. Qin, L. Zhang, F. Li et al., Jointly learning structured analysis discriminative dictionary and analysis multiclass classifier, IEEE Trans. Neural Netw. Learn. Syst, vol.29, issue.8, pp.3798-3814, 2018.

Q. Zhang and B. Li, Discriminative K-SVD for dictionary learning in face recognition, Proc. Int. Conf. on Computer Vision and Pattern Recognition (CVPR), pp.2691-2698, 2010.

Z. Jiang, Z. Lin, and L. S. Davis, Learning a discriminative dictionary for sparse coding via label consistent K-SVD, Proc. Int. Conf. on Computer Vision and Pattern Recognition (CVPR), pp.1697-1704, 2011.

C. Wang and D. M. Blei, Collaborative topic modeling for recommending scientific articles, Proc. ACM SIGKDD Int. Conf. Knowledge Discovery Data Mining, pp.448-456, 2011.

J. Yoo, M. Kim, K. Kang, and S. Choi, Nonnegative matrix partial co-factorization for drum source separation, Proc. IEEE Int. Conf. Acoust., Speech and Signal Process. (ICASSP), pp.1942-1945, 2010.

N. Yokoya, T. Yairi, and A. Iwasaki, Coupled nonnegative matrix factorization unmixing for hyperspectral and multispectral data fusion, IEEE Trans. Geosci. Remote Sens, vol.50, issue.2, pp.528-537, 2012.

N. Akhtar and A. Mian, Nonparametric coupled bayesian dictionary and classifier learning for hyperspectral classification, IEEE Trans. Neural Netw. Learn. Syst, vol.29, issue.9, pp.4038-4050, 2018.

A. Lagrange, M. Fauvel, S. May, and N. Dobigeon, Hierarchical Bayesian image analysis: from low-level modeling to robust supervised learning, Pattern Recognit, vol.85, pp.26-36, 2018.
URL : https://hal.archives-ouvertes.fr/hal-01545393

J. Bolte, S. Sabach, and M. Teboulle, Proximal alternating linearized minimization for nonconvex and nonsmooth problems, Math. Program, vol.146, issue.1, pp.459-494, 2014.
URL : https://hal.archives-ouvertes.fr/hal-00916090

C. Févotte and J. Idier, Algorithms for nonnegative matrix factorization with the beta-divergence, Neural Comput, vol.23, issue.9, pp.2421-2456, 2011.

I. Jolliffe, Principal component analysis, International Encyclopedia of Statistical Science, pp.1094-1096, 2011.

P. Paatero and U. Tapper, Positive matrix factorization: a non-negative factor model with optimal utilization of error estimates of data values, Environmetrics, vol.5, issue.2, pp.111-126, 1994.

A. M. Bruckstein, M. Elad, and M. Zibulevsky, On the uniqueness of nonnegative sparse solutions to underdetermined systems of equations, IEEE Trans. Inf. Theory, vol.54, issue.11, p.820, 2008.

T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning, 2009.

D. M. Kline and V. L. Berardi, Revisiting squared-error and cross-entropy functions for training neural network classifiers, Neural Comput. Appl, vol.14, issue.4, pp.310-318, 2005.

I. Goodfellow, Y. Bengio, A. Courville, and Y. Bengio, Deep Learning, 2016.

L. Condat, A convex approach to K-means clustering and image segmentation, Int. Workshop on Energy Minimization Methods in Computer Vision and Pattern Recognition, pp.220-234, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01504799

F. Pompili, N. Gillis, P. Absil, and F. Glineur, Two algorithms for orthogonal nonnegative matrix factorization with application to clustering, Neurocomputing, vol.141, pp.15-25, 2014.

X. Sun, N. M. Nasrabadi, and T. D. Tran, Task-driven dictionary learning for hyperspectral image classification with structured sparsity constraints, IEEE Trans. Geosci. Remote Sens, vol.53, issue.8, pp.457-461, 2015.

Y. Yu, On decomposing the proximal map, Adv. in Neural Information Process. Systems, pp.91-99, 2013.

L. Drumetz, M. Veganzones, S. Henrot, R. Phlypo, J. Chanussot et al., Blind hyperspectral unmixing using an extended linear mixing model to address spectral variability, IEEE Trans. Image Process, vol.25, issue.8, pp.3890-3905, 2016.
URL : https://hal.archives-ouvertes.fr/hal-01336279

Y. Zhang, M. Brady, and S. Smith, Segmentation of brain MR images through a hidden Markov random field model and the expectation-maximization algorithm, IEEE Trans. Med. Imaging, vol.20, pp.45-57, 2001.

M. Yang, L. Zhang, X. Feng, and D. Zhang, Fisher discrimination dictionary learning for sparse representation, Proc. IEEE Int. Conf. Computer Vision (ICCV), pp.543-550, 2011.

Y. Liu, F. Condessa, J. M. Bioucas-dias, J. Li, P. Du et al., Convex formulation for multiband image classification with superpixel-based spatial regularization, IEEE Trans. Geosci. Remote Sens, vol.56, issue.5, pp.2704-2721, 2018.

T. Uezato, M. Fauvel, and N. Dobigeon, Hyperspectral image unmixing with Li-DAR data-aided spatial regularization, IEEE Trans. Geosci. Remote Sens, vol.56, issue.2, pp.4098-4108, 2018.

P. J. Huber, Robust estimation of a location parameter, Ann. Math. Stat, vol.35, issue.1, pp.73-101, 1964.

N. Gillis and R. Luce, A fast gradient method for nonnegative sparse regression with self dictionary, IEEE Trans. Image Process, vol.27, issue.1, pp.24-37, 2018.

D. M. Strong, P. Blomgren, and T. F. Chan, Spatially adaptive local-feature-driven total variation minimizing image restoration, SPIE Statistical and Stochastic Methods in Image Processing II, vol.3167, pp.222-234, 1997.

J. M. Bioucas-dias and M. A. Figueiredo, Alternating direction algorithms for constrained sparse regression: application to hyperspectral unmixing, Proc. IEEE GRSS Workshop Hyperspectral Image SIgnal Process.: Evolution in Remote Sens. (WHISPERS), pp.1-4, 2010.

M. E. Paoletti, J. M. Haut, R. Fernandez-beltran, J. Plaza, A. J. Plaza et al., Deep pyramidal residual networks for spectral-spatial hyperspectral image classification, IEEE Trans. Geosci. Remote Sens, vol.57, issue.2, pp.740-754, 2019.

M. P. Uddin, M. A. Mamun, and M. A. Hossain, Effective f eature extraction through segmentation-based folded-PCA for hyperspectral image classification, Int. J. Remote Sens, vol.40, issue.18, pp.7190-7220, 2019.

F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion et al., Scikit-learn: machine learning in python, J. Mach. Learn. Res, vol.12, pp.2825-2830, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00650905

A. Lagrange, M. Fauvel, S. May, J. Bioucas-dias, and N. Dobigeon, Matrix Cofactorization for Joint Representation Learning and Supervised Classification -Application to Hyperspectral Image Analysis. Complementary results, 2019.
URL : https://hal.archives-ouvertes.fr/hal-02887755

R. G. Congalton and K. Green, Assessing the Accuracy of Remotely Sensed Data: Principles and Practices, 2008.

A. Stoian, V. Poulain, J. Inglada, V. Poughon, and D. Derksen, Land cover maps production with high resolution satellite image time series and convolutional neural networks: adaptations and limits for operational systems, Remote Sens, vol.11, issue.17, 1986.
URL : https://hal.archives-ouvertes.fr/hal-02627773

A. Lagrange, M. Fauvel, and M. Grizonnet, Large-scale feature selection with Gaussian mixture models for the classification of high dimensional remote sensing images, IEEE Trans. Comput. Imaging, vol.3, issue.2, pp.230-242, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01382500

L. Chaari, T. Vincent, F. Forbes, M. Dojat, and P. Ciuciu, Fast joint detection-estimation of evoked brain activity in event-related fMRI using a variational approach, IEEE Trans. Med. Imaging, vol.32, issue.5, pp.821-837, 2013.
URL : https://hal.archives-ouvertes.fr/inserm-00753873

P. Getreuer, Rudin-Osher-Fatemi total variation denoising using split Bregman, Image Process. Line, vol.2, pp.74-95, 2012.

L. Condat, Fast projection onto the simplex and the l1 ball, Math. Program, vol.158, issue.1-2, pp.575-585, 2016.
URL : https://hal.archives-ouvertes.fr/hal-01056171

R. Jenatton, J. Mairal, G. Obozinski, and F. Bach, Proximal methods for hierarchical sparse coding, J. Mach. Learn. Res, vol.12, pp.2297-2334, 2011.
URL : https://hal.archives-ouvertes.fr/inria-00516723