K. Som and K. Sparse, average over 100 maps and standard deviation between parenthesis) for the " trajectories " dataset. Parameters for the methods are given between parenthesis after the method name (% of entropy preserved in the projection for K-PCA SOM and maximum mass, ?, and update parameter, ?, for random ascending updates in sparse K-SOM), Methods QE (×100) ICI TE (%) CPU time Stability (%) Dimension

B. Penn, Using self-organizing maps to visualize high-dimensional data, Computers & Geosciences, vol.31, issue.5, pp.531-544, 2005.
DOI : 10.1016/j.cageo.2004.10.009

M. Pölzlbauer, M. Dittenbach, and A. Rauber, Advanced visualization of Self-Organizing Maps with vector fields, Self Organising Maps -WSOM'05, pp.911-922, 2006.
DOI : 10.1016/j.neunet.2006.05.013

P. Sarlin and S. Rönnqvist, Cluster Coloring of the Self-Organizing Map: An Information Visualization Perspective, 2013 17th International Conference on Information Visualisation, pp.532-538
DOI : 10.1109/IV.2013.72

A. Neme, J. Pulido, M. Noz, A. , S. Hernández et al., Stylistics analysis and authorship attribution algorithms based on self-organizing maps, Self-Organizing Maps Subtitle of the special issue: Selected Papers from the Workshop on Self-Organizing Maps 2012, pp.147-159, 2012.
DOI : 10.1016/j.neucom.2014.03.064

Z. Yu, H. Wong, J. You, and G. Han, Visual query processing for efficient image retrieval using a SOM-based filter-refinement scheme, Information Sciences, vol.203, pp.83-101, 2012.
DOI : 10.1016/j.ins.2012.03.012

A. Abbott and J. Forrest, Optimal Matching Methods for Historical Sequences, Journal of Interdisciplinary History, vol.16, issue.3, pp.471-494, 1986.
DOI : 10.2307/204500

C. Elzinga, Sequence Similarity, Sociological Methods & Research, vol.32, issue.1, pp.3-29
DOI : 10.1177/0049124103253373

C. Lozupone, M. Hamady, S. Kelley, and R. Knight, Quantitative and qualitative ? eiversity measures lead to different insights into factors that structure microbial communities, Applied and Environmental Microbiology, pp.1576-1585, 2007.
DOI : 10.1128/aem.01996-06

URL : http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1828774

Z. Yu, J. You, L. Li, H. Wong, and G. Han, Representative Distance: A New Similarity Measure for Class Discovery From Gene Expression Data, IEEE Transactions on NanoBioscience, vol.11, issue.4, pp.341-351, 2012.
DOI : 10.1109/TNB.2012.2208198

M. Cottrell and P. Letrémy, How to use the Kohonen algorithm to simultaneously analyze individuals and modalities in a survey, Neurocomputing, vol.63, pp.193-207, 2005.
DOI : 10.1016/j.neucom.2004.04.011

T. Kohohen and P. Somervuo, Self-organizing maps of symbol strings, Neurocomputing, vol.21, issue.1-3, pp.19-30, 1998.
DOI : 10.1016/S0925-2312(98)00031-9

B. Conan-guez, F. Rossi, and A. , Fast algorithm and implementation of dissimilarity self-organizing maps, Neural Networks, vol.19, issue.6-7, pp.855-863, 2006.
DOI : 10.1016/j.neunet.2006.05.002

URL : https://hal.archives-ouvertes.fr/inria-00174196

N. Aronszajn, Theory of reproducing kernels, Transactions of the American Mathematical Society, vol.68, issue.3, pp.337-404, 1950.
DOI : 10.1090/S0002-9947-1950-0051437-7

L. Goldfarb, A unified approach to pattern recognition, Pattern Recognition, vol.17, issue.5, pp.575-582, 1984.
DOI : 10.1016/0031-3203(84)90056-6

D. M. Donald and C. Fyfe, The kernel self organising map, Proceedings of 4th International Conference on knowledge-based Intelligence Engineering Systems and Applied Technologies, pp.317-320, 2000.

R. Boulet, B. Jouve, F. Rossi, and N. Villa, Batch kernel SOM and related Laplacian methods for social network analysis, Neurocomputing, vol.71, issue.7-9, pp.7-9, 2008.
DOI : 10.1016/j.neucom.2007.12.026

URL : https://hal.archives-ouvertes.fr/hal-00202339

M. Olteanu and N. , On-line relational and multiple relational SOM, Neurocomputing, vol.147, pp.15-30, 2015.
DOI : 10.1016/j.neucom.2013.11.047

URL : https://hal.archives-ouvertes.fr/hal-01063831

B. Hammer and A. Hasenfuss, Topographic Mapping of Large Dissimilarity Data Sets, Neural Computation, vol.2005, issue.9, pp.2229-2284, 2010.
DOI : 10.1162/jmlr.2003.4.6.1001

D. Hofmann, F. Schleif, B. Paaß-en, and B. Hammer, Learning interpretable kernelized prototype-based models, Neurocomputing, vol.141, pp.84-96, 2014.
DOI : 10.1016/j.neucom.2014.03.003

C. Chu, S. Kim, Y. Lin, Y. Yu, G. Bradski et al., Map- Reduce for machine learning on multicore, Advances in Neural Information Processing Systems (NIPS 2010), pp.281-288, 2010.

X. Chen and M. Xie, A split-and-conquer approach for analysis of extraordinarily large data, Statistica Sinica, vol.24, pp.1655-1684, 2014.

S. Del-rio, V. López, J. Beniítez, and F. Herrera, On the use of MapReduce for imbalanced big data using Random Forest, Information Sciences, vol.285, pp.112-137, 2014.
DOI : 10.1016/j.ins.2014.03.043

M. B?adoiub?adoiu, S. Har-peled, and P. Indyk, Approximate clustering via coresets, Proceedings of the 34th annual ACM Symposium on Theory of Computing, pp.250-257, 2002.

D. Yan, L. Huang, and M. Jordan, Fast approximate spectral clustering, Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD '09, pp.907-916, 2009.
DOI : 10.1145/1557019.1557118

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.148.1997

A. Kleiner, A. Talwalkar, P. Sarkar, and M. Jordan, A scalable bootstrap for massive data, Journal of the Royal Statistical Society: Series B (Statistical Methodology), vol.90, issue.4, pp.795-816, 2014.
DOI : 10.1111/rssb.12050

URL : http://arxiv.org/abs/1112.5016

N. Laptev, K. Zeng, and C. Zaniolo, Early accurate results for advanced analytics on MapReduce, Proceedings of the 28th International Conference on Very Large Data Bases of Proceedings of the VLDB Endowment, 2012.
DOI : 10.14778/2336664.2336675

URL : http://arxiv.org/abs/1207.0142

X. Meng, Scalable simple random sampling and stratified sampling, Proceedings of the 30th International Conference on Machine Learning (ICML 2013), 2013.

A. Saffari, C. Leistner, J. Santner, M. Godec, and H. Bischof, On-line Random Forests, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, pp.1393-1400, 2009.
DOI : 10.1109/ICCVW.2009.5457447

M. Denil, D. Matheson, and N. De-freitas, Consistency of online random forests, Proceedings of the 30th International Conference on Machine Learning, pp.2013-1256, 2013.

C. Williams and M. Seeger, Using the Nyström method to speed up kernel machines, Advances in Neural Information Processing Systems (Proceedings of NIPS 2000) Neural Information Processing Systems Foundation, 2000.

R. Hochking, The analysis and selection of variables in linear regression, Biometrics

R. Tibshirani, Regression shrinkage and selection via the lasso, Journal of the Royal Statistical Society, series B, vol.58, issue.1, pp.267-288, 1996.
DOI : 10.1111/j.1467-9868.2011.00771.x

Z. Yu, P. Luo, J. You, H. S. Wong, H. Leung et al., Incremental Semi-Supervised Clustering Ensemble for High Dimensional Data Clustering, IEEE Transactions on Knowledge and Data Engineering, vol.28, issue.3, pp.701-714, 2016.
DOI : 10.1109/TKDE.2015.2499200

C. Bouveyron and C. Brunet-saumard, Model-based clustering of high-dimensional data: A review, Computational Statistics & Data Analysis, vol.71, pp.52-78, 2014.
DOI : 10.1016/j.csda.2012.12.008

URL : https://hal.archives-ouvertes.fr/hal-00750909

F. Rossi, A. Hasenfuss, and B. Hammer, Accelerating relational clustering algorithms with sparse prototype representation, Proceedings of the 6th Workshop on Self-Organizing Maps (WSOM 07), 2007.

D. Hofmann, A. Gisbrecht, and B. Hammer, Efficient approximations of robust soft learning vector quantization for non-vectorial data, Neurocomputing, vol.147, pp.96-106, 2015.
DOI : 10.1016/j.neucom.2013.11.044

A. Gisbrecht, B. Mokbel, and B. Hammer, The Nyström approximation for relational generative topographic mappings, NIPS workshop on challenges of Data Visualization, 2010.

X. Zhu, A. Gisbrecht, F. Schleif, and B. Hammer, Approximation techniques for clustering dissimilarity data, Neurocomputing, vol.90, pp.72-84, 2012.
DOI : 10.1016/j.neucom.2012.01.033

A. Gisbrecht, A. Shultz, and B. Hammer, Parametric nonlinear dimensionality reduction using kernel t-SNE, Neurocomputing, vol.147, pp.71-82, 2015.
DOI : 10.1016/j.neucom.2013.11.045

URL : https://pub.uni-bielefeld.de/download/2671047/2905035

B. Schölkopf, A. Smola, and K. Müller, Nonlinear Component Analysis as a Kernel Eigenvalue Problem, Neural Computation, vol.20, issue.5, pp.1299-1319, 1998.
DOI : 10.1007/BF02281970

S. Kumar, M. Mohri, and A. Talwalkar, Sampling techniques for the Nyström method, Journal of Machine Learning Research, vol.13, pp.981-1006, 2012.

M. Olteanu and N. Villa-vialaneix, Sparse Online Self-Organizing Maps for Large Relational Data, Advances in Self-Organizing Maps and Learning Vector Quantization (Proceedings of WSOM 2016) of Advances in Intelligent Systems and Computing, pp.27-37, 2016.
DOI : 10.1007/978-3-319-28518-4_6

URL : https://hal.archives-ouvertes.fr/hal-01270710

Y. Chen, E. Garcia, M. Gupta, A. Rahimi, and L. Cazzanti, Similarity-based classification: concepts and algorithm, Journal of Machine Learning Research, vol.10, pp.747-776, 2009.

G. Pölzlbauer, Survey and comparison of quality measures for selforganizing maps, Proceedings of the Fifth Workshop on Data Analysis, pp.67-82, 2004.

L. Danon, A. Diaz-guilera, J. Duch, and A. , Arenas, Comparing community structure identification, Journal of Statistical Mechanics, p.9008, 2005.
DOI : 10.1088/1742-5468/2005/09/p09008

URL : http://arxiv.org/abs/cond-mat/0505245

M. Newman and M. Girvan, Finding and evaluating community structure in networks, Physical Review E, vol.69, issue.2, pp.69-026113, 2004.
DOI : 10.1103/PhysRevE.69.026113

URL : http://arxiv.org/abs/cond-mat/0308217

J. Boelaert, L. Bendha¨?babendha¨?ba, M. Olteanu, and N. Villa-vialaneix, SOMbrero: An R Package for Numeric and Non-numeric Self-Organizing Maps, Advances in Self- Organizing Maps and Learning Vector Quantization (Proceedings of WSOM 2014) of Advances in Intelligent Systems and Computing, pp.219-228, 2014.
DOI : 10.1007/978-3-319-07695-9_21

URL : https://hal.archives-ouvertes.fr/hal-01018732

P. Hebert, E. Penton, J. Burns, D. Janzen, and W. Hallwachs, Ten species in one: DNA barcoding reveals cryptic species in the neotropical skipper butterfly Astraptes fulgerator, Proceedings of the National Academy of Sciences, vol.101, issue.41, pp.14812-14817, 2004.
DOI : 10.1073/pnas.0406166101

M. Kimura, A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences, Journal of Molecular Evolution, vol.206, issue.5, Nov., pp.111-120, 1980.
DOI : 10.1007/BF01731581

L. Adamic and N. Glance, The political blogosphere and the 2004 us election: divided they blog, Proceedings of the 3rd LINKDD Workshop, pp.36-43, 2005.

C. Meyer and G. Paulay, DNA Barcoding: Error Rates Based on Comprehensive Sampling, PLoS Biology, vol.85, issue.12
DOI : 10.1371/journal.pbio.0030422.st003

URL : http://doi.org/10.1371/journal.pbio.0030422

P. Cortez, A. Cerdeira, F. Almeida, T. Matos, and J. Reis, Modeling wine preferences by data mining from physicochemical properties, Decision Support Systems, vol.47, issue.4, pp.547-553, 2009.
DOI : 10.1016/j.dss.2009.05.016

E. Côme, M. Cottrell, and P. Gaubert, Analysis of professional trajectories using disconnected self-organizing maps, Self-Organizing Maps Subtitle of the special issue: Selected Papers from the Workshop on Self-Organizing Maps 2012, pp.185-196, 2012.
DOI : 10.1016/j.neucom.2013.12.058

S. Needleman and C. Wunsch, A general method applicable to the search for similarities in the amino acid sequence of two proteins, Journal of Molecular Biology, vol.48, issue.3, pp.443-453, 1970.
DOI : 10.1016/0022-2836(70)90057-4

J. Mariette and N. Villa-vialaneix, Aggregating Self-Organizing Maps with Topology Preservation, Advances in Self-Organizing Maps and Learning Vector Quantization (Proceedings of WSOM 2016) of Advances in Intelligent Systems and Computing, pp.27-37, 2016.
DOI : 10.1007/978-3-319-28518-4_2

URL : https://hal.archives-ouvertes.fr/hal-01270640