Handbook of massive data sets, vol.4, 2013. ,
Probabilistic polynomials and hamming nearest neighbors, IEEE 56th Annual Symposium on Foundations of Computer Science, FOCS 2015, pp.136-150, 2015. ,
Beyond localitysensitive hashing, Proceedings of the twenty-fifth annual ACM-SIAM symposium on Discrete algorithms, pp.1018-1028, 2014. ,
Optimal data-dependent hashing for approximate near neighbors, Proceedings of the forty-seventh annual ACM symposium on Theory of computing, pp.793-801, 2015. ,
On the fine-grained complexity of empirical risk minimization: Kernel methods and neural networks, Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems, pp.4311-4321, 2017. ,
A discriminative framework for clustering via similarity functions, Proceedings of the fortieth annual ACM symposium on Theory of computing, pp.671-680, 2008. ,
Subquadratic approximation algorithms for clustering problems in high dimensional spaces, Proceedings of the thirty-first annual ACM symposium on Theory of computing, pp.435-444, 1999. ,
Genome-wide expression analysis of plant cell cycle modulated genes, Current opinion in plant biology, vol.4, issue.2, pp.136-142, 2001. ,
Characterization, stability and convergence of hierarchical clustering methods, Journal of machine learning research, vol.11, pp.1425-1470, 2010. ,
Approximate hierarchical clustering via sparsest cut and spreading metrics, Proceedings of the Twenty-Eighth Annual ACM-SIAM Symposium on Discrete Algorithms, pp.841-854, 2017. ,
Hierarchical clustering better than average-linkage, Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, pp.2291-2304, 2019. ,
Hierarchical clustering for euclidean data, 2018. ,
On coresets for k-median and k-means clustering in metric and euclidean spaces and their applications, SIAM Journal on Computing, vol.39, issue.3, pp.923-947, 2009. ,
Twister tries: Approximate hierarchical agglomerative clustering for average distance in linear time, Proceedings of the 2015 ACM SIGMOD international conference on Management of data, pp.505-517, 2015. ,
Hierarchical clustering beyond the worst-case, Advances in Neural Information Processing Systems, pp.6201-6209, 2017. ,
Hierarchical clustering: Objective functions and algorithms, Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, pp.378-397, 2018. ,
URL : https://hal.archives-ouvertes.fr/hal-02169539
A cost function for similarity-based hierarchical clustering, 2015. ,
Locality-sensitive hashing scheme based on p-stable distributions, Proceedings of the twentieth annual symposium on Computational geometry, pp.253-262, 2004. ,
A novel brain partition highlights the modular skeleton shared by structure and function, Scientific reports, vol.5, p.10532, 2015. ,
Fast agglomerative clustering using a knearest neighbor graph, IEEE transactions on pattern analysis and machine intelligence, vol.28, pp.1875-1881, 2006. ,
The elements of statistical learning, Springer series in statistics, vol.1, 2001. ,
Approximate nearest neighbor: Towards removing the curse of dimensionality, Theory of computing, vol.8, issue.1, pp.321-350, 2012. ,
On the complexity of k-sat, Journal of Computer and System Sciences, vol.62, issue.2, pp.367-375, 2001. ,
Which problems have strongly exponential complexity?, Journal of Computer and System Sciences, vol.63, issue.4, pp.512-530, 2001. ,
Nc-link: A new linkage method for efficient hierarchical clustering of large-scale data, IEEE Access, vol.5, pp.5594-5608, 2017. ,
On closest pair in euclidean metric: Monochromatic is as hard as bichromatic, 10th Innovations in Theoretical Computer Science Conference, ITCS 2019, vol.17, p.16, 2019. ,
Fast approximate hierarchical clustering using similarity heuristics, BioData mining, vol.1, issue.1, p.9, 2008. ,
Mining of massive datasets, 2014. ,
Approximation bounds for hierarchical clustering: Average linkage, bisecting k-means, and local search, Advances in Neural Information Processing Systems, pp.3094-3103, 2017. ,
Scalable nearest neighbor algorithms for high dimensional data. Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol.36, 2014. ,
A survey of recent advances in hierarchical clustering algorithms, The Computer Journal, vol.26, issue.4, pp.354-359, 1983. ,
Comments on 'parallel algorithms for hierarchical clustering and cluster validity, IEEE Trans. Pattern Anal. Mach. Intell, vol.14, issue.10, pp.1056-1057, 1992. ,
Approximate k-nearest neighbour based spatial clustering using kd tree, 2013. ,
Scikit-learn: Machine learning in Python, Journal of Machine Learning Research, vol.12, pp.2825-2830, 2011. ,
URL : https://hal.archives-ouvertes.fr/hal-00650905
Hierarchical clustering via spreading metrics, Advances in Neural Information Processing Systems, pp.2316-2324, 2016. ,
Hardness of approximate nearest neighbor search, Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing, pp.1260-1268, 2018. ,
Introduction to information retrieval, vol.39, 2008. ,
argmin j avgpC, ? j q. We now argue that avgpC, ? j?q`w C`w ? j?? ?p1? ,
, Consider the data structure D i,? . By its correctness, D i,? returned a point p? such that ||q?pCq´p?|| 1 ? ?p||q?pCq´d i p?q|| 1 . Thus, applying Claim 1 yields that avg w pC, ??q`w C`w ?? ? ?p1`?qpavgpC,?q`w C`w? q. By the choice of j?, Let? " argmin C 1 ?C pavgpC, C 1 q`w C`wC 1 q and? " |?|