Neocognitron: a self-organizing neural network for a mechanism of pattern recognition unaffected by shift in position, Biol. Cybern, vol.36, issue.4, pp.193-202, 1980. ,
Gradient-based learning applied to document recognition, Proc. IEEE, vol.86, issue.11, pp.2278-2324, 1998. ,
ImageNet classification with deep convolutional neural networks, Proceedings of the Advances in Neural Information Processing Systems, pp.1097-1105, 2012. ,
ImageNet Large scale visual recognition challenge, Int. J. Comput. Vis. (IJCV), vol.115, issue.3, pp.211-252, 2015. ,
, , 2015.
Large-scale machine learning with stochastic gradient descent, Proceedings of the COMPSTAT'2010, pp.177-186, 2010. ,
Deep residual learning for image recognition, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.770-778, 2016. ,
Very deep convolutional networks for large-scale image recognition, 2014. ,
, Proceedings of the Advances in Neural Information Processing Systems 28, pp.685-693, 2015.
Large scale distributed deep networks, Proceedings of the Advances in Neural Information Processing Systems 25, pp.1223-1231, 2012. ,
Randomized gossip algorithms, IEEE Trans. Inf. Theory, vol.52, 2006. ,
Gossip dual averaging for decentralized optimization of pairwise functions, Proceedings of the Thirty-Third International Conference on Machine Learning, Proceedings of Machine Learning Research, vol.48, pp.1388-1396, 2016. ,
URL : https://hal.archives-ouvertes.fr/hal-01329315
Asynchronous gossip principal components analysis, Neurocomputing, vol.169, pp.262-271, 2015. ,
URL : https://hal.archives-ouvertes.fr/hal-01148639
Epidemic k-means clustering, Proceedings of the IEEE Eleventh International Conference on Data Mining Workshops (ICDMW), pp.151-158, 2011. ,
Decentralized k-means using randomized gossip protocols for clustering large datasets, Proceedings of the IEEE Thirteenth International Conference on Data Mining Workshops, pp.599-606, 2013. ,
URL : https://hal.archives-ouvertes.fr/hal-00915822
, The Nature of Statistical Learning Theory, 1995.
, Linear and Nonlinear Programming, 2008.
Adam: a method for stochastic optimization, Proceedings of the International Conference on Learning Representations, 2015. ,
Overview of mini-batch gradient descent, Neural Networks for Machine Learning, vol.6, 2013. ,
Densely connected convolutional networks, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.4700-4708, 2017. ,
Theano-MPI: a theano-based distributed training framework, Proceedings of the European Conference on Parallel Processing, pp.800-813, 2016. ,
The loss surfaces of multilayer networks, Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, Proceedings of Machine Learning Research, vol.38, pp.192-204, 2015. ,
Gossip-based computation of aggregate information, Proceedings of the Forty-Fourth Annual IEEE Symposium on Foundations of Computer Science, pp.4-82, 2003. ,
Distributed optimization and statistical learning via the alternating direction method of multipliers, Found. Trends Mach. Learn, vol.3, issue.1, pp.1-122, 2011. ,
Learning Multiple Layers of Features From Tiny Images, 2009. ,
, Proceedings of the International Conference on Machine Learning, pp.1058-1066, 2013.
, Torch, vol.7