Abstract : Sampling is a popular way of scaling up machine learning algorithms to large datasets. The question often is how many samples are needed. Adaptive stopping algorithms monitor the performance in an online fashion and they can stop early, saving valuable resources. We consider problems where probabilistic guarantees are desired and demonstrate how recently-introduced empirical Bernstein bounds can be used to design stopping rules that are efficient. We provide upper bounds on the sample complexity of the new rules, as well as empirical results on model selection and boosting in the filtering setting.
https://hal-enpc.archives-ouvertes.fr/hal-00834983 Contributor : Pascal MonasseConnect in order to contact the contributor Submitted on : Tuesday, June 18, 2013 - 2:46:49 PM Last modification on : Thursday, March 17, 2022 - 10:08:39 AM Long-term archiving on: : Thursday, September 19, 2013 - 4:08:00 AM