J. Weng, J. Mcclelland, A. Pentland, O. Sporns, I. Stockman et al., ARTIFICIAL INTELLIGENCE: Autonomous Mental Development by Robots and Animals, Science, vol.291, issue.5504, pp.599-600, 2001.
DOI : 10.1126/science.291.5504.599

P. Fitzpatrick, A. Needham, L. Natale, and G. Metta, Shared challenges in object perception for robots and infants, Infant and Child Development, vol.35, issue.1, pp.7-24, 2008.
DOI : 10.1002/icd.541

F. Kaplan and P. Oudeyer, The progress-drive hypothesis: an interpretation of early imitation, " in Models and mechanisms of imitation and social learning: Behavioural, social and communication dimensions, 2006.

J. Piaget, Play, dreams and imitation in childhood, 1999.

R. Rouanet, P. Oudeyer, and D. Filliat, An integrated system for teaching new visually grounded words to a robot for non-expert users using a mobile device, 2009 9th IEEE-RAS International Conference on Humanoid Robots, 2009.
DOI : 10.1109/ICHR.2009.5379540

URL : https://hal.archives-ouvertes.fr/inria-00420249

M. Fiala, ARTag, a Fiducial Marker System Using Digital Techniques, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), pp.590-596, 2005.
DOI : 10.1109/CVPR.2005.74

P. Viola and M. J. Jones, Robust Real-Time Face Detection, International Journal of Computer Vision, vol.57, issue.2, pp.137-154, 2004.
DOI : 10.1023/B:VISI.0000013087.49260.fb

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.102.9805

J. Fritsch, S. Lang, M. Kleinehagenbrock, G. A. Fink, and G. Sagerer, Improving adaptive skin color segmentation by incorporating results from face detection, Proceedings. 11th IEEE International Workshop on Robot and Human Interactive Communication, pp.337-343, 2002.
DOI : 10.1109/ROMAN.2002.1045645

D. Beale, P. Iravani, and P. Hall, Probabilistic models for robot-based object segmentation, Robotics and Autonomous Systems, vol.59, issue.12, pp.1080-1089, 2011.
DOI : 10.1016/j.robot.2011.08.003

L. Natale, F. Orabona, F. Berton, G. Metta, and G. Sandini, From sensorimotor development to object perception, 5th IEEE-RAS International Conference on Humanoid Robots, 2005., pp.226-231, 2005.
DOI : 10.1109/ICHR.2005.1573572

Z. W. Pylyshyn, Visual indexes, preconceptual objects, and situated vision, Cognition, vol.80, issue.1-2, pp.127-158, 2001.
DOI : 10.1016/S0010-0277(00)00156-6

D. Walther and C. Koch, Modeling attention to salient proto-objects, Neural Networks, vol.19, issue.9, pp.1395-407, 2006.
DOI : 10.1016/j.neunet.2006.10.001

H. Wersing, S. Kirstein, M. Götting, H. Brandl, M. Dunn et al., ONLINE LEARNING OF OBJECTS IN A BIOLOGICALLY MOTIVATED VISUAL ARCHITECTURE, International Journal of Neural Systems, vol.17, issue.04, pp.219-230, 2007.
DOI : 10.1142/S0129065707001081

D. G. Lowe, Distinctive Image Features from Scale-Invariant Keypoints, International Journal of Computer Vision, vol.60, issue.2, pp.91-110, 2004.
DOI : 10.1023/B:VISI.0000029664.99615.94

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.14.4931

H. Bay, A. Ess, T. Tuytelaars, and L. Van-gool, Speeded-Up Robust Features (SURF), Computer Vision and Image Understanding, vol.110, issue.3, pp.346-359, 2008.
DOI : 10.1016/j.cviu.2007.09.014

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.205.738

W. Förstner, A framework for low level feature extraction, European Conf. on Computer Vision (ECCV), pp.383-394, 1994.
DOI : 10.1007/BFb0028370

T. Dickscheid, F. Schindler, and W. Förstner, Detecting interpretable and accurate scale-invariant keypoints, IEEE Int. Conf. on Computer Vision, pp.2256-2263, 2009.

K. Mikolajczyk and C. Schmid, Scale & Affine Invariant Interest Point Detectors, International Journal of Computer Vision, vol.60, issue.1, pp.63-86, 2004.
DOI : 10.1023/B:VISI.0000027790.02288.f2

URL : https://hal.archives-ouvertes.fr/inria-00548554

J. Shi and C. Tomasi, Good features to track, IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pp.593-600, 1994.

]. T. Dickscheid, F. Schindler, and W. Förstner, Coding Images with Local Features, International Journal of Computer Vision, vol.59, issue.1, pp.154-174, 2011.
DOI : 10.1007/s11263-010-0340-z

C. Tomasi and T. Kanade, Detection and tracking of point features, 1991.

D. Filliat, A visual bag of words method for interactive qualitative localization and mapping, Proceedings 2007 IEEE International Conference on Robotics and Automation, pp.3921-3926, 2007.
DOI : 10.1109/ROBOT.2007.364080

URL : https://hal.archives-ouvertes.fr/hal-00640996

J. Sivic and A. Zisserman, Video Google: a text retrieval approach to object matching in videos, Proceedings Ninth IEEE International Conference on Computer Vision, pp.1470-1477, 2003.
DOI : 10.1109/ICCV.2003.1238663

B. Micusik and J. Kosecka, Semantic segmentation of street scenes by superpixel co-occurrence and 3D geometry, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, pp.625-632, 2009.
DOI : 10.1109/ICCVW.2009.5457645

A. R. Smith, Color gamut transform pairs, ACM SIGGRAPH Computer Graphics, vol.12, issue.3, pp.12-19, 1978.
DOI : 10.1145/965139.807361

R. Fergus, P. Perona, A. Zisserman, and O. P. , Object class recognition by unsupervised scale-invariant learning, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings., pp.264-271, 2003.
DOI : 10.1109/CVPR.2003.1211479

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.114.7863

C. Kemp and A. Edsinger, What can i control?: The development of visual categories for a robots body and the world that it influences, IEEE Int. Conf. on Development and Learning (ICDL), Special Session on Autonomous Mental Development, 2006.