Abstract : In the present study, we propose a model of multimodal place cells merging visual and proprioceptive primitives. First we will briefly present our previous sensory-motor architecture, highlighting limitations of a visual-only based system. Then we will introduce a new model of proprioceptive localization, giving rise to the so-called grid cells, wich are congruent with neurobiological studies made on rodent. Finally we will show how a simple conditionning rule between both modalities can outperform visual-only driven models by producing ro- bust multimodal place cells. Experiments show that this model enhances robot localization and also allows to solve some benchmark problems for real life robotics applications.