Abstract : This paper investigates the use of synthetic 3D scenes to generate ground truth of pedestrian segmentation in 2D crowd video data. Manual segmentation of objects in videos is indeed one of the most time-consuming type of assisted labeling. A big gap in computer vision research can not be filled due to this lack of temporally dense and precise segmentation ground truth on large video samples. Such data is indeed essential to introduce machine learning tech- niques for automatic pedestrian segmentation, as well as many other applications involving occluded people. We present a new dataset of 1.8 million pedestrian silhouettes presenting human-to-human occlusion patterns likely to be seen in real crowd video data. To our knowledge, it is the first publicly available large dataset of pedestrian in crowd silhouettes. Solutions to generate and represent this data are detailed. We discuss ideas of how this ground truth can be used for a large number of computer vision applications and demonstrate it on a camera calibration toy problem.