Skip to Main content Skip to Navigation
Journal articles

A multi-cue spatio-temporal framework for automatic frontal face clustering in video sequences

Abstract : Clustering of specific object detections is a challenging problem for video summarization. In this article, we present a method to form tracks by grouping face detections of a video sequence. Our clustering method is based on a probabilistic maximum a posteriori data association framework, and we apply it to face detection in a visual surveillance context. Optimal solution is found with a procedure using network-flow algorithms described in previous pedestrian tracking-by-detection works. To address difficult cases of small detections in scenes with multiple moving people, given that face detections are located in a video sequence, we use dissimilarities involving appearance and spatio-temporal information. The main contribution is the use of an optical flow or local front-back tracking to handle complex situations appearing in real sequences. The resulting algorithm is then able to deal with situations where people are crossing one another and face detections are scattered due to head rotation. The clustering step of our framework is compared to generic clustering methods (hierarchical clustering and affinity propagation) on several real challenging sequences, as evaluations indicate that this is more adapted to video-based detection clustering. We propose to use a new evaluation criteria, derived from purity and inverse purity of a clustering estimation, to assess performances of such methods. Results also show that optical flow and a skin color prior added to face detections improve the clustering quality.
Document type :
Journal articles
Complete list of metadata

Cited literature [33 references]  Display  Hide  Download
Contributor : Laurent Trassoudaine Connect in order to contact the contributor
Submitted on : Wednesday, December 19, 2018 - 12:27:08 PM
Last modification on : Wednesday, April 21, 2021 - 8:34:03 AM
Long-term archiving on: : Wednesday, March 20, 2019 - 6:35:08 PM


Publisher files allowed on an open archive


Distributed under a Creative Commons Attribution - NonCommercial 4.0 International License



Siméon Schwab, Thierry Chateau, Christophe Blanc, Laurent Trassoudaine. A multi-cue spatio-temporal framework for automatic frontal face clustering in video sequences. EURASIP Journal on Image and Video Processing, Springer, 2013, 2013 (1), pp.10. ⟨10.1186/1687-5281-2013-10⟩. ⟨hal-01877613⟩



Record views


Files downloads