Unsupervised mining of audiovisually consistent segments in videos with application to structure analysis

Mathieu Ben 1 Guillaume Gravier 1
1 TEXMEX - Multimedia content-based indexing
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, Inria Rennes – Bretagne Atlantique
Abstract : In this paper, a multimodal event mining technique is proposed to discover repeating video segments exhibiting audio and visual consistency in a totally unsupervised manner. The mining strategy first exploits independent audio and visual cluster analysis to provide segments which are consistent in both their visual and audio modalities, thus likely corresponding to a unique underlying event. A subsequent modeling stage using discriminative models enables accurate detection of the underlying event throughout the video. Event mining is applied to unsupervised video structure analysis, using simple heuristics on occurrence patterns of the events discovered to select those relevant to the video structure. Results on TV programs ranging from news to talk shows and games, show that structurally relevant events are discovered with precisions ranging from 87% to 98% and recalls from 59% to 94%.
Document type :
Conference papers
Complete list of metadatas

https://hal.archives-ouvertes.fr/hal-00646603
Contributor : Guillaume Gravier <>
Submitted on : Wednesday, November 30, 2011 - 1:26:43 PM
Last modification on : Friday, November 16, 2018 - 1:23:01 AM
Long-term archiving on : Thursday, March 1, 2012 - 2:30:27 AM

File

paper_434.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-00646603, version 1

Citation

Mathieu Ben, Guillaume Gravier. Unsupervised mining of audiovisually consistent segments in videos with application to structure analysis. IEEE Intl. Conf. on Multimedia and Exhibition, 2011, Spain. ⟨hal-00646603⟩

Share

Metrics

Record views

420

Files downloads

211