Statistical Audio-Visual Data Fusion for Video Scene Segmentation

Vyacheslav Parshin; Liming Chen

doi:10.4018/978-1-59904-370-8.ch004

Chapitre D'ouvrage Année : 2007

Statistical Audio-Visual Data Fusion for Video Scene Segmentation

(1) , (2)

1
2

Vyacheslav Parshin

Fonction : Auteur
PersonId : 1016702

Laboratoire d'InfoRmatique en Image et Systèmes d'information

Liming Chen

Fonction : Auteur
PersonId : 7562
IdHAL : liming-chen
IdRef : 067400175

Extraction de Caractéristiques et Identification

Résumé

Automatic video segmentation into semantic units is important to organize an effective content based access to long video. In this work we focus on the problem of video segmentation into narrative units called scenes - aggregates of shots unified by a common dramatic event or locale. In this work we derive a statistical video scene segmentation approach which detects scenes boundaries in one pass fusing multimodal audio-visual features in a symmetrical and scalable manner. The approach deals properly with the variability of real-valued features and models their conditional dependence on the context. It also integrates prior information concerning the duration of scenes. Two kinds of features extracted in visual and audio domain are proposed. The results of experimental evaluations carried out on ground truth video are reported. They show that our approach effectively fuse multiple modalities with higher performance as compared with an alternative rule-based fusion technique.

Domaines

Informatique [cs]

Équipe gestionnaire des publications SI LIRIS : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01589551

Soumis le : lundi 18 septembre 2017-16:37:48

Dernière modification le : mercredi 5 juillet 2023-15:28:04

Dates et versions

hal-01589551 , version 1 (18-09-2017)

Identifiants

HAL Id : hal-01589551 , version 1
DOI : 10.4018/978-1-59904-370-8.ch004

Citer

Vyacheslav Parshin, Liming Chen. Statistical Audio-Visual Data Fusion for Video Scene Segmentation. Pr. Yujin Zhang. Semantic-Based Visual Information Retrieval, Idea Group Inc., pp.68-89, 2007, ⟨10.4018/978-1-59904-370-8.ch004⟩. ⟨hal-01589551⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS UNIV-LYON1 UNIV-LYON2 INSA-LYON EC-LYON LIRIS LABEXIMU INSA-GROUPE UDL EC_LYON_STRICT

319 Consultations

0 Téléchargements

Statistical Audio-Visual Data Fusion for Video Scene Segmentation

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager