Real-Time Audio-Visual Analysis for Multiperson Videoconferencing

Petr Motlicek; Stefan Duffner; Danil Korchagin; Hervé Bourlard; Carl Scheffler; Jean-Marc Odobez; Giovanni del Galdo; Markus Kallinger; Oliver Thiergart

doi:10.1155/2013/175745

Article Dans Une Revue Advances in Multimedia Année : 2013

Real-Time Audio-Visual Analysis for Multiperson Videoconferencing

(1) , (1, 2) , (1) , (1) , (1) , (1) , (3) , (3) , (3)

1
2
3

Petr Motlicek

Fonction : Auteur

IDIAP Research Institute

Stefan Duffner

Fonction : Auteur
PersonId : 3908
IdHAL : stefan-duffner
ORCID : 0000-0003-0374-3814
IdRef : 203322940

IDIAP Research Institute

Extraction de Caractéristiques et Identification

Danil Korchagin

Fonction : Auteur

IDIAP Research Institute

Hervé Bourlard

Fonction : Auteur

IDIAP Research Institute

Carl Scheffler

Fonction : Auteur

IDIAP Research Institute

Jean-Marc Odobez

Fonction : Auteur

IDIAP Research Institute

Giovanni del Galdo

Fonction : Auteur

Fraunhofer Institute for Integrated Circuits

Markus Kallinger

Fonction : Auteur

Fraunhofer Institute for Integrated Circuits

Oliver Thiergart

Fonction : Auteur

Fraunhofer Institute for Integrated Circuits

Résumé

We describe the design of a system consisting of several state-of-the-art real-time audio and video processing components enabling multimodal stre am manipulation (e.g., automatic online editing for multiparty videoconferencing applications) in open, unconstrained environments. The underlying algorithms are designed to allow multiple people to enter, interact, and le ave the obser vable scene with no constraints. They comprise continuous localisation of audio objects and its application for spatial audio object coding, detection, and tracking of faces, estimation of head poses and visual focus of attention, detection and localisation of verbal and paralinguistic events, and the association and fusion of these different events. C ombined all together, they represent multimodal streams with audio objects and semantic video objects and provide semantic information for stream manipulation systems (like a virtual director). Various experiments have b een performed to evaluate the performance of the system. The obtained results demonstrate the effectiveness of the proposed design, the various a lgorithms, and the b enefit of fusing different modalities in this scenario.

Domaines

Informatique [cs]

Fichier principal

Liris-6300.pdf (4.96 Mo)

Origine : Fichiers éditeurs autorisés sur une archive ouverte

Équipe gestionnaire des publications SI LIRIS : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01339256

Soumis le : lundi 13 mars 2017-10:21:06

Dernière modification le : mercredi 5 juillet 2023-15:28:04

Archivage à long terme le : mercredi 14 juin 2017-12:57:08

Dates et versions

hal-01339256 , version 1 (13-03-2017)

Licence

Paternité

Identifiants

HAL Id : hal-01339256 , version 1
DOI : 10.1155/2013/175745

Citer

Petr Motlicek, Stefan Duffner, Danil Korchagin, Hervé Bourlard, Carl Scheffler, et al.. Real-Time Audio-Visual Analysis for Multiperson Videoconferencing. Advances in Multimedia, 2013, 2013, pp.175745:1-175745:21. ⟨10.1155/2013/175745⟩. ⟨hal-01339256⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS UNIV-LYON1 UNIV-LYON2 INSA-LYON EC-LYON LIRIS INSA-GROUPE UDL

201 Consultations

99 Téléchargements

Real-Time Audio-Visual Analysis for Multiperson Videoconferencing

Résumé

Domaines

Dates et versions

Licence

Identifiants

Citer

Exporter

Collections

Altmetric

Partager