VIDEO SCENE SEGMENTATION OF TV SERIES USING MULTI-MODAL NEURAL FEATURES
Résumé
Scene segmentation of a video, a book or TV series allows them to be organized into logical story units (LSU) and is an essential step for representing, extracting and understanding their narrative structures. We propose an automatic scene segmentation method for TV series based on the grouping of adjacent shots and relying on a combination of multimodal neural features: visual features and textual features, further augmented with the temporal information which may improve the clustering of adjacent shots. Reported experiments compare the combination of different features, video frames sub-sampling and various shot clustering algorithms. The proposed method achieved good results, using different metrics, when tested on several seasons of two popular TV series.
Domaines
Informatique [cs]
Fichier principal
VIDEO SCENE SEGMENTATION OF TV SERIES USING MULTI-MODAL NEURAL FEATURES.pdf (502.44 Ko)
Télécharger le fichier
Origine : Fichiers éditeurs autorisés sur une archive ouverte