A Scalable Video Search Engine Based on Audio Content Indexing and Topic Segmentation

Abstract : One important class of online videos is that of news broadcasts. Most news organisations provide near-immediate access to topical news broadcasts over the Internet, through RSS streams or podcasts. Until lately, technology has not made it possible for a user to automatically go to the smaller parts, within a longer broadcast, that might interest them. Recent advances in both speech recognition systems and natural language processing have led to a number of robust tools that allow us to provide users with quicker, more focussed access to relevant segments of one or more news broadcast videos. Here we present our new interface for browsing or searching news broadcasts (video/audio) that exploits these new language processing tools to (i) provide immediate access to topical passages within news broadcasts, (ii) browse news broadcasts by events as well as by people, places and organisations, (iii) perform cross lingual search of news broadcasts, (iv) search for news through a map interface, (v) browse news by trending topics, and (vi) see automatically-generated textual clues for news segments, before listening. Our publicly searchable demonstrator currently indexes daily broadcast news content from 50 sources in English, French, Chinese, Arabic, Spanish, Dutch and Russian.
Complete list of metadatas

Cited literature [11 references]  Display  Hide  Download

Contributor : Pascale Sébillot <>
Submitted on : Sunday, November 27, 2011 - 3:06:11 PM
Last modification on : Saturday, May 4, 2019 - 1:20:00 AM
Long-term archiving on : Tuesday, February 28, 2012 - 2:21:55 AM


Files produced by the author(s)


  • HAL Id : hal-00645228, version 1
  • ARXIV : 1111.6265


Julien Lawto, Jean-Luc Gauvain, Lori Lamel, Gregory Grefenstette, Guillaume Gravier, et al.. A Scalable Video Search Engine Based on Audio Content Indexing and Topic Segmentation. 2011 Networked and Electronic Media (NEM) Summit : Implementing Future Media Internet, Sep 2011, Torino, Italy. 160 p. ⟨hal-00645228⟩



Record views


Files downloads