déposer
version française rss feed
HAL : hal-00005544, version 1

Fiche détaillée  Récupérer au format
Speech Communication 33 (2001) p. 79-96
Linguistic documents synchronizing sound and text
Michel Jacobson 1, Boyd Michailovsky 1, John B. Lowe 1
(2001)

The goal of the LACITO linguistic archive project is to conserve and to make available for research recorded and transcribed oral traditions and other linguistic materials in (mainly) unwritten languages, giving simultaneous access to sound recordings and text annotation. The project uses simple, TEI-inspired XML markup for the kinds of annotation traditionally used in field linguistics. Transcriptions are segmented at the levels of, roughly, the sentence and the word, and annotation associated with different levels: metadata at the text level, free translation at the sentence level, interlinear glosses at the word level, etc. Time alignment is at the sentence (and optionally the word) level. To minimize in-house development and maintenance, the project uses standard software to the extent possible. Marked-up data is processed using widely-available XML/XSL/XSLT/XQL software tools, and displayed using standard browsers. The project has developed (1) an authoring tool, SoundIndex, to facilitate time-alignment, (2) a Java applet which enables standard browsers to access time-aligned speech, (3) XSL stylesheets which determine \"views\" on the data, and (4) a simple CGI interface permitting the user to choose documents and views and to enter queries. The paper describes these elements in detail. Current objectives are further development of the annotation with a view to linguistic research beyond simple browsing, and of a querying system (using a standard XML query processor) to exploit the annotated material.
1 :  Laboratoire des langues et civilisations à tradition orale (LACITO)
CNRS : UMR7107 – Université Paris III - Sorbonne nouvelle – Université Paris IV - Paris Sorbonne
Sciences cognitives/Linguistique
speech annotation – field linguistics
Liste des fichiers attachés à ce document : 
PDF
speechCom33.pdf(639.9 KB)

tous les articles de la base du CCSd...
tous les articles de la base du CCSd...
tous les articles de la base du CCSd...
tous les articles de la base du CCSd...
tous les articles de la base du CCSd...
tous les articles de la base du CCSd...
tous les articles de la base du CCSd...
tous les articles de la base du CCSd...
tous les articles de la base du CCSd...
tous les articles de la base du CCSd...
tous les articles de la base du CCSd...
tous les articles de la base du CCSd...
tous les articles de la base du CCSd...
tous les articles de la base du CCSd...
tous les articles de la base du CCSd...
tous les articles de la base du CCSd...
tous les articles de la base du CCSd...
tous les articles de la base du CCSd...
tous les articles de la base du CCSd...