Concurrent Speech Synthesis to Improve Document First Glance for the Blind

Fabrice Maurel; Gaël Dias; Stéphane Ferrari; Judith-Jeyafreeda Andrew; Emmanuel Giguet

Communication Dans Un Congrès Année : 2019

Concurrent Speech Synthesis to Improve Document First Glance for the Blind

(1) , (1) , (1) , , (1)

Fabrice Maurel

Fonction : Auteur
PersonId : 5132
IdHAL : fabrice-maurel
IdRef : 087982013

Equipe Hultech - Laboratoire GREYC - UMR6072

Gaël Dias

Fonction : Auteur
PersonId : 3735
IdHAL : gael-dias
ORCID : 0000-0002-5840-1603
IdRef : 113779747

Equipe Hultech - Laboratoire GREYC - UMR6072

Stéphane Ferrari

Fonction : Auteur
PersonId : 10052
IdHAL : stephane-ferrari
IdRef : 17490343X

Equipe Hultech - Laboratoire GREYC - UMR6072

Judith-Jeyafreeda Andrew

Fonction : Auteur
PersonId : 1055717

Emmanuel Giguet

Fonction : Auteur
PersonId : 740232
IdHAL : emmanuel-giguet
ORCID : 0000-0001-8617-0091
IdRef : 159042585

Equipe Hultech - Laboratoire GREYC - UMR6072

Résumé

Skimming and scanning are two well-known reading processes, which are combined to access the document content as quickly and efficiently as possible. While both are available in visual reading mode, it is rather difficult to use them in non visual environments because they mainly rely on typographical and layout properties. In this article, we introduce the concept of tag thunder as a way (1) to achieve the oral transposition of the web 2.0 concept of tag cloud and (2) to produce an innovative interactive stimulus to observe the emergence of self-adapted strategies for non-visual skimming of written texts. We first present our general and theoretical approach to the problem of both fast, global and non-visual access to web browsing; then we detail the progress of development and evaluation of the various components that make up our software architecture. We start from the hypothesis that the semantics of the visual architecture of web pages can be transposed into new sensory modalities thanks to three main steps (web page segmentation, keywords extraction and sound spatialization). We note the difficulty of simultaneously (1) evaluating a modular system as a whole at the end of the processing chain and (2) identifying at the level of each software brick the exact origin of its limits; despite this issue, the results of the first evaluation campaign seem promising.

Mots clés

Web Accessibility Document Layout Oral Transposition Non Visual Skimming

Domaines

Informatique [cs] Intelligence artificielle [cs.AI] Traitement du texte et du document Web Interface homme-machine [cs.HC]

Fichier principal

HDI_2019-Concurrent Speech Synthesis.pdf (12.47 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

Giguet Emmanuel : Connectez-vous pour contacter le contributeur

https://hal.science/hal-02309647

Soumis le : mercredi 9 octobre 2019-14:23:27

Dernière modification le : mercredi 20 mars 2024-16:20:04

Dates et versions

hal-02309647 , version 1 (09-10-2019)

Identifiants

HAL Id : hal-02309647 , version 1

Citer

Fabrice Maurel, Gaël Dias, Stéphane Ferrari, Judith-Jeyafreeda Andrew, Emmanuel Giguet. Concurrent Speech Synthesis to Improve Document First Glance for the Blind. 2nd International Workshop on Human-Document Interaction (HDI 2019) in conjunction with IAPR/IEEE ICDAR 2019, Sep 2019, Sydney, Australia. ⟨hal-02309647⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS GREYC GREYC-HULTECH COMUE-NORMANDIE ENSICAEN UNICAEN

79 Consultations

47 Téléchargements

Concurrent Speech Synthesis to Improve Document First Glance for the Blind

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager