The Diachronic Spanish Sonnet Corpus (DISCO): TEI and Linked Open Data Encoding, Data Distribution and Metrical Findings - Archive ouverte HAL Accéder directement au contenu
Article Dans Une Revue Digital Scholarship in the Humanities Année : 2021

The Diachronic Spanish Sonnet Corpus (DISCO): TEI and Linked Open Data Encoding, Data Distribution and Metrical Findings

Résumé

How has the sonnet form in Spanish evolved over the centuries? What is the distribution of metrical patterns and combinations thereof, considering diachronic, geographical and social factors? What rhyme schemes are favoured in different periods and regions? How is enjambment distributed within the sonnet? Providing quantitative answers to such questions requires a corpus spanning several centuries, annotated for the relevant literary features and containing author metadata. The absence of appropriate digital resources to undertake a macroanalytical study of the evolution of the sonnet in Spanish led us to create the DISCO corpus. This paper presents how the corpus was designed for providing quantitative evidence on the evolution of sonnets in Spanish, and our findings regarding metrics and enjambment. The corpus contains 4085 sonnets by 1204 Spanish and Latin American authors (15th to 19th centuries), encoded in TEI, with RDFa attributes. The corpus aims at breadth, including many peripheral authors besides some major ones. Author metadata were encoded (dates, origin, gender). Scansion and enjambment were annotated automatically, with the ADSO and ANJA tools. The range of authors and periods, the use of TEI and RDFa for interoperability, and the combination of metrical and enjambment annotations goes beyond previously available digital resources. The corpus allowed us to examine the evolution of metrical patterns and their combinations after the Golden Age, complementing earlier studies. We also observed an increase in enjambment across the tercets in the 19th century, which may indicate increased variety in the discourse organization of sonnets in the period.

Dates et versions

hal-02661650 , version 1 (30-05-2020)

Identifiants

Citer

Pablo Ruiz, Helena Bermúdez Sabel, Clara I Martínez Cantón, Elena Gonzalez-Blanco. The Diachronic Spanish Sonnet Corpus (DISCO): TEI and Linked Open Data Encoding, Data Distribution and Metrical Findings. Digital Scholarship in the Humanities, 2021, 36 (Supplement_1), pp.i68-i80. ⟨10.1093/llc/fqaa035⟩. ⟨hal-02661650⟩

Collections

SITE-ALSACE
93 Consultations
0 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More