An interactive visualization of Google Books Ngrams with R and Shiny - Archive ouverte HAL Accéder directement au contenu
Pré-Publication, Document De Travail Année : 2019

An interactive visualization of Google Books Ngrams with R and Shiny

Fabian Vetter
  • Fonction : Auteur
  • PersonId : 1048365

Résumé

Using the re-emergence of the /h/ onset from Early Modern to Present-Day English as a case study, we illustrate the making and the functions of a purpose-built web application for the interactive visualization of the raw n-gram data provided by Google Books Ngrams (GBN). The database has been compiled from the full text of over 4.5 million books in English, totalling over 468 billion words and covering roughly five centuries. We focus on bigrams consisting of words beginning with graphic preceded by the indefinite article allomorphs a and an, which serve as a diagnostic of the consonantal strength of the initial /h/. The sheer size of this mega-corpus affords us the possibility to attain a maximal diachronic resolution, to distinguish highly specific groups of -initial lexical items, and even to trace the diffusion of the observed changes across individual lexical units. The functions programmed into the app enable us to explore the data interactively by filtering, selecting and viewing them according to various parameters that were manually annotated into the data frame. We also discuss limitations of the database and of the explorative data analysis.
Fichier principal
Vignette du fichier
JDMDH_190606_Schlüter_ Vetter_An interactive visualization of Google Books Ngrams with R and Shiny.pdf (1.25 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-02149498 , version 1 (18-06-2019)
hal-02149498 , version 2 (02-03-2020)
hal-02149498 , version 3 (08-12-2020)
hal-02149498 , version 4 (15-12-2020)

Identifiants

  • HAL Id : hal-02149498 , version 1

Citer

Julia Schlüter, Fabian Vetter. An interactive visualization of Google Books Ngrams with R and Shiny: Exploring a(n) historical increase in onset strength in a(n) huge database. 2019. ⟨hal-02149498v1⟩
486 Consultations
1081 Téléchargements

Partager

Gmail Facebook X LinkedIn More