Skip to Main content Skip to Navigation
Journal articles

Database of word-level statistics for Mandarin Chinese (DoWLS-MAN)

Abstract : In this article we present the Database of Word-Level Statistics for Mandarin Chinese (DoWLS-MAN). The database addresses the lack of agreement in phonological syllable segmentation specific to Mandarin by offering phonological features for each lexical item according to 16 schematic representations of the syllable (8 with tone and 8 without tone). Those lexical statistics that differ per phonological word and nonword due to changes in syllable segmentation are of the variant category and include subtitle lexical frequency, phonological neighborhood density measures, homophone density, and network science measures. The invariant characteristics consist of each items' lexical tone, phonological transcription, and syllable structure among others. The goal of DoWLS-MAN is to provide researchers both the ability to choose stimuli that are derived from a segmentation schema that supports an existing model of Mandarin speech processing, and the ability to choose stimuli that allow for the testing of hypotheses on phonological segmentation according to multiple schemas. In an exploratory analysis we illustrate how multiple schematic representations of the phonological mental lexicon can aid in hypothesis generation, specifically in terms of phonological processing during reading Chinese orthography. Users of the database can search among over 92,000 words, over 1,600 out-of-vocabulary Chinese characters, and 4,300 phonological nonwords according to either Chinese orthography, pinyin, or ascii phonetic script. Users can also generate a list of phonological words and nonwords according to user defined ranges and categories of lexical characteristics. DoWLS-MAN is available to the public for search or download at
Document type :
Journal articles
Complete list of metadata
Contributor : James S. German Connect in order to contact the contributor
Submitted on : Monday, August 30, 2021 - 10:30:51 AM
Last modification on : Tuesday, May 3, 2022 - 10:48:26 AM


Files produced by the author(s)




Karl David Neergaard, Hongzhi Xu, James Sneed German, Chu-Ren Huang. Database of word-level statistics for Mandarin Chinese (DoWLS-MAN). Behavior Research Methods, Psychonomic Society, Inc, 2022, 54, pp.987-1009. ⟨10.3758/s13428-021-01620-7⟩. ⟨hal-03328510⟩



Record views


Files downloads