Opening new doors for psycholinguistic research: word and sound frequencies of Cantonese

Jane Li

Published: Apr 30, 2020

Jane Li

Abstract

Our study analyzed and compared three Cantonese language corpora (large textual databases) in order to document type (unique units) and token (total units) frequencies of words and the sounds inside them. The frequencies were calculated for each corpus, and the correlations and similarities among these corpora were analyzed. The frequency analyses reveal that while the three corpora are similar in many regards, they are statistically independent and should not be used interchangeably. In turn, we hope this study will help future empirical studies in making an informed choice regarding the basis of investigation. Furthermore, the documentation of word and sound frequencies can be used for a host of natural language applications such as speech errors, which have been shown to be sensitive to the frequencies of words and its component sounds. By pinning down the impact of frequencies on speech errors, we can better understand the nature of normal (non-erroneous) speech processes.

Issue

Vol. 1 No. 1 (2020): SFU Undergraduate Research Symposium 2020

Section

Linguistic investigations: foreign language teaching methods, psycholinguistics

Article Sidebar

Main Article Content

Abstract

Article Details