By William R. Jones
For quite some time, I did not submit articles, not because of the psychological inhibition of writer's block but because of my difficulty coming up with inviting opinionated topics. So now, although perhaps a bit pretentious, I write about what I know the most about: the American English language.
I present my adult ESL (English as a Second Language) class with a three-letter word list because most are one syllable and most of our elementary students know them. Thus, I continue to give alphabetically from "A to Z," with the exception of "Q and X," along with nice accompanying pictures for each word. These handouts supplement the guiding textbook we use. In addition, each class period, they receive a list of 10 to 20 of the most commonly or frequently used words with "the" at the top of the list. We only cover the first 500, for which each student forms a sentence with their assigned word and then comes forth and writes it on the whiteboard. They read it to the class, and corrections are made.
It is especially noted that the Brown Corpus (the first large collection of written and spoken English language to appear that is used to scientifically study the language) has 57 percent word tokens of four letters or fewer, but for constructed dictionaries, words of four or fewer characters account for less than 9 percent of the dictionary. The statistics behind all of this suggest efficiency as well as redundancy to achieve both communicational usefulness and reliability and this is true for most languages in their natural evolution.
Today, one of the latest and most comprehensive text corpora for American English is the Corpus of Contemporary American English. Its impressiveness and poignancy embraces texts from 1990 to the present, with updates continuing regularly with over 1 billion words. We thank word processing for that.
All languages do not have corpora to guide linguists, teachers, translators and researchers. However, not surprising and not unexpected to you, the Korean language has a few text corpora available. Very notable is the Sejong Corpus, which is also known as the Korean National Corpus, containing tens of millions of words. There are more! I recently learned through the corpora that the most frequent word in the Korean language is the particle "ui" which is used to indicate possession or relation, similar to the English word "of."
Depending on your source, you may find that the most common word in the Korean language based on frequency lists is "geot" meaning thing or object. Common usage examples with this word are: "Igeoseun mueosimnikka?" which translates to "What is this thing?"; and "Geugeoseun nae geosimnida," which means "That is mine." Mastering a number of the frequency words will give you a solid foundation in your language studies.
Finally, I give you food for thought or, rather, for research. It is cited that Shakespeare's complete works consist of a total of 884,647 words of text containing a grand total of 29,066 different words, including proper names. In your native language, is your personal lexical vocabulary greater? I would guess yes, especially if you are well-read or a university graduate.
The author ([email protected]) published the novella “Beyond Harvard” and teaches English as a second language.