F is for Frequency

Home / teaching tips / F is for Frequency

Here are two key questions related to frequency: What are the most frequent words in the English language? and How frequently do we need to be exposed to new words in order to acquire them?

To answer the first question, in every language certain words tend to appear more often than others. A common list of the most frequent twenty words in English typically includes the, be, and, of, a, in, to, have, to, it, I, that, for, you, he, with, on, do, say, and this.

Many would agree that these twenty words seem extremely common, but any comprehensive list of words tends to be drawn from a selective corpus, or body of words. The selective nature of any corpus influences what words will appear to be most common. For example, the Cambridge International Corpus replaces some of the above top twenty words with you, uh, yeah, know, like, they, have, so, was and but. Many teachers would take exception to teaching uh and yeah as among the most important words for their beginner students to learn.

The sources for corpora are often dialect-specific, for example, American English or British English, and based on written English or spoken English or a combination. Some corpora only collect words related to particular themes and time periods (e.g., 18th century novels) or specific genres of speech or writing, such as medical English. Some corpora document dead or obscure forms of English, such as those drawn from Old English, Middle English, and Early English texts.

The practice of building corpora goes back to the Middle Ages, but truly useful corpora only appeared in the 1960s when computer databases could be used to quickly compile and search them. The Brown Corpus of 1961 featured what was then an astounding million words. Modern corpora are larger–much larger. The Google N-gram corpus is at 155 billion words and growing. It’s by exploring a database words, a computer can determine which are the most frequently used.

However, looking over the list of twenty words in the first paragraph, it seems that only a few meaningful sentences can be fashioned from them, e.g., I have that. You say it. This is because of the low number of content words versus function words.

Content words are the ones we first learn as children. Even before babies are taught the correct words for everyday people, places, and things, they will improvise and attach sounds to things of interest. In this way, a pet dog might be called an emphatic “Ba!” A study by Hochmann, Endress & Mehler (2010) tried to trick children into learning a different part of speech (determiners such as a, an, the) instead of content words but their experiment failed. The learning of content words seems to be more natural. It might be because young children are often monosyllabic. They get away with just using content words because their parents and peers are happy to infer the larger meanings. For example, a baby using the word milk is generally understood to be announcing her interest in having some.

If we can identify the most frequent words, should they be the ones we should teach? It would seem sensible and since 1953, when Michael West produced the General Service List of 2,300 words that were thought to be essential to communication, other lists have been introduced on a regular basis. Some, like Averil Coxhead’s Academic Word List are more specialized. Textbooks tend to focus on such word lists to challenge students to learn the most appropriate vocabulary for particular their ages and areas of study.

But what about the second question, How frequently do we need to be exposed to new words in order to acquire them? In 1990, Paul Nation suggested that students need to be exposed to a new word between five and sixteen times in order to acquire it. But according to Gu (2003) researchers have since tested that hypothesis with wildly different findings.

Some researchers suggest that it is not the frequency with which one encounters words but the intervals in which students are exposed to new words. Those in favor of interval theories suggest that having a schedule for reviewing flashcards of new vocabulary items is a useful task, and most teachers would agree. As each word becomes firmly set in memory, it is discarded from the flashcard deck and new words are added. Most textbooks tend to recognize the importance of exposing students to both words and structures repeatedly. In publishing parlance, this is called recycling vocabulary and it’s something good teachers tend to do naturally.

But regardless of the method, part of the problem is that students tend to have difficulties retaining new vocabulary items that they don’t perceive as being useful to them.

This probably explains why beginner students seem to acquire so many new words while advanced students struggle to expand their vocabularies. Consider the basic and high-frequency word window and the lower-frequency word llama. A beginner student learning the word window will have seen windows since birth and will continue to see them everyday. Moreover, there will be frequent opportunities to hear and use the word window in everyday conversation: “Look out the window!” “Open the window, please.”

On the other hand, the word llama might be part of a short story or an article on South American animal husbandry, but it’s unlikely to enter a learner’s everyday vocabulary unless the learner’s parents are llama herders.

This last point is not as silly as it sounds as it helps to point out a fault in many frequency lists: Such lists fail to acknowledge individual or local vocabulary needs. Frequency lists tend to be developed on national or international scales and although they are perfectly suitable for most challenges students face in reading, writing, speaking, and listening, they don’t allow students to talk about many of the things that are most important to them.

Imagine the differences between a student who lives in a small snowy mountain village where skiing and other winter sports are the most common pastimes and another student who lives in a large tropical city where life revolves around the beach. Imagine the different foods they eat, the clothes they wear, and the transportation they use. It’s natural that they would need different vocabularies to narrate their respective lives.

Students need to be exposed to high-frequently vocabulary and exposed to it in meaningful ways. Beyond reading and hearing new words, students require opportunities to use vocabulary in speaking and writing tasks. More importantly, they need to be exposed to the vocabulary that they need to explore and explain their everyday lives.

“Oh! Look out the window. There’s a llama!”

Tasks for Teachers
1. A localized dictionary
Work with students to create your own frequency list of words that your students need to learn based on individual and local basis. Do this in an online document set out with each letter of the alphabet and you and your students will be able to use and expand your dictionary of local vocabulary over many years.

2. Test the hypothesis of new words being easier to use if they’re contextual to the students’ lives.
Consider the level of your students and teach them ten low-frequency words, five that name things they are likely to see everyday, and five that they are unlikely to encounter. For example, the low-frequency word for the end of a shoelace is called an aglet. Another five-letter, low-frequency word students are unlikely to know is reeve, a nautical verb meaning to pass a line through a hole, eye or block.
With a group of beginners, you might teach the names of ten animals, five they are likely to see and five they are not.
With advanced students you might consider using Nadsat, the Russian-based invented language created by Anthony Burgess for his novel A Clockwork Orange. Dictionaries of Nadsat are available online.
Follow Paul Nation’s suggestion of exposing the students to each word 5 to 16 times. After a week, and after a month or more, test the students’ memory of the words to see if there are differences between their retention of obscure words they see in their everyday contexts compared to those they do not.

Tasks for Students
1. A corpus task
Ask students to search online and locate the British National Corpus.
In the search box, each student should type in a different everyday word, such as field. The BNC will produce 50 random meanings with a sentence for each showing the context.
Have students examine the context sentences and see what different meanings there are for the word they’ve chosen. Which are common and easy to understand? Which are uncommon and difficult to understand.
Ask students to compare their lists with other students.

2. A vocabulary game to encourage peer teaching
Have each student prepare a list of ten difficult words with definitions. The words might best be taken from books the students are currently reading rather than combing the dictionary for obscure language.
One student begins by sharing a word that he knows and asks if someone can define it. Students are welcome to guess.
When a correct answer is given, the student who has given it can go next or can be awarded a point.

References
Coxhead, A. (2012) Academic Word List. University of Wellington. Retrieved from: http://www.victoria.ac.nz/lals/resources/academicwordlist/
Ellis, N.C. (1995). The psychology of foreign language vocabulary acquisition: Implications for CALL. Computer Assisted Language Learning. 8(2-3), 103-128
Gu, P.Y. (2003). Vocabulary Learning in a Second Language: Person, Task, Context and Strategies. TESL-EJ, vol. 7, no. 2. Retrieved from: http://www.tesl-ej.org/wordpress/issues/volume7/ej26/ej26a4/
Hochmann, J.R., Endress, A.D. & Mehler, J. (2010). Word frequency as a cue for identifying function words in infancy. Cognition. Retrieved from: http://www.endress.org/publications/freq_function_words.pdf
Nation, I. S. P. (1990). Teaching and learning vocabulary. Boston: Heinle & Heinle
Richards, J.C. (2013). Curriculum approaches in language teaching: Forward, central, and backward design. RELC Journal. 44(1), 5-33

Acknowledgement: I thank my graduate student, David Penton for observations on the informal language in the Cambridge International Corpus.