📜 ⬆️ ⬇️

How many English words do you know?

Estimation of the number of learned and remembered words of a foreign language is primarily interesting for understanding how far a person has advanced in the “passive” perception of information: texts, speech, films, etc. I propose to get acquainted with several methods that I used, found in the network and "self-made". Below - a couple of tests to assess the vocabulary, a technique for finding important words that have not yet been hooked in the brain, a few arguments and some references.


Online tests


Of the many tests assessing the number of words I liked two. A couple of years ago I met a rather simple Test Your Vocabulary . Going through three screens with words, you tick off those that (you think) you know, after which you get an estimate of the total number of words learned. Many of my friends complained about his inadequacy - they received a quantity less than “the one about which I know for sure that he knows worse”. But during the passage there may be a mistake of another kind - it seems that you know the word, but in fact you have already forgotten. They say that the hand itself stretches to put a tick next to the word, which seems vaguely familiar, so you can subconsciously overestimate the overall assessment itself.

Another interesting test is my.vocabularysize.com from New Zealand’s Victoria University in Wellington. You can even choose a Russian interface. After 140 questions on a choice of one of 4-5 definitions, the assessment of your lexicon is given. Also there is a test on the knowledge of parts of words.
The authors of the test refer to pdf 2 of the articles of 1990 and 2006 , which describe the so-called. lists of word relatives (word-family lists).
')
Your results

You know at least 10,500 English word families!

What do my results mean?

In general, there is no minimum vocabulary size. You can be able to understand. However, Paul Nation's (2006) research suggests that she could use it:

How large is the vocabulary is needed for reading and listening?
Skill Size estimate Notes
Reading 8,000 - 9,000 word families Nation (2006)
Listening 6,000 - 7,000 word families Nation (2006)
Native speaker 20,000 word families Goulden, Nation, & Read (1990)
Zechmeister, Chronis, Cull, D'Anna, & Healy (1995)

What is a word family?

It can be recognized that it is possible to recognize the other forms. For example, nation, a, can also be an adjective (national), a verb (nationalize), or an adverb (nationally). It is also possible that it is made up of affixes. For a test of receptive vocabulary knowledge such as this, word families are considered.


Frequency dictionaries


After registering on www.wordfrequency.info, you can download the Exelic frequency dictionary of American English. There is a text version .

Type of such:

Rank Word Part Of Speech Frequency Dispersion

1 the - a 22038615 0.98
2 be - v 12545825 0.97
3 and - c 10741073 0.99
4 of - i 10343885 0.97
5 a - a 10144200 0.98
6 in - i 6996437 0.98
7 to - t 6332195 0.98
8 have - v 4303955 0.97
...
...
4996 immigrant - j 0.97
4997 kid - v 5094 0.92
4998 middle-class - j 5025 0.93
4999 apology - n 4972 0.94
5000 till - i 5079 0.92

The file contains 5000 English words sorted by frequency of occurrence. Frequency counted on a huge heterogeneous array of English texts. Recently, I saw my friend searching for words unknown to him, checking his vocabulary. After reviewing the first 500, I did not find the unknown. He showed an extract in his smartphone - about a dozen words from the second thousand (that is, from 1000 to 2000) and about 20 from the third. It's funny that, going through the list, you meet sequences of words that successfully add up to phrases or even short sentences. The logic is very simple - if the word is very common in statistics, and you do not know it, then it is better to learn and see examples of use.

After reading the list of unknown words (already with translation), I saw the following thing. About 50-60% of these unknown words I knew, but some of the meanings of the translations recorded there were unknown to me, there were several completely unknown words to me.
In general, the site is trying to be commercial, they sell lists of more than 5,000, but this is not so interesting.

So far, this friend of mine is writing a program with a user-friendly interface for searching for unknown words - for training purposes. I suggested that for the global assessment not this list be used, but thinned out : every seventh word from the general list of 60,000 words is given. In fact, even watching the first couple of thousands of people drives into dismay, not all will get to 5,000. Although I don’t bother with all 100, a thinned dictionary will certainly show at least one word from the “family”, and time will be spent, respectively, 7 or 10 times less (depending on the frequency of thinning).
By the way, similar frequency dictionaries of the Russian language contain about 160 thousand words, including abbreviations and abbreviations. There are several different similar "corpuses" of English words from different organizations.

I am interested in another question: how accurate are the tests that give an estimate of the number of words known to you? It is possible that this could be determined just by checking the frequency dictionary, as well as comparing the list of selected unknown words - their number and entry into different “families”.

There are general laws of remembering and forgetting. One of the main things: if a person has learned something and does not repeat it, does not use it - the information is forgotten exponentially from time. On the other hand, a few repetitions lengthens, stretches the falling exponent to an acceptable level. I was very surprised when an acquaintance who worked as a tutor of schoolchildren told me that there is a sequence of periods for deep memorization: say, 20 minutes, then 8 hours, another day, etc., after which the information is planted in the brain . That is, the brain provides a statistically maximum level of the excitation signal when it encounters this information.

image
Ebbingauza Curve, from Wikipedia.

As I learned the words at the institute.


Without taking into account the standard course, where the requirements of the first three years were quite rigid, I tried to read fiction. The first big book was the old Soviet edition of Conan Doyle's The Lost World. I do not know how adapted it was, but the Victorian words and expressions in the text were abundant, and this greatly delayed the progress towards the end ... Of course, you could look at Lingvo from the computer, but I did not like to read at the computer, but to run around. for every new word quickly tired. The plates were not common then, a pocket electronic translator is an expensive rarity, so I developed a paper system for myself. In a thick 96-sheet notebook, the spread was divided into 6 columns. Now I tried to find a notebook - got lost. We'll have to describe in words. Divided the alphabet into groups of letters, for example - a..d, e..f, g..j, k..n, o..q, r..t, u..w, x..z. Approximately, on the eye, I estimated the statistical percentage of words that begin with these letters and divided the columns in the spread into rectangles. For example, the group a..d gave 2/3 of the first column, and so on. The group x..z was assigned the last remaining smallest piece in the 6th column. Then everything is simple. I met an unknown word - enter it with the translation in the desired rectangle. Nothing inside the block is not alphabetically - not long to find. To get the translation, lying on the bed, you need to climb into the book dictionary. That is, the value of receiving a translation is quite large, more than now look at Lingua or an online translator like multitran . Pen enter too long. But the brain remembers better, because it is not very pleasant to break away from an interesting plot and get into the dictionary. Sometimes the word reappeared in the notebook at the next turn, and then after two more, it is just a cost. In the course of reading in the notebook had to climb less. Then it turned out that it is possible from the context to guess the meaning of a very considerable percentage of new words.

It will be interesting to hear what other approaches there are. In my opinion, the best way is a long full immersion on Wednesday, but it is not available for everyone.

Interesting links


British word base BNC
Learn English with Anki
From the forum lingvo
Review Lexiconer
Russian language dictionaries for download

UPD It turns out that Habré had a great interesting article about vocabulary that I didn’t notice

Source: https://habr.com/ru/post/168419/


All Articles