http://www.wordfrequency.info/
Word frequency dataCorpus of Contemporary American English |
Purchase data |
This site contains what we believe is the most accurate frequency data of English, and it comes in a number of different formats (see samples: 100,000 and 60,000 word lists, and a comparison of the two lists). For the 5,000-60,000 word lists, you can download a simple word list, frequency by genre, or as an eBook or a printed frequency dictionary. For the 100,000 word list, you can see detailed frequency information for many genres in several different corpora. In addition to word frequency data, you can also download up to 155 million n-grams, and 4.3 million collocates. Any frequency list is only as good as the corpus (collection of texts) that it is based on. The 5,000-60,000 word lists are based on the only large, genre-balanced, up-to-date corpus of American English -- the 450 million word Corpus of Contemporary American English(COCA). The 100,000 word list supplements this COCA data with detailed frequency data from the 400 million word Corpus of Historical American English, the British National Corpus, and the Corpus of American Soap Operas (for very informal language). Short samples (see more)
|