Full-text corpus data



In March 2020 we released the most recent (and probably final) version of the Corpus of Contemporary American English (COCA). This version is a significant improvement on and enlargement of the previous version.
 
  Previously (1990-2017) Added in 2020 New total
Spoken 119 million 8 million (2018-2019) 127 million
Fiction 112 million 8 million (2018-2019) 120 million
Magazine 119 million 8 million (2018-2019) 127 million
Newspaper 115 million 8 million (2018-2019) 123 million
Academic 113 million 8 million (2018-2019) 121 million
Blogs   125 million 125 million
Other Web   130 million 130 million
TV / Movies   128 million 128 million
TOTAL 577 million words 423 million words One billion words

Order / download this data