Full-text corpus data


You can now download the Coronavirus Corpus for offline use. To date, this is about 1.5 billion words of data that you would have on your own machine. The Coronavirus Corpus contains data on the medical, social, cultural, and economic impact of the coronavirus (COVID-19) in 0 texts from online magazines and newspapers in 20 different English-speaking countries from 1 Jan 2020 to 31 December 2022.

#1 below contains links to samples from the first five months of the corpus (Jan - May 2020). #2 is a sample of one of the months after that (Dec 2022). When you purchase the data, you receive the data for all of the months Jan 2020 - Dec 2022.

    Samples
1 Jan - May 2020 Database, WordLemPoS, Text, Sources, Lexicon
2 Jun 2020 - Dec 2022 Database, WordLemPoS, Text, Sources, Lexicon