Full-text corpus data


The NOW corpus contains more than billion words of text from online magazines and newspapers in 20 different English-speaking countries from 2010 to the current time (see sources). If you're interested in what's going on in English up to and including right now, this is by far the best corpus available.

When you purchase the full-text data from NOW, you get all of the data up through the month of purchase. You can also purchase an annual subscription, which will give you another year's worth of data (typically about 1.7 billion words each year).

For example, if you purchase both datasets on 15 July 2019, you would have the data from Jan 2010 - June 2019 (which was released on July 2; see the samples in #2 below), and an annual subscription would give you the data for one more year: July 2019 - June 2020.

    Time period Size Samples
1 One-time purchase 2010 - month of purchase (Currently) About 6 billion words Database, WordLemPoS, Text, Sources, Lexicon
2 Annual subscription The 12 month period after month of purchase Database, WordLemPoS, Text, Sources, Lexicon

If you purchase just #1 above, it would be the price of one corpus, and there would be a discount for purchasing both corpora (#1 and #2) at the same time.

Note also that the monthly updates will be released at the beginning of the following month (at the latest). You will be notified as soon as the update is available, and you will have ten days to download the data.