Full-text corpus data



After you have purchased the data, we will send you links to download the data. There are five files for each corpus: the three formats of text, word / lemma / PoS, and database, as well as a sources file (with metadata on each of the texts), and a "lexicon" (all distinct word forms + lemma + PoS). You can download just the formats / file that you need.

You must have access to Google Drive to download the data. The person who will be downloading the data should check to see if they have access to Google Drive. If they can download and view this file (which says You have access to Google Drive), then things are fine.

Things work best if you have a Google account, but you can also download the data even if you don't have an account. In this case, you will simply need to click on a link from an email that you receive from Google, to confirm the email address that we will enter.