Note: this data is based on
corpora that were
created solely by
Professor of Linguistics at
Brigham Young University. As the result of an agreement
between BYU and Mark Davies, all transactions regarding payments and licenses for this data are made solely with Mark Davies, rather than with BYU.
* The "complete" full-text data also includes the URLs data
* When you purchase the complete full-text data, you have access
to all three formats: database, word/lemma/PoS, and text
* The full-text data is about 20% more expensive than the other
full-text data, but iWeb is much larger than these corpora
(e.g. 25x as large as COCA)
* Approx. size of full-text data (uncompressed): Text: 78 GB;
Database: 480 GB; WordLemPoS: 660 GB
the steps to obtain the data:
1. Download and fill out the license agreement. This states that you
will not give the data to anyone else outside of your university or
company (which also means that you cannot post it on the web). You just
need to fill in your name and company (if that is applicable), and then
send it back to us as an attachment.
* Note that you must use an
academic email address (e.g. *.edu or *.ac.edu) for an academic license.
2. Once we receive the license agreement, we'll send you a request for
payment from PayPal.
make the payment with a credit card at PayPal. Note that you
do not need a PayPal account to make the payment.
4. As soon
as we receive confirmation of the payment, we'll send you the link to
download the data.