This corpus is used to conduct Authorship Attribution. It's comprised of books 80 books. They are from 8 different authors, each with 10 books.
Steps to reproduce
It was downloaded from Project Gutenberg and some changes were made: only the text limited by "*** START OF THIS PROJECT GUTENBERG EBOOK ***" and "*** END OF THIS PROJECT GUTENBERG EBOOK ***" was kept.