Corpora used for the experiments in 'Corpus characteristics-based method to centroids number determination for clustering text documents'.
Steps to reproduce
Step 1: Unzip the file. Step 2: Load each corpus separately with Lucene. Step 3: Execute the proposed method to determine the number of centroids for each corpus.