Thematic analysis result (LDA results and community network results)in construction and demolition waste management
Description
The anlaysis result and node data for research. The LDA analysis result provide the topic information of research and used as node data in the lateral thematic model. The edge data is provide by word2vec model. Combining with edge data and node data, the social network can be conducted and visualized.
Files
Steps to reproduce
The first step is data collection and partition for follow-up analysis. The second step is data preprocessing, when TF-IDF takes the role of identifying the high-frequency invalid word. The third step is knowledge discovery, which contains three analysis algorithms as LDA, word2vec, and community detection. The first step is collect the academic articles from database (Web of science). Then, we manually eliminate the irrelated articles or duplicated articles. After that, we use R language to tansform the articles that in the form of pdf to txt. The stopword list provided is used as the criterion for eliminating the meaningless words in the LDA model. After preprocessing, the data will be respestively used for LDA analysis and word2vec, and results have been provided.