Dataset Supporting a Glaserian Systematic Mapping Study of Grounded Theory in Software Engineering
Description
This dataset investigates how Grounded Theory (GT) has been applied, reported, and classified in Software Engineering research, and examines the limitations of traditional Systematic Mapping Studies (SMS) in adequately classifying GT-based qualitative studies. The research hypothesis is that conventional SMS approaches provide limited classification coverage, and that integrating SMS with Glaserian Grounded Theory principles improves analytical depth, classification accuracy, and theoretical insight. The data were generated through a Glaserian Systematic Mapping Study (GSMS), combining systematic literature mapping with iterative qualitative coding and constant comparison. A corpus of 70 peer-reviewed research articles was selected through structured search, screening, and snowballing processes, and analysed using ATLAS.ti with open, selective, and theoretical coding. The dataset demonstrates that fine-grained iterative coding identifies a richer and more meaningful set of classificatory elements than SMS alone. The resulting codes and quotations reveal patterns in GT application across software development contexts and methodological practices. The dataset supports replication of the GSMS process, secondary qualitative analysis, and mixed-methods research through the provided ATLAS.ti project and tabular exports. Files Included in the Dataset 1. Guide GSMS Data.docx Word document providing an overview of the dataset purpose, data process, and file structure, together with guidance on how to interpret data. 2. ATLAS.ti Project File • GSMS_Software_GroundedTheory.atlasti Complete ATLAS.ti project containing all documents, codes, quotations, memos, and groupings used in the qualitative analysis. 3. Document Manager Files (70 Documents) • Document Manager70.xlsx • Document Manager_70.csv Each row corresponds to one research paper included in the GSMS analysis. 4. Code Manager Files (454 Codes) • Code Manager454.xlsx • Code Manager_454.csv These files contain all codes generated during the GSMS analysis. 5. Quotation Manager Files (888 Quotations) • Quotation Manager888.xlsx • Quotation Manager_888.csv Each row corresponds to a coded quotation extracted from the analysed papers. 6. Code-Grouped Quotation Manager (888 Quotations) • Code-Grouped Quotation Manager888.xlsx This file reorganises the same 888 quotations by code. Each worksheet corresponds to a specific code and contains all quotations associated with that code. 7. Code Manager – Grounded Theory Elements (183 Codes) • Code Manager GTe183.xlsx • Code Manager GTe_183.csv These files contain the subset of 183 codes associated specifically with Grounded Theory Elements (GTe).
Files
Steps to reproduce
The dataset was produced through a Glaserian Systematic Mapping Study (GSMS). A systematic search was conducted in major scientific databases: web of Science, IEEE Xplore, Scopus, and ACM Digital Library, using search strings combining Grounded Theory and Software Engineering. The initial results were filtered through predefined inclusion and exclusion criteria, followed by a snowballing process, resulting in a final corpus of 70 articles. Data analysis was carried out using ATLAS.ti as the primary qualitative analysis software. Each article was imported as a primary document and analysed through iterative cycles of open, selective, and theoretical coding. Quotations were extracted directly from the source texts, and codes were continuously refined using constant comparison across documents and iterations. Document groups, code groups, and analytic memos were used to support abstraction, traceability, and theory development. The GSMS workflow involved multiple iterations of keywording, mapping, and synthesis until theoretical saturation was achieved. All qualitative artefacts (documents, codes, quotations, and relationships) were preserved in the ATLAS.ti project file and exported into structured Excel and CSV formats to enable reuse and reproduction of the analysis without reliance on proprietary software. Researchers can reproduce or extend this study by applying the same GSMS protocol to a new corpus of articles, reusing the provided coding structures, or re-analysing the dataset using alternative qualitative or mixed-methods approaches.
Institutions
- Universidad Politécnica de MadridMadrid, Madrid
- Politecnica Salesiana UniversityAzuay, Cuenca