India's Social Science Trajectory: A Curated Open Dataset for Research and Policy Analysis

Published: 31 July 2025| Version 1 | DOI: 10.17632/t5hm4h32zv.1
Contributor:
Sanjoy Kar

Description

The dataset titled India's Social Science Trajectory: A Curated Open Dataset for Research and Policy Analysis provides a longitudinal and multi-dimensional overview of the Indian social science research ecosystem spanning the period from 1948 to 2025, i.e., the post-Independence era. It is designed to support empirical investigations into research capacity, output growth, scholarly integrity, and infrastructural development. The dataset includes several interlinked components. First, the Activity Index (AI) is presented for each year between 1948 and 2024, measuring India's relative research activity compared to the global average. This metric is computed by normalizing India’s annual publication share against its overall share during the study period and comparing it with corresponding global figures. Second, the Relative Growth Rate (RGR) component presents a comparative analysis of publication growth for India, the global average, and the Indian Council of Social Science Research (ICSSR) using the standard logarithmic growth formula. A third component includes historical PhD statistics, offering counts of doctoral theses awarded in social sciences from 1930 to 2025. This provides insight into long-term capacity building in higher education. A fourth module presents period-wise publication output, tabulating the total and average annual research publications from India and ICSSR across seven major temporal segments. Fifth, the dataset features data on retractions in Indian social sciences, derived from the Retraction Watch database. It includes both a subject-wise breakdown of the 382 identified retracted papers and a year-wise time series analysis covering 2004 to 2025, enabling scrutiny of research misconduct trends. Sixth, the dataset includes a consolidated list of 41 Indian social science journals indexed in the 2025 JCR. Each journal entry provides metadata such as title, publisher, ISSN, WoS category, JCR edition (SCIE, SSCI), JIF, JCI, and quartile rank (Q1–Q4). The majority of these journals are clustered in the Q3 and Q4 categories, reflecting the persistent challenges of international visibility, citation performance, and editorial capacity in India’s social science publishing landscape. Additionally, the dataset offers a visual representation of the socio-technical information infrastructure that supports social science research in India. This systems diagram illustrates the interconnections among funding sources, institutional arrangements, data repositories, and knowledge dissemination pathways. Complementing this, a conceptual model of a proposed Open Research Information (ORI) system outlines a future-oriented, PID-enabled, multilingual, FAIR-aligned national infrastructure for responsible research assessment and open knowledge dissemination. Finally, the dataset includes a comprehensive list of India’s Global South collaboration partners in social science publishing, based on Scopus co-authorship data from 1948 to 2024.

Files

Steps to reproduce

This dataset was developed to trace the post-Independence trajectory (1948–2025) of India’s social science research system using a combination of bibliometric extraction, institutional mapping, and retraction analytics. Data were sourced primarily from Scopus, the Retraction Watch database, and India’s Shodhganga thesis repository. The following steps outline how the data were gathered and how other researchers may reproduce the dataset. 1. Bibliometric Data Extraction (Scopus) Data on global and Indian social science publications were retrieved from the Scopus database using structured advanced queries. The first search string targeted global social science output: SUBJAREA(SOCI) AND PUBYEAR > 1947 AND PUBYEAR < 2025 AND (LIMIT-TO(LANGUAGE, "English")) The second query focused on India-specific outputs: SUBJAREA (SOCI) AND PUBYEAR > 1947 AND PUBYEAR < 2025 AND (LIMIT-TO (AFFILCOUNTRY, "India")) AND (LIMIT-TO (LANGUAGE, "English")) To isolate publications from ICSSR-affiliated institutions, a third query was constructed using 75 Scopus Affiliation IDs (AF-IDs) associated with ICSSR research institutes. The syntax was: (AF-ID(60075901) OR AF-ID(127194134) OR ... ) AND PUBYEAR > 1969 AND PUBYEAR < 2025 These queries were executed using the Scopus advanced search portal. Results were exported in .csv format with metadata fields including publication title, year, authors, affiliations, DOI, document type, and funding information. 2. Retraction Data Retrieval Retraction data were collected from the Retraction Watch database hosted on GitLab (commit SHA: 372791fe4897b937216008427f14262ab0a1ec21). The CSV was filtered using Excel to extract entries with India-based affiliations and social science subject tags. Additional parsing was performed using Excel functions (SEARCH, MID, FIND) to identify subject categories and compute annual retraction trends. 3. Doctoral Thesis Data Data on PhD awards in social science disciplines were manually compiled from the Shodhganga repository. Entries were filtered by subject, and then aggregated by decade to map trends from 1948 to 2025. 4. Data Cleaning and Indicator Computation Data cleaning and indicator calculations were performed using Excel, and visualizations were generated using Excel, and Adobe Illustrator. The dataset is fully reproducible, with raw files and computed tables organized with a consistent naming convention (e.g., Figure 1.2a, Table 1.6). 5. File Organization and Reusability Data are shared in .xlsx, .csv, .jpg, and .png formats with transparent naming (e.g., Figure_1.2a_Activity_Index.png). Intermediate calculations are documented within Excel sheets. The entire dataset is licensed under CC BY 4.0 for reuse, and users are encouraged to replicate the analysis using the provided raw data. Full citation should refer to: Kar, S. (2025). India’s Social Science Trajectory: A Curated Open Dataset for Research and Policy Analysis. Mendeley Data, V1, DOI: https://doi.org/10.17632/t5hm4h32zv.1.

Institutions

  • Indian Council of Social Science Research

Categories

Social Sciences, Anthropology, Sociology, Political Science, Open System, Social Science Education, Gender, India, Research Evaluation, Bibliometrics, Open Data, History of the Social Sciences

Licence