A multi-year dataset of hospital admission flows from the world’s largest public healthcare system: the Brazilian Sistema Único de Saúde, SUS

Published: 5 December 2025| Version 1 | DOI: 10.17632/c7t4xybd6k.1
Contributors:
,
,
,
,

Description

This multi-year dataset compiles nationwide inpatient hospital authorizations issued within Brazil’s public healthcare system, the Sistema Único de Saúde (SUS), into annual origin–destination (OD) tables of intercity patient movements from 2008 to 2024. The underlying microdata were programmatically retrieved, restricted to principal authorizations, deduplicated, and harmonized to standard identifiers and dates. Based on 198,218,410 unique hospital authorizations, two derived OD tables are provided. The first aggregates annual interregional movements stratified by procedure complexity (medium or high) and contains 1,879,572 records. The second aggregates the same movements by procedure category (diagnostic, clinical, surgical, or transplants) and contains 2,351,471 records. All files are distributed in columnar format to support efficient single-node filtering and grouping with modern data engines such as DuckDB and Apache Spark. R scripts accompany the data, documenting the processing steps and enabling users to generate customized aggregations of flows beyond the provided stratifications. The repository is organized into three folders. The /ha_microdata folder contains the yearly hospital admission microdata from 2008 to 2024, stored as one Parquet file per year, named sih_2008.parquet through sih_2024.parquet. The /ha_flows folder contains two intercity origin-destination tables: ha_flow_c.parquet, stratified by procedure complexity where medium = 02 and high = 03; and ha_flow_p.parquet, stratified by procedure category derived from the procedure code where diagnostic = 02, clinical = 03, surgical = 04, and transplants = 05. The /r_scripts folder provides the R codebook, including the acquisition script (gen_microdata.R), the flow summarization scripts (gen_flows.R), and an R vignette with examples for data visualization and alternative aggregations (flows_eda.R). An additional folder, /ancillary_data, contains the spreadsheet mun_hr_br.xlsx, which lists Brazilian municipalities and their corresponding Health Regions, and the geospatial file health_regions_geo.gpkg, which provides the Health Region polygons.

Files

Institutions

  • Universidade Federal da Bahia

Categories

Public Health, Hospital Inpatient, Regional Planning

Licence