Bangladesh Air Quality Index (AQI) Dataset (2000–2025): Historical Hourly Air Pollution Data Across 103 Cities

Published: 21 January 2026| Version 2 | DOI: 10.17632/9j447cynb9.2
Contributor:

Description

The Bangladesh Air Quality Index (AQI) Dataset (2000–2025) is a large-scale, high-resolution environmental dataset providing historical air pollution measurements across 103 cities in Bangladesh. The dataset spans 25 years and contains 1,048,551 hourly records, making it one of the most comprehensive publicly available air quality datasets for the country. It covers urban, divisional, district, sub-district, coastal, and border regions, offering extensive spatial and temporal coverage. Each record includes concentrations of eight key air pollutants: PM10, PM2.5, CO, CO2, NO2, SO2, O3, along with a composite Air Quality Index (AQI) value. Additional attributes include city identifiers, city names, geographic coordinates (latitude and longitude), and timestamps in ISO 8601 format, ensuring accurate spatial and temporal alignment. The dataset is provided in a standardized CSV format with 13 columns, allowing straightforward integration with statistical analysis, visualization, and machine learning pipelines. Temporal coverage varies by location, with earlier records concentrated in major metropolitan areas such as Dhaka, Chittagong, and Sylhet, while recent years include expanded coverage of smaller cities and upazilas. The hourly resolution enables detailed short-term variability analysis as well as long-term trend assessment. Geographic coverage spans all major regions of Bangladesh, including the Ganges–Brahmaputra–Meghna delta, coastal zones such as Cox’s Bāzār and Teknāf, and northern border areas like Panchagarh and Thākurgaon. AQI values are calculated according to the United States Environmental Protection Agency (US EPA) standard and categorized into six levels: Good, Moderate, Unhealthy for Sensitive Groups, Unhealthy, Very Unhealthy, and Hazardous. Some missing values are present due to sensor maintenance, calibration, or data transmission limitations, and variations in monitoring equipment may introduce location- and time-dependent differences. This dataset supports a wide range of applications, including air quality monitoring, environmental and public health research, climate and urban studies, policy evaluation, and machine learning tasks such as pollutant prediction, city clustering, anomaly detection, and AQI forecasting using models like LSTM, ARIMA, and Prophet. The dataset is released under the Creative Commons Attribution 4.0 (CC BY 4.0) license, allowing unrestricted use, distribution, and adaptation with appropriate attribution.

Files

Institutions

Categories

Air Quality, Machine Learning

Licence