A Comprehensive Retail Market Price Dataset for Wheat Flour and Macroeconomic Indicators in Bangladesh (2007–2025)
Description
This dataset contains monthly wheat flour prices across Bangladesh alongside macroeconomic indicators and global commodity prices from January 2007 to September 2025. It includes 17,099 observations across all eight administrative divisions and supports price forecasting, structural break analysis, and studies of how domestic markets respond to international pricing and exchange rate shocks. Data comes from domestic market monitoring networks across Bangladesh, with missing values filled using spatial interpolation. Macroeconomic indices and international commodity prices are sourced from the World Bank and national statistical registries. Observations span the national, divisional, district, and market levels on a monthly frequency. The core file contains 18 columns. Spatial identifiers include adm1_name (one of eight divisions: Dhaka, Chittagong, Khulna, Barisal, Mymensingh, Rajshahi, Rangpur, Sylhet), adm2_name (district), mkt_name (specific market), lat, and lon. A unique observation ID is provided via eo_id (format: OBS_XXXXX). Temporal variables (dates, year, and month) are standardized to the first day of each calendar month. Data quality is tracked through data_coverage_recent (reporting completeness), spatially_interpolated (1 = estimated via spatial models; 0 = original field observation), and trust_wheat_flour (confidence score from reporting consistency). The target variable is wheat_flour_price in Bangladeshi Taka per kilogram. inflation_wheat_flour captures year-over-year percentage change. Economic context comes from four variables: c_food_price_index (national food basket), exchange_rate_unofficial (parallel market BDT-to-USD rate reflecting import costs), cpi_food_index (food category inflation), and international_wheat_price_usd (global wheat benchmark from the World Bank Pink Sheet). The dataset is suitable for time-series forecasting, econometric modeling, and machine learning applications targeting food security and market dynamics in import-dependent economies. Researchers can use standard cross-validation (rolling windows or forward-chaining), feed data into XGBoost, LightGBM, or LSTM models, or apply spatial econometrics using the lat and lon coordinates. Python (pandas, numpy, scikit-learn, optuna, ruptures), R, STATA, SPSS, or MATLAB can all open and analyze the files.
Files
Steps to reproduce
1. Data Sourcing & Integration: Micro-level monthly market price observations across 64 districts in Bangladesh were gathered and combined with macroeconomic time series retrieved from official global platforms, including the World Bank Real-Time Prices dataset and World Bank Commodity Price Data (The Pink Sheet). 2. Spatial Formatting: Each retail observation market hub was georeferenced with precise latitude and longitude values to capture regional variance and cross-border logistics effects. 3. Quality Control & Imputation: Missing historical spatial data values were evaluated using algorithmic cross-market spatial interpolation, tagged transparently with a boolean flag vector ('spatially_interpolated'). 4. External Feature Engineering: Time-series parameters were expanded with national consumer price inflation indices, unofficial currency trading conversion pairs, and international wheat indexes to capture systemic external macroeconomic shocks.
Institutions
- Bangladesh Agricultural UniversityMymensingh Division, Mymensingh