CalcFi Open Data: 34 Free CC-BY Financial and Macro Time Series Mirrored from Primary Sources
Description
Free, CC-BY-4.0 financial and macro datasets — a daily-refreshed mirror of CalcFi (https://calcfi.app) with full primary-source provenance. 34 time series across 11 categories: mortgage rates (Freddie Mac PMMS, Treasury yields), interest rates (Federal Funds, Prime), inflation (CPI, PCE), employment (BLS unemployment, labor participation, hourly earnings), energy (WTI, Brent, gasoline), foreign exchange (USD/EUR, USD/GBP, JPY/USD), cryptocurrency (BTC, ETH, SOL), FDIC deposit and CD rates, credit card and personal loan rates, copper and corn benchmarks, and World Bank global indicators. Each series is a complete Frictionless Data Package with data.csv (history with provenance headers), datapackage.json (schema descriptor), and README.md (per-series source citation, live URL, quickstart code). Data is mirrored verbatim from primary sources (FRED, BLS, BEA, US Treasury, Freddie Mac, FDIC, EIA, Federal Reserve H.10, World Bank, CoinGecko, IMF). No smoothing, imputation, or seasonal adjustment beyond what the primary source publishes. Refreshed daily at 05:00 UTC. This deposit is one of four permanent identifiers for the same dataset: - Figshare: https://doi.org/10.6084/m9.figshare.32332290 - Kaggle: https://doi.org/10.34740/kaggle/dsv/16356447 - OSF: https://doi.org/10.17605/OSF.IO/PUMKT - Hugging Face mirror: https://huggingface.co/datasets/iizy/calcfi-open-data - GitHub repository: https://github.com/jerehere/calcfi-open-data - Live API + documentation: https://calcfi.app/developers Accompanying methodology paper has been submitted to SSRN and is currently indexing. The paper describes the system architecture, the k-anonymity protocol applied to calculator-run aggregates (k≥10 enforced at write time), and the dataset schema in detail. License: CC BY 4.0 on data, CC0 1.0 on scaffold code. Attribution requested back to https://calcfi.app/developers and the primary source named in each dataset's README.
Files
Steps to reproduce
Data is mirrored daily from primary sources via the CalcFi read-only API (https://calcfi.app/api/data/{slug}/history.csv). The mirror process is: 1. Daily extract-transform-load pipeline at 05:00 UTC pulls each series from primary sources (FRED, BLS, BEA, US Treasury, Freddie Mac, FDIC, EIA, Federal Reserve H.10, World Bank, CoinGecko, IMF). 2. Values are stored without smoothing, imputation, or seasonal adjustment. 3. Provenance headers (#-prefixed lines) are written into each CSV recording source name, primary URL, source series identifier, retrieval timestamp, and licence. 4. A GitHub Actions workflow at 06:00 UTC mirrors the resulting files to the public repository. To reproduce locally: run `node scripts/refresh.mjs` from the open-data repository. The script reads the series manifest from https://calcfi.app/api/llms/rates and downloads each history CSV to the matching subdirectory under datasets/. No authentication required. Refer to scripts/refresh.mjs in the GitHub repository for the canonical implementation.