Replication Package for "Familiar Strangers: The Role of Diaspora Networks in Foreign Investment and Long-run Development" (INEC-D-24-00067)

Published: 26 May 2026| Version 1 | DOI: 10.17632/243jg3s2y8.1
Contributors:
Fanghao Chen,
,

Description

This package replicates "Familiar Strangers: The Role of Diaspora Networks in Foreign Investment and Long-run Development" (Journal of International Economics, forthcoming) by Fanghao Chen, Ruichi Xiong, and Xiaobo Zhang. It contains the full Stata 17 pipeline (36 do-files, single-entry Main.do with module toggles) that reproduces every table and figure in the paper. Note: the two primary microdata sources (the SAIC firm-registration database and the 2005 China 1% Population Survey) are restricted-access and cannot be redistributed here; please contact the authors to arrange controlled remote access for full replication. See README.md for environment setup and runtime instructions, and INDEX.md for the mapping from each paper table and figure to the do-file that produces it.

Files

Steps to reproduce

Full instructions are in README.md; INDEX.md maps each paper table/figure to its producing do-file. 1. Data. The two primary microdata sources (SAIC firm-registration database; 2005 China 1% Population Survey with surname) are restricted-access and NOT included. Contact any author (chenfanghao@jnu.edu.cn; ruichixiong@whu.edu.cn; x.zhang@gsm.pku.edu.cn) or CER, Peking University (datasupport@pkucer.onaliyun.com) to arrange remote access on a controlled server where the full pipeline runs. Other datasets are public (paper Table 2). Place raw exports in Crude_Data/ and cached .dta in raw_data/. 2. Environment. Stata 17+. One-time: ssc install reghdfe, ftools, estout, outreg2, coefplot, gtools, asdoc, ppmlhdfe, did_multiplegt, shufflevar. 3. Run. cd to the package root and run: do Main.do. It sets all path globals and sources every do-file (01_prepare -> 02_construct -> 03_describe -> 04_analyze -> 05_robust). Outputs go to tables/, figures/, logfiles/. Runtime ~3-4 h on a 16 GB / 8-core machine (the one-time CSV import adds 2-3 h; set REBUILD_CACHE=1). 4. Selective re-runs. Flip toggles at the top of Main.do (RUN_PREPARE/CONSTRUCT/DESCRIBE/ANALYZE/ROBUST) to 0, or do a single script directly after cd-ing to the root (see INDEX.md). Note: TabA4 and FigA7 are typeset in LaTeX and not code-generated.

Institutions

Categories

Firm Location, China, Diaspora, Foreign Market Entry, Multinational Firm, Migrant Development

Licence