Disease heritability inferred from familial relationships reported in medical records

Published: 16-07-2018| Version 1 | DOI: 10.17632/j8239bz4n5.1
Fernanda C G Polubriaginof,
Rami Vanguiri,
Kayla Quinnies,
Gillian M Belbin,
Alexandre Yahi,
Hojjat Salmasian,
Tal Lorberbaum,
Victor Nwankwo,
Li Li,
Mark Shervey,
Patricia Glowe,
Iuliana Ionita-Laza ,
Mary Simmerling,
George Hripcsak,
Suzanne Bakken,
David Goldstein,
Krzysztof Kiryluk,
Eimear Kenny,
Joel Dudley,
David K Vawdrey,
Nicholas Tatonetti


This dataset includes clinical data along with familial relationships inferred from electronic health record data using RIFTEHR (Relationship Inference from the Electronic Health Record), a method that identifies familial relationships using patients' emergency contact information. A detailed description of RIFTEHR and the use of this dataset can be found at https://www.biorxiv.org/content/early/2017/05/24/066068. This dataset was modified according to the rules of Safe Harbor, as provided by the U.S. Department of Health and Human Services. Additionally, conditions with less than 1,000 individuals, families with unique family structures, and family structures with less than 100 families were excluded from this data release. We also did not release race and ethnicity data where less than 20 individuals are reported in a single category. We generated a new random map of patient identifiers for every individual trait. Unfortunately, this also will preclude the use of this dataset for comorbidity analysis. The raw clinical data is available at http://riftehr.tatonettilab.org. In addition to this dataset, for those patients that consented to release, we also released genotype data on dbGaP for patients that provided written consent for data release. Data from The Charles Bronfman Institute for Personalized Medicine is available at https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000388.v1.p1 and data from Columbia University Medical Center will soon be available at dbGaP.