Associations of observational and genetically determined caffeine intake with coronary artery disease and diabetes - GWAS summary statistics data files

Published: 01-11-2020| Version 1 | DOI: 10.17632/d8nwkm7p9p.1
Abdullah Said


GWAS summary statistics data files related to the article "Associations of observational and genetically determined caffeine intake with coronary artery disease and diabetes" by Said et al.


Steps to reproduce

Ascertainment of coffee and tea intake During the first visit to the assessment center, daily coffee and tea intake were assessed by asking participants “How many cups of coffee do you drink each day? (Include decaffeinated coffee)" and “How many cups of tea do you drink each day? (Include black and green tea)". Participants were asked to provide the average number of cups of either beverage they drink daily, based on their intake over the last year. We excluded participants who answered with “Less than one”, “Do not know” or “Prefer not to answer”. Participants who indicated to drink more than 10 cups of coffee or 20 cups of tea daily were asked to confirm their input. In addition, coffee drinkers were asked what type of coffee they usually drink, to which they could answer “Decaffeinated coffee (any type)”, “Instant coffee”, “Ground coffee (include espresso, filter etc)”, “Other type of coffee”, “Do not know” or “Prefer not to answer”. Amongst coffee drinkers we additionally excluded those who did not provide information on the type of coffee they usually drink. Coffee and tea intake were truncated at 20 cups per day. Decaffeinated coffee was considered to contain 3 mg of caffeine per cup, instant coffee 60 mg, ground coffee 85 mg, and tea 30 mg. Combined caffeine intake from both coffee and tea was calculated as the sum of the daily caffeine intake from coffee and tea from individuals who provided data on both. Genotyping and imputation Genomic quality control and imputation was performed by the Wellcome Trust Centre for Human Genetics. Participants with a mismatch between genetic and reported sex, high missingness, excess heterozygosity, or who were not of white British descent were exlcuded. Genome Wide Association Study We performed GWAS for inverse rank normalized combined caffeine intake, caffeine from coffee, and caffeine from tea. GWAS were performed using BOLT-LMM v2.3.1, which uses a linear mixed model that corrects for population structure and cryptic relatedness. In total, 19,400,838 SNPs were included per GWAS. The GWAS were adjusted for age at inclusion, sex, genotyping array (UK Biobank Axiom or UK BiLEVE Axiom), and the first 30 principal components provided by UK Biobank. To obtain a set of independent SNPs per phenotype, SNPs with P<5×10-8 were clumped together based on linkage disequilibrium R2>0.005 and 5-Mb distance using the clumping procedure integrated in PLINK version 1.9. To account for multiple testing of the 3 GWAS, we considered only SNPs with Bonferroni corrected P<1.67x10-8 (traditional GWAS significance threshold of 5x10-8/3) as statistically significant. SNPs were excluded if the minor allele frequency was <0.005 or the INFO score was <0.3. A locus was defined as a 1-Mb region at either side of the sentinel SNP. Positions are based on GRCh37/hg19. A total of 362,316 individuals were included in the GWAS for combined caffeine intake, 373,522 for caffeine from coffee and 395,866 for caffeine from tea.