GWAS研究可利用的数据库(持续更新)

时间:2022-04-18 16:45:29

1、列表包括数据库名称、表型、是否能下载到基因型(genotype)、是否能下载到GWAS结果文件(P值、效应值、SNP位点)。目前收集到的有如下:

GWAS研究可利用的数据库(持续更新)

参考到这些数据库的文献:Genome-wide association study identifies 74 loci associated with educational attainment

2、The Japanese Genotype-phenotype Archive (JGA)  :该数据拥有个体水平的基因型和表型数据,需要申请,已经有人做过GWAS了,数据库连接:https://www.ddbj.nig.ac.jp/jga/index-e.html

3、ExAC,不提供个体水平的genotype,但提供vcf、CNV、coverage等。表型只提供已经发表过的表型,比如二型糖尿病。

ExAC涉及的population和样本数:

Population

Male Samples

Female Samples

Total

African/African American (AFR)

1,888

3,315

5,203

Latino (AMR)

2,254

3,535

5,789

East Asian (EAS)

2,016

2,311

4,327

Finnish (FIN)

2,084

1,223

3,307

Non-Finnish European (NFE)

18,740

14,630

33,370

South Asian (SAS)

6,387

1,869

8,256

Other (OTH)

275

179

454

Total

33,644

27,062

60,706

ExAC可下载的数据:

FTP Link

Description

Sites VCF

VCF of Variant Sites

CNV

CNV Counts and Intolerance Scores

Coverage

Per Base Coverage

Functional Gene Constraint

Functional Gene Constraint Scores for ExAC and Subsets

Manuscript Data

Variant Tables Used in Manuscript

Resources

Exome Calling and Purcell5k Intervals

Subsets

Non-TCGA VCF Subset

数据库链接:http://exac.broadinstitute.org/downloads

4、Simons Genome Diversity Project (SGDP)

提供279个样本,涉及的群体有:美洲、非洲、东亚、南亚、西欧、大洋洲;提供vcf、Phased genotypes、STR、BAMS for Y-chromosomes

链接地址:http://reichdata.hms.harvard.edu/pub/datasets/sgdp/

5、CHINESE MILLIONOME DATABASE

网址:https://db.cngb.org/cmdb/

The Chinese Millionome Database(CMDB) is a unique large-scale Chinese genomics database produced by BGI and hosted in the National GeneBank. The CMDB delivers peridical and useful variation information and scientific insights derived from the analysis of millions of Chinese sequencing data. The results aim to promote genetic research and precision medicine actions in China.

The delivering information includes any of detected variants and the corresponding allele frequency, annotation, frequency comparison to the global populations from existing databases, etc.

 提供变异位点的频率、注释、和其他群体的频率比较;