Here, we developed a comprehensive human variation annotation database (VARAdb,
http://www.licpathway.net/VARAdb), which aims to provide a large number of variations and annotate their
potential roles with a large amount of regulatory information.
The current version of VARAdb cataloged a total of 577,283,813 variations and
provided five annotation sections including ‘Variation
information’, ‘Related genes’, ‘Chromatin
accessibility’ and ‘Chromatin interaction’, with significantly more information than similar databases. The information includes motif changes, risk SNPs, LD SNPs,
eQTLs, clinical variant-drug-gene pairs, sequence conservation, somatic mutations, enhancers, super
enhancers, promoters, TFs, ChromHMM states, histone modifications, ATAC accessible regions and chromatin
interactions from Hi-C and ChIA-PET.
Importantly, we considered two types of variation related genes: 1) variation that sets in enhancer may associate with enhancer target genes predicted by Lasso method; 2) variation related genes based on distance. In addition, VARAdb can prioritize variations based on score, annotate novel variants and perform pathway downstream analysis conveniently. Together, VARAdb is a user-friendly database to query, browse and visualize variations of interest. We believe VARAdb will help obtain perspectives on the regulation of variations in complex diseases.
We have not only collected the variation from dbSNP but also multiple other resources. Notably, 577,098,938 SNVs were collected from dbSNP v151 and 79,482,384 common SNPs were collected from the 1000 Genomes Project. Each common SNP from the 1000 Genomes Project has at least one 1000 Genomes population with a minor allele of frequency ≥ 1%. Millions of LD SNPs of five super-populations (4,477,132 from African; 4,548,152 from Ad Mixed American; 3,693,208 from East Asian; 4,011,947 from European; and 3,838,175 from South Asian) were calculated using phased genotype information accompanying the 1000 Genomes Project phase 3. In addition, we integrated 1,515,001 risk SNPs from the GWAS Catalog, GWASdbv2.0, GAD, Johnson and O'Donnell, and GRASP v2.0. We also obtained 3,998,301 eQTLs from GTEx v7, PancanQTL, and HaploReg v4.1.
SEdb: The comprehensive human Super-Enhancer databasebr
SEanalysis: a web tool for super-enhancer associated regulatory analysis
KnockTF: a comprehensive human gene expression profile database with knockdown/knockout of transcription factors
ENdb: An experimentally supported enhancer database for human and mouse