Help Desk

SNPselector & High Quality SNP Database

Center for Human Genetics, DUMC

SNPselector will be retired from public use on September 30, 2010

Introduction

This is the HQSNP DB (high-quality SNP database) developed by the CHG bioinformatics group. A high-quality SNP is defined as a SNP having allele frequency or genotyping data. The majority of the HQSNPs come from HapMap, others come from JSNP (Japanese SNP database), TSC (The SNP Consortium), Affymetrix 120K SNP, and .

A SNP selection tool (SNPselector) is built upon HQSNP. It takes a snp ID list, a gene name list, or a genome region list as input and searches SNPs for genome scans or gene association studies. It can take an optional ABI SNP file (exported from ABI SNP search web page) as input for checking whether the candidate SNP is available from ABI. It can also take an optional Illumina SNP pre-score file as input to select SNP for Illumina SNP assay. It generates result sorted by tag SNP in LD block, SNP quality, SNP function, SNP regulatory potential, and SNP mutation risk.

User Instruction (manual)

There are four kinds of SNP searches you can do:
  • Get SNPs by dbSNP rs#: Choose this search if you have already selected a list of SNPs and you just want to get the SNP information. The program will generate an Excel file containing the SNP flanking sequence, variation, quality, function, etc. In the Excel file, there are 10 highlighted fields. You can send only those highlighted information to Illumina to get SNP pre-score. (The same fields are presented in other types of searches as well.)
  • Get gene SNPs by gene names: Choose this search if you have a list of gene names and you want to get the SNP information in these genes. The gene name can be official gene symbol, Ensembl gene ID, RefSeq accession ID, LocusLink number, etc.
  • Get gene SNPs by genome regions: Choose this search if you have a list of genome regions and you want to get all gene SNP information in these regions. The software will find all the Ensembl genes in the regions and find SNPs associated to each Ensembl gene.
  • Get genome scan SNPs by genome regions: Choose this search if you have a list of genome regions and you want to get evenly spaced SNPs in these regions.

    Note:

  • For each gene, the program takes about 1 mins to generate the SNP list. Please be patient when you submit a long list of genes or genome regions.
  • For ABI upload file, please use the exported ABI text file.
  • For Illumina upload file, please use the original Illumina csv file they send back to you after running their SNP scoring program.
  • (03/12/05): Now a user can set r2 and MAF (minor allele frequency) value for LD selection. The default values for r2 and MAF are 0.7 and 0.05, respectively.
  • (03/20/05): Now a user can upload a list of SNP rs# (see the example file - tag.txt) as user defined tag SNPs for LD selection. It's useful when the user has already typed a set of SNPs and wants to use them as tag SNPs to define LD bin.
  • (07/19/05): The selection of race, r2, and minor allele frequency (MAF) for LD bin analysis has been added to the gene or genomic region level. Please see the table below to find the number code for available genotyping races in SNPselector. Please also check the example files ( gene.txt,loca.txt, spac.txt) for searching SNPs within gene or genomic region. For example, in the "gene.txt", the race number "9" for gene "ACE" means that the LD selected population is "Perlegen African American".
  • New (12/12/07):  The user MUST limit the segment size in the function "Get genome scan SNPs by genome regions" to 1Mb to get a response from the server. If the user requires information on a larger segment, it is suggested that the user submit 1Mb or smaller segments overlapping by at least 100Kb and integrate the separate results.

    Number_codeRace
    1 All Caucasian (HapMap Caucasian, Perlegen Caucasian)
    2 All African (HapMap African, Perlegen African American)
    3 All Asian (HapMap Chinese & Japanese, Perlegen Chinese)
    4 HapMap Caucasian
    5 HapMap African
    6 HapMap Chinese
    7 HapMap Japanese
    8 Perlegen Caucasian
    9 Perlegen African American
    10 Perlegen Chinese

    Search (Human Genome: NCBI build 36.2)

    Get SNPs by dbSNP rs#

    Upload a text file of a list of dbSNP accession IDs (ex. rs228), one SNP ID per line. Please see the example file - snp.txt. User can upload an optional ABI SNP file or an optional Illumina SNP score file. Program will merge the ABI and/or Illumina data with other SNP data by matching them on the same rs#.

     

    Email  

    Upload Query File: 

    ABI SNP file (Optional): 

    Illumina score file (Optional): 

    Get gene SNPs by gene names

    Upload a text file of a list of gene names, one gene per line. Each line has 4 tab-delimited fields: gene_name, 5'_Flanking(bp), 3'_Flanking(bp), SNPnum_perGene. Please see the example file - gene.txt. User can upload an optional ABI SNP file and/or an optional Illumina SNP score file. Program will merge the ABI or Illumina data with other SNP data by matching them on the same rs#.

     

    Email  

    Upload Query File: 

    ABI SNP file (Optional): 

    Illumina score file (Optional): 

    LD selection parameters (Optional):

    User defined tag SNP file: 

    Get gene SNPs by genome regions

    Upload a text file of a list of genome regions, one region per line. Each line has 4 tab-delimited fields: Chr_name, Chr_start, Chr_end, SNPnum_perGene. The program will find all SNPs associated with each Ensembl gene within the user specified genome regions. Please see the example file - loca.txt. User can upload an optional ABI SNP file and/or an optional Illumina SNP score file. Program will merge the ABI or Illumina data with other SNP data by matching them on the same rs#.

     

    Email  

    Upload Query File: 

    ABI SNP file (Optional): 

    Illumina score file (Optional): 

    LD selection parameters (Optional):

    User defined tag SNP file: 

    Get genome scan SNPs by genome regions

    PLEASE LIMIT EACH REGION TO 1Mb. TO SCAN A CONTIGUOUS REGION OF >1Mb, USE SEPARATE 1Mb REGIONS OVERLAPPING BY AT LEAST 100Kb. To use this method, upload a text file of a list of genome regions, one region per line. Each line has 4 tab-delimited fields: Chr_name, Chr_start, Chr_end, Spacing(bp). The "Spacing" specify the desired average spacing between SNPs. Please see the example file - spac.txt. User can upload an optional ABI SNP file and/or an optional Illumina SNP score file. Program will merge the ABI or Illumina data with other SNP data by matching them on the same rs#.

     

    Email  

    Upload Query File: 

    ABI SNP file (Optional): 

    Illumina score file (Optional): 

    LD selection parameters (Optional):

    User defined tag SNP file: 

     

    Last Modified: $Date: Fri Dec 14 10:37:33 EST 2007$