FUNCISNP: AN R/BIOCONDUCTOR TOOL INTEGRATING FUNCTIONAL NON-CODING DATASETS WITH GENETIC ASSOCIATION STUDIES TO IDENTIFY CANDIDATE REGULATORY SNPs

We established a workflow for integrating genetically associated genome or population variants with epigenomics data, including next generation sequencing. We captured the workflow in a freely available software package called FunciSNP on the R/Bioconductor software archive.

Abstract

Single nucleotide polymorphisms (SNPs) are increasingly used to tag genetic loci associated with phenotypes such as risk of complex diseases. Technically, this is done genome-wide without prior restriction or knowledge of biological feasibility in scans referred to as genome-wide association studies (GWAS). Depending on the linkage disequilibrium (LD) structure at a particular locus, such tagSNPs may be surrogates for many thousands of other SNPs, and it is difficult to distinguish those that may play a functional role in the phenotype from those simply genetically linked. Because a large proportion of tagSNPs have been identified within non-coding regions of the genome, distinguishing functional from non-functional SNPs has been an even greater challenge. A strategy was recently proposed that prioritizes surrogate SNPs based on non-coding chromatin and epigenomic mapping techniques that have become feasible with the advent of massively parallel sequencing. Here, we introduce an R/Bioconductor software package that enables the identification of candidate functional SNPs by integrating information from tagSNP locations, lists of linked SNPs from the 1000 genomes project and locations of chromatin features which may have functional significance

22684628_.png

RELEVANCE TO OC

We have used this software repeatedly to publish papers proposing functional mechanisms in non-protein-coding regions associated with ovarian and other cancers.

PUBLICATION LINK