Log on/register
BioMed Central home | Journals A-Z | Feedback | Support | My details
 
Open AccessSoftware review

ParaHaplo: A program package for haplotype-based whole-genome association study using parallel computing

Kazuharu Misawa1 email and Naoyuki Kamatani2 email

Research Program for Computational Science, Research and Development Group for Next-Generation Integrated Living Matter Simulation, Fusion of Data and Analysis Research and Development Team, RIKEN, 4-6-1 Shirokane-dai, Minato-ku, Tokyo 108-8639, Japan

Laboratory for Statistical Analysis, RIKEN Center for Genomic Medicine, Tokyo, Japan

author email corresponding author email

Source Code for Biology and Medicine 2009, 4:7doi:10.1186/1751-0473-4-7

Published: 21 October 2009

Abstract

Background

Since more than a million single-nucleotide polymorphisms (SNPs) are analyzed in any given genome-wide association study (GWAS), performing multiple comparisons can be problematic. To cope with multiple-comparison problems in GWAS, haplotype-based algorithms were developed to correct for multiple comparisons at multiple SNP loci in linkage disequilibrium. A permutation test can also control problems inherent in multiple testing; however, both the calculation of exact probability and the execution of permutation tests are time-consuming. Faster methods for calculating exact probabilities and executing permutation tests are required.

Methods

We developed a set of computer programs for the parallel computation of accurate P-values in haplotype-based GWAS. Our program, ParaHaplo, is intended for workstation clusters using the Intel Message Passing Interface (MPI). We compared the performance of our algorithm to that of the regular permutation test on JPT and CHB of HapMap.

Results

ParaHaplo can detect smaller differences between 2 populations than SNP-based GWAS. We also found that parallel-computing techniques made ParaHaplo 100-fold faster than a non-parallel version of the program.

Conclusion

ParaHaplo is a useful tool in conducting haplotype-based GWAS. Since the data sizes of such projects continue to increase, the use of fast computations with parallel computing--such as that used in ParaHaplo--will become increasingly important. The executable binaries and program sources of ParaHaplo are available at the following address: http://sourceforge.jp/projects/parallelgwas/?_sl=1 webcite


© 1999-2010 BioMed Central Ltd unless otherwise stated. Part of Springer Science+Business Media.