<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1751-0473-5-5</ui>
   <ji>1751-0473</ji>
   <fm>
      <dochead>Software review</dochead>
      <bibl>
         <title>
            <p>ParaHaplo 2.0: a program package for haplotype-estimation and haplotype-based whole-genome association study using parallel computing</p>
         </title>
         <aug>
            <au ca="yes" id="A1">
               <snm>Misawa</snm>
               <fnm>Kazuharu</fnm>
               <insr iid="I1"/>
               <email>kazumisawa@riken.jp</email>
            </au>
            <au id="A2">
               <snm>Kamatani</snm>
               <fnm>Naoyuki</fnm>
               <insr iid="I2"/>
               <email>kamatani@msb.biglobe.ne.jp</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Research Program for Computational Science, Research and Development Group for Next-Generation Integrated Living Matter Simulation, and Fusion of Data and Analysis Research and Development Team, RIKEN, 4-6-1 Shirokane-dai, Minato-ku, Tokyo 108-8639, Japan</p>
            </ins>
            <ins id="I2">
               <p>Laboratory for Statistical Analysis, RIKEN Center for Genomic Medicine, Tokyo, Japan</p>
            </ins>
         </insg>
         <source>Source Code for Biology and Medicine</source>
         <issn>1751-0473</issn>
         <pubdate>2010</pubdate>
         <volume>5</volume>
         <issue>1</issue>
         <fpage>5</fpage>
         <url>http://www.scfbm.org/content/5/1/5</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">20525312</pubid>
               <pubid idtype="doi">10.1186/1751-0473-5-5</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>16</day>
               <month>4</month>
               <year>2010</year>
            </date>
         </rec>
         <acc>
            <date>
               <day>4</day>
               <month>6</month>
               <year>2010</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>4</day>
               <month>6</month>
               <year>2010</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2010</year>
         <collab>Misawa and Kamatani; licensee BioMed Central Ltd.</collab>
         <note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>The use of haplotype-based association tests can improve the power of genome-wide association studies. Since the observed genotypes are unordered pairs of alleles, haplotype phase must be inferred. However, estimating haplotype phase is time consuming. When millions of single-nucleotide polymorphisms (SNPs) are analyzed in genome-wide association study, faster methods for haplotype estimation are required.</p>
            </sec>
            <sec>
               <st>
                  <p>Methods</p>
               </st>
               <p>We developed a program package for parallel computation of haplotype estimation. Our program package, ParaHaplo 2.0, is intended for use in workstation clusters using the Intel Message Passing Interface (MPI). We compared the performance of our algorithm to that of the regular permutation test on both Japanese in Tokyo, Japan and Han Chinese in Beijing, China of the HapMap dataset.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>Parallel version of ParaHaplo 2.0 can estimate haplotypes 100 times faster than a non-parallel version of the ParaHaplo.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusion</p>
               </st>
               <p>ParaHaplo 2.0 is an invaluable tool for conducting haplotype-based genome-wide association studies (GWAS). The need for fast haplotype estimation using parallel computing will become increasingly important as the data sizes of such projects continue to increase. The executable binaries and program sources of ParaHaplo are available at the following address: <url>http://en.sourceforge.jp/projects/parallelgwas/releases/</url></p>
            </sec>
         </sec>
      </abs>
   </fm>
   <meta>
      <classifications>
         <classification id="endnote" subtype="user_supplied_xml" type="bmc"/>
      </classifications>
   </meta>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>Recent advances in various high-throughput genotyping technologies have allowed us to test allele frequency differences between case and control populations on a genome-wide scale <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. Genome-wide association studies (GWAS) are used to compare the frequency of alleles or genotypes of a particular variant between cases and controls for a particular disease across a given genome <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr></abbrgrp>. More than a million single-nucleotide polymorphisms (SNPs) are analyzed in SNP-based GWAS. One difficulty faced when conducting SNP-based GWAS is performing corrections for multiple comparisons. Under the assumption that all SNPs are independent, a Bonferroni correction for a P value is usually used to account for multiple tests. When SNP loci are in linkage disequilibrium, Bonferroni corrections are known to be too conservative <abbrgrp><abbr bid="B5">5</abbr></abbrgrp>. As a result, SNP-based GWAS may exclude the truly significant SNPs from analysis <abbrgrp><abbr bid="B6">6</abbr></abbrgrp>.</p>
         <p>To cope with problems related to multiple comparisons in GWAS, haplotype-based algorithms were developed to correct for multiple comparisons at multiple SNP loci in linkage disequilibrium <abbrgrp><abbr bid="B5">5</abbr></abbrgrp>. A permutation test can also help control inherent problems with multiple testing <abbrgrp><abbr bid="B6">6</abbr></abbrgrp>. The use of haplotype-based association tests can improve the power of GWAS <abbrgrp><abbr bid="B7">7</abbr><abbr bid="B8">8</abbr></abbrgrp>. To conduct haplotype-GWAS within a short time period, Misawa and Kamatani <abbrgrp><abbr bid="B9">9</abbr></abbrgrp> developed ParaHaplo 1.0, a set of computer programs for the parallel computation of accurate P values in haplotype-based GWAS by using the MCMC <abbrgrp><abbr bid="B5">5</abbr></abbrgrp> and RAT <abbrgrp><abbr bid="B6">6</abbr></abbrgrp>.algorithms.</p>
         <p>Despite this, haplotype estimation is still time consuming <abbrgrp><abbr bid="B10">10</abbr></abbrgrp>, and therefore, faster methods for haplotype estimation are required. We developed a software package for the parallel computation of haplotype estimation called ParaHaplo 2.0. ParaHaplo 2.0 contains all of the functions of ParaHaplo 1.0 <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>. Additionally, ParaHaplo 2.0 can conduct haplotype estimation by using the PHASE 2.1 <abbrgrp><abbr bid="B11">11</abbr></abbrgrp> and SNPHAP 1.3.1 <abbrgrp><abbr bid="B12">12</abbr></abbrgrp> algorithms. ParaHaplo 2.0, is based on the principle of data parallelism--a programming technique used to split large datasets into smaller ones that can be run in a parallel, concurrent fashion <abbrgrp><abbr bid="B13">13</abbr></abbrgrp>. ParaHaplo 2.0 is intended for use in workstation clusters using the Intel Message Passing Interface (MPI).</p>
         <p>Using ParaHaplo 2.0, we estimated haplotypes from the genotype data of the Japanese from Tokyo (JPT), and Han Chinese from Beijing (CHB); these data sets were obtained from the HapMap dataset <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>. Using ParaHaplo 2.0, we compared the speed of haplotype estimation using parallel computation to the number of processors.</p>
      </sec>
      <sec>
         <st>
            <p>Implementation</p>
         </st>
         <sec>
            <st>
               <p>Software overview</p>
            </st>
            <p>ParaHaplo supports the genotype data in the HapMap format <abbrgrp><abbr bid="B10">10</abbr></abbrgrp> as well as the BioBank Japan format <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>. For input, ParaHaplo 2.0 requires a file of haplotype block boundaries. ParaHaplo 2.0 conducts haplotype estimation by using PHASE 2.1 <abbrgrp><abbr bid="B11">11</abbr></abbrgrp> and SNPHAP 1.3.1 <abbrgrp><abbr bid="B12">12</abbr></abbrgrp> algorithms. ParaHaplo 2.0 can also conduct haplotype-based GWAS like version 1.0 <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>.</p>
         </sec>
         <sec>
            <st>
               <p>Parallel computing using MPI methods</p>
            </st>
            <p>ParaHaplo 2.0 is implemented in an MPI-C multithreaded package. The MPI package allows us to construct parallel computing programs on multiprocessors. The genome-wide polymorphism data is broken down into user-defined haplotype blocks, and the MPI Bcast function is used to distribute a single block of haplotype data into each processor. Each processor executes PHASE 2.1 <abbrgrp><abbr bid="B11">11</abbr></abbrgrp> and SNPHAP 1.3.1 <abbrgrp><abbr bid="B12">12</abbr></abbrgrp> algorithms and estimates haplotypes of a single linkage disequilibrium (LD) block. Once the haplotypes of each LD block are completely estimated, the results are compiled into a single genome-wide dataset by using the MPI-Gatherv function. ParaHaplo 2.0 is compatible with OpenMPI 1.2.5 as well as with MPICH 1.2.7p1. Users can compile the source code using a GCC compiler or an Intel C compiler.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Methods</p>
         </st>
         <sec>
            <st>
               <p>Hardware</p>
            </st>
            <p>When computational time was measured, a CentOS PC cluster at RIKEN was used. The program was compiled using an Intel C compiler. Numbers of processing units used were 1, 2, 4, 8, 16, 32, 64, 128, and 256.</p>
         </sec>
         <sec>
            <st>
               <p>Example data</p>
            </st>
            <p>An example of GWAS is presented here: We used ParaHaplo 2.0 to compare genome-wide genotype data of JPT and CHB from HapMap <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>; the number of individuals therein was 44 and 45, respectively. Haplotype blocks were obtained as LD blocks, using the method outlined by Gabriel <it>et al</it>. <abbrgrp><abbr bid="B16">16</abbr></abbrgrp> and by using the Haploview program <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>. The entire genomes of JPT and CHB were divided into 106,149 haplotype blocks by Haploview <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>. PHASE 2.1 does not work with a large number of SNPs <abbrgrp><abbr bid="B11">11</abbr><abbr bid="B18">18</abbr></abbrgrp>; therefore, when the number of SNPs in an LD block was greater than 40, we split the block into 40 SNPs.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Results</p>
         </st>
         <sec>
            <st>
               <p>Haplotype Estimation of JPT and CHB</p>
            </st>
            <p>Figure <figr fid="F1">1</figr> shows the result of haplotype phasing. The SNP number, the position of the SNP in the chromosome, and haplotype data are displayed in each line; the rest are phased haplotypes. Each column displays a haplotype. Individuals are separated by a tab; haplotypes are separated by a space. The data format is identical to the results from ParaHaplo 1.0 <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>.</p>
            <fig id="F1">
               <title>
                  <p>Figure 1</p>
               </title>
               <caption>
                  <p>The result of haplotype phasing</p>
               </caption>
               <text>
                  <p><b>The result of haplotype phasing</b>. The first column shows the SNP number. The second column shows the position of SNP in the chromosome. The additional columns display phased haplotypes; each column shows a haplotype. Individuals are separated by a tab; haplotypes are separated by a space.</p>
               </text>
               <graphic file="1751-0473-5-5-1" hint_layout="double"/>
            </fig>
         </sec>
         <sec>
            <st>
               <p>Calculation Time</p>
            </st>
            <p>The speedup ratio is the ratio of the computation time of a single processor to that of multiple processors. Table <tblr tid="T1">1</tblr> shows the elapsed times and the speedups associated with the use of ParaHaplo 2.0 using the genotype data of chromosome 22 for haplotype estimation. In table 2, the calculation time decreased as the number of processors increased. When 256 processors were used, ParaHaplo was 100 times faster than the non-parallel program.</p>
            <tbl id="T1">
               <title>
                  <p>Table 1</p>
               </title>
               <caption>
                  <p>Elapsed times and speedups obtained with ParaHaplo applied on the HapMap 3 JPT and CHB data of chromosome 22.</p>
               </caption>
               <tblbdy cols="8">
                  <r>
                     <c ca="left" cspan="8">
                        <p>
                           <b>Elapsed times and speedups obtained with ParaHaplo on the phasing process</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c cspan="8">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>Number of Processing Units</b>
                        </p>
                     </c>
                     <c ca="left" cspan="6">
                        <p>
                           <b>Calculation Time</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>Speed Ratio </b>
                           <sup>a</sup>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c cspan="8">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>9</p>
                     </c>
                     <c ca="left">
                        <p>h</p>
                     </c>
                     <c ca="left">
                        <p>56</p>
                     </c>
                     <c ca="left">
                        <p>m</p>
                     </c>
                     <c ca="left">
                        <p>54</p>
                     </c>
                     <c ca="left">
                        <p>s</p>
                     </c>
                     <c ca="left">
                        <p>1</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>2</p>
                     </c>
                     <c ca="left">
                        <p>4</p>
                     </c>
                     <c ca="left">
                        <p>h</p>
                     </c>
                     <c ca="left">
                        <p>56</p>
                     </c>
                     <c ca="left">
                        <p>m</p>
                     </c>
                     <c ca="left">
                        <p>13</p>
                     </c>
                     <c ca="left">
                        <p>s</p>
                     </c>
                     <c ca="left">
                        <p>2</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>4</p>
                     </c>
                     <c ca="left">
                        <p>2</p>
                     </c>
                     <c ca="left">
                        <p>h</p>
                     </c>
                     <c ca="left">
                        <p>26</p>
                     </c>
                     <c ca="left">
                        <p>m</p>
                     </c>
                     <c ca="left">
                        <p>40</p>
                     </c>
                     <c ca="left">
                        <p>s</p>
                     </c>
                     <c ca="left">
                        <p>4</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>8</p>
                     </c>
                     <c ca="left">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>h</p>
                     </c>
                     <c ca="left">
                        <p>21</p>
                     </c>
                     <c ca="left">
                        <p>m</p>
                     </c>
                     <c ca="left">
                        <p>39</p>
                     </c>
                     <c ca="left">
                        <p>s</p>
                     </c>
                     <c ca="left">
                        <p>7</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>16</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>39</p>
                     </c>
                     <c ca="left">
                        <p>m</p>
                     </c>
                     <c ca="left">
                        <p>2</p>
                     </c>
                     <c ca="left">
                        <p>s</p>
                     </c>
                     <c ca="left">
                        <p>15</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>32</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>21</p>
                     </c>
                     <c ca="left">
                        <p>m</p>
                     </c>
                     <c ca="left">
                        <p>5</p>
                     </c>
                     <c ca="left">
                        <p>s</p>
                     </c>
                     <c ca="left">
                        <p>28</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>64</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>11</p>
                     </c>
                     <c ca="left">
                        <p>m</p>
                     </c>
                     <c ca="left">
                        <p>49</p>
                     </c>
                     <c ca="left">
                        <p>s</p>
                     </c>
                     <c ca="left">
                        <p>50</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>128</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>7</p>
                     </c>
                     <c ca="left">
                        <p>m</p>
                     </c>
                     <c ca="left">
                        <p>4</p>
                     </c>
                     <c ca="left">
                        <p>s</p>
                     </c>
                     <c ca="left">
                        <p>85</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>256</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>5</p>
                     </c>
                     <c ca="left">
                        <p>m</p>
                     </c>
                     <c ca="left">
                        <p>32</p>
                     </c>
                     <c ca="left">
                        <p>s</p>
                     </c>
                     <c ca="left">
                        <p>108</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p><sup>a</sup>Ratio of computation time of a single processor to computation time of multiple processors</p>
               </tblfn>
            </tbl>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Discussion</p>
         </st>
         <p>We developed ParaHaplo 2.0, a set of computer programs, for the parallel computation of haplotype estimation as well as for accurate P values in haplotype-based GWAS. ParaHaplo is intended for use in workstation clusters using the Intel MPI. By using ParaHaplo, we conducted haplotype estimation of the genotype data of JPT and CHB from the HapMap dataset <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>.</p>
         <sec>
            <st>
               <p>Parallel Computation of Haplotype-based GWAS</p>
            </st>
            <p>The results showed that the parallel computing ability of ParaHaplo 2.0 for haplotype estimation was 100 times faster than non-parallel version of ParaHaplo 2.0. In this study, we used a total of 89 JPT and CHB individuals whose genotypes had been determined during the HapMap project <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>. When a single processor was used, haplotype estimation for chromosome 22 took more than 9 h; if 9,000 individuals were to be analyzed under the same conditions, it would take approximately 1 month. However, if ParaHaplo 2.0 was used on a workstation with 256 processors, the same analysis would take approximately 9 h.</p>
            <p>Algorisms for faster haplotype estimation, such as FastPHASE <abbrgrp><abbr bid="B19">19</abbr></abbrgrp> and GERBIL <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>, have been developed. However, we chose PHASE 2.1 <abbrgrp><abbr bid="B11">11</abbr></abbrgrp> because it outperforms these methods in accuracy of estimating haplotypes of these methods <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>.</p>
            <p>Even when 256 processors were used, the speedup ratio was only 116 because of the variations in the LD block size. Since ParaHaplo is based on data parallelism, the computation times of each haplotype estimation was approximately proportional to the number of SNPs within the LD block <abbrgrp><abbr bid="B5">5</abbr><abbr bid="B6">6</abbr></abbrgrp>; therefore, we believe that a large LD block may becomes a computational bottleneck. PHASE 2.1 <abbrgrp><abbr bid="B11">11</abbr></abbrgrp> in ParaHaplo 2.0 does not work for a large number of SNPs, when the number of SNPs in a haplotype block is greater than 40. Most of SNPs in a large LD block are in strong LD so that we must choose smaller number of tag SNPs in phase estimation to estimate haplotypes by using PHASE 2.1 <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>. Or, we can use SNPHAP 1.3.1 <abbrgrp><abbr bid="B12">12</abbr></abbrgrp> in ParaHaplo 2.0.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Conclusion</p>
         </st>
         <p>The results indicated that when the number of processors is sufficient, the parallel computing abilities of ParaHaplo were 100 times faster than those of non-parallel programs. There are more than a million SNPs for which accurate and complete genotypes have been obtained <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>, more than ten thousands of people are now being genotyped <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>. The need for fast haplotype estimation using parallel computing will become increasingly important as the data sizes of such projects continue to increase.</p>
      </sec>
      <sec>
         <st>
            <p>Availability and Requirements</p>
         </st>
         <p>&#8226; <b>Project name</b>: ParaHaplo 2.0</p>
         <p>&#8226; <b>Project home page</b>:. <url>http://sourceforge.jp/projects/parallelgwas/releases/46982</url></p>
         <p>&#8226; <b>Operating systems</b>: Platform independent</p>
         <p>&#8226; <b>Programming language</b>: Java and C</p>
         <p>&#8226; <b>Other requirements</b>: OpenMPI version 1.2.5, or MPICH version 1.2.7p1</p>
         <p>&#8226; <b>License</b>: MIT license</p>
         <p>&#8226; <b>Any restrictions to use by non-academics</b>: License required</p>
      </sec>
      <sec>
         <st>
            <p>Abbreviations</p>
         </st>
         <p>RAT: Rapid Association Test; SPT: Standard Permutation Test; MCMC: Markov-chain Monte Carlo; JPT: Japanese Tokyo; CHB: Han Chinese Beijing.</p>
      </sec>
      <sec>
         <st>
            <p>Competing interests</p>
         </st>
         <p>The authors declare that they have no competing interests.</p>
      </sec>
      <sec>
         <st>
            <p>Authors' contributions</p>
         </st>
         <p>KM wrote the software and the manuscript, and NK supervised the project. Both authors read and approved the final manuscript.</p>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>The present study was supported in part by grants from the Research Project for Personalized Medicine (MEXT). This study was supported by the "Next-generation Integrated Living Matter Simulation" - a national project of the Ministry of Education, Culture, Sports, Science, and Technology (MEXT).</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>Genome-wide association studies for common diseases and complex traits</p>
            </title>
            <aug>
               <au>
                  <snm>Hirschhorn</snm>
                  <fnm>JN</fnm>
               </au>
               <au>
                  <snm>Daly</snm>
                  <fnm>MJ</fnm>
               </au>
            </aug>
            <source>Nat Rev Genet</source>
            <pubdate>2005</pubdate>
            <volume>6</volume>
            <fpage>95</fpage>
            <lpage>108</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nrg1521</pubid>
                  <pubid idtype="pmpid" link="fulltext">15716906</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>Functional SNPs in the lymphotoxin-alpha gene that are associated with susceptibility to myocardial infarction</p>
            </title>
            <aug>
               <au>
                  <snm>Ozaki</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Ohnishi</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Iida</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Sekine</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Yamada</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Tsunoda</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Sato</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Hori</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Nakamura</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Tanaka</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>2002</pubdate>
            <volume>32</volume>
            <fpage>650</fpage>
            <lpage>654</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/ng1047</pubid>
                  <pubid idtype="pmpid" link="fulltext">12426569</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>ITPKC functional polymorphism associated with Kawasaki disease susceptibility and formation of coronary artery aneurysms</p>
            </title>
            <aug>
               <au>
                  <snm>Onouchi</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Gunji</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Burns</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Shimizu</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Newburger</snm>
                  <fnm>JW</fnm>
               </au>
               <au>
                  <snm>Yashiro</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Nakamura</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Yanagawa</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Wakui</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Fukushima</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Kishi</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Hamamoto</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Terai</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Sato</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Ouchi</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Saji</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Nariai</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Kaburagi</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Yoshikawa</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Suzuki</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Tanaka</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Nagai</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Cho</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Fujino</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Sekine</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Nakamichi</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Tsunoda</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Kawasaki</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Hata</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>2008</pubdate>
            <volume>40</volume>
            <fpage>35</fpage>
            <lpage>42</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/ng.2007.59</pubid>
                  <pubid idtype="pmcid">2876982</pubid>
                  <pubid idtype="pmpid" link="fulltext">18084290</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>An intronic SNP in a RUNX1 binding site of SLC22A4, encoding an organic cation transporter, is associated with rheumatoid arthritis</p>
            </title>
            <aug>
               <au>
                  <snm>Tokuhiro</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Yamada</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Chang</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Suzuki</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Kochi</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Sawada</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Suzuki</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Nagasaki</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Ohtsuki</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Ono</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Furukawa</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Nagashima</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Yoshino</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Mabuchi</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Sekine</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Saito</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Takahashi</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Tsunoda</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Nakamura</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Yamamoto</snm>
                  <fnm>K</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>2003</pubdate>
            <volume>35</volume>
            <fpage>341</fpage>
            <lpage>348</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/ng1267</pubid>
                  <pubid idtype="pmpid" link="fulltext">14608356</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>New correction algorithms for multiple comparisons in case-control multilocus association studies based on haplotypes and diplotype configurations</p>
            </title>
            <aug>
               <au>
                  <snm>Misawa</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Fujii</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Yamazaki</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Takahashi</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Takasaki</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Yanagisawa</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Ohnishi</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Nakamura</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Kamatani</snm>
                  <fnm>N</fnm>
               </au>
            </aug>
            <source>J Hum Genet</source>
            <pubdate>2008</pubdate>
            <volume>53</volume>
            <fpage>789</fpage>
            <lpage>801</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1007/s10038-008-0312-0</pubid>
                  <pubid idtype="pmpid" link="fulltext">18651098</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>A fast method for computing high-significance disease association in large population-based studies</p>
            </title>
            <aug>
               <au>
                  <snm>Kimmel</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Shamir</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Am J Hum Genet</source>
            <pubdate>2006</pubdate>
            <volume>79</volume>
            <fpage>481</fpage>
            <lpage>492</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1086/507317</pubid>
                  <pubid idtype="pmcid">1559554</pubid>
                  <pubid idtype="pmpid" link="fulltext">16909386</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>Evaluating associations of haplotypes with traits</p>
            </title>
            <aug>
               <au>
                  <snm>Schaid</snm>
                  <fnm>DJ</fnm>
               </au>
            </aug>
            <source>Genet Epidemiol</source>
            <pubdate>2004</pubdate>
            <volume>27</volume>
            <fpage>348</fpage>
            <lpage>364</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/gepi.20037</pubid>
                  <pubid idtype="pmpid" link="fulltext">15543638</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>Efficient multilocus association testing for whole genome association studies using localized haplotype clustering</p>
            </title>
            <aug>
               <au>
                  <snm>Browning</snm>
                  <fnm>BL</fnm>
               </au>
               <au>
                  <snm>Browning</snm>
                  <fnm>SR</fnm>
               </au>
            </aug>
            <source>Genet Epidemiol</source>
            <pubdate>2007</pubdate>
            <volume>31</volume>
            <fpage>365</fpage>
            <lpage>375</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/gepi.20216</pubid>
                  <pubid idtype="pmpid" link="fulltext">17326099</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>ParaHaplo: A program package for haplotype-based whole-genome association study using parallel computing</p>
            </title>
            <aug>
               <au>
                  <snm>Misawa</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Kamatani</snm>
                  <fnm>N</fnm>
               </au>
            </aug>
            <source>Source Code Biol Med</source>
            <pubdate>2009</pubdate>
            <volume>4</volume>
            <fpage>7</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1186/1751-0473-4-7</pubid>
                  <pubid idtype="pmcid">2774321</pubid>
                  <pubid idtype="pmpid" link="fulltext">19845960</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>A comparison of phasing algorithms for trios and unrelated individuals</p>
            </title>
            <aug>
               <au>
                  <snm>Marchini</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Cutler</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Patterson</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Stephens</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Eskin</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Halperin</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Lin</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Qin</snm>
                  <fnm>ZS</fnm>
               </au>
               <au>
                  <snm>Munro</snm>
                  <fnm>HM</fnm>
               </au>
               <au>
                  <snm>Abecasis</snm>
                  <fnm>GR</fnm>
               </au>
               <au>
                  <snm>Donnelly</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Am J Hum Genet</source>
            <pubdate>2006</pubdate>
            <volume>78</volume>
            <fpage>437</fpage>
            <lpage>450</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1086/500808</pubid>
                  <pubid idtype="pmcid">1380287</pubid>
                  <pubid idtype="pmpid" link="fulltext">16465620</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>Complete Sequence of the Duckweed (Lemna minor) Chloroplast Genome: Structural Organization and Phylogenetic Relationships to Other Angiosperms</p>
            </title>
            <aug>
               <au>
                  <snm>Mardanov</snm>
                  <fnm>AV</fnm>
               </au>
               <au>
                  <snm>Ravin</snm>
                  <fnm>NV</fnm>
               </au>
               <au>
                  <snm>Kuznetsov</snm>
                  <fnm>BB</fnm>
               </au>
               <au>
                  <snm>Samigullin</snm>
                  <fnm>TH</fnm>
               </au>
               <au>
                  <snm>Antonov</snm>
                  <fnm>AS</fnm>
               </au>
               <au>
                  <snm>Kolganova</snm>
                  <fnm>TV</fnm>
               </au>
               <au>
                  <snm>Skyabin</snm>
                  <fnm>KG</fnm>
               </au>
            </aug>
            <source>J Mol Evol</source>
            <pubdate>2008</pubdate>
            <volume>66</volume>
            <issue>6</issue>
            <fpage>555</fpage>
            <lpage>64</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1007/s00239-008-9091-7</pubid>
                  <pubid idtype="pmpid" link="fulltext">18463914</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>SNPHAP - A program for estimating frequencies of large haplotypes of SNPs</p>
            </title>
            <url>http://www-gene.cimr.cam.ac.uk/clayton/software/</url>
         </bibl>
         <bibl id="B13">
            <title>
               <p>Parallel Computer Architecture: A Hardware/Software Approach</p>
            </title>
            <aug>
               <au>
                  <snm>Culler</snm>
                  <fnm>DE</fnm>
               </au>
               <au>
                  <snm>Gupta</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Singh</snm>
                  <fnm>JP</fnm>
               </au>
            </aug>
            <publisher>San Francisco, CA: Morgan Kaufmann Publishers</publisher>
            <pubdate>1997</pubdate>
         </bibl>
         <bibl id="B14">
            <title>
               <p>The International HapMap Project</p>
            </title>
            <aug>
               <au>
                  <cnm>The International HapMap Consortium</cnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2003</pubdate>
            <volume>426</volume>
            <fpage>789</fpage>
            <lpage>796</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nature02168</pubid>
                  <pubid idtype="pmpid" link="fulltext">14685227</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>The BioBank Japan Project</p>
            </title>
            <aug>
               <au>
                  <snm>Nakamura</snm>
                  <fnm>Y</fnm>
               </au>
            </aug>
            <source>Clin Adv Hematol Oncol</source>
            <pubdate>2007</pubdate>
            <volume>5</volume>
            <fpage>696</fpage>
            <lpage>697</lpage>
            <xrefbib>
               <pubid idtype="pmpid">17982410</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>The structure of haplotype blocks in the human genome</p>
            </title>
            <aug>
               <au>
                  <snm>Gabriel</snm>
                  <fnm>SB</fnm>
               </au>
               <au>
                  <snm>Schaffner</snm>
                  <fnm>SF</fnm>
               </au>
               <au>
                  <snm>Nguyen</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Moore</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Roy</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Blumenstiel</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Higgins</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>DeFelice</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Lochner</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Faggart</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Liu-Cordero</snm>
                  <fnm>SN</fnm>
               </au>
               <au>
                  <snm>Rotimi</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Adeyemo</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Cooper</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Ward</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Lander</snm>
                  <fnm>ES</fnm>
               </au>
               <au>
                  <snm>Daly</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Altshuler</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2002</pubdate>
            <volume>296</volume>
            <fpage>2225</fpage>
            <lpage>2229</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.1069424</pubid>
                  <pubid idtype="pmpid" link="fulltext">12029063</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>Haploview: analysis and visualization of LD and haplotype maps</p>
            </title>
            <aug>
               <au>
                  <snm>Barrett</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Fry</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Maller</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Daly</snm>
                  <fnm>MJ</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2005</pubdate>
            <volume>21</volume>
            <fpage>263</fpage>
            <lpage>265</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/bth457</pubid>
                  <pubid idtype="pmpid" link="fulltext">15297300</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>Missing data imputation and haplotype phase inference for genome-wide association studies</p>
            </title>
            <aug>
               <au>
                  <snm>Browning</snm>
                  <fnm>SR</fnm>
               </au>
            </aug>
            <source>Hum Genet</source>
            <pubdate>2008</pubdate>
            <volume>124</volume>
            <fpage>439</fpage>
            <lpage>450</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1007/s00439-008-0568-7</pubid>
                  <pubid idtype="pmcid">2731769</pubid>
                  <pubid idtype="pmpid" link="fulltext">18850115</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase</p>
            </title>
            <aug>
               <au>
                  <snm>Scheet</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Stephens</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Am J Hum Genet</source>
            <pubdate>2006</pubdate>
            <volume>78</volume>
            <fpage>629</fpage>
            <lpage>644</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1086/502802</pubid>
                  <pubid idtype="pmcid">1424677</pubid>
                  <pubid idtype="pmpid" link="fulltext">16532393</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B20">
            <title>
               <p>GERBIL: Genotype resolution and block identification using likelihood</p>
            </title>
            <aug>
               <au>
                  <snm>Kimmel</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Shamir</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2005</pubdate>
            <volume>102</volume>
            <fpage>158</fpage>
            <lpage>162</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1073/pnas.0404730102</pubid>
                  <pubid idtype="pmcid">544046</pubid>
                  <pubid idtype="pmpid" link="fulltext">15615859</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B21">
            <title>
               <p>Genome-wide association study of hematological and biochemical traits in a Japanese population</p>
            </title>
            <aug>
               <au>
                  <snm>Kamatani</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Matsuda</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Okada</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Kubo</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Hosono</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Daigo</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Nakamura</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Kamatani</snm>
                  <fnm>N</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>2010</pubdate>
            <volume>42</volume>
            <fpage>210</fpage>
            <lpage>215</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/ng.531</pubid>
                  <pubid idtype="pmpid" link="fulltext">20139978</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
      </refgrp>
   </bm>
</art>

