An efficient algorithm to perform multiple testing in epistasis screening

Van Lishout, François; Mahachie John, Jestinah M.; Gusareva, Elena S.; Urrea Gales, Víctor; Cleynen, Isabelle; Théâtre, Emilie; Charloteaux, Benoït; Calle, M. Luz; Wehenkel, Louis; Van Steen, Kristel

Inicio
→
Recerca
→
Articles, congressos, llibres
→
Articles
→
Ver ítem

dc.contributor	Universitat de Vic. Escola Politècnica Superior
dc.contributor	Universitat de Vic. Grup de Recerca en Bioinformàtica i Estadística Mèdica
dc.contributor.author	Van Lishout, François
dc.contributor.author	Mahachie John, Jestinah M.
dc.contributor.author	Gusareva, Elena S.
dc.contributor.author	Urrea Gales, Víctor
dc.contributor.author	Cleynen, Isabelle
dc.contributor.author	Théâtre, Emilie
dc.contributor.author	Charloteaux, Benoït
dc.contributor.author	Calle, M. Luz
dc.contributor.author	Wehenkel, Louis
dc.contributor.author	Van Steen, Kristel
dc.date.accessioned	2013-06-06T16:26:12Z
dc.date.available	2013-06-06T16:26:12Z
dc.date.created	2013
dc.date.issued	2013
dc.identifier.citation	François Van Lishout, Jestinah M Mahachie John, Elena S Gusareva, Victor Urrea, Isabelle Cleynen, Emilie Théâtre, Benoît Charloteaux, Malu Luz Calle, Louis Wehenkel and Kristel Van Steen " An efficient algorithm to perform multiple testing in epistasis screening" A: BMC Bioinformatics 2013, 14:138 doi:10.1186/1471-2105-14-138	ca_ES
dc.identifier.issn	1471-2105
dc.identifier.uri	http://hdl.handle.net/10854/2274
dc.description.abstract	Background: Research in epistasis or gene-gene interaction detection for human complex traits has grown over the last few years. It has been marked by promising methodological developments, improved translation efforts of statistical epistasis to biological epistasis and attempts to integrate different omics information sources into the epistasis screening to enhance power. The quest for gene-gene interactions poses severe multiple-testing problems. In this context, the maxT algorithm is one technique to control the false-positive rate. However, the memory needed by this algorithm rises linearly with the amount of hypothesis tests. Gene-gene interaction studies will require a memory proportional to the squared number of SNPs. A genome-wide epistasis search would therefore require terabytes of memory. Hence, cache problems are likely to occur, increasing the computation time. In this work we present a new version of maxT, requiring an amount of memory independent from the number of genetic effects to be investigated. This algorithm was implemented in C++ in our epistasis screening software MBMDR-3.0.3. We evaluate the new implementation in terms of memory efficiency and speed using simulated data. The software is illustrated on real-life data for Crohn’s disease. Results: In the case of a binary (affected/unaffected) trait, the parallel workflow of MBMDR-3.0.3 analyzes all gene-gene interactions with a dataset of 100,000 SNPs typed on 1000 individuals within 4 days and 9 hours, using 999 permutations of the trait to assess statistical significance, on a cluster composed of 10 blades, containing each four Quad-Core AMD Opteron(tm) Processor 2352 2.1 GHz. In the case of a continuous trait, a similar run takes 9 days. Our program found 14 SNP-SNP interactions with a multiple-testing corrected p-value of less than 0.05 on real-life Crohn’s disease (CD) data. Conclusions: Our software is the first implementation of the MB-MDR methodology able to solve large-scale SNP-SNP interactions problems within a few days, without using much memory, while adequately controlling the type I error rates. A new implementation to reach genome-wide epistasis screening is under construction. In the context of Crohn’s disease, MBMDR-3.0.3 could identify epistasis involving regions that are well known in the field and could be explained from a biological point of view. This demonstrates the power of our software to find relevant phenotype-genotype higher-order associations.	en
dc.description.sponsorship	is paper presents research results of the Belgian Network DYSCO (Dynamical Systems, Control, and Optimization), funded by the Interuniversity Attraction Poles Programme, initiated by the Belgian State, Science Policy Office. The scientific responsibility rests with its author(s). Their work was also supported in part by the IST Programme of the European Community, under the PASCAL2 Network of Excellence (Pattern Analysis, Statistical Modelling and Computational Learning), IST-2007-216886. FVL, LW and KVS also acknowledges support by Alma in Silico, funded by the European Commission and Walloon Region through the Interreg IV Program. For MC and VU, this work was partially supported by Grant MTM2008-06747-C02-02 from el Ministerio de Educacion y Ciencia (Spain), Grant 050831 from La Marato de TV3 Foundation, Grant 2009SGR-581 from AGAUR-Generalitat de Catalunya. VU is the recipient of a pre-doctoral FPU fellowship award from the Spanish Ministry of Education (MEC).
dc.format	application/pdf
dc.format.extent	10 p.	ca_ES
dc.language.iso	eng	ca_ES
dc.publisher	Biomed Central	ca_ES
dc.relation	MEC/PN2008-2011/MTM2008-06747-C02-00
dc.relation	AGAUR/2009-2014/2009SGR-581
dc.rights	Aquest document està subjecte a aquesta llicència Creative Commons	ca_ES
dc.rights.uri	http://creativecommons.org/licenses/by/3.0/es/	ca_ES
dc.subject.other	Bioinformàtica	ca_ES
dc.subject.other	Epidemiologia genètica	ca_ES
dc.subject.other	Biometria	ca_ES
dc.title	An efficient algorithm to perform multiple testing in epistasis screening	en
dc.type	info:eu-repo/semantics/article	ca_ES
dc.identifier.doi	https://doi.org/10.1186/1471-2105-14-138
dc.rights.accessRights	info:eu-repo/semantics/openAccess	ca_ES
dc.type.version	info:eu-repo/publishedVersion	ca_ES
dc.indexacio	Indexat a SCOPUS
dc.indexacio	Indexat a WOS/JCR	ca_ES
dc.contribution.funder	Ministerio de Ciencia e Innovación (España)
dc.contribution.funder	Generalitat de Catalunya. Agència de Gestió d'Ajuts Universitaris i de Recerca