Repositorio Dspace

Evaluation of the performance of commonly applied global ancestry algorithms in complex spatial demographic scenarios

Registro sencillo

dc.contributor Universitat de Vic - Universitat Central de Catalunya. Facultat de Ciències i Tecnologia
dc.contributor Universitat de Vic - Universitat Central de Catalunya. Màster Universitari en Anàlisi de Dades Òmiques
dc.contributor.author Roig Aubeso, Roger
dc.date.accessioned 2018-06-18T17:30:45Z
dc.date.available 2018-06-18T17:30:45Z
dc.date.created 2016-09
dc.date.issued 2016-09
dc.identifier.uri http://hdl.handle.net/10854/5470
dc.description Curs 2015-2016
dc.description.abstract The development of new methods for inferring ancestral origins in human populations has atracted a renewed interest for human population geneticists for better understanding recent human evolutonary history or for correcting the presence of hidden population substructure in genome-wide association studies (GWAS). The algorithms for detecting population substructure present several problems such as the dependency on the assumptions of the algorithm, the type and number of considered DNA markers, the underlying demographic relationship among the considered populations and the sample size of the target populations. With this concern in mind, we have constructed an experimental model for testing the performance of currently algorithms applied for estimating population substructure which starts by designing two ideal prototypes of spatially structured populations (2D stepping stone and anisotropic). From each model we have generated a pool of 78 experimental datasets, simulating the genomic molecular diversity with Fastsimcoal2 under various migration rate conditions, performing the sampling of individuals and populations and selecting different filtering strategies: Minor Allele Frequency (MAF) and Linkage Disequilibrium (LD). Those 78 datasets (plink bed files) have been processed to evaluate the response of commonly applied algorithms to SNP data for quantifying individual population substructure: Principal Components Analysis (smartPCA), Multidimensional Scaling (MDS-PLINK), Spatial Ancestry Analysis (SPA), ADMIXTURE and SNMF. For those algorithms in which the output is a coordinate (PCA, MDS and SPA), we have evaluated the correlation (via Mantel and Procrustes tests) of these estimated coordinates with the geographic sampling coordinates of individuals in our original ideal artifacts. For ADMIXTURE and SNMF we have applied different algorithms for assessing the best K number of ancestries and we have applied CLUMPP sotware to compare their output matrices. This ideal prototype has enabled us to establish the robustness of the five algorithms, identify best performing algorithms and determine the impact of the conditions imposed on the results of these programs. es
dc.format application/pdf es
dc.format.extent 83 p. es
dc.language.iso eng es
dc.rights Aquest document està subjecte a aquesta llicència Creative Commons es
dc.rights.uri http://creativecommons.org/licenses/by-nc-nd/3.0/es/ es
dc.subject.other Algorismes es
dc.subject.other Genètica de poblacions humanes es
dc.title Evaluation of the performance of commonly applied global ancestry algorithms in complex spatial demographic scenarios es
dc.type info:eu-repo/semantics/masterThesis es
dc.description.version Director/a: Oscar Lao
dc.rights.accessRights info:eu-repo/semantics/openAccess es

Texto completo de este documento

Registro sencillo

Aquest document està subjecte a aquesta llicència Creative Commons Aquest document està subjecte a aquesta llicència Creative Commons

Buscar en RIUVic


Listar

Estadísticas