DSpace/Dipòsit Manakin

HRAS physical feature analysis: Predicting protein activation through a random forest classifier

Registre simple

dc.contributor Universitat de Vic - Universitat Central de Catalunya. Màster Universitari en Anàlisi de Dades Òmiques
dc.contributor Universitat de Vic - Universitat Central de Catalunya. Facultat de Ciències i Tecnologia
dc.contributor.author González García, Jorge
dc.date.accessioned 2024-01-31T12:44:03Z
dc.date.available 2024-01-31T12:44:03Z
dc.date.created 2023-09-10
dc.date.issued 2023-09-10
dc.identifier.uri http://hdl.handle.net/10854/7719
dc.description Curs 2022-2023 es
dc.description.abstract Over the past decade, increased computational capabilities have enabled us to address biological questions using data-driven methods, particularly where traditional techniques have been limiting. We hypothesize these computer-based methods can be used to predict enzyme activation status. To verify this claim, we have selected a benchmark protein for study, Human HRAS, sourcing a comprehensive set of experimentally labelled structures from available databases. Seven physical, computationally inexpensive features were extracted from these structures at the amino acid alpha carbon level and aligned to the canonical sequence to convey their metrics locally. Subsequently, three-dimensional tensors were generated with them from the set of all possible combinations of the obtained features. A random forest model was then trained on t-SNE preprocessed tensors to look for the highest performing combination of features. Our results strongly suggest that activation status in Human HRAS, and probably other proteins, is mainly codified in the electrostatic and Van der Waals forces, with solvation forces playing a lesser role. These forces, when processed through machine learning models, offer substantial predictive capability. In contrast, methods based on the physical three-dimensional position of residues, such as coordinate-based data and pairwise Root Mean Standard Deviation, were not independently effective in distinguishing activation states. es
dc.format application/pdf es
dc.format.extent 41 p. es
dc.language.iso eng es
dc.rights Aquest document està subjecte a aquesta llicència Creative Commons es
dc.rights.uri https://creativecommons.org/licenses/by-nc-nd/4.0/deed.ca es
dc.subject.other Proteïnes -- Investigació es
dc.title HRAS physical feature analysis: Predicting protein activation through a random forest classifier es
dc.type info:eu-repo/semantics/masterThesis es
dc.description.version Academic tutor: Jordi Villà i Freixa.
dc.rights.accessRights info:eu-repo/semantics/openAccess es

Text complet d'aquest document

Registre simple

Aquest document està subjecte a aquesta llicència Creative Commons Aquest document està subjecte a aquesta llicència Creative Commons

Buscar al RIUVic


Llistar per

Estadístiques