2024-03-29T05:48:50Zhttp://repisalud.isciii.es/oai/requestoai:repisalud.isciii.es:20.500.12105/156682024-01-30T19:03:39Zcom_20.500.12105_2145com_20.500.12105_2051com_20.500.12105_2144col_20.500.12105_2146
Repisalud
author
Cabrera-Alarcón, José Luis
author
García Martinez, Jorge
author
Enriquez, Jose Antonio
author
Sánchez-Cabo, Fátima
funder
Ministerio de Ciencia, Innovación y Universidades (España)
funder
Fundación ProCNIC
funder
Ministerio de Ciencia e Innovación. Centro de Excelencia Severo Ochoa (España)
2023-03-17T12:08:32Z
2023-03-17T12:08:32Z
2022-05
Eur J Hum Genet. 2022 May;30(5):555-559
http://hdl.handle.net/20.500.12105/15668
35079159
10.1038/s41431-021-01034-1
1476-5438
European journal of human genetics : EJHG
Accurate detection of pathogenic single nucleotide variants (SNVs) is a key challenge in whole exome and whole genome sequencing studies. To date, several in silico tools have been developed to predict deleterious variants from this type of data. However, these tools have limited power to detect new pathogenic variants, especially in non-coding regions. In this study, we evaluate the use of a new metric, the Shannon Entropy of Locus Variability (SELV), calculated as the Shannon entropy of the variant frequencies reported in genome-wide population studies at a given locus, as a new predictor of potentially pathogenic variants in non-coding nuclear and mitochondrial DNA and also in coding regions with a selective pressure other than that imposed by the genetic code, e.g splice-sites. For benchmarking, SELV was compared to predictors of pathogenicity in different genomic contexts. In nuclear non-coding DNA, SELV outperformed CDTS (AUCSELV = 0.97 in ROC curve and PR-AUCSELV = 0.96 in Precision-recall curve). For non-coding mitochondrial variants (AUCSELV = 0.98 in ROC curve and PR-AUCSELV = 1.00 in Precision-recall curve) SELV outperformed HmtVar. Moreover, SELV was compared against two state-of-the-art ensemble predictors of pathogenicity in splice-sites, ada-score, and rf-score, matching their overall performance both in ROC (AUCSELV = 0.95) and Precision-recall curves (PR-AUC = 0.97), with the advantage that SELV can be easily calculated for every position in the genome, as opposite to ada-score and rf-score. Therefore, we suggest that the information about the observed genetic variability in a locus reported from large scale population studies could improve the prioritization of SNVs in splice-sites and in non-coding regions.
eng
Variant pathogenic prediction by locus variability: the importance of the current picture of evolution.
journal article
TElDRU5DSUEgREUgRElTVFJJQlVDScOTTiBOTyBFWENMVVNJVkEKCkFjZXB0YW5kbyBlc3RhIGxpY2VuY2lhLCBVc3RlZCAoZWwgYXV0b3IvZXMgbyBlbCBwcm9waWV0YXJpby9zIGRlIGxvcyBkZXJlY2hvcyBkZSAKYXV0b3IpIGNvbmNlZGUgYSBSRVBJU0FMVUQgZWwgZGVyZWNobyBubyBleGNsdXNpdm8gZGUgcmVwcm9kdWNpciwgY29udmVydGlyLCB5L28gCmRpc3RyaWJ1aXIgc3UgZG9jdW1lbnRvIChpbmNsdXllbmRvIHN1IHJlc3VtZW4pIGEgbml2ZWwgbXVuZGlhbCBlbiBmb3JtYXRvIGRpZ2l0YWwsIAppbmNsdXllbmRvLCBhdWRpbyB5IHbDrWRlbywgYSB0cmF2w6lzIGRlIHN1IHJlcG9zaXRvcmlvIGluc3RpdHVjaW9uYWwuCgpVc3RlZCBhY2VwdGEgcXVlIFJFUElTQUxVRCBwdWVkZSwgc2luIGFsdGVyYXIgc3UgY29udGVuaWRvLCBjb252ZXJ0aXIgc3UgZG9jdW1lbnRvIAphIGN1YWxxdWllciBvdHJvIGZvcm1hdG8gZGlnaXRhbCBkZSBkYXRvcywgYXVkaW8geSB2aWRlbywgY29uIGVsIHByb3DDs3NpdG8gZGUgcXVlIApwdWVkYSBzZXIgYWxvamFkbyBlbiBlbCByZXBvc2l0b3Jpby4gCgpVc3RlZCBlc3TDoSBkZSBhY3VlcmRvIGNvbiBxdWUgUkVQSVNBTFVEIHB1ZWRhIGNvbnNlcnZhciBtw6FzIGRlIHVuYSBjb3BpYSBkZSBlc3RlIApkb2N1bWVudG8gcGFyYSBhc2VndXJhciBzdSBzZWd1cmlkYWQsIHByZXNlcnZhY2nDs24geSBhY2Nlc28uCgpVc3RlZCBkZWNsYXJhIHF1ZSBlbCBkb2N1bWVudG8gZXMgdW4gdHJhYmFqbyBvcmlnaW5hbCwgeSBxdWUgdGllbmUgZWwgZGVyZWNobyBkZSAKb3RvcmdhciBsb3MgZGVyZWNob3MgY29udGVuaWRvcyBlbiBlc3RhIGxpY2VuY2lhLiBUYW1iacOpbiBkZWNsYXJhIHF1ZSBzdSBwZXRpY2nDs24gCm5vIGluZnJpbmdlIGxvcyBkZXJlY2hvcyBkZSBhdXRvciBkZSBuYWRpZS4gCgpTaSBlbCBkb2N1bWVudG8gY29udGllbmUgbWF0ZXJpYWxlcyBwYXJhIGxvcyBxdWUgbm8gc2UgdGllbmVuIGxvcyBkZXJlY2hvcyBkZSBhdXRvciwgClVzdGVkIGRlY2xhcmEgcXVlIGhhIG9idGVuaWRvIGVsIHBlcm1pc28gc2luIHJlc3RyaWNjacOzbiBkZWwgcHJvcGlldGFyaW8gZGUgbG9zIApkZXJlY2hvcyB5IHF1ZSBlbiBkaWNobyBtYXRlcmlhbCwgZXN0w6EgY2xhcmFtZW50ZSBpZGVudGlmaWNhZGEgeSByZWNvbm9jaWRhIHN1IAphdXRvcsOtYSBkZW50cm8gZWwgdGV4dG8gbyBkZWwgY29udGVuaWRvIGRlIGRpY2hvIGRvY3VtZW50by4gCgpTaSBlbCBlbnbDrW8gc2UgYmFzYSBlbiB1biB0cmFiYWpvIHF1ZSBoYSBzaWRvIHBhdHJvY2luYWRvIG8gYXBveWFkbyBwb3IgdW5hIGFnZW5jaWEgCnUgb3JnYW5pemFjacOzbiBkaXN0aW50YSBhIFJFUElTQUxVRCwgdXN0ZWQgYWNlcHRhIHF1ZSBoYSBjdW1wbGlkbyBjb24gZWwgZGVyZWNobyBkZSAKcmV2aXNpw7NuIHkgb3RyYXMgb2JsaWdhY2lvbmVzIHJlcXVlcmlkYXMgcG9yIGNvbnRyYXRvIG8gYWN1ZXJkby4gCgpSRVBJU0FMVUQgaWRlbnRpZmljYXLDoSBjbGFyYW1lbnRlIHN1KHMpIG5vbWJyZShzKSBjb21vIGF1dG9yKHMpIG8gcHJvcGlldGFyaW8ocykgCmRlbCBkb2N1bWVudG8sIHkgbm8gaGFyw6EgbmluZ3VuYSBhbHRlcmFjacOzbiwgZXhjZXB0byBzZWfDum4gbG8gcGVybWl0aWRvIHBvciBlc3RhIApsaWNlbmNpYS4gCg==
URL
https://repisalud.isciii.es/bitstream/20.500.12105/15668/1/Variant%20pathogenic%20prediction%20Eur%20J%20Hum%20Genet%202022.pdf
File
MD5
654ec7cda459a9c7bdf7b27eeaba02aa
1001066
application/pdf
Variant pathogenic prediction Eur J Hum Genet 2022.pdf
URL
https://repisalud.isciii.es/bitstream/20.500.12105/15668/4/Variant%20pathogenic%20prediction%20Eur%20J%20Hum%20Genet%202022.pdf.txt
File
MD5
2e78e32caca1d94a6df10cbc2425385b
25804
text/plain
Variant pathogenic prediction Eur J Hum Genet 2022.pdf.txt