2024-03-28T13:46:39Zhttp://repisalud.isciii.es/oai/requestoai:repisalud.isciii.es:20.500.12105/156682024-01-30T19:03:39Zcom_20.500.12105_2145com_20.500.12105_2051com_20.500.12105_2144col_20.500.12105_2146
00925njm 22002777a 4500
dc
Cabrera-Alarcón, José Luis
author
García Martinez, Jorge
author
Enriquez, Jose Antonio
author
Sánchez-Cabo, Fátima
author
2022-05
Accurate detection of pathogenic single nucleotide variants (SNVs) is a key challenge in whole exome and whole genome sequencing studies. To date, several in silico tools have been developed to predict deleterious variants from this type of data. However, these tools have limited power to detect new pathogenic variants, especially in non-coding regions. In this study, we evaluate the use of a new metric, the Shannon Entropy of Locus Variability (SELV), calculated as the Shannon entropy of the variant frequencies reported in genome-wide population studies at a given locus, as a new predictor of potentially pathogenic variants in non-coding nuclear and mitochondrial DNA and also in coding regions with a selective pressure other than that imposed by the genetic code, e.g splice-sites. For benchmarking, SELV was compared to predictors of pathogenicity in different genomic contexts. In nuclear non-coding DNA, SELV outperformed CDTS (AUCSELV = 0.97 in ROC curve and PR-AUCSELV = 0.96 in Precision-recall curve). For non-coding mitochondrial variants (AUCSELV = 0.98 in ROC curve and PR-AUCSELV = 1.00 in Precision-recall curve) SELV outperformed HmtVar. Moreover, SELV was compared against two state-of-the-art ensemble predictors of pathogenicity in splice-sites, ada-score, and rf-score, matching their overall performance both in ROC (AUCSELV = 0.95) and Precision-recall curves (PR-AUC = 0.97), with the advantage that SELV can be easily calculated for every position in the genome, as opposite to ada-score and rf-score. Therefore, we suggest that the information about the observed genetic variability in a locus reported from large scale population studies could improve the prioritization of SNVs in splice-sites and in non-coding regions.
Eur J Hum Genet. 2022 May;30(5):555-559
http://hdl.handle.net/20.500.12105/15668
35079159
10.1038/s41431-021-01034-1
1476-5438
European journal of human genetics : EJHG
Variant pathogenic prediction by locus variability: the importance of the current picture of evolution.