Publication: An audit of the PeptideAtlas database uncovers evidence for repurposed pseudogenes and co-opted retroviral ORFs.
| dc.contributor.author | Rodriguez, Jose Manuel | |
| dc.contributor.author | Maquedano, Miguel | |
| dc.contributor.author | Cerdán-Vélez, Daniel | |
| dc.contributor.author | Laguillo-Gómez, Andrea | |
| dc.contributor.author | Calvo, Enrique | |
| dc.contributor.author | Abascal, Federico | |
| dc.contributor.author | Vázquez, Jesús | |
| dc.contributor.author | Tress, Michael L | |
| dc.contributor.funder | Ministerio de Ciencia e Innovación (España) | |
| dc.contributor.funder | Unión Europea. Comisión Europea. NextGenerationEU | |
| dc.contributor.funder | Comunidad de Madrid (España) | |
| dc.contributor.funder | Fundación La Caixa | |
| dc.contributor.funder | Ministerio de Ciencia e Innovación. Centro de Excelencia Severo Ochoa (España) | |
| dc.date.accessioned | 2025-12-15T13:56:58Z | |
| dc.date.available | 2025-12-15T13:56:58Z | |
| dc.date.issued | 2025-11-21 | |
| dc.description.abstract | The human genome has been the subject of scrutiny for more than two decades, yet new protein coding genes are still being uncovered. Recently ribosome profiling experiments have provided evidence for the translation of thousands of novel open reading frames (ORFs). To determine how many of these novel ORFs have peptide support, we carried out an in-depth investigation of an entire mass spectrometry proteomics database. We analysed the peptides housed in the human build of the PeptideAtlas database and identified reliable evidence for 35 potential coding genes not annotated in the Ensembl/GENCODE reference gene set. Evidence from complementary sources confirmed that 16 were almost certainly coding genes, but we believe that at least 14 are most likely to be undergoing aberrant translation. These 14 genes had reading frames that were not preserved beyond human and their peptides were restricted to cancers or cell lines. Remarkably, three of the sixteen likely coding genes were derived from endogenous retroviral ORFs and were expressed only in placenta. All three had evidence of purifying selection. Retroviral ORFs (syncytins) with distinct origins are expressed in almost all mammalian placentae and these results suggest that co-opted ORFs may also play an important role in placental development. Our analysis shows that proteomics data can be used in conjunction with evolutionary evidence to confirm the existence of new coding genes. The evidence suggests that both testis and placenta are the tissues most likely to express still to be identified coding genes, and that there may be other transposon-derived ORF that have been co-opted as coding genes. The strong evidence for the translation of regions under dysregulated conditions has important implications for the annotation of coding genes and in the analysis of cancer and other degenerative diseases.The online version contains supplementary material available at 10.1186/s12864-025-12238-w. | |
| dc.description.peerreviewed | Sí | |
| dc.description.tableofcontents | This study was supported by competitive grants PID2021-122348NB-I00 and PID2024-155650NB-I00 funded by MICIU/AEI/ 10.13039/501100011033 and by “ERDF/EU”, PLEC2022-009298, PLEC2022-009235 and EQC2021-007053-P funded by MICIU/AEI/10.13039/501100011033 and by “European Union NextGenerationEU/ PRTR”, andS2022/BMD-7333-CM (INMUNOVAR-CM) funded by Comunidad de Madrid. The project leading to these results has received funding from ”la Caixa” Foundation under the project code LCF/PR/HR22/52420019. The CNIC is supported by the Instituto de Salud Carlos III (ISCIII), the Ministerio de Ciencia, Innovación Y Universidades (MICIU) and the Pro CNIC Foundation), and is a Severo Ochoa Center of Excellence (grant CEX2020-001041-S funded by MICIU/AEI/10.13039/501100011033). | |
| dc.identifier.citation | BMC Genomics. 2025 Nov 21;26(1):1087. | |
| dc.identifier.journal | BMC GENOMICS | |
| dc.identifier.pubmedID | 41272456 | |
| dc.identifier.uri | https://hdl.handle.net/20.500.12105/27024 | |
| dc.language.iso | eng | |
| dc.publisher | BMC | |
| dc.relation.isreferencedby | PubMed | |
| dc.relation.projectID | info:eu-repo/grantAgreement/ES/PID2021-122348NB-I00 | |
| dc.relation.projectID | info:eu-repo/grantAgreement/ES/PID2024-155650NB-I00 | |
| dc.relation.projectID | info:eu-repo/grantAgreement/ES/MICIU/AEI/10.13039/501100011033 | |
| dc.relation.projectID | info:eu-repo/grantAgreement/ES/PLEC2022-009298 | |
| dc.relation.projectID | info:eu-repo/grantAgreement/ES/PLEC2022-009235 | |
| dc.relation.projectID | info:eu-repo/grantAgreement/ES/QC2021-007053-P | |
| dc.relation.projectID | info:eu-repo/grantAgreement/ES/S2022/BMD-7333-CM | |
| dc.relation.projectID | info:eu-repo/grantAgreement/ES/LCF/PR/HR22/52420019 | |
| dc.relation.projectID | info:eu-repo/grantAgreement/ES/CEX2020-001041-S | |
| dc.relation.publisherversion | https://doi.org/10.1186/s12864-025-12238-w | |
| dc.repisalud.institucion | CNIC | |
| dc.repisalud.orgCNIC | CNIC::Grupos de investigación::Proteómica cardiovascular | |
| dc.rights.accessRights | open access | |
| dc.rights.license | Attribution-NonCommercial-NoDerivatives 4.0 International | |
| dc.rights.uri | http://creativecommons.org/licenses/by-nc-nd/4.0/ | |
| dc.subject | Co-option | |
| dc.subject | Coding genes | |
| dc.subject | Endogenous retrovirus | |
| dc.subject | Proteomics | |
| dc.subject | Pseudogenes | |
| dc.title | An audit of the PeptideAtlas database uncovers evidence for repurposed pseudogenes and co-opted retroviral ORFs. | |
| dc.type | research article | |
| dc.type.hasVersion | VoR | |
| dspace.entity.type | Publication |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- An audit of the PeptideAtlas database_BMC Genomics_2025.pdf
- Size:
- 4.35 MB
- Format:
- Adobe Portable Document Format


