A Genome-Based Model to Predict the Virulence of Pseudomonas aeruginosa Isolates

Pincus, Nathan B; Ozer, Egon A; Allen, Jonathan P; Nguyen, Marcus; Davis, James J; Winter, Deborah R; Chuang, Chih-Hsien; Chiu, Cheng-Hsun; Zamorano, Laura; Oliver, Antonio; Hauser, Alan R

Publication:
A Genome-Based Model to Predict the Virulence of Pseudomonas aeruginosa Isolates

dc.contributor.author	Pincus, Nathan B
dc.contributor.author	Ozer, Egon A
dc.contributor.author	Allen, Jonathan P
dc.contributor.author	Nguyen, Marcus
dc.contributor.author	Davis, James J
dc.contributor.author	Winter, Deborah R
dc.contributor.author	Chuang, Chih-Hsien
dc.contributor.author	Chiu, Cheng-Hsun
dc.contributor.author	Zamorano, Laura
dc.contributor.author	Oliver, Antonio
dc.contributor.author	Hauser, Alan R
dc.date.accessioned	2024-09-13T09:11:41Z
dc.date.available	2024-09-13T09:11:41Z
dc.date.issued	2020-07
dc.description.abstract	Variation in the genome of Pseudomonas aeruginosa, an important pathogen, can have dramatic impacts on the bacterium's ability to cause disease. We therefore asked whether it was possible to predict the virulence of P. aeruginosa isolates based on their genomic content. We applied a machine learning approach to a genetically and phenotypically diverse collection of 115 clinical P. aeruginosa isolates using genomic information and corresponding virulence phenotypes in a mouse model of bacteremia. We defined the accessory genome of these isolates through the presence or absence of accessory genomic elements (AGEs), sequences present in some strains but not others. Machine learning models trained using AGEs were predictive of virulence, with a mean nested cross-validation accuracy of 75% using the random forest algorithm. However, individual AGEs did not have a large influence on the algorithm's performance, suggesting instead that virulence predictions are derived from a diffuse genomic signature. These results were validated with an independent test set of 25 P. aeruginosa isolates whose virulence was predicted with 72% accuracy. Machine learning models trained using core genome single-nucleotide variants and whole-genome k-mers also predicted virulence. Our findings are a proof of concept for the use of bacterial genomes to predict pathogenicity in P. aeruginosa and highlight the potential of this approach for predicting patient outcomes. IMPORTANCE Pseudomonas aeruginosa is a clinically important Gram-negative opportunistic pathogen. P. aeruginosa shows a large degree of genomic heterogeneity both through variation in sequences found throughout the species (core genome) and through the presence or absence of sequences in different isolates (accessory genome). P. aeruginosa isolates also differ markedly in their ability to cause disease. In this study, we used machine learning to predict the virulence level of P. aeruginose isolates in a mouse bacteremia model based on genomic content. We show that both the accessory and core genomes are predictive of virulence. This study provides a machine learning framework to investigate relationships between bacterial genomes and complex phenotypes such as virulence.	en
dc.description.sponsorship	This work was supported by the National Institute of General Medical Sciences (grants T32 GM008061 and T32 GM008152 [N.B.P.]), by the American Cancer Society (grant MRSG-13-220-01-MPC [E.A.O.]), and by the National Institute of Allergy and Infectious Diseases (grants R01 AI118257, R21 129167, K24 AI104831, and U19 AI135964 [A.R.H.]). J.J.D. and M.N. are supported by the United States Defense Advanced Research Projects Agency Friend or Foe program iSENTRY award (contract HR0011937807 [J.J.D.]) and by the U.S. National Institute of Allergy and Infectious Diseases Bacterial and Viral Bioinformatics Resource Center award (contract 75N93019C00076 [principal investigator Rick Stevens]). A.O. is supported by Instituto de Salud Carlos III, Subdireccion General de Redes y Centros de Investigacion Cooperativa, Ministerio de Economia y Competitividad, Spanish Network for Research in Infectious Diseases (REIPI RD16/0016/0004), cofinanced by the European Development Regional Fund A way to achieve Europe and operative program Intelligent Growth 2014-2020.; This research was supported in part through the computational resources and staff contributions provided by the Genomics Compute Cluster, which is jointly supported by the Feinberg School of Medicine, the Center for Genetic Medicine, and Feinberg's Department of Biochemistry and Molecular Genetics, the Office of the Provost, the Office for Research, and Northwestern Information Technology. The Genomics Compute Cluster is part of Quest, Northwestern University's high-performance computing facility, with the purpose of advancing research in genomics. We acknowledge the University of Maryland School of Medicine Institute for Genome Sciences for performance of PacBio whole-genome sequencing.	es_ES
dc.format.number	4	es_ES
dc.format.page	e01527-20	es_ES
dc.format.volume	11	es_ES
dc.identifier.citation	Pincus NB, Ozer EA, Allen JP, Nguyen M, Davis JJ, Winter DR, et al. A Genome-Based Model to Predict the Virulence of Pseudomonas aeruginosa Isolates. mBio. 2020 Jul;11(4):e01527-20.	en
dc.identifier.doi	10.1128/mBio.01527-20
dc.identifier.issn	2150-7511
dc.identifier.journal	mBio	es_ES
dc.identifier.other	http://hdl.handle.net/20.500.13003/10165
dc.identifier.pubmedID	32843552	es_ES
dc.identifier.pui	L2005052842
dc.identifier.scopus	2-s2.0-85089927546
dc.identifier.uri	https://hdl.handle.net/20.500.12105/22849
dc.identifier.wos	572063000015
dc.language.iso	eng	en
dc.publisher	American Society for Microbiology (ASM)
dc.relation.publisherversion	https://dx.doi.org/10.1128/mBio.01527-20	en
dc.rights.accessRights	open access	en
dc.rights.license	Attribution 4.0 International	*
dc.rights.uri	http://creativecommons.org/licenses/by/4.0/	*
dc.subject	Pseudomonas aeruginosa
dc.subject	Genome analysis
dc.subject	Machine learning
dc.subject	Modeling
dc.subject	Prediction
dc.subject	Virulence
dc.title	A Genome-Based Model to Predict the Virulence of Pseudomonas aeruginosa Isolates	en
dc.type	research article	en
dspace.entity.type	Publication
relation.isPublisherOfPublication	30cd8aef-e018-40d1-b05e-19af778995bd
relation.isPublisherOfPublication.latestForDiscovery	30cd8aef-e018-40d1-b05e-19af778995bd

Collections

IdisBa - Instituto de Investigación Sanitaria Illes Balears (Baleares)

Publication: A Genome-Based Model to Predict the Virulence of Pseudomonas aeruginosa Isolates

Files

Collections

Publication:
A Genome-Based Model to Predict the Virulence of Pseudomonas aeruginosa Isolates