Please use this identifier to cite or link to this item:http://hdl.handle.net/20.500.12105/9628
dSreg: a Bayesian model to integrate changes in splicing and RNA-binding protein activity
Marti-Gomez, Carlos CNIC | Lara-Pezzi, Enrique CNIC | Sanchez-Cabo, Fatima CNIC
Bioinformatics. 2020; 36(7):2134-2141
MOTIVATION: Alternative splicing (AS) is an important mechanism in the generation of transcript diversity across mammals. AS patterns are dynamically regulated during development and in response to environmental changes. Defects or perturbations in its regulation may lead to cancer or neurological disorders, among other pathological conditions. The regulatory mechanisms controlling AS in a given biological context are typically inferred using a two-step framework: differential AS analysis followed by enrichment methods. These strategies require setting rather arbitrary thresholds and are prone to error propagation along the analysis. RESULTS: To overcome these limitations, we propose dSreg, a Bayesian model that integrates RNA-seq with data from regulatory features, e.g. binding sites of RNA-binding proteins. dSreg identifies the key underlying regulators controlling AS changes and quantifies their activity while simultaneously estimating the changes in exon inclusion rates. dSreg increased both the sensitivity and the specificity of the identified AS changes in simulated data, even at low read coverage. dSreg also showed improved performance when analyzing a collection of knock-down RNA-binding proteins' experiments from ENCODE, as opposed to traditional enrichment methods, such as over-representation analysis and gene set enrichment analysis. dSreg opens the possibility to integrate a large amount of readily available RNA-seq datasets at low coverage for AS analysis and allows more cost-effective RNA-seq experiments. AVAILABILITY AND IMPLEMENTATION: dSreg was implemented in python using stan and is freely available to the community at https://bitbucket.org/cmartiga/dsreg. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.