Efficient and scalable modelling of cotranscriptional RNA folding with deterministic and iterative RNA structure sampling
Efficient and scalable modelling of cotranscriptional RNA folding with deterministic and iterative RNA structure sampling
Courtney, E.; Choi, E.; Ward, M.; Lucks, J. B.
AbstractRNA structure sampling is central to modelling RNA ensembles, yet stochastic sampling methods are non-exhaustive, scale poorly, and are biased towards low-free-energy structures, while current suboptimal folding approaches generate an unpredictable exponential number of structures. These limitations are particularly problematic for modelling cotranscriptional folding, where vectorial synthesis continuously reshapes the energy landscape during transcription, stabilising transient out-of-equilibrium structures. Here we introduce iterative sampling, a deterministic framework that enumerates unique RNA secondary structures in strict order of increasing free energy, enabling progressive and exhaustive exploration of the structure space up to an arbitrary stopping criterion. To implement this approach, we developed two scalable algorithms, iterative deepening and a persistent data structure approach, that incrementally traverse the expansion tree by evolving partial structures in place, avoiding redundant recomputation and fixed energy windows. Implemented in memerna, this approach achieves orders-of-magnitude speedups over existing tools (10x over ViennaRNA; 100x over RNAstructure). Integration within the sample-and-select framework (R2D2) improves structural diversity and identifies conformations with greater agreement with experimental data. Comprehensive sampling further enables direct comparison of equilibrium and cotranscriptionally restrained ensembles. Analysis of the resulting structural probability distributions uncovers kinetic traps and putative transcriptional pause sites, supporting an intuitive cotranscriptional folding mechanism in which local 3'-hairpin formation transiently stabilises upstream structure to delay large-scale rearrangement. Together, these results establish iterative sampling as a scalable and general framework for resolving out-of-equilibrium RNA cotranscriptional folding.