Accurate ab initio gene prediction in eukaryotes with Tiberius in multiple clades
Accurate ab initio gene prediction in eukaryotes with Tiberius in multiple clades
Gabriel, L.; Bruna, T.; Kaur, A.; Krishnan, A.; Ortmann, F.; Salamov, A.; Talbot, S.; Becker, F.; Krieg, R.; Wheat, C. W.; Grigoriev, I. V.; Stanke, M.; Hoff, K. J.
AbstractEukaryotic genome annotation is currently bottlenecked by limitations in the generality, scalability and accuracy of computational methods. Deep learning approaches have recently achieved large improvements in ab initio gene prediction accuracy. We extend the deep learning-based ab initio gene predictor Tiberius beyond mammals by training lineage-specific models for Mesangiospermae, Fungi, Vertebrata, Insecta, Chlorophyta and Bacillariophyta. Across a benchmark of 33 species, Tiberius consistently achieves higher accuracy than the other evaluated ab initio methods, Helixer and ANNEVO, while also having the fastest runtimes overall. Compared with BRAKER3, which incorporates RNA-Seq and protein evidence, Tiberius approaches state-of-the-art accuracy in Mesangiospermae, Fungi, Bacillariophyta and Chlorophyta, while being on average 80 times faster when using a GPU. Availability and implementation: https://github.com/Gaius-Augustus/Tiberius