Computational Biology and Bioinformatics

Biology has to a large extent become an information science. With the quantity of biological data being generated, for example by high-throughput sequencing techniques, data can only meaningfully be processed by computer.

The processing of the data and the subsequent search for patterns (sequence and structure) in DNA (genome), RNA and proteins help us to identify functional genomic components. However to do this, programs needs to be efficient so run times can be minimized.

The group has a general focus on animal genomics including non-coding RNAs (ncRNAs), structure and interactions, CRISPR and analysis of high-throughput sequencing data. ncRNAs are rapidly becoming a central focus of genomic biology and given only ~1% of the (~3 billion base) mammalian genome encodes proteins, the potential for the genome to host many ncRNAs is large.

In our group we develop new computational methods (computational biology) as well as setting up pipelines for genome annotation (bioinformatics). We relate these findings to diseases and other phenotypes. We are addressing animal models for human disease, and we are studying bacteria used in industrial contexts as cell factories, with the aim to understand production yield.

The group hosts Center for non-coding RNA in Technology and Health (see details at which takes a whole new approach to disease studies by searching for ncRNA and structured RNAs as disease components and biomarkers through development of in silico search tools for ncRNA analysis complemented by experimental analysis and further functional studies. The disease focus is on inflammatory diseases and diabetes employing human and animal material.

Recent selected publications:

  • CRISPRroots: on- and off-target assessment of RNA-seq data in CRISPR-Cas9 edited cells. Corsi GI, Gadekar VP, Gorodkin J*, Seemann SE*Nucleic Acids Res. 2021.
  • A non-enzymatic, isothermal strand displacement and amplification assay for rapid detection of SARS-CoV-2 RNA. Mohammadniaei M, Zhang M, Ashley J, Christensen UB, Friis-Hansen LJ, Gregersen R, Lisby JG, Benfield TL, Nielsen FE, Henning Rasmussen J, Pedersen EB, Olinger ACR, Kolding LT, Naseri M, Zheng T, Wang W, Gorodkin J, Sun Y. Nature Communications 2021, 12(1):5089.
  • Enhancing CRISPR-Cas9 gRNA efficiency prediction by data integration and deep learning. Xiang X, Corsi GI, Anthon C, Qu K, Pan X, Liang X, Han P, Dong Z, Liu L, Zhong J, Ma T, Wang J, Zhang X, Jiang H, Xu F, Liu X, Xu X, Wang J, Yang H, Bolund L, Church GM, Lin L, Gorodkin J*, Luo Y* Nature Communications 2021, 12(1):3238
  • Human pathways in animal models: possibilities and limitations. Doncheva NT*, Palasca O, Yarani R, Litman T, Anthon C, Groenen MAM, Stadler PF, Pociot F, Jensen LJ*, Gorodkin J*. Nucleic Acids Res. 2021.
  • BSGatlas: a unified Bacillus subtilis genome and transcriptome annotation atlas with enhanced information access. Geissler AS, Anthon C, Alkan F, Gonzalez-Tortuero E, Poulsen LD, Kallehauge TB, Breuner A, Seemann SE, Vinther J, Gorodkin J. Microb Genom. 2021 Feb 4.
  • CRISPR-Cas9 off-targeting assessment with nucleic acid duplex energy parameters. Alkan F, Wenzel A, Anthon C, Havgaard JH, Gorodkin J Genome Biol. 2018 Oct 26;19(1):177
  • The identification and functional annotation of RNA structures conserved in vertebrates. Seemann SE, Mirza AH, Hansen C, Bang-Berthelsen CH, Garde C, Christensen-Dalsgaard M, Torarinsson E, Yao Z, Workman CT, Pociot F, Nielsen H, Tommerup N, Ruzzo WL, Gorodkin J Genome Res. 2017 Aug;27(8):1371-1383
  • RIsearch2: suffix array-based large-scale prediction of RNA–RNA interactions and siRNA off-targets. Alkan F, Wenzel A, Palasca O, Kerpedjiev P, Rudebeck A, Stadler PF, Hofacker IL, Gorodkin J Nucleic Acids Res. 2017 May 5;45(8):e60.
  • RAIN: RNA–protein Association and Interaction Networks
    Junge A, Refsgaard JC, Garde C, Pan X, Santos A, Alkan F, Anthon C, von Mering C, Workman CT, Jensen LJ, Gorodkin J Database (Oxford). 2017 Jan 10;2017.