Fast pairwise structural RNA alignments by pruning of the dynamical programming matrix

Research output: Contribution to journalJournal articleResearchpeer-review

Standard

Fast pairwise structural RNA alignments by pruning of the dynamical programming matrix. / Havgaard, Jakob Hull; Torarinsson, Elfar; Gorodkin, Jan.

In: PLoS Computational Biology, Vol. 3, No. 10, 2007, p. 1896-1908.

Research output: Contribution to journalJournal articleResearchpeer-review

Harvard

Havgaard, JH, Torarinsson, E & Gorodkin, J 2007, 'Fast pairwise structural RNA alignments by pruning of the dynamical programming matrix', PLoS Computational Biology, vol. 3, no. 10, pp. 1896-1908. https://doi.org/10.1371/journal.pcbi.0030193

APA

Havgaard, J. H., Torarinsson, E., & Gorodkin, J. (2007). Fast pairwise structural RNA alignments by pruning of the dynamical programming matrix. PLoS Computational Biology, 3(10), 1896-1908. https://doi.org/10.1371/journal.pcbi.0030193

Vancouver

Havgaard JH, Torarinsson E, Gorodkin J. Fast pairwise structural RNA alignments by pruning of the dynamical programming matrix. PLoS Computational Biology. 2007;3(10):1896-1908. https://doi.org/10.1371/journal.pcbi.0030193

Author

Havgaard, Jakob Hull ; Torarinsson, Elfar ; Gorodkin, Jan. / Fast pairwise structural RNA alignments by pruning of the dynamical programming matrix. In: PLoS Computational Biology. 2007 ; Vol. 3, No. 10. pp. 1896-1908.

Bibtex

@article{869354c0a1c311ddb6ae000ea68e967b,
title = "Fast pairwise structural RNA alignments by pruning of the dynamical programming matrix",
abstract = "It has become clear that noncoding RNAs (ncRNA) play important roles in cells, and emerging studies indicate that there might be a large number of unknown ncRNAs in mammalian genomes. There exist computational methods that can be used to search for ncRNAs by comparing sequences from different genomes. One main problem with these methods is their computational complexity, and heuristics are therefore employed. Two heuristics are currently very popular: pre-folding and pre-aligning. However, these heuristics are not ideal, as pre-aligning is dependent on sequence similarity that may not be present and pre-folding ignores the comparative information. Here, pruning of the dynamical programming matrix is presented as an alternative novel heuristic constraint. All subalignments that do not exceed a length-dependent minimum score are discarded as the matrix is filled out, thus giving the advantage of providing the constraints dynamically. This has been included in a new implementation of the FOLDALIGN algorithm for pairwise local or global structural alignment of RNA sequences. It is shown that time and memory requirements are dramatically lowered while overall performance is maintained. Furthermore, a new divide and conquer method is introduced to limit the memory requirement during global alignment and backtrack of local alignment. All branch points in the computed RNA structure are found and used to divide the structure into smaller unbranched segments. Each segment is then realigned and backtracked in a normal fashion. Finally, the FOLDALIGN algorithm has also been updated with a better memory implementation and an improved energy model. With these improvements in the algorithm, the FOLDALIGN software package provides the molecular biologist with an efficient and user-friendly tool for searching for new ncRNAs. The software package is available for download at http://foldalign.ku.dk.",
author = "Havgaard, {Jakob Hull} and Elfar Torarinsson and Jan Gorodkin",
year = "2007",
doi = "10.1371/journal.pcbi.0030193",
language = "English",
volume = "3",
pages = "1896--1908",
journal = "P L o S Computational Biology (Online)",
issn = "1553-734X",
publisher = "Public Library of Science",
number = "10",

}

RIS

TY - JOUR

T1 - Fast pairwise structural RNA alignments by pruning of the dynamical programming matrix

AU - Havgaard, Jakob Hull

AU - Torarinsson, Elfar

AU - Gorodkin, Jan

PY - 2007

Y1 - 2007

N2 - It has become clear that noncoding RNAs (ncRNA) play important roles in cells, and emerging studies indicate that there might be a large number of unknown ncRNAs in mammalian genomes. There exist computational methods that can be used to search for ncRNAs by comparing sequences from different genomes. One main problem with these methods is their computational complexity, and heuristics are therefore employed. Two heuristics are currently very popular: pre-folding and pre-aligning. However, these heuristics are not ideal, as pre-aligning is dependent on sequence similarity that may not be present and pre-folding ignores the comparative information. Here, pruning of the dynamical programming matrix is presented as an alternative novel heuristic constraint. All subalignments that do not exceed a length-dependent minimum score are discarded as the matrix is filled out, thus giving the advantage of providing the constraints dynamically. This has been included in a new implementation of the FOLDALIGN algorithm for pairwise local or global structural alignment of RNA sequences. It is shown that time and memory requirements are dramatically lowered while overall performance is maintained. Furthermore, a new divide and conquer method is introduced to limit the memory requirement during global alignment and backtrack of local alignment. All branch points in the computed RNA structure are found and used to divide the structure into smaller unbranched segments. Each segment is then realigned and backtracked in a normal fashion. Finally, the FOLDALIGN algorithm has also been updated with a better memory implementation and an improved energy model. With these improvements in the algorithm, the FOLDALIGN software package provides the molecular biologist with an efficient and user-friendly tool for searching for new ncRNAs. The software package is available for download at http://foldalign.ku.dk.

AB - It has become clear that noncoding RNAs (ncRNA) play important roles in cells, and emerging studies indicate that there might be a large number of unknown ncRNAs in mammalian genomes. There exist computational methods that can be used to search for ncRNAs by comparing sequences from different genomes. One main problem with these methods is their computational complexity, and heuristics are therefore employed. Two heuristics are currently very popular: pre-folding and pre-aligning. However, these heuristics are not ideal, as pre-aligning is dependent on sequence similarity that may not be present and pre-folding ignores the comparative information. Here, pruning of the dynamical programming matrix is presented as an alternative novel heuristic constraint. All subalignments that do not exceed a length-dependent minimum score are discarded as the matrix is filled out, thus giving the advantage of providing the constraints dynamically. This has been included in a new implementation of the FOLDALIGN algorithm for pairwise local or global structural alignment of RNA sequences. It is shown that time and memory requirements are dramatically lowered while overall performance is maintained. Furthermore, a new divide and conquer method is introduced to limit the memory requirement during global alignment and backtrack of local alignment. All branch points in the computed RNA structure are found and used to divide the structure into smaller unbranched segments. Each segment is then realigned and backtracked in a normal fashion. Finally, the FOLDALIGN algorithm has also been updated with a better memory implementation and an improved energy model. With these improvements in the algorithm, the FOLDALIGN software package provides the molecular biologist with an efficient and user-friendly tool for searching for new ncRNAs. The software package is available for download at http://foldalign.ku.dk.

U2 - 10.1371/journal.pcbi.0030193

DO - 10.1371/journal.pcbi.0030193

M3 - Journal article

C2 - 17937495

VL - 3

SP - 1896

EP - 1908

JO - P L o S Computational Biology (Online)

JF - P L o S Computational Biology (Online)

SN - 1553-734X

IS - 10

ER -

ID: 8096718