Bayesian prediction of bacterial growth temperature range based on genome sequences.

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningfagfællebedømt

Standard

Bayesian prediction of bacterial growth temperature range based on genome sequences. / Jensen, Dan B.; Vesth, Tammi C.; Hallin, Peter F.; Pedersen, Anders G.; Ussery, David W.

I: BMC Genomics, Bind Volume 13 Supplement 7, 2012.

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningfagfællebedømt

Harvard

Jensen, DB, Vesth, TC, Hallin, PF, Pedersen, AG & Ussery, DW 2012, 'Bayesian prediction of bacterial growth temperature range based on genome sequences.', BMC Genomics, bind Volume 13 Supplement 7. https://doi.org/10.1186/1471-2164-13-s7-s3

APA

Jensen, D. B., Vesth, T. C., Hallin, P. F., Pedersen, A. G., & Ussery, D. W. (2012). Bayesian prediction of bacterial growth temperature range based on genome sequences. BMC Genomics, Volume 13 Supplement 7. https://doi.org/10.1186/1471-2164-13-s7-s3

Vancouver

Jensen DB, Vesth TC, Hallin PF, Pedersen AG, Ussery DW. Bayesian prediction of bacterial growth temperature range based on genome sequences. BMC Genomics. 2012;Volume 13 Supplement 7. https://doi.org/10.1186/1471-2164-13-s7-s3

Author

Jensen, Dan B. ; Vesth, Tammi C. ; Hallin, Peter F. ; Pedersen, Anders G. ; Ussery, David W. / Bayesian prediction of bacterial growth temperature range based on genome sequences. I: BMC Genomics. 2012 ; Bind Volume 13 Supplement 7.

Bibtex

@article{7607c71783834894ad0bf81b2bf2c168,
title = "Bayesian prediction of bacterial growth temperature range based on genome sequences.",
abstract = "The preferred habitat of a given bacterium can provide a hint of which types of enzymes of potential industrial interest it might produce. These might include enzymes that are stable and active at very high or very low temperatures. Being able to accurately predict this based on a genomic sequence, would thus allow for an efficient and targeted search for production organisms, reducing the need for culturing experiments. This study found a total of 40 protein families useful for distinction between three thermophilicity classes (thermophiles, mesophiles and psychrophiles). The predictive performance of these protein families were compared to those of 87 basic sequence features (relative use of amino acids and codons, genomic and 16S rDNA AT content and genome size). When using na{\"i}ve Bayesian inference, it was possible to correctly predict the optimal temperature range with a Matthews correlation coefficient of up to 0.68. The best predictive performance was always achieved by including protein families as well as structural features, compared to either of these alone. A dedicated computer program was created to perform these predictions. This study shows that protein families associated with specific thermophilicity classes can provide effective input data for thermophilicity prediction, and that the na{\"i}ve Bayesian approach is effective for such a task. The program created for this study is able to efficiently distinguish between thermophilic, mesophilic and psychrophilic adapted bacterial genomes.",
author = "Jensen, {Dan B.} and Vesth, {Tammi C.} and Hallin, {Peter F.} and Pedersen, {Anders G.} and Ussery, {David W.}",
year = "2012",
doi = "10.1186/1471-2164-13-s7-s3",
language = "English",
volume = "Volume 13 Supplement 7",
journal = "BMC Genomics",
issn = "1471-2164",
publisher = "BioMed Central Ltd.",

}

RIS

TY - JOUR

T1 - Bayesian prediction of bacterial growth temperature range based on genome sequences.

AU - Jensen, Dan B.

AU - Vesth, Tammi C.

AU - Hallin, Peter F.

AU - Pedersen, Anders G.

AU - Ussery, David W.

PY - 2012

Y1 - 2012

N2 - The preferred habitat of a given bacterium can provide a hint of which types of enzymes of potential industrial interest it might produce. These might include enzymes that are stable and active at very high or very low temperatures. Being able to accurately predict this based on a genomic sequence, would thus allow for an efficient and targeted search for production organisms, reducing the need for culturing experiments. This study found a total of 40 protein families useful for distinction between three thermophilicity classes (thermophiles, mesophiles and psychrophiles). The predictive performance of these protein families were compared to those of 87 basic sequence features (relative use of amino acids and codons, genomic and 16S rDNA AT content and genome size). When using naïve Bayesian inference, it was possible to correctly predict the optimal temperature range with a Matthews correlation coefficient of up to 0.68. The best predictive performance was always achieved by including protein families as well as structural features, compared to either of these alone. A dedicated computer program was created to perform these predictions. This study shows that protein families associated with specific thermophilicity classes can provide effective input data for thermophilicity prediction, and that the naïve Bayesian approach is effective for such a task. The program created for this study is able to efficiently distinguish between thermophilic, mesophilic and psychrophilic adapted bacterial genomes.

AB - The preferred habitat of a given bacterium can provide a hint of which types of enzymes of potential industrial interest it might produce. These might include enzymes that are stable and active at very high or very low temperatures. Being able to accurately predict this based on a genomic sequence, would thus allow for an efficient and targeted search for production organisms, reducing the need for culturing experiments. This study found a total of 40 protein families useful for distinction between three thermophilicity classes (thermophiles, mesophiles and psychrophiles). The predictive performance of these protein families were compared to those of 87 basic sequence features (relative use of amino acids and codons, genomic and 16S rDNA AT content and genome size). When using naïve Bayesian inference, it was possible to correctly predict the optimal temperature range with a Matthews correlation coefficient of up to 0.68. The best predictive performance was always achieved by including protein families as well as structural features, compared to either of these alone. A dedicated computer program was created to perform these predictions. This study shows that protein families associated with specific thermophilicity classes can provide effective input data for thermophilicity prediction, and that the naïve Bayesian approach is effective for such a task. The program created for this study is able to efficiently distinguish between thermophilic, mesophilic and psychrophilic adapted bacterial genomes.

UR - http://www.scopus.com/inward/record.url?scp=84878771879&partnerID=8YFLogxK

U2 - 10.1186/1471-2164-13-s7-s3

DO - 10.1186/1471-2164-13-s7-s3

M3 - Journal article

C2 - 23282160

AN - SCOPUS:84878771879

VL - Volume 13 Supplement 7

JO - BMC Genomics

JF - BMC Genomics

SN - 1471-2164

ER -

ID: 292229661