From phenotype to genotype: a Bayesian solution

Research output: Contribution to journalJournal articleResearchpeer-review

The study of biological systems commonly depends on inferring the state of a 'hidden' variable, such as an underlying genotype, from that of an 'observed' variable, such as an expressed phenotype. However, this cannot be achieved using traditional quantitative methods when more than one genetic mechanism exists for a single observable phenotype. Using a novel latent class Bayesian model, it is possible to infer the prevalence of different genetic elements in a population given a sample of phenotypes. As an exemplar, data comprising phenotypic resistance to six antimicrobials obtained from passive surveillance of Salmonella Typhimurium DT104 are analysed to infer the prevalence of individual resistance genes, as well as the prevalence of a genomic island known as SGI1 and its variants. Three competing models are fitted to the data and distinguished between using posterior predictive p-values to assess their ability to predict the observed number of unique phenotypes. The results suggest that several SGI1 variants circulate in a few fixed forms through the population from which our data were derived. The methods presented could be applied to other types of phenotypic data, and represent a useful and generic mechanism of inferring the genetic population structure of organisms.

Original languageEnglish
JournalProceedings. Biological sciences / The Royal Society
Volume278
Issue number1710
Pages (from-to)1434-40
Number of pages7
ISSN0962-8452
DOIs
Publication statusPublished - 7 May 2011
Externally publishedYes

    Research areas

  • Anti-Bacterial Agents, Bayes Theorem, Drug Resistance, Multiple, Bacterial, Genes, Bacterial, Genetic Heterogeneity, Genetics, Population, Genomic Islands, Genotype, Humans, Markov Chains, Models, Biological, Monte Carlo Method, Phenotype, Salmonella Infections, Salmonella typhimurium

ID: 137015385