Information Theory and Multivariate Techniques for Analyzing DNA Sequence Data: An Example from Tomato Genes

Authors

  • Bal K Joshi North Carolina State University, Raleigh, North Carolina
  • Dilip R Panthee North Carolina State University, Raleigh, North Carolina

DOI:

https://doi.org/10.3126/njb.v1i1.3867

Keywords:

Diversity analysis, DNA sequences, principal component analysis, tomato genes

Abstract

DNA and amino acid sequences are alphabetic symbols having no underlying metric. Use of information theory is one of the solutions for sequence metric problems. The reflection of DNA sequence complexity in phenotype stability might be useful for crop improvement. Shannon-Weaver index (Shannon Entropy, H') and mutual information (MI) index were estimated from DNA sequences of 22 genes, consisted of two gene families of tomato, namely disease resistance and fruit quality. Main objective was use of information theory and multivariate techniques to understand diversity among genes and relate the sequence complexity with phenotypes. The normalized H' value ranged from 0.429 to 0.461. The highest diversity was observed in the gene Crtr-B (beta carotene hydroxylase). Two principal components which accounted for 36.65% variation placed these genes into four groups. Groupings of these genes by both principal component and cluster analyses showed clearly the similarity at phenotypes levels within cluster. Sequences similarity among genes was observed within a family. Diversity assessment of genes applying information theory should link to understand the sequences complexity with respect to gene stability for example stability of resistance gene.

Key words: Diversity analysis; DNA sequences; principal component analysis; tomato genes

Nepal Journal of Biotechnology, 2011, Vol. 1, No. 1 pp.1-9

Downloads

Download data is not yet available.
Abstract
1655
PDF
794

Downloads

How to Cite

Joshi, B. K., & Panthee, D. R. (2010). Information Theory and Multivariate Techniques for Analyzing DNA Sequence Data: An Example from Tomato Genes. Nepal Journal of Biotechnology, 1(1), 1–8. https://doi.org/10.3126/njb.v1i1.3867

Issue

Section

Original Research Articles