Articles

Spectrogram Analysis of Genomes

Sussillo, David; Kundaje, Anshul; Anastassiou, Dimitris

We performed frequency-domain analysis in the genomes of various organisms using tricolor spectrograms, identifying several types of distinct visual patterns characterizing specific DNA regions. We relate patterns and their frequency characteristics to the sequence characteristics of the DNA. At times, the spectrogram patterns could be related to the structure of the corresponding protein region by using various public databases such as GenBank. Some patterns are explained from the biological nature of the corresponding regions, which relate to chromosome structure and protein coding, and some patterns have yet unknown biological significance. We found biologically meaningful patterns, on the scale of millions of base pairs, to a few hundred base pairs. Chromosome-wide patterns include periodicities ranging from 2 to 300. The color of the spectrogram depends on the nucleotide content at specific frequencies, and therefore can be used as a local indicator of CG content and other measures of relative base content. Several smaller-scale patterns were found to represent different types of domains made up of various tandem repeats.

Subjects

Files

  • thumnail for 1687-6180-2004-790248.pdf 1687-6180-2004-790248.pdf application/pdf 8.63 MB Download File
  • thumnail for 1687-6180-2004-790248.xml 1687-6180-2004-790248.xml application/xml 3.63 KB Download File
  • thumnail for 00a053988f2a786d6bb8b08f8292020c.zip 00a053988f2a786d6bb8b08f8292020c.zip application/zip 8.54 MB Download File

Also Published In

Title
EURASIP Journal on Advances in Signal Processing
DOI
https://doi.org/10.1155/S1110865704310048

More About This Work

Academic Units
Electrical Engineering
Publisher
Springer
Published Here
September 8, 2014