Protein coding region of genome is gene
the genome is the entirety of an organism'shereditary information.
Question??
The human genome contains 3164.7 million chemical nucleotide bases (A, C, T, and G).
The average gene consists of 3000 bases, but sizes vary greatly, with the largest known human gene containing 2.4 million bases.
The total number of genes is estimated at 30,000 to 35,000.
Less than 2% of the genome is used in protein coding.
At least 50% of the genome is comprised of unused repetitive sequences.
Gene to protein
Protein
mRNA
DNA
transcription
translation
CCTGAGCCAACTATTGATGAA
CCUGAGCCAACUAUUGAUGAA
PEPTIDE
Protein-coding regions of DNA have been found to have apeak at frequency 2π/3 in their Fourier spectra. This iscalled the period-3 property.
The period-3 property is related to the different statisticaldistributions of codons between protein-coding andnoncoding DNA sections.
The period-3 property can be used as a basis foridentifying the coding and non-coding regions in a DNASequence.
Identification of protein coding regions
Prediction of the proper reading frame
Comparing to traditional methods, signal processing methods are much quicker, and can be even more accurate in some cases.
By mapping the chemical bases of DNA to a number set, we give ourselves an effective “DNA signal” .
A properly defined Fourier transform is a powerful predictor of both the existence and the reading frame of protein coding regions in DNA sequences.
Their respective color mapping schemes can help in visually identifying protein coding regions.
Challenges and Future Work• Genomic signal processing opens a new signal
processing frontier
• Sequence analysis: symbolic or categorical signal, classical signal processing methods are not directly applicable
• Increasingly high dimensionality of genetic data sets and the complexity involved call for fast and high throughput implementations of genomic signal processing algorithms
• Future work: spectral analysis of DNA sequence and data clustering of microarray data. Modify classical signal processing methods, and develop new ones.