DNA DNA (deoxyribonucleic acid) and RNA (ribonucleic acid) are composed of linear chains of...

• DNA (deoxyribonucleic acid) and RNA (ribonucleic acid) are composed of linear chains of monomeric units of nucleotides

• A nucleotide has three parts: a sugar, a phophate and a base

• Four bases

• Two strands are complementary• Base pairing: A-T; G-C• Pyrimidine and Purine form complementary H

bonding

Secondary Structure of DNA

• Genome– The entire DNAs of a cell is the genome– Individual units for coding proteins or RNA are genes

– A gene starts with ATG, ends with one or two stop codons

– Called ORF (Open Reading Frame)

– Biological Info– Contained in genome– Encoded in nucleotide sequences of DNA or RNA– Partitioned into discrete units, genes

Genome

Genome Databases

Completed genomes ftp site -- ftp://ftp.ncbi.nlm.nih.gov/genomes/ http://www.ncbi.nlm.nih.gov/PMGifs/Genomes/allorg.html http://www.ebi.ac.uk/genomes/mot/index.html http:/pir.goergetown.edu/pirwww/search/genome.html

Organism-specific databases http://www.unledu/stc-95/ResTools/biotools/biotools10.html http://www.fp.mcs.anl.gov/~gaasterland/genomes.html http://www.hgmp.mrc.ac.uk/GenomeWeb/genome-db.html http://www.bioinformatik.de/cgi-bin/browse/Catalog/

Databases/Genome_Proejcts

Human Genome• Human Genome Project

– Conceived in 1984, begun in 1990, completed in 2001 ahead of 2003 schedule

• What did the sequence reveal ?– 3 Bbp (base pair)

– 24 chromosomes,

– 22 autosomes plus two sex chromasomes (X,Y)

– Longest 250 Mbp, shorted 55 Mbp

– Mitochondrial genome

– Circular DNA molecule of 16.569 Mbp

– ~10**(13) cells

– How many is 3 Bbp ?– Typical 11-pt font can print 60 nucleotide is 3 in (~10 cm).

– In this format, 3 Bbp writes out in 5,000 mi

Other Species

Organism Genome size # of genes

Epstein – Barr virus 0.17 Mbp 80

E.Coli 4.6 Mbp 4,406

Yeast (S. cerevisiae) 12.5 Mbp 6,172

Nematode worm (C.elegans) 100.3 Mbp 19,099

Thale cress (A. thaliana) 115.4 Mbp 25,498

Fruit fly (D. melanogaster) 128.3 Mbp 13,601

Human (H. sapiens) 3223.0 Mbp 20,500

Fugu (Takifugu rubripes) 390.0 Mbp 30,000

Wheat 16000.0 Mbp 30,000

• In double strands• # of A = # of T; # of G = # of C• Erwin Chargaff’s 1st Parity Rule, 1951

• In a single strand ?• # of A = # of T; # of G = # of C• Erwin Chargaff’s 2nd Parity Rule

Monomer counts in DNA

• Download the Yeast Chromosome 1 sequence from www.cs.uml.edu/~kim/100/yeast01.txt to your C:\100

• Open a Command Prompt from Applications (NOT JES)

• cd C:\100• python• In Python

• NAME the DNA file• Read all lines and put them

into a single string, ‘dna’

• What does lines[0] have ?• What is happening here ?

Parsing DNA Data Files

>>> fp = open(‘yeast01.txt’)>>> lines=fp.readlines()

>>> lines[0]

• Line by line processing is difficult• Each line ends with ‘\n’• How to concatenate all

the lines into a LONG string by removing ‘\n’

• Why lines[1:], not lines[0:]?

Parsing DNA Data Files

>>> dna = ‘’.join(lines[1:])>>> dna[0:100]>>> dna = dna.replace(‘\n’,’’)

Base-Pair Distribution in a DNA String

• Write a Python function, basePairFreq(dna)• To count the number of ‘A’,’T’,’C’,’G’ in the concatenated dna

string

• How about the distribution of pairs of bases (bimers) ?• ACTTAGG

• AC, CT, TT, TA, AG, GG

• How about trimers, tetramers, pentamers, hexamers, … ?

DNA Base Countingdef baseFreq(dna):

count = [0.0,0.0, 0.0, 0.0]

num = 0

length = len(dna)

for i in range(0,length):

if dna[i:i+1] == 'A': count[0] = count[0]+1

elif dna[i:i+1] == 'C': count[1] = count[1]+1

elif dna[i:i+1] == 'T': count[2] = count[2]+1

elif dna[i:i+1] == 'G': count[3] = count[3]+1

else: num=num

num = num+1

for i in range(0,4):

count[i] = count[i]/num

return count

Base Counting (in Notepad)

def baseFreq(dna): count = [0.0,0.0] num = 0 length = len(dna) for i in range(0,length): if dna[i:i+1] == 'A': count[0] = count[0]+1 elif dna[i:i+1] == 'C': count[1] = count[1]+1 elif dna[i:i+1] == 'T': count[2] = count[2]+1 elif dna[i:i+1] == 'G': count[3] = count[3]+1 else: num=num num = num+1 for i in range(0,4): count[i] = count[i]/num return count

##### main() function #############dataFile = input('Enter a DNA file name\n')fp = open(dataFile)lines = fp.readlines()dnaStr = ''.join(lines)dnaStr = dnaStr.replace('\n', '')

freq = basePairFreq(dnaStr)print(freq)

DNA DNA (deoxyribonucleic acid) and RNA (ribonucleic acid) are composed of linear chains of...

Documents

RNA and Protein Synthesis Ribonucleic acid: another type of nucleic acid that works with DNA to make proteins

Human genome sequence. DNA = DEOXYRIBONUCLEIC ACID RNA = RIBONUCLEIC ACID Strong acid hydrolyses DNA & RNA into 3 components: nitrogenous base (4 different

RNA: Structure and Function BiologyVick. What is RNA? Ribonucleic Acid Ribonucleic Acid It carries out the instructions encoded in DNA It carries out

Proteins and Nucleic Acids · The Roles of Nucleic Acids There are two types of nucleic acids:-Deoxyribonucleic acid (DNA)-Ribonucleic acid (RNA) DNA directs synthesis of messenger

Notes on Nucleic Acids 2 types of nucleic acids: DNA – deoxyribonucleic acid RNA – ribonucleic acid Chapter 8

Ribonucleic acid

How is RNA different from DNA? RNA (Ribonucleic Acid)

RNA Ribonucleic Acid. Ribonucleic Acid (RNA) RNA is much more abundant than DNA There are several important differences between RNA and DNA: - the

Nucleic Acids and Protein Synthesis. Nucleic Acids DNA DNA Deoxyribonucleic Acid Deoxyribonucleic Acid RNA RNA Ribonucleic Acid Ribonucleic Acid

Chapter 11 11.2 From DNA to Protein. DNA can make RNA RNA = Ribonucleic Acid RNA –Single strand that helps build protein

Chapter 9 Nucleotides and Nucleic Acids. 1. The nucleic acids, deoxyribonucleic acid (DNA) and ribonucleic acid (RNA), are polymers of nucleotide units

Nucleic Acids Ex. Deoxyribose Nucleic Acid (DNA) Ribonucleic Acid (RNA)

Chapter 25 Nucleic Acids and Protein Synthesis. Chapter 252 Introduction Deoxyribonucleic acid (DNA) and ribonucleic acid (RNA) are the molecules

Two nucleic acids: – Deoxyribonucleic acid (DNA) – Ribonucleic acid (RNA) Monomers: ‘Nucleotide’ – 5-carbon sugar (ribose or deoxyribose) – Phosphate

14.1 Structure of Ribonucleic Acid (RNA). 14.1 Structure of ribonucleic acid Learning Objectives Compare and contrast the structure of DNA and RNA molecules

BIOl1020(–nucleic(acids(– supporngslides · Nucleic Acids • There are two types of nucleic acids – Deoxyribonucleic acid (DNA) – Ribonucleic acid (RNA) • DNA – Stores

Gel Electrophoresis. Technique used for separation of –Deoxyribonucleic acid (DNA) –Ribonucleic acid (RNA) –Protein molecules Using an electric current

Nucleic Acid Nucleic Acids Examples: – RNA (ribonucleic acid) single helix – DNA (deoxyribonucleic acid) double helix Structure: – monomers = nucleotides

Nucleic acids Informational macromolecule Deoxyribonucleic acid (DNA) is the genetic material Ribonucleic acid (RNA) – Messenger RNA (mRNA) carries information

Regents Biology Nucleic Acids Examples DNA DeoxyriboNucleic Acid RNA RiboNucleic Acid ATP Adenosine TriPhosphate DNA Double Helix