SIMCODE: A program for simulating point mutations in genomic DNA

66

Several disadvantages have been also pointed out. First, the representation (Fig 2) is not so 'legible' (a term coined by J D R a w n ) 6 as the Lineweaver-Burk plot. Further- more, it is a little more tedious in data handling, but this complication is hardly a problem for the new generation of pocket-calculator-owner students. More serious objec- tions have been made by arguing that this procedure is not very sensitive to deviations from Michaelis-Menten be- haviour, 9'1° and is subjected to considerable deflection by systematic errors. 11,x2 My own experience has shown me, however, that, in most of the cases, Km and V values obtained by the students from slopes fitted by eye to 15 s delayed curves (or discrete experimental points) are far worse than calculations by the method described here.

It may be concluded that the use of the integrated Michaelis-Menten equation (ignored by almost all cur- rent biochemistry textbooks 1-6) is a feasible and didactic manner to overcome problems of kinetic parameters determination in laboratory practices. It can be comple- mentary to classical procedures handled in theory and problems. Even if it must be admitted that initial velocity methods are preferred in common work by enzymol- ogists, 1° understanding the integrated Michaelis-Menten equation by the students should be considered by no means a useless effort, because this equation is frequently utilized in more profound studies to estimate initial velocities in order to obtain more accurate Km and V values. 10,13

Acknowledgements The authors wish to thank Drs J L Garcia-Martinez and J G Peret6 for carefully reviewing the manuscript.

References 1 Lehninger, A L (1975) 'Biochemistry' second edition, Worth Pub-

lishers, New York

2 Lehninger, A L (1982) 'Principles of Biochemistry', Worth Pub- lishers, New York

3 Metzler, D (1977) 'Biochemistry', Academic Press, New York

4 Stryer, L (1981) 'Biochemistry' second edition, W H Freeman, San Francisco

5 Zubay, G (1983) 'Biochemistry', Addison-Wesley, Reading, Massa- chusetts

6 Rawn, J D (1983) 'Biochemistry', Harper & Row, New York

7 Wood, W B, Wilson, J H, Benbow, R M and Hood, L E (1981) 'Biochemistry. A Problems approach', second edition, Benjamin/ Cummings, Menlo Park

Walker, A C and Schmidt, C L A (1944) Arch Biochem 5, 445-467

'~ Cleland, W W (1970) 'Steady state kinetics' in "The Enzymes' edited by P D Boyer, Vol II, pp 1-65, Academic Press, London

m Atkins, G L and Nimmo, I A (1980) Analyt Biochem 104, 1-9

iJ Fleisher, G A (1953) J Am Chem Soc 75, 4487-4490

~2 Newman, P F J, Atkins, G L and Nimmo, I A (1974) BiochemJ 143, 779-781

L~Glick, N, Landman, A D and Roufogalis, B D (1979) Trends Biochem Sci 4, N 82-83

SIMCODE: A Program for Simulating Mutations in Genomic DNA

RAFAEL FRANCO and ENRIQUE I CANELA

Department of Biochemistry, Faculty of Chemistry University of Barcelona Diagonal 647, 08028 Barcelona, Spain

Point

Introduction In Spain, Biochemistry is taught in the Faculties of Chemistry, Pharmacy, Biology and Medicine. In general, advanced students of Biochemistry in these Faculties are familiar with DNA structure and the Genetic Code. Theoretically, the students also know what types of genome mutation can occur and how they affect trans- lation into proteins. However, there are certain practical aspects, like the genetic variability concept, that are less known to them. Mutations that lead to mutants having traits almost indistinguishable from those of the pro- genitors are responsible for this variability. Mutation frequency is variable depending on the type of organism - - eukaryote or prokaryote. In Escherichia coli the frequency of point mutations has been established as 10 - l° per nucleotide and cell division. 1

Simulation has been employed in this article to teach the consequences of point mutations, on a given protein, with respect to genetic variability and protein poly- morphism. Furthermore, genome changes leading to termination codons or to proteins which produce non- viable organisms are demonstrated by comparing amino acids as breakers or formers of the o~-helix or the 13- sheet.

Description of the Program The program is written in BASIC and runs on a Tektronix 4051 computer. It is available on request from the authors.

Once the polypeptide is selected by the student, the program starts by asking the user the number of amino acids in the polypeptide chain, up to 18. Then he enters the sequence, amino acid by amino acid, in standard three-letter form; 2 if an error is made, the program stops. When the protein sequence is complete, the computer codes it and constructs the corresponding messenger RNA (mRNA) sequence. This is done by assigning a codon for each amino acid res idue) In the case of multiple codons for a specific amino acid, a codon is selected at random. At this point the computer is ready to simulate point mutations. For simplifying the program, mutations have been directly performed on the mRNA sequence; furthermore initiation or termination codons have not been included in this mRNA sequence. Simulation is done by three consecutive random selections. First, the codon where the mutation will take place is selected. Then the base position in the codon (ie first, second or third, is chosen. Finally, this specific base is replaced by another selected at random from the four possible (ie A, U, C or G).

BIOCHEMICAL EDUCATION 13(2) 1985

Once mutation has occurred, the computer constructs the new polypeptide sequence and displays it on the screen. It also displays information on the number of the codon where the mutation has occurred and the changes in the mRNA and the polypeptide sequences (see Fig 1).

MUTRTION ? m l m l m ~ a x z ~

CY$-LEU-GLY-U~LoASP-HIS-~Sp-CLj-~N-TH~-~pC-3ER-LEU-PHE-LY$-U~L-U~L-GL'¢ t~t

CODON 16 ~qSE [ C~ H~S BEEN CHQNGED TO G

ILE H~.':.. ~EEII CH~NCEC, TO U~L I.S IT THE STRUCTURE $1rlILAR ? YES

CORPECT

DO YOU ~I3N TO COHTINUE ~

YES OR NOT

Figure I Copy of the screen output. This corresponds to a point mutation which lead to a distinct polypeptide chain

On the whole, there are two distinct possibilities. It is . possible that mutation may lead to the same protein. In this case 'Identical protein' appears on the screen and the computer asks: 'Do you wish to continue?' (Fig 2). If the mutation leads to a codon which is specific for a new amino acid, the student must decide if the new protein has a structure similar to the previous one. In this case the machine displays: 'lie has changed to Val. Is it the structure similar?' (Fig 1). After the answer is given, the computer replies 'CORRECT' for a right answer and 'INCORRECT' for a wrong answer. Similarities between amino acids have been tested as described by Chou and Fasman. 4 Each amino acid has a number (P~) depending on its power to form or break the a-helix, and another number (Pa) which is an index of its power to form or break the 13-sheet. Comparison of amino acids is made by these indices. It is assumed that amino acids are similar if there is not more than 60% of difference between the P~ and P~ indices for the two amino acids. In this case it is also assumed that protein structures are analogous.

The program runs continuously simulating mutation after mutation until the student decides to finish the computation. At this point, the score ie number of correct and incorrect answers, appears on the screen. The user can also compare the lastprotein sequence appearing on the screen with the one previously selected by him. Although as stated above, structures of these two poIy-

RUTAT ION $ i = = z l m = = l = = =

CYS-LEU-GLY- UAL-ASP-H i C;-ASP-GLU-A~N-THP-,~RC,-$ER-LF.U-PHE-LYS- ILE-U~L-r.Ly

CODON 11 BASE ~ t/ H,qS BEEN CH¢,HGED TO C

ARG HAS BEEN CHANGED rn QRG

IDENTICAL PROTEIN

DO YOU I41SH TO CONTDiUE?

"CE'~ OR NOT

Figure 2 Copy of the screen output. This corresponds to a point mutation which lead to the same polypeptide chain, ie a silent mutation

67

peptides can be considered as similar, some amino acid changes lead to variability. With the SIMCODE program, testing various polypeptide chains 18 amino acids long, the amino acid replacements at any position are, on average, 22% after 10 consecutive point mutations and 35% after 20 consecutive point mutations. The student can calculate the number of duplications neccessary to reach this level of variability knowing the frequency of mutation, the number of bases in the DNA fragment and the number of mutations at the end of the program. This number is given by the program (see Figs 1 and 2).

Comments The program described in this article is suitable for advanced students of biochemistry. The student must know the structure DNA and mRNA and the mechanism by which the genetic code is translated into protein.

The program is useful for teaching the consequences of point mutations in DNA. Since it is interactive, it also allows students to test their knowledge of amino acid and protein structure. In this sence students must be familiar with Chou and Fasman's classification 4 before running the program (see above).

The program also has other possibilities. With data handling facilitated by the computer the student can construct statistics about types of DNA mutations 'trans- versions or transitions', or protein modifications 'silent mutations or non-viable organism'. A criterion for the latter is to consider that a mutation leading to a termination codon or to an amino acid that alters protein structure produces a non-viable organism. When this occurs, the program does not stop but mutations continue from the last mRNA sequence that led to a viable organism. This is logical since, obviously, a non-viable organism is unable to reproduce.

In our experience it has been shown that students are surprised to discover the large number of silent mutations produced with the SIMCODE program. This leads to consideration that nature is mostly conservative. How- ever, comparison between proteins entered by the students and what appear at the end of the computation indicates not negligible neutral variability, ie nature produces variability within equality. With the SIMCODE program mutations lead to a non-viable or a viable organism with variability in DNA and/or in protein sequence but showing the same characteristics as the progenitor. Obviously, neither the SIMCODE program nor the simulation techniques are at this point, capable of deciding whether such a mutation or set of mutations is beneficial to the organism. At this level, only nature can 'select'.

References i Kourilsky P and Gachelin G (1984) Mundo Cientifico (Spanish version

of La Recherche) 38,718-727

2 IUPAC and IUPAC-IUB (1975) Biochemistry 14, 449-462

3 Jukes T H (1978) Advances in Enzymology 47 375-432

4 Chou P Y and Fasman G D (1978) Advances in Enzymology 47, 45-148

BIOCHEMICAL EDUCATION 13(2) 1985

Documents

SIMCODE: A program for simulating point mutations in genomic DNA