Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
1-1
MultiQTL
Comprehensive interactive package for genetic mapping
of quantitative traits,
allowing for multiple-trait, multiple-environment,
multiple-family, and multiple-interval QTL analysis
Institute of Evolution
Haifa University, Haifa, Israel
Tel: 972-4-8240449, Fax: 972-4-8246554
http://www.multiqtl.com
Table of Contents
Part 1: Loading and Editing Real Data Data input format Starting to load the data Project creation stage Optional addition and removal of data files Optional data revision
Part 2: Data Simulation (step1 - step 6)
Part 3: Model Creation and Calculation stages Model Creation stage General model options About model Calculation panel description Save and Open Project options
Part 4: Single-QTL model, one and two traits Displaying the results Main window for analysis and LOD graph Models with “extended” parameters Selective genotyping model
1-10
Part 5: Two-linked QTL model Submodel and Estimate options Compare option Bootstrap and Distribution options
Part 6: Multiple-trait model
Multiple trait example for single QTL model Multiple trait example for two-linked QTL model
Table of Contents (continued)
Part 8: Multiple environments file Results for model with/without trait normalization Submodel option Submodel compare option
1-11
Part 11: Summarizing the results Total Significance option Creation of the result table with specified threshold significance. Computation of the total significance by Benjamini-Hochberg (1995) method Creation of the report table Creation panel with LOD score graphs
Part 7: Multiple chromosome analysis Creation of multi-chromosome set Fitting multilocus interval model Multiple Simulation options
Part 9: Multiple families Computation results Submodel compare option Significance options Multichromosome set Format transition “multiple-environment” “multiple-trait” formats
Part 10: Selective genotyping Simulated data Computation results Multichromosome set
Our algorithms are based on up-to-date theoretical papers from the whole QTL mapping field but some are unique to us.
1-12
References
Korol A., Preygel I., Preygel S. 1994, Recombination Variability and Evolution. Chapman & Hall, Lond. Korol A., Ronin Y., Kirzhner V. 1995, Interval mapping of quantitative trait loci employing correlated trait complexes. Genetics 140: 1137-1147. Ronin Y., Kirzhner V., Korol A. 1995, Linkage between loci of quantitative traits and multi-trait analysis with a single marker. Theor. Appl. Genet. 90: 776-786. Ronin Y., Korol A., Fahima T., Kirzhner V., Nevo E. 1996, Censored estimation of linkage between PCR-generated markers and a target gene based on stepwise bulked analysis. Biometrics 52: 1428-1439. Korol A., Ronin Y., Kirzhner V. 1996a, Linkage between loci of quantitative traits and marker loci. Resolution power of three statistical approaches in single marker analysis. Biometrics 52: 426-441. Korol A., Ronin Y., Tadmor Y., Bar-Zur A., Kirzhner V., Nevo E. 1996b, Estimating variance effect of QTL: an important prospect to increase the resolution power of interval mapping. Genet. Res. 67:187-194. Weller J., Song J., Ronin Y., Korol A. 1997, Experimental designs and solutions to multiple trait comparisons. Animal Biotechnology 8: 107-122. Korol A., Ronin Y., Hayes P., Nevo E. 1998a, Multi-interval mapping of correlated trait complexes: simulation analysis and evidence from barley. Heredity 80: 273-284. Korol A., Ronin Y., Nevo E. 1998b, Approximated analysis of QTL-environment interaction with no limits on the number of environments. Genetics 148: 2015-2028. Ronin Y., Korol A., Weller J. 1998, Selective genotyping to detect quantitative trait affecting multiple traits. Theor. Appl. Genet. 97: 1169-1178. Ronin Y., Korol A., Nevo E. 1999, Single- and multiple-trait analysis of linked QTLs: some asymptotic analytical approximation. Genetics 151: 387-396. Peng J., Korol A., Fahima T., Röder M., Ronin Y., Li Y-C., Nevo E. 2000, Molecular genetic maps in wild emmer wheat, Triticum dicoccoides: genome-wide coverage, massive negative interference, and putative quasi-linkage. Genome Res.10:1509-1531. Korol A., Ronin Y., Itzcovich A., Peng J., Nevo E. 2001, Enhanced efficiency of QTL mapping analysis based on multivariate complexes of quantitative traits. Genetics 157: 1789-1803.
1-13
* References (continued)
Peng J., Ronin Y., Fahima T., Röder M., Li Y., Nevo E., Korol A. 2003, Domestication quantitative trait loci in Triticum dicoccoides, the progenitor of wheat. Proc. Natl. Acad. Sci. USA 100: 2489-2495. Mester D.I., Ronin Y.I., Hu Y., Peng, J., Nevo E., Korol A.B. 2003, Efficient multipoint mapping: Making use of dominant repulsion-phase markers. Theor.Appl.Genet. 107: 1002-1112. Mester D.I., Ronin Y.I., Minkov D., Nevo E., Korol A.B. 2003, Constricting large scale genetic maps using evolutionary strategy algorithm. Genetics 165: 2269-2282. Hanotte O., Ronin Y., Agaba M., Nilsson P., Gelhaus A., Horstmann R., Sugimoto Y., Kemp S., Gibson J., Korol A., Soller M., Teale A. 2003, Mapping of quantitative trait Loci (QTL) controlling resistance to trypanosomosis in an experimental cross of trypanotolerant West African N’Dama cattle (Bos taurus) and trypanosusceptible East African Boran cattle (Bos indicus) Proc. Natl. Acad. Sci. USA 100: 7443-7448. Ronin, Y, Korol, A.,
Shtemberg, M., Nevo, E., Soller, M. 2003, High resolution mapping of quantitative trait loci by selective recombinant genotyping. Genetics 164: 1657-1666. Mester, D, Korol, A, Nevo, E. 2004, Fast and high precision algorithms for optimization in large scale genomic problems. Computation Biology and Chemistry 28: 281-290. Yagil, C., Sapojnikov, M., Wechsler, A., Korol, A.B. & Y. Yagil, Y. 2006. Genetic dissection of proteinuria in the Sabra rat. Physiological Genomics 25: 121-133. Atzmon, G., Ronin, Y., Korol, A., Yonash N., Cheng H. & Hillel J. 2006. QTLs associated with growth traits and abdominal fat weight and their interactions with gender and hatch in commercial meat-type chickens. Animal Genetics 37: 352–358. Korol, A., Frenkel, Z., Cohen, L., Lipkin, E. & Soller, M. 2007. Fractioned DNA pooling: a new cost effective strategy for fine mapping of quantitative trait loci. Genetics 176: 2611–2623. Korol, A., Mester, D., Frenkel, Z. & Ronin, Y.I. 2009.Methods for genetic analysis in the Triticeae. Chapt. 6 In: C. Feuillet and G. J. Muehlbauer (eds). Genetics and Genomics of the Triticeae. Springer, pp. 163-199. Korol A, Frenkel Z, Orion O, Ronin Y. 2012. Some ways to improve QTL mapping accuracy. Animal Genetics 43 (Suppl. 1), 36–44.
* References to other papers cited in any of the Parts 1-12 are provided at the end of the document (i.e., at the end of Part 12)
1-14
Part 1: Loading and Editing Real Data
Table of Contents
Format input of real data
Chromosome File Format Trait File format Starting to load the data
Definition and explanations New Input
16 17 20 21 22 23 25 26 30 3135
Part 1: Loading and Editing Real Data
Regular data Environments data Family data Selective Genotype data Error correction General data formats (*.txt, *.dbf, *.xls, *.raw)
Step 1 Enter a name for the problem Step 2 Select the type of population Step 3 Enter the number of environments and genotypes Step 4 Enter the symbols of the markers Step 5 Select the files of data
Old Input
38 38 39 40 41
1-15
Project Creation Stage 44 45464749
Part 1: Loading and Editing Real Data
Table of contents (continued)
Errors in chromosome values Errors in trait values Errors in chromosome: recombination rate >0.5 New project creation
Option of changing marker order in the chromosome Option of adding data files Option of adding marker files Option of deleting data files from Problem folder Extracting data and file information from Project Data revision option
Chromosome revision Trait revision
50 53 54 56 58 59 61 62 68
Graphical display of information file
1-16
For detailed descriptions of the MultiQTL formats see the following pages.
Format input of real data
In order to analyze real QTL data, the data files must be prepared in txt format (ANSI) only with extension indicated below. Word or Excel must be converted to this format.
Four main types of real data are employed for the current version of MultiQTL: chromosome files with extension *.chr (each chromosome being represented by a separate file), trait files with extension *.trt (each trait being represented by a separate file) or *.tra (one file for all traits), and marker files with extension *.mrk (for separate marker loci added later, after the main body of data is already entered). No blank (space) is allowed in the file names. If Notepad tool of Windows is used for preparing all these files, you should be careful in saving these files with the proper extension: when saving, please, choose “all files” from the “save as type” options of Notepad.
Part 1: Loading and Editing Real Data
To conduct QTL analysis with MultiQTL software, you should first prepare linkage maps (ordering of markers along chromosomes). This can be done using various packages, such as MapMaker (Lander et al. 1987) or JoinMap (Stam 1993). A new package, MultiPoint, for efficient multipoint mapping based on Evolutionary Strategy algorithms (Mester et al. 2003a,b, 2004) is being developed by our team (to be released in 2005).
The format described below is mandatory for the old input (v. 2.4 or earlier). The new version v. 2.5 is much easier and allows additional formats (will be described later as new input).
1-17
The Chromosome File Format
Each chromosome file should be arranged as a set of rows. Each row starts with a unique marker name followed by a series of symbols representing individual scores of this marker for all genotypes.
The chromosome data file has extension *.chr
The number of genotypes in each row must be the same. It is not necessary to have spaces between the symbols of marked genotypes, but the first symbol for a genotype must be space separated from the marker name. We assume that the order of markers for each chromosome is known (e.g., from a previous marker ordering based on MapMaker, JoinMap, or other software). Consequently, the rows in the chromosome data file should follow this order. If this is not the case (marker rows are not ordered), you can input your data file with an arbitrary order of markers and provide an additional file *.ord to define the order of markers in the chromosome. For more details, see page 53, this part.
As mentioned above, we are going to provide you with a new, highly efficient and user-friendly, package for multipoint mapping, MultiPoint (Mester et al. 2003a,b, 2004) (expected release June-July 2005).
Part 1: Loading and Editing Real Data
1-18
Any symbol can represent any genotype. The defaults are:
Missing genotype: 0
Backcross genotypes: 1 - Aa 2 - aa
Dihaploid genotypes: 1 - AA, 2 - aa
RIL_selfing genotypes: 1 - AA, 2 - aa
RIL_sib_mating population genotypes: 1 - AA, 2 - aa
F2, F3, F4 populations genotypes:
for a codominant marker locus: 1 - AA, 2 - aa, 3 - Aa otherwise: when maternal allele is dominant: 4 - A when paternal allele is dominant: 5 - a
The Chromosome File Format (continued)
The defaults can be changed to any other symbols in step 4 of loading data section of the old input (see next example).
Part 1: Loading and Editing Real Data
Note that with such designations, the sign of the QTL substitution effect d represents the difference X(AA)-X(aa), or X(1)-X(2), or X(A)-X(B), where AA and aa (or 1 and 2, or A and B) are the two homozygotes at the marker locus (clearly, you may use other designations, but sign(d) is always in accordance with the indicated direction). Likewise, h is calculated as deviation from the mid-parent value, i.e., X(Aa)-0.5[X(AA)+X(aa)], or X(3)-0.5[X(1)+X(2)], or X(H)-0.5[X(A)+X(B)].
Example 2: ChromNew.chr
(F2: a - AA genotype, b - aa genotype, h - Aa genotype,
missing symbol -)
BMAG579
h a b h b b h b
b h h - h b h a
b h b b a b b a
b a b h a a a h
a b b
hE35M58la
h d b d b b h b
b h b - h b h b
b h b b a b b d
a a b h a a b h
a b d h h b h d
a b - b
The Chromosome File Format (continued)
Example 1: Chrom1.chr (Population F2, default) mar1a 1 3 23 112 1 3 2 1 1 mar2a 102311210211
mar3a 2 22 3 11 2 1 1 2 1 1
1-19
Part 1: Loading and Editing Real Data
For the Single-environment case the number of numeric values (phenotypes) in each trait must be the same as the number of genotypes for each marker of the chromosome files. For the currently available QTL-E models (Multiple-environments case), the number of phenotypes for each trait is equal to number of the genotypes (for each marker locus) multiplied by the number of environments. Traits values are set in turn: values for first environment, then for second environment, and so on.
1-20
The Trait File format
The trait data should be arranged in groups. Each group represents a trait and consists of a unique trait name and a line of numeric values or a symbol for a missing value. By
default, the missing value symbol is $.
Part 1: Loading and Editing Real Data
Single trait data files can also be created. The data may be organized in one group without a trait name. The name of the file with extension *.trt may represent the trait name:
Bur.trt:
54.5 68.5 87.5 88.5 $ 88.5 82.5 78.5 88.5 80.5 95.5 75.5
Yg2.trt:
4.28 4.10 6.45 2.25 11.25 12.55 $ 8.90 10.14 8.83 13.45 9.14
You can also write all trait groups together in a *.tra file:
Alltr.tra:
Bur 54.5 68.5 87.5 88.5 $ 88.5 82.5 78.5 88.5 80.5 95.5 75.5
Yg2 4.28 4.10 6.45 2.25 11.25 12.55 $ 8.90 10.14 8.83 13.45 9.14
Starting to load the data
1-21
Definition and explanations
A “Problem” is a folder that includes information for one mapping problem only, defined by its population, genotype and phenotype numbers, and its chromosomes and traits files. In this folder, an information file that reflects the problem
structure is created alongside the working files.
For problem creation, select File New
Problem from the main menu.
Part 1: Loading and Editing Real Data
<Old Input> was used in versions up to v. 2.4. Version 2.5 <New Input> includes, in addition, an improved input that can read MapMarker (*.raw) files, formats *.dbf, *.txt and exel (*.xls).
1-22
Starting to load the data: New Input
Part 1: Loading and Editing Real Data
By opening <Population type> window we can select the population type from the list below. In fact, due to the new section “Multiple families” you can also analyze data generated using Advanced backcrossing design (see Part 9, page 2).
You can choose the format of input data by opening the window of formats. But we first consider an example of our old format (*.chr,*.trt,*.tra). By pressing button <Import Data> you can input data from any folder.
You can get Help for this option by pressing F1.
1-23
Starting to load the data: New Input (continued)
Regular data
Part 1: Loading and Editing Real Data
If the entered information was correct (without errors), a list of entered files will appear in the right part of the window. Each of these files can be deleted. For that, you should choose it from the section Files, press button <Delete> of the keyboard, and confirm this action by <OK>. You can continue the input by adding new files to the list of selected files.
A window <Open> will appear where you should choose the folder with your data files, select files for input and press button <Open>.
1-24
Starting to load the data: New Input
Regular data (continued)
Part 1: Loading and Editing Real Data
By pressing button <+> near the symbol of a chromosome, you can get its markers. Colors and letters “C”,”R”, and “D” serve to indicate marker dominance.
You can change the order of markers in the chromosome. For that, select the the marker by using the mouse left button. By using double click of the right mouse button you can get the menu <Move Up> and <Move Dn> for moving the marker.
By pressing <OK>button you finish the input. A standard window <Save As> will appear for choosing the folder for the entered information and service files. Now we can move to step Project Creation Stage (see page 1-44).
1-25
Starting to load the data: New Input (continued)
Environments data
Part 1: Loading and Editing Real Data
In this case, you should first put the check box <Environments> to state <On> and then press the button <Import Data> to input your data.
You can change the names of the environments. For that, you press on <+> near the list of environments.
A list of default names of environments will appear. To change a name, you should mark the environment by mouse left button and then repeat clicking on the chosen name. After the corresponding frame appears, you
can change the name.
1-26
Starting to load the data: New Input (continued)
Family data
Part 1: Loading and Editing Real Data
You should first put the check box <Multiple Families> to state <On>, and then press <Import Data> button to input your data. There are some requirements that we should mention here: 1. The data on families are entered consequently: all files for the first family, then all files for the second family, etc. 2. The chromosome and trait file names should be the same across families. 3. Marker names within chromosomes are shared among families. 4. In chromosomes of some families a marker may be in state “missing data” across the family, but all trait values should be presented across families.
During the input of each family, a question of merging the files appears. Merging is conducted for files of chromosomes (according to marker names) and traits. Missing marker (not polymorphic or not scored within a family) is replaced in this family by symbol “missing marker”. All files are presented on the right side of the panel.
1-27
Starting to load the data: New Input
Family data (continued)
Part 1: Loading and Editing Real Data
Different markers can be missed in different families, but marker ordering must be the same across families. During the input process you may use a special file *.ord that defines the order of the markers in each chromosome. Each line of this file begins with the chromosome name followed by names of the mapped markers in their true order. The marker symbols are separated by space or tab. To enter the *.ord file, you should chose it in the Input window and press Import Data button.
Below two examples are shown:
Chr1 mar1 mar4 mar5 mar2 mar10 Chr2 mar3 mar4 mar 1 mar6 mar7 mar8
1-28
Starting to load the data: New Input. Family data (continued)
Part 1: Loading and Editing Real Data
Similar to the case of Multiple-Environment data (page 25), you can change the family name. After all data were entered, you may want to open all chromosomes (by pressing <+>) in order to check again the order and names of the markers. If the names of some markers were not the same among families, extra markers will appear. To find out in what family this happened you should open each family by pressing corresponding <+>, and then within the family find the chromosome where this error has occurred. In the example below, the error is in the 2nd chromosome of Fam2: instead “marker” we had ”mark”. We should delete this file by choosing it from the section Files (where the families are ordered) and pressing the button <Delete>. To fix the names of the markers we should again input this file (after switching the check box <Always show the data table> to state <On>).
1-29
Part 1: Loading and Editing Real Data
Changing marker’s name
After the input, a table with information on the treated chromosome will appear, and we can now change there the marker name. To do that, we need to select the column with the wrong name and write the correct name in the field Column name. Pressing <OK> will result in input of the chromosome with the changed names.
Starting to load the data: New Input. Family data (continued)
1-30
Starting to load the data: New Input (continued)
Selective Genotyping data
Part 1: Loading and Editing Real Data
Now you should indicate the trait that was the basis of selective genotyping. This trait should be marked by a double click, resulting in appearance of red frame near the trait name. If the trait was not marked, then, after you press the button <OK>, a text explaining how to conduct the choice will appear.
In this case you should put the check box <Selective Genotyping> to state <On> and then press the button <Import Data>.
1-31
Starting to load the data: New Input (continued)
Error correction
Part 1: Loading and Editing Real Data
If there are errors in the information files, corresponding message will appear. You do not need to remember these errors, just press <OK> button.
As a result, a window with the details of the errors will appear. By pressing <OK> in this window, you will obtain sequentially messages on the errors. For instance:
1-32
Part 1: Loading and Editing Real Data
To correct the indicated error, we write in the Column number field the error number obtained in the message. Simultaneously the marker name appears in the Column name field and this column becomes the first one in the table. In the field near button <Find> we write the erroneous symbol «4» and press <Find>. Both the corresponding line and the erroneous symbol become highlighted. We can correct (change) the highlighted value, e.g., to «2», and press <Enter> of the keyboard. After next pressing <OK> a message about next error appears (if errors still remain).
Starting to load the data: New Input
Error correction (continued)
1-33
Part 1: Loading and Editing Real Data
To fix such error, you need again write the column number «11» in the indicated field, and in the field near button <Find> enter a backspace and press <Find>. The line «439» with missing symbol(s) will appear. The error will be corrected by entering the missing symbol(s) and pressing <OK>. During entering each symbol do not forget to press <Enter> on the keyboard.
It may happen that the number of entered genotype scores is different in different markers.
Starting to load the data: New Input
Error correction (continued)
1-34
Part 1: Loading and Editing Real Data
If in the chromosome file the codes do not correspond to the chosen standards, a special message appears.
Missing AA aa Aa A dom
a dom
First, you should mark all adjacent loci, by choosing the first one and then the last one, using Shift key (you’ll get all loci highlighted in black). Now you can bring to correspondence the occurred codes and their graphical display. Namely, to change the code, you should mark the targeted code value by pressing the left mouse button. Then, by a second click on this marked code, you will get a frame. Now you can change the code in the frame as you need.
Once the codes were corrected, they will be remembered and become thereby default codes for all other chromosomes.
Starting to load the data. New Input
Error correction (continued)
Coding by default: Codes occurred in this chromosome
1-35
Starting to load the data: New Input (continued)
Other formats
Part 1: Loading and Editing Real Data
If you are using data prepared in one of the usual formats, you should first select the needed format and then, by pressing <Import Data> button, enter the file in this format. Data from deferent formats can also be entered when you prepare a job file for one mapping project. We show an example with data sets imported first from *.raw (MapMarker) format, and then from *.dbf (DataBaseFormat).
1-36
Part 1: Loading and Editing Real Data
After entering the data we get an information table. Markers and traits may be displayed in one table. To show how to display the markers in accordance to their chromosomes, let us select the first four markers and write “chrom1” in the field “Chromosome”. Then select the reminder markers and call this group “chrom2”. The same table will be obtained after import from *.dbf format. In this table we may denote groups “chrom3” and “chrom4”.
Starting to load the data: New Input. Other formats (continued)
Such a subdivision of a file into chromosomes during the input process is possible only with a small number of markers (especially unordered). In case of many markers, it will be better first to divide the files according to chromosomes, and input the chromosomes separately. Each chromosome will be named according to its data file name. If the markers were already ordered in initial data file, you may enter all markers as one “super-chromosome” and then subdivide this array into separate chromosomes (section <Data revision options>, p.61).
1-37
Part 1: Loading and Editing Real Data
After input from *.txt and *.xls formats, a transposed table may be obtained, with marker names occupying the first column. Let us press the button <TRANSPOSE TABLE>.
Now the marker names appear in the first line. To move the names to the title line we should indicate the current line with the names, then change the state of the check box <Column Names> to <On> and press button <Take>.
Starting to load the data: New Input. Other formats (continued)
Starting to load the data: Old Input
1-38
Step 1: Enter a name for the Problem
User should name the problem he is going to create. By default, folder is created in the current directory. In order to make changes, define another Location, and then choose a name for the Problem.
Part 1: Loading and Editing Real Data
If a folder with this name already exists in the current directory, then after you push the <Next> button, this message will appear. Answering “Yes” destroys all information in this folder. After answering “No” it is necessary to choose a new folder name.
Step 2: Select the type of population and the
mapping function
Currently, you may use the following types of populations: backcross, dihaploid, RIL_selfing, RIL_sib_mating, F2, F2_F3, and F2_F4 (F2 markers with phenotypes represented by means of F3 or F4 families, respectively). Mark <QTL-E> radio button in the QTL-environment interaction case.
1-39
By default, one environment, and number of phenotypes is equal to the number of genotypes. In the case of multi-environment analysis, the number of phenotypes is equal to number of genotypes multiplied by the number of environments (we assume that the same genotypes are assayed in multiple environments). After the number of genotypes was entered, the number of phenotypes appears automatically.
Starting to load the data: Old Input (continued)
Step 3: Enter the number of environments, genotypes, and phenotypes
Part 1: Loading and Editing Real Data
By answering “Yes” it is possible to change the names of the
environments (by default: env1, env2, etc…) by using this table
In the case of data from multiple-environment design, after you push the <Next> button, this message will appear.
1-40
Notes:
• missing marker symbol - one character only or a digit >5 (by default 0)
• missing trait symbol - one character only, not a digit ! (by default $)
• if your codes differ from the default, you should enter your codes. Then, by pressing button <Next>, you will move to the next step
Backcross (by default)
F2 population (example)
Step 4: Enter the symbols of the
markers that are used in the
data files
By default:
Backcross - 1,2
F2 - 1,2,3,4,5
Part 1: Loading and Editing Real Data
Starting to load the data: Old Input (continued)
1-41
First select the drive, then select the folder, and, finally, the files. The icons of the files correspond to their type.
Step 5: Select the data files (chromosomes
and traits) to use in the problem
Part 1: Loading and Editing Real Data
Remark: Only one *.tra file may be selected
for the Problem !
Click on the <Include> button to add the files
to the problem.
At this stage, the equality of the numbers of genotypes and phenotypes for each marker and trait to the number that was input earlier (at step 3) is checked. Only files that fit this condition may be included. Error messages give the names of files not included in the problem (i.e., when this condition is violated).
Starting to load the data: Old Input (continued)
1-42
The name of genotypes is incorrect if it begins from a digit or there is no space between a marker name and its values. The number of genotypes may be incorrect if the name of the next marker is missing. Examples of marker name errors: 35EM47jw 1 1 2 3 1 2 2 3 2 1 EM47jw1 1 2 3 1 2 2 3 2 1
Example of errors in the traits file: 0.1500.090 (space is missing). In trait files with extension *.tra additional errors may occur, e.g.: no space between the name of the trait and its values. Example: Wstrs0.17
Step 5 (continued): Error messages
For chromosome file: For traits file:
All errors are saved in the file
errors.txt of Problem folder
Part 1: Loading and Editing Real Data
Starting to load the data. Old Input (continued)
1-43
When some of the included files have the correct number of objects, this message box appears:
By answering “Yes” and pressing <Finish> button we move to the next stage of data input. By answering “No” and pressing <Cancel> we exit from program to correct errors (in accordance with the messages on the detected errors concerning the number of objects, as on the previous page). If there are no files with correct number of objects, we will get the message:
Step 5 (continued):
Part 1: Loading and Editing Real Data
Starting to load the data. Old Input (continued)
Press <Cancel> and <Exit> for error correction.
1-44
When creating a project, the <Select Data Files> window appears on the screen. A Project is a file with extension *.job that holds all the data and definitions of a problem, including the methods and the results of the calculations that were conducted. To create the project, some or all files from a directory may be chosen. In one Problem folder, one may create several Projects.
Project Creation Stage
At this stage, the correctness of marker and trait values is verified; in there is an error, windows for correction appear on the screen.
Part 1: Loading and Editing Real Data
The process of error correction described in the following pages is related only to <Old Input>. For <New Input> developed for v.2.5, checking and correcting the data is much easier and is conducted directly during the data loading as explained above (see page 1-31)
Errors in the chromosome files
Please note that the program checks for correspondence between marker state designations for each of the dominant markers.
Project Creation Stage (continued)
Correct the errors and press the <Next file> button.
The next errors will appear.
All markers from the entire chromosome are shown in the window.
Click on the button near the error to display the error correction window.
1-45
Part 1: Loading and Editing Real Data
1-46
Errors in the trait file: the missing symbol does not correspond to the symbol defined in Problem folder, or a trait value is not a number.
Project Creation Stage (continued)
Errors in the trait values. Error message
Correct these errors and press <Save and exit>. If <Cancel> button is clicked or some errors were not corrected, the window with the question appears. If no more errors were found, project creation is completed.
Part 1: Loading and Editing Real Data
If <OK> is pressed in the current panel, error control is performed once again and corresponding messages appear in the window <Input Data Report>.
In the chromosome file, a special error message may be obtained due to markers being in repulsion phase, which can be solved by changing the phase (but see the next page).
Click <Yes> to see the upper window. Click <Apply> button to see the lower window; the phase is changed for markers 3 and 4. Click <Save and Exit> button to save the chromosome under a special new name atz8mn%1.
Project Creation Stage (continued)
Errors in chromosome: recombination rate >0.5
1-47
Part 1: Loading and Editing Real Data
This data control is relevant for both Old Input and New Input.
In this case, changing the phase for one interval does not improve the situation. Thus, the program suggests dividing the chromosome into three parts.
Chromosomes with recombination rates >0.5 cannot be included in the project.
Project Creation Stage (continued)
Errors in chromosome: recombination rate >0.5 (continued)
1-48
Part 1: Loading and Editing Real Data
Important note: For data on Families only a message that a certain interval in a certain family shows recombination rate>0.5 may be obtained. You can enter for this family a separate data set and a transformed chromosome, and then use the transformed information to input the data sets for all families.
1-49
Then choose the Problem folder
and its information file.
Project Creation Stage (continued)
Creating a new project
It is possible to create several Projects in a single Problem folder. To do so, choose menu item FileNewProject.
A <Select Data Files> window for project creation appears.
Part 1: Loading and Editing Real Data
1-50
Graphical display of information file
After the Load Data operation is finished, all relevant files are automatically copied to the folder denoted in Step 1. In this folder, an information file with extension *.inf is created that holds all definitions and file names used in the Problem. This information file can be examined using the EditSetting option of the main menu. It is necessary to first choose the Problem folder, and then a file with extension *.inf .
Information file with Problem name with one environment
Part 1: Loading and Editing Real Data
1-51
In these cases you may change environment (or families) names. Click <Change Environment Names> or <Change Families names> for this. Now change the names in the appeared window.
Graphical display of information file (continued) For a Problem with multiple environments, the information file window looks like:
Part 1: Loading and Editing Real Data
For a Problem with multiple families, the information file window looks like:
1-52
Graphical display of information file (continued)
Part 1: Loading and Editing Real Data
For a Problem with Selective genotyping, the information file window looks like: The name of the trait that was a basis to select individuals from the tails of the trait distribution for selective genotyping.
For the trait file with extension *.tra this frame shows the name of the file and names of traits in that file.
Option Change marker order
It is possible to revise (change) the order of the markers in any chromosome. To do so, it is necessary to create a special file with extension *.ord in the Problem folder for every chromosome to be changed. For example for chromosome chrom2 with 12 markers we create the file:
1-53
Part 1: Loading and Editing Real Data
chrom2Ord.ord 3 6 2 7 8 9 11 1 5 12 4 10
Then we select <EditChange Marker Order> and choose the
corresponding chromosome and order file from the folder.
The new chromosome should be saved with a new or an old name.
This option is used with the <Old Input>. With <New Input> you can change the order of the markers in the chromosome during the input (see page # 1-24).
Option Add data files
It is possible to add data files to an existing Problem folder. This is done using the option <EditAdd to ProblemAdd Data Files> from the main menu. Then choose the Problem folder and its information file in the window <Select Problem to add data files>.
1-54
Part 1: Loading and Editing Real Data
Now, we choose a folder and its files that will be added to the chosen Problem
folder in the <Select data files to add into Problem> window.
Option Add data files (continued)
The number of genotypes for a chosen chromosome file is compared to the number of genotypes in the Problem folder. The correspondence of marker symbols to those in the Problem folder is checked.
For adding trait files, *.trt files only may be used. The number of phenotypes in the new file is compared to the number in the Problem folder. One of the traits of *.tra file may be overwritten by adding a *.trt file with the same trait name (in this example, the trait “front”.)
Possible error messages appear.
1-55
Part 1: Loading and Editing Real Data
If the names of added files coincide with the names of existing files, a message offering to overwrite the existing file appears.
1-56
Option Add marker files
E35M55 444024244422244244444244444202424424444444444
Example of a file named mona4h_5.mrk containing marker E35M55 :
Sometimes you may need to add a new marker to a chromosome that already exists in Problem folder. The marker addition procedure will be shown on the next page. First, we place the marker file in the Problem folder. The marker file format is the same as the chromosome file format, but it includes one marker only. The file’s extension should be *.mrk.
During input of a marker file, the user must know the chromosome name and the interval number of its marker. It is helpful if the marker file name contains this information. For example, we use a name like mona3h_6 to indicate that the marker of this file will be added to chromosome mona3h to interval 6.
Part 1: Loading and Editing Real Data
The option for adding a marker file <EditAdd to ProblemAdd Marker Files> works
similarly to that for adding a data file. First, choose the Problem folder, then find the folder
containing the marker files and select them.
1-57
Option Add marker files (continued)
The number of genotypes from the file with the additional marker as well as marker scores are compared against the corresponding values in the Problem folder. The interval number defining the position of the new marker must be less than the total number of markers in the chromosome. Otherwise, an error message will appear.
In the window <Inserting Markers>, choose the marker file name, the chromosome name in the Problem’s chromosome folder, and the input interval number. Then click <OK>.
Part 1: Loading and Editing Real Data
Deleting data files from the Problem folder
In the window <Select files to delete>, select which files should be deleted. After clicking <OK> those
files will be deleted.
It is possible to delete data files from an existing
Problem folder. This is done by selecting <Edit
Delete> from the main menu.
Then select the Problem folder and its information file.
1-58
Part 1: Loading and Editing Real Data
1-59
Extracting data and information file from Project
It is also possible to extract data and problem information from the Project file.
To do that, the Project must be opened using the <FileOpen Project > option.
When we perform the <FileExtract> option, we input the whole path for the folder where the extracted information will be placed. The name of this file information is the initial *.inf.
A Project file with extension *.job contains problem information, fitted models, used methods and obtained results. It is not restricted to the Problem folder and can be
placed in any other folder.
Part 1: Loading and Editing Real Data
1-60
Extracting data and file information from Project (continued)
Example of information file
with extracted data
Part 1: Loading and Editing Real Data
In case of multiple Families, data for each family are entered to a separate sub-folder of the chosen folder (using names Fam1, Fam2 …).
1-61
Data revision option
This option is used to change chromosome or trait data. We can delete (or add) a marker in the chromosome or selected part of the chromosome. As in chromosome reordering, we can conduct transformations of the trait data. This option is available for open Projects only. To revise the chromosome data, select <Data revisionChromosome> from the menu or tool bar.
To revise the trait data, choose <Data revisionTrait> from the menu or tool bar.
Part 1: Loading and Editing Real Data
This option is not provided for data with multiple Families, Environments, and Selective Genotyping.
1-62
In the inside panel, we see marker names and the segregation ratios, together with chi-square values for deviation from expected ratios and significance of deviations marked by *, **, *** (for p<5, 1, and 0.1%, respectively).
Data revision option (continued)
Chromosome revision
Part 1: Loading and Editing Real Data
Select the chromosome. Blue and red denote dominant markers linked in repulsion phase, whereas green denotes co-dominant markers
.
Note that our new MultiPoint software (under development) allows building high quality multilocus maps based on such information (for the method see Mester et al. 2003 a,b).
Click the <Rec.distances> button. The table of pairwise marker distances appears.
1-63
To delete a marker, choose the <Select marker to delete> option and select the desired marker (for example, marker #4).
Data revision option
Deleting a marker
The new chromosome is shown
in the <Changed chromosome
report> window.
To continue deleting markers, choose the <Select marker to delete> option again and select the desired marker in the <Changed chromosome
report> window.
Chromosome revision (continued)
Part 1: Loading and Editing Real Data
1-64
To add a marker, choose the <Select interval to add> option. The panel for marker selection appears. After clicking the <OK> button of this panel, new marker(3a) is added to the specified interval.
Data revision option (continued)
Adding a marker
.
Any conducted change in the map can be cancelled. For that, click the <Undo> button.
Chromosome revision (continued)
Part 1: Loading and Editing Real Data
Data revision option (continued)
Selecting part of a chromosome
Choose the <Select part of chromosome> option. The <Select part of chromosome> window appears.
Click the upper (lower) radio button of this panel to choose the first (last) marker of the chromosome part in the <Chromosome information> window. Press the <OK> button to obtain the new version of the chromosome.
1-65
Chromosome revision (continued)
Part 1: Loading and Editing Real Data
Data revision option (continued)
Selecting part of a chromosome
1-66
Chromosome revision (continued)
Part 1: Loading and Editing Real Data
Chrom1_new Chrom2_new
We will employ now the described option to subdivide a large set of markers into parts corresponding to separate chromosomes (see p. 32). Each such part can be selected by indicating its flanking markers. Then we move to the <Chromosome revision> option, and use the function <Select part of chromosome>. After selecting a part, we should save it under a new name (for more details see next page).
You can see here an example with two selected parts, chrom1_new (markers 1-36) and chrom2_new (markers 38-74). Their LOD-score graphs for a multiple-trait model are shown in the figures.
1-67
Data revision option (continued)
Saving your changes
Each of the above changes creates a new chromosome, which may be saved. For that, press <Apply> button in <Chromosome information> window or close <Changed chromosome report> window. The window <Status of the new chromosome > appears. The <Exit without saving> option is available only if the <Changed chromosome report> window was closed.
Chromosome revision (continued)
Part 1: Loading and Editing Real Data
Choose the first option from the Status window to add the new chromosome to current Project and save it in the Problem folder. Choose the second option to add the new chromosome to current Project without saving. Choose the third option to replace the old chromosome by the new one (the old chromosome is deleted), adding the new chromosome to current Project and saving it in the Problem folder.
The name for the new chromosome must be given.
1-68
You can also choose the width of trait
grouping in the histograms.
Data revision option (continued)
Trait revision
You can see the effect of a chosen interval/marker on the trait distribution in the alternative QTL groups.
To use this option, select the names of the traits to display or edit.
Select the chromosome and interval to examine.
Part 1: Loading and Editing Real Data
1-69
You can transform the scale of trait distribution if needed. The decision could be based on the numerical values of parameters characterizing the trait distribution, e.g., asymmetry and curtosis. You can control the transformation based on the parameters of the resulting distribution.
Data revision option (continued)
Trait revision (continued)
You can either replace the scores of the original trait with the transformed ones or add the new trait for QTL analysis. A window for the new trait name appears.
Part 1: Loading and Editing Real Data
1-70
Outliers (extreme data points) can be displayed using the “Tail Cutting” option. To display these points, click on the red tails.
Data revision option (continued)
Trait revision (continued)
Part 1: Loading and Editing Real Data
1-71
To see information about a certain genotype click on the corresponding point on the screen.
If the Two traits option is chosen, the diagram shows for the alternative QTL groups the “0.95-areas” that carry 95% of genotypes
from those groups.
Data revision option (continued)
Trait revision (continued)
Part 1: Loading and Editing Real Data
2-1
Part 2
Data Simulation
Table of Contents
Introduction Step 1: Input type and size of the data
Step 2: Input global simulation parameters Step 3: Input marker number and chromosome length for regular
2
3
4
5
7
9
10
12
13
14
and multi-environment data
Input marker number and chromosome length for families data Step 4: Setting dominant markers
Step 5: Setting the QTL parameters for regular and
multi-environment data
Setting the QTL parameters for families data Step 6: Setting the epistasis values
Step 7: Simulation of general parameters
2-2
Input of real data (chromosomes & traits) was shown
in Part 1.
In this part we are going to show how to generate “artificial” (simulated) data.
To do that, choose <FileSimulation> from the main menu.
Part 2 : Data simulation
Introduction
2-3
Step 1: Input type and size of the data
Part 2 : Data simulation
Regular Multi-environment Families (‘Environments’)
Only the sample size (number of individuals) should be entered.
The number of environments and number of individuals in each environment (assumed equal) should be entered.
Number of families and genotypes in each family should be entered. If you put the option <All equal> to state <On>, you can set equal family sizes.
2-4
Input number of: chromosomes, traits (at most 15)
Click <OK> to go to step 3.
Select the mapping function (Haldane or Kosambi) and the population type:
Step 2: Input global simulation parameters
We can simulate “missing marker scores” by setting the percent of lost marker scores (0 - no missing marker data). In simulations of multiple families no missing data is only possible).
Part 2 : Data simulation
In simulation of data with multiple environments or multiple families the number of traits (in current version) is limited by two. By default, the names of families are fam1, fam2 …, and those of environments env1, env2 …
2-5
Each red line corresponds to one chromosome.
Step 3: Input marker number and chromosome length
(for regular and multi-environment data)
Part 2 : Data simulation
To select all of the chromosomes press the <All> radio button. Input the number of markers and the chromosomes length. All chromosomes will have the same length and number of markers. However, the chromosomes already selected will not be affected by pressing the <All> radio button.
Select chromosome, input number of markers and chromosome length and press button <Set markers>.
Description of each operation is given in the bottom of the window
2-6
Step 3 (for regular and multi-environment data) (continued):
In this example, the first and third chromosomes have 9 markers and length 80 cM, whereas the second chromosome has 11 markers and length 110 cM defined previously.
Part 2 : Data simulation
You can move specific markers on any selected chromosome by pressing the left mouse button and dragging them. The new marker positions appear on the screen. You can delete specific markers by pressing the right
mouse button.
After finishing marking the chromosomes, press the <Next> button to go to the next step.
2-7
Step 3: Input marker number and chromosome length (for families)
Part 2 : Data simulation
In case of multiple families, these parameters should be set up for each chromosome separately. After entering the number of markers, we get a window for setting the chromosome length for each family. By using option <Equal> we can set equal lengths of this chromosome across families. Pressing <OK> means switching on the function <Set markers>.
This will result in appearance of a marked chromosome and a window <Families selection>. In case of equal lengths of the chromosome among families, the down radio button is called <Into all family>, whereas in non-equal lengths it is called <Proportionally to the length>.
2-8
Step 3 for families data (continued)
Part 2 : Data simulation
Now for each family you can move or delete any marker of the considered chromosome.
If we move the marker within the chromosome, then the interval length will change equally among families, in the case of equal lengths of the chromosome among families, and proportionally to the chromosome length, in the other case. By marking Fam1, or Fam2, etc., we will affect the interval lengths only in the chosen families.
By choosing one of Fam1, Fam2 … and clicking on a marker by the right mouse button we can delete this marker in the marked family. In case when <Proportionally to the length> was chosen, the system will prevent deletion (to exclude situations of empty marker across the data set). During the permitted deletion the system asks for confirmation, and after it is provided, the deleted marker is highlighted by gray color.
After defining of markers in all chromosomes, you can press <Next> to move to the next step of simulation. In case of F2 population this step is defining dominant markers (if you need it). In family data this is done in the same way as for regular data or multi-environment data.
If the population structure allows for both homo- and heterozygous marker states (e.g., F2), you can set dominant markers on each chromosome without selecting it.
Step 4: Setting dominant markers (for all types of data)
First, set a red-blue marker by pressing the left mouse button. Then select either the maternal (red) or paternal (blue) allele of the dominant marker by clicking the red or blue part.
After finishing marking the chromosomes, press the <Next> button to go to the next step. Press <Previous> to go to the previous step.
2-9
Part 2 : Data simulation
2-10
You can set the QTL location within the interval by moving the ‘location’ indicator and entering the QTL substitution effect (d) and the dominant (heterozygous) effect (h) for F2 population [for RIL or dihaploid, effect (d) only].
Step 5: Setting the QTL parameters
(for regular and multi-environment data)
To set a QTL at any interval of any chromosome, left button click this interval. The <Setting QTL’s parameters> window appears on the screen.
Part 2 : Data simulation
You must input effect values for each trait. If the <Equal> option is selected, all traits will have the same substitution and heterozygous effects.
2-11
Step 5 Setting the QTL parameters (for regular
and multi-environment data) (continued)
Click the <OK> button to set the QTL on the chromosome.
In the case of several environments (see step 1 of the simulation dialog) only one or two traits are possible in the current version.
Substitution and dom. effects are entered for every environment.
Part 2 : Data simulation
Every QTL is symbolized as a green triangle. Left click on this to open the <QTL’s set> window with the QTL parameters set, to review or modify them. To delete a QTL, press the right mouse button by this symbol.
Press <Next> to go to the next step; press <Previous> to go to the previous step.
2-12
Step 5: Setting the QTL parameters (for family data)
Part 2 : Data simulation
Input of the effects is conducted for each family separately, but if you marked the radio button <Equal>, then the effect is set up for all families, like in simulation regime for multiple environments. Still, the relative position of the QTL within chosen interval can be adjusted separately for each family. For that, the family should be chosen in the <Families selection> window, and then the QTL position adjusted using the “location” indicator of <Setting QTL’s> window.
You can also conduct the adjustment (if needed) for all families simultaneously by using the choice as shown in the figure
2-13
Step 6: Setting the epistasis values (for all types of data)
If you have set more than one QTL on the same chromosome, you can select on this step interacting QTL pairs and set epistasis value(s). To do so, click the two QTL symbols in turn. Values for each epistatic component are inserted for each trait in the <Setting epistatic effects> window. If the <Equal> option is selected, all traits will have the same epistasis values. Click the <OK> button to set epistasis values for the QTL pair. The <Setting Epistatic effects> window will close and the interacting QTL pair will be shown by dotted arc.
Part 2 : Data simulation
Press <Next> to go to the next step. Press <Previous> to go to the previous step.
2-14
Step 7: Simulation of general parameters (for all types of data)
You should also set general simulation parameters: - mean value of the trait - standard deviation for each trait - residual correlation for each trait pair The <Setting parameter values> window is used to do this. If the <Equal> option is selected, all the traits will have the same parameter values. The residual correlation matrix should be positive definite (i.e. its determinant should be positive). This property cannot be ensured automatically for any arbitrary matrix with coefficients between -1 and +1. To guarantee it, use the <Correlation fitting> tool.
Part 2 : Data simulation
Click the <OK> button to set global parameter values. The <Setting parameter values> window will be closed and the simulation for the project will begin. Once complete, we receive chromosomes and traits in the format of real data.
3-1
Table of contents
Model Creation stage Page 1: Model Parameters
Two-trait two lists option: Example
Multiple-trait option: Example
Initial Submodel Default option
Trait normalization option Selective
Genotyping option
Page 2: Extended Parameters Finish About model
Calculation panel description
Results of calculation
Save and Open Project options
Part 3
Model Creation and Calculation stages
2
3
5
6
7
7
7
8
10
11
14
15
18
3-2
Model Creation stage
Input of real and simulated data to the system was shown in Parts 1 and 2. In this part
we will show how to create various models to analyze this data.
As the process of Project Creation or Data
Simulation is finished, the window for
setting the model appears.
If a Project is open, we can obtain this window using <ModelCreate> from the main menu.
The Model Creation window consists of three panels.
Part 3: Model Creation and Calculation stages
3-3
Page 1: Model Parameters
Name of the window - the type of population created in the Problem folder or during the process of data simulation.
Name of Page1: Model Parameters.
It is necessary to:
- enter a unique name for each model; - select the mapping function (Haldane or Kosambi); - select the number of QTLs per chromosome to fit (“Single QTL” or “Two-linked QTL” model)
Global number of traits is displayed in the area <Number of selected traits>
Part 3: Model Creation and Calculation stages
3-4
Page 1: Model Parameters (continued)
You should select traits in the area <Names of traits>. The number of selected traits is displayed in the <Number of selected traits> area.
Select an option in the area <Number of traits in the model>
Four options are possible :
one-trait - calculations will be performed separately for each of the selected traits
two-trait - calculations will be performed for each pair of the selected traits (Korol et al. 1995)
multiple trait - a multiple trait model will be fitted for the selected traits (no more than 40 traits can be included) (Korol et al. 2001)
Part 3: Model Creation and Calculation stages
two-trait from two lists - the traits to form pairs of traits are selected from two lists (see next page)
3-5
Page 1: Model Parameters (continued)
Two-trait two lists option: example
Part 3: Model Creation and Calculation stages
After your selected this option two lists with names of the traits appear. The traits to form pairs of traits are selected from two lists: each chosen trait of the first list is combined with each one of the second list. The selected traits in the two lists must be different: If some coincide you’ll be warned by a message of error.
3-6
Page 1: Model Parameters (continued)
Multiple-trait option: example
Out of 50, 34 traits were selected, and the number of objects common to all of the 23 selected traits is 63.
Part 3: Model Creation and Calculation stages
Note: Multiple-trait option is not available for models Selective genotyping, Multiple-environment, and Multiple-family. If the multiple-trait option was selected, it is possible to see the table of objects number for each trait. To do so, click the <Table of objects number> button.
In this example,
the global number
of objects was
152, and global
number of traits
50.
3-7
- no variance and covariance effects; - no epistasis, for a model with two-linked QTLs
Initial Submodel Default option
By default, it is assumed:
To change the default, check the box <Specifying default form of initial model>. After clicking the <Add Model> button, a window <Submodel> that allows nullifying of the default chosen effects will appear. This window may slightly change its form depending on the selected models and type of data. For more detail about Submodel options see Parts 4,5,8 and 9.
Page 1: Model Parameters (continued)
Part 3: Model Creation and Calculation stages
Trait normalization option This option can be applied to data with several environments only (in QTL-Environment analysis). If the option <Trait normalization> is selected, trait normalization (scaling) is performed (for more detail see Part 8). Computation results may be presented in normalized (scaled) and non-normalized forms. Selective genotyping option This option is used in simulation of Selective genotyping data. See Part 10 for details.
3-8
Click the tab <Extended Parameters> to move to page 2.
- the calculation method: marker or interval analysis (by default interval analysis); - marker restoration option (more details in the next page); - number of starting points (by default 1)
Page 2: Extended Parameters
Part 3: Model Creation and Calculation stages
Model name will be displayed. It is necessary to select:
In complicated cases (e.g., two-QTL model across multiple environments, under small sample size), the convergence the optimization procedure of the log-likelihood function may strongly depend on the “starting” point (initial values of the genetic parameters). You may want to check the uniqueness of the obtained solution by choosing several starting points (reducing thereby the risk of not reaching the global maximum of likelihood function)
3-9
Page 2: Extended Parameters (continued)
Marker Restoration option
This option is relevant in cases of real data or simulated data with nonzero percent of lost marker scores and/or missing information caused by marker dominance. If the option <Marker Restoration> is selected (default), missing markers will be restored.
If missing markers are not restored (when <Marker restoration> is off) then the number of objects may vary among marker intervals. In this case the option <LOD normalization> may be employed, although one should be careful in using it (see part 4 for details)
The option <Ignore Marker Loss> may be relevant in case of simulated data with nonzero loss of marker scores.
Part 3: Model Creation and Calculation stages
To reduce the effect of missing information, we calculate the probabilities of being a heterozygote and dominant homozygote for each dominant marker phenotype or missing marker phenotype, based on scores of the neighbor markers. These probabilities help us to calculate the likelihood function (e.g., Jiang & Zeng 1997; Jansen & deJong 1999; Peng et al. 2003).
3-10
By specifying the parameters of a model and clicking the button <Add Model> on the <Model Parameters> page, we add the model to the Project. You may use the <Model Scrolling> buttons to display all chosen parameters of the next or previous model. Click <Delete Model> to delete the chosen model (with all computation results related to it). Click <OK> to finish the Model Creation stage.
Part 3: Model Creation and Calculation stages
Finish
Upon finishing the Model Creation stage, the <Calculation Panel> window appears.
This window may also be accessed using the <Model Open> menu.
Click <About Model>, to get a total description of the selected model (see next page). For a detailed description of the <Calculation Panel> see page 14.
3-11
About model
By clicking <About Environment> we can display the number and names of the environments (where there are several). By clicking <About simulation> we see the <Setting the chromosomes> window (for simulated data only).
A description of the selected model is shown in the <About model> window.
Part 3: Model Creation and Calculation stages
In case of Families data the corresponding information can be obtained by pressing <About families> button.
3-12
About model (continued)
Click a chromosome name to get the number
and lengths (in cM) of intervals
By clicking <Param>button we see <Setting parameters Value> window.
<QTL-s set> window appears when clicking a QTL’s triangle.
Part 3: Model Creation and Calculation stages
About simulation for regular and multi-environment data
3-13
Part 3: Model Creation and Calculation stages
About model (continued)
About simulation for families data In such cases, after the chromosome is selected, a window <Families selection> appears. Here, for each family we can get information about interval lengths in each chromosome.
Likewise, after clicking the QTL’s green triangle, in order to get the information of the QTL relative position in the target interval in each of the families, we should use the same window <Families selection>.
3-14
Calculation Panel description
On this panel all chromosomes of the Problem and all selected traits are displayed. Traits are displayed according to the selected model: one trait, trait pairs, or several traits.
In order to compute a model for one square (e.g., chromosome - trait pair combination), click this square and then the <Compute> button. To analyze all combinations, click <All> then <Compute>. You may also choose any chromosome or trait separately and then click the <Compute> button.
Scrolling is available for chromosomes and traits.
Part 3: Model Creation and Calculation stages
For Families data you should also set the number of steps for calculation of the QTL position within the intervals (by default =10). For more detail see section Families.
We may choose any one model from all created models in the <Models> menu.
3-15
Results of calculation
The calculation graphs are shown
in compressed form.
The maximum LOD value of
each graph is also shown.
While computing, the <Progress line> is updated in the window.
Part 3: Model Creation and Calculation stages
3-16
Results of calculation (continued)
Part 3: Model Creation and Calculation stages
Switch the check box <Global Bootstrap or Permutation test> to <On>; the window will change. We can now conduct Bootstrap analysis or Permutation Test for several of all “cells” simultaneously. For that, we should mark the cells of interest (highlighted in blue). The number of runs can be changed (by default, 1000). You can first conduct the Permutation Test and then Bootstrap (recommended order).
After the PermutationTest is conducted, the significance levels and the chosen numbers of permutation runs will be shown in the cells.
3-17
Results of calculation (continued)
Part 3: Model Creation and Calculation stages
For the same set of cells let us conduct now Bootstrap function. We should again mark these cells and press <Bootstrap> button. As a result, in the selected cells we’ll get the information that this function fulfilled and the number of permutation’s runs will be shown.
Later, for any model from the <Calculation Panel> we can switch the check box to <On> and by pressing <All> get full information about the conducted tests. In more detail information about the tests will be provided later.
3-18
Save and Open the Project
We have created a Project and performed our first calculation on it. To save all the data, models and calculated results select: <FileSave Project> or <FileSave Project As>or click on the tool bars. A window for selecting the Project name will appear. Enter the file name and click <Save>. The extension *.job will be added to the file name.
Part 3: Model Creation and Calculation stages
To load a Project, choose <FileOpen Project> from the main menu or click the green icon. The <Select JOB name to open> window appears. Select the desired Project by double clicking on the filename or selecting it then clicking the <Open> button.
4-1
Part 4
Single-QTL model, one and two traits
Table of contents Introduction Description of the first Project (allMarker.job) Main Window for the Analysis and LOD graph
Occurred and Estimated minitables Interval analysis, one trait (Model mTr1int)
interval analysis, two traits (Model mTr2int) Estimate option and Estimation table
Interval analysis, one trait (Model mTr1Int ) Marker analysis, one trait (Model mTr1mark)
Interval analysis, two traits (Model mTr2int)
Defining and Fitting Submodel option Interval analysis, one trait (Model mTr1Int)
Interval analysis, two traits (Model mTr2int)
3 4
6
7
10
8 9
11
12 14
Displaying the results
4-2
Table of contents (continued)
Part 4: Single-QTL model, one and two traits
Scanning option Comparing hypotheses H1 H0 option Compare Submodels option Bootstrap analysis option Distribution option
About submodel option
Main Window for the Analysis and LOD graph
Models with extended parameters (file “lossMarker.job”) Project with missing marker scores LOD Normalization option Ignore Marker Loss option Marker Restoration option
15 17 18 19 21
22
23 24 25 26
4-3
In this part, we show on simulated data, how to perform the analysis. Two examples
of such data will be employed. Both examples are on F2 populations, with scores
from one environment, 200 genotypes (and phenotypes) and all markers from one
chromosome.
The first example (in file “allMarker.job”) includes two traits and has no
missing markers. We shall use it for showing of the overall result of
performing an analysis.
The second example (file “lossMarker.job”) includes one trait with some proportion
of missing markers. We shall show how to use the model options <Marker
restoration>, <LOD normalization> and <Ignore marker loss> on this example.
See also Part 3, page 9.
Introduction
Part 4: Single-QTL model, one and two traits
Description of the first project (allMarker.job)
This project includes 4 models: two models with one trait; and two models with two traits (interval and marker analysis in both cases).
Interval
Ma r ke r
4-4
Part 4: Single-QTL model, one and two traits
Press any thumbnail graph to display the results of fitting the model for chosen chromosome- trait combination. After pressing the graph button, the LOD value is highlighted in red.
The information on each model can be displayed by clicking <About Model> button on
the <Calculation Panel>.
4-5
Description of the first project (continued)
We can display all the parameters on simulated data by clicking the button
<About Simulation> on the <About Model> panel.
For details see Part 3 page 11.
Part 4: Single-QTL model, one and two traits
The X axis of LOD graph is in centiMorgan and is supplied with marker bars. Placing the cursor on the marker vertical bar allows you to display marker name. For F2, F3, or F4 populations, red and blue colors show the dominant marker loci in repulsion phase whereas green color is to denote codominant markers. For backcross (or double haploids, or RIL) all markers are red colored.
Main Window for the Analysis and LOD graph
4-6
Displaying the results
Part 4: Single-QTL model, one and two traits
First we select the model mTr1Int. For example, we can look at trait2 graph (LOD=4.26).
The <Results menu> corresponds to the new window with LOD score graph.
For simulated data the green triangle indicates the QTL. You can click it to see the QTL’s input parameters (user-specified) and the results of simulation (the Occurred table)
4-7
Click any place in the interval space to see all the parameter values for this interval (the Estimate table for this interval). The selected interval is highlighted in gray.
Occurred and Estimated minitables
Part 4: Single-QTL model, one and two traits
Interval analysis, one trait (Model mTr1Int)
The any graph can be printed by choosing the option of Results menu <OptionsPrint>. But the tables <Occurred> and <Estimated> are not supposed to be printed.
Note that the sign of d represents the difference X(A)-X(B), or X(1)-X(2), where A and B (or 1 and 2) are the two homozygotes at the marker loci (clearly, you may use other designations, but the sign (d) is always in accordance to the indicated direction). Likewise, h is calculated as deviation from the mid-parent value, i.e., X(H)-0.5[X(A)+X(B)], or X(3)-0.5[X(1)+X(2)].
Choosing the <Estimates> option of Results menu opens the Estimation table with estimated parameter values for each interval. Interval with the global maximum LOD is highlighted by red color whereas interval(s) with local maximum LOD is (are) highlighted by blue color.
4-8
The table can be opened in Excel by choosing the option of Results menu <OptionsOpen in EXEL>. The option <Print> for this table is available but less effective than those in Excel. P.E.V. [P.E.V.(ad)] is the percentage of explained variance [explained additive variance] of the trait
Displaying the results: Estimates option
“Coordinate right end” in this table is distance (cM) from the beginning of chromosome to the end of interval. In this table “L“ is the distance from the beginning of the chromosome to the point of maximum LOD in the interval.
Part 4: Single-QTL model, one and two traits
Interval analysis, one trait (Model mTr1Int)
4-9
Lod Graph, Estimation table
Marker analysis, one trait (Model mTr1mark)
For the marker analysis case we see <Estimation table> with the estimated parameter values at each marker locus.
We may click any marker to see the estimates of all the parameters for this marker.
Part 4: Single-QTL model, one and two traits
4-10
LOD Graph, Occurred and Estimated minitables
Interval analysis, two traits (Model mTr2int)
For the two-trait analysis, we see <Estimate> and <Occurred> tables with the estimated and occurred parameter values for each of the two traits (Korol et al. 1995)
Part 4: Single-QTL model, one and two traits
4-11
Estimation table (two traits) Model mTr2int
The Estimation table shown here corresponds to the two-trait analysis case. The estimated parameter values are shown for each of the two traits.
Part 4: Single-QTL model, one and two traits
4-12
Defining and Fitting Submodel option
Model mTr1int
The first calculation was performed (by default) under the assumption of equal residual variance in the QTL groups (i.e., “no variance effect”, for more details on variance effect model see Korol et al. 1996b).
We may analyze the data allowing for variance effect or taking into account other assumptions about the model parameters. This can be done by defining corresponding submodels. They may be created by using <Submodel Add> option from the Results menu
On the <Submodel> panel, all possible conditions for one trait model and F2 population are displayed.
Combinations of these conditions are also possible.
Part 4: Single-QTL model, one and two traits
We see LOD graphs for 4 submodels in this window. Name of each one reflects the way of its
computation. All options of Results menu will be performed for selected submodel. We select it by left click. Up to 5 submodels can be created and represented by
graphs simultaneously.
4-13
Model mTr1int
Defining and Fitting Submodel option (continued)
Part 4: Single-QTL model, one and two traits
The selected submodel may be deleted by option <Submodel Delete>.
Model mTr2int
Many more submodels may be created for two trait model.
For each selected submodel an <Estimated> table may be received. In this table the color of rectangle denotes the current submodel.
4-14
Defining and Fitting Submodel option (continued)
Part 4: Single-QTL model, one and two traits
The <Scanning> option from the Results menu can be used to show more details (also employs linear approximation, but with much more points per interval). <Scanning Parameters Setup> panel is used for that.
4-15
Scanning option
The displayed LOD graph (for model mTr1int, trait1) is very coarse, due to linear approximation: maxLOD values in the intervals are connected to obtain the graph.
The resolution provided by the default scanning parameters is usually sufficient.
Part 4: Single-QTL model, one and two traits
4-16
Result of scanning is shown in current window. Note that by using the scanning option we can improve the solution in some complicated situations. Indeed, this option complements the restricted maximum likelihood (ML) solutions representing the best points for each interval. Scanning provides additional (intermediate) points for each interval. Due to the challenge of global multiparametric optimization of ML function (with very complicated landscapes in the parameter space in certain situations) such scanning may help in optimization .
Clearly, scanning option is impossible for models with marker analysis. Choosing the <Unscanning> option restores the original graph
Scanning option (continued)
Part 4: Single-QTL model, one and two traits
4-17
Comparing Hypotheses H1 H0 option
To compare hypotheses H1 (there is a QTL in the chromosome) and H0 (no effect of the chromosome on the trait) you can use the permutation test (Churchill & Doerge 1994).
Choose the <SignificanceCompareHypotheses H1:H0> option of Results menu. This test is an easy tool to check for significance. For example we can examine model mTr1Int, trait2, submodel h=0 (yellow). Just press the <Start> button and the test starts. All the preset options can be changed. Permutation test can be continued. Parameter <Number of Runs> allows you to define the number of runs that you want to get in current session, whereas the window <Permutation> displays the total number of permutations at the moment of observations
Part 4: Single-QTL model, one and two traits
Definitions: <Critical LOD value> is the maximum LOD value of current submodel. <Overcome> is the number of permutation runs in which the computed result (max LOD) was above the critical LOD value. <Threshold LOD values> are the values exceeded in 5% , 1% or 0.1% of the permutation runs, respectively.
Press the <Reset> button to clear previous permutation result. After closing the permutation panel we see the significance value.
4-18
. After fitting any two submodels you can test whether
they differ statistically. Choose <Significance Compare Submodels> option to compare all relevant submodels to the selected one (Ronin et al. 1999; Peng et al. 2003; Hanotte et al. 2003)
Compare Submodels option
For model mTr1int, the default submodel (see page 14) highlighted in red can be compared with three submodels. The submodel defined by maximal number of parameters is marked by a larger rectangle. In the <Comparison> window, select the submodel to compare to and click <OK>.
Part 4: Single-QTL model, one and two traits
The <Comparison Test> window appears. Press <Start>button and the test will run. All the initial parameters here may be changed. <Critical LOD increment> is the difference between maximum LOD values of compared submodels. Simulation method is used to compare the submodels
(e.g., Peng et al. 2003)
Bootstrap analysis allows estimating the standard deviations of the main parameters using repeated re-sampling of the data (with repeats). Non-parametric selective and non-selective bootstrapping (Lebreton & Visscher 1998) is provided in MultiQTL.
4-19
Bootstrap analysis option
Choose <SignificanceBootstrap> option of Results menu. The <Bootstrap Test> window appears.
Part 4: Single-QTL model, one and two traits
.
<Number of samples> may be changed. Press <Start> button to start. If permutation test is not performed for current submodel, <Threshold LOD values> are not created.
4-20
A histogram in the case of marker analysis looks as shown.
Bootstrap analysis option (continued)
Part 4: Single-QTL model, one and two traits
If threshold LOD values are defined (in permutation test), three new rows in the table appear. In this case, the Power value may be received for 0.05, 0.01, and 0.001 significance levels, accordingly. The table of the <Bootstrap Test> window can be opened in Excel by clicking on the <Open in EXCEL> button.
4-21
Distribution option
Select a submodel and choose <Distribution>
option of Results menu to see the distribution of
the trait in the alternative marker or interval groups.
Trait2 of model mTr1int is shown in current window. The interval number 4 is chosen. The number of steps in the histogram may be changed.
Part 4: Single-QTL model, one and two traits
<Distribution> panel of two-trait model is shown.
4-22
About submodel option
Choose <About submodel> option. This option allows displaying the submodel information. Examples of displaying submodel names for two-trait model and one-trait model are shown below.
Rectangle color corresponds to
the submodel graph color.
In order to mark the best submodel, use radio button <Mark this submodel>. This submodel will be shown in the <Calculation Panel>.
Part 4: Single-QTL model, one and two traits
4-23
Project with missing marker scores
Real data often have missing marker scores. In a special example (file lossMarker.job) we simulated data with 20% of missing markers. A simple model <missMark> for this Project was created. LOD graph in the scanning form and its <Estimation table> are shown. You can see in the <Estimation table> different number of genotypes (nObj) for every interval (due to missing data). Note that the largest LOD values did not appeared here in the interval carrying the simulated QTL. The number of simulated phenotypes in the example was 200.
Models with extended parameters
Part 4: Single-QTL model, one and two traits
LOD Normalization option
LOD value for every interval depends on the number of genotypes (nObj) for this interval. Because this number varies among intervals (due to missing data), the LOD graph (see previous page) may display a biased (non-objective) picture.
We can create a new model <normLod> with “corrected” nObj by scaling on the max nObj (see part 3, page 9). The resulting graph is shown below.
4-24
Part 4: Single-QTL model, one and two traits
Models with extended parameters (continued)
4-25
Ignore Marker Loss option
Using simulated data with missing marker scores, you can evaluate the impact of loosing information (due to missing data) on mapping quality, compared to the “no missing data” case. For the last case, you should calculate the results using the option <Ignore Marker Loss> .
Model <MLossIgnor> using this option was created. The result for this model is shown.
Part 4: Single-QTL model, one and two traits
Models with extended parameters (continued)
4-26
Marker Restoration option
Option <Marker Restoration> can be used for “virtual” restoration of missing marker scores based on the information on linked markers (see Part 3, page 9). This option is available for real and simulation data. By comparing the LOD graph below to the graph shown on previous page (the same data set but with no missing marker scores) you can see how efficient this function was in the considered example.
Part 4: Single-QTL model, one and two traits
Models with extended parameters (continued)
5-1
Table of contents
Introduction
Parameters of simulated example (Qtl2Tr1_2.job)
Comparing the models
Computation results
Submodel option
About submodel option
Estimate option
Data analysis options Hypotheses H2H0 Hypotheses H2H1 Compare Submodels Bootstrap option
Distribution option
Part 5
Two-linked QTL model
2
3
4
7
10
13
14
16
17 18 19 21
5-2
Part 5: Two-linked QTL model
In order to show the analytical options of the Two-linked QTL model we give an example on simulated data with two QTLs.
Our example includes one chromosome with two interacting QTLs. The simulated data
were for a population F2 phenotyped for two traits in one environment. The number of
genotypes (and phenotypes) was 200.
See next page for details
The name of created project is “Qtl2Tr1_2.job.” Open it and press the
<AboutModel> button. All parameter values will be shown. In order to see
parameter values used in simulation press the <About Simulation> button.
Introduction
5-3
Parameters of simulated example
The simulated chromosome carries two QTLs, in the 3rd and 7th intervals. Parameters of these QTLs are shown in the <Setting QTL’s parameters> windows.
You can open the <Setting epistatic effect> window by left clicking on the point at the top of the dotted arc. The values of epistatic effects are shown in this window.
Part 5: Two-linked QTL model
5-4
The created project includes 4 models
Single QTL analysis Two-linked QTL analysis
Two traits
One trait
Interval mapping analysis was used in all four models. It is also possible to use marker analysis and all the techniques for restoration of missing marker scores.
Comparing the models
Part 5: Two-linked QTL model
5-5
In this example, only one QTL (in the 7th interval) is clearly seen in the single-QTL
analysis for both one and two traits.
One-trait analysis Two-trait analysis
Part 5: Two-linked QTL model
Comparing the models (continued)
5-6
In the two-linked QTL analysis, we clearly see that maximum LOD is reached for the
pair of intervals 3 and7, i.e., at the positions where the QTL effects were simulated.
One-trait analysis Two-trait analysis
Part 5: Two-linked QTL model
Comparing the models (continued)
- a larger window with results representing the LOD values for each pair of intervals of the chosen chromosome.
5-7
Computation results
For two-linked QTL model, the results are shown in two windows:
- a small window with the names of the submodels
Global LOD maximum is highlighted in red, local maxima are marked in blue.
Three dimensional graph may be rotated by pressing the <Rotate> button.
Part 5: Two-linked QTL model
If you analyze simulated data, you may
click the green QTL symbol to see the
QTL’s input parameters and simulation
results. <Specified> and <Occurred>
results will be shown .
5-8
Computation results (continued)
Click any cell to see its <Estimate table>.
The table contains all parameter values.
Part 5: Two-linked QTL model
5-9
Computation results (continued)
Part 5: Two-linked QTL model
Computations in case of marker analysis
5-10
Submodel option
All submodel options available for single QTL analysis can now be used with two QTLs (see Korol et al. 1998a; Ronin et al. 1999; and references therein).
From the previous page we could see that the estimates of main parameters may differ quite strongly from the simulated parameter values. That happens because we have simulated a rather high level of epistasis, but the first submodel (by default) was computed with zero epistasis values. Now let us create a new submodel allowing for epistasis. For that, select
the <SubmodelAdd> menu option.
Set <OFF> values for all epistasis=0 radio buttons
Part 5: Two-linked QTL model
5-11
The new submodel (green) is shown. Its <Occurred> and <Estimated> values are much more similar.
In the small window the two submodels are represented.
The ToolTip (prompt) shows the full name of the submodel.
Submodel option (continued)
Part 5: Two-linked QTL model
5-12
Up to 5 submodels can be created and represented by windows simultaneously. All analytical options can be performed for the selected submodel. Select a submodel in the small window. Its full
window becomes active.
Submodel option (continued) Part 5: Two-linked QTL model
The selected submodel may be deleted by choosing the <SubmodelDelete> option.
A submodel window may be closed, but the submodel is not deleted in this case. The colored rectangle of this submodel will disappear from the submodel name in the small window. If this submodel is selected again, its window will appear again and become active. The <Order> button is used to regulate windows according to submodel numbers in the small window.
5-13
About submodel option
Choose <About submodel> menu option. This option shows submodel information. Examples of submodels for
In order to mark the best submodel, click radio button <Mark this submodel>.
one-trait model two-trait model and
Part 5: Two-linked QTL model
The Estimate option gives you a great deal of information.
The interval pair with global maximum LOD is highlighted by red color, whereas interval(s) with local maximum (maxima) of LOD is (are) highlighted by blue color.
Like in single-QTL analysis and in all further models, P.E.V. [P.E.V.(ad)] is the percentage of explained variance [explained additive variance] of the trait relative to its phenotypic variation
5-14
Estimate option Part 5: Two-linked QTL model
“Coordinate right end1 “ and “Coordinate right end2” in the <Estimate> table are the distances from the beginning of chromosome to the end of first and second intervals, respectively. L1 “ and “L2” in the <Estimate> table are the distances from the beginning of chromosome to the point of maximum LOD in the first and second intervals.
5-15
Part 5: Two-linked QTL model
For this case the table differs a bit from that of the interval analysis
The form of the table in case of marker analysis
Estimate option (continued)
The table can be opened automatically in Excel spreadsheet by the <OptionOpen in EXCEL> menu option.
We can compare hypotheses H2 (two linked QTLs), H1 (a single QTL in the chromosome) or H0 (no QTL in the chromosome), by using the <SignificanceCompare> options.
For comparing hypotheses H2 vs. H0 we use permutation test. It is completely identical to the test of H1 vs. H0 in the single QTL case.
Comparing hypotheses H2 H0
Part 5: Two-linked QTL model
5-16
Data analysis options
Data analysis options (continued)
For testing H2 vs. H1 we use simulation method (parametric bootstrapping – Walling et al. 1998; Korol et al. 1998a; Ronin et al. 1999; Peng et al. 2003).
Both tests run much slower than in single QTL models because computations go over all pairs of intervals.
Part 5: Two-linked QTL model
5-17
The results of the tests can be seen in the <About submodel> table.
Comparing hypotheses H2 H1
5-18
<CompareSubmodels> option is
completely identical to the same
option in the single QTL case.
Compare Submodels
Part 5: Two-linked QTL model
Data analysis options (continued)
5-19
Bootstrap option
Choose menu option <SignificanceBootstrap>.
This test for two-linked QTL method is identical to the one for single QTL case. But instead of histogram, color intensity is used to show the distribution of outcomes among the pairs of intervals. The numerical value of intensity (“histogram bin”) is displayed by left clicking on a square.
Part 5: Two-linked QTL model
Data analysis options (continued)
5-20
Bootstrap option (continued)
Part 5: Two-linked QTL model
The form of the table in case of marker analysis.
Data analysis options (continued)
This option shows how the two intervals of QTL location affect the trait distribution. This option is also identical to the distribution option of the single QTL case. But here it is necessary to define two intervals of QTL location.
5-21
Distribution option
The putative QTL location and the number of intervals on the histograms can be changed.
Choose <Distribution> menu option.
Part 5: Two-linked QTL model
6-1
Part 6 Multiple-trait model
Table of contents Introduction
Single QTL model (MyExTr6_4chr.job and Tr6_Chr4F2.job) Parameters of simulation example Computation results Submodels option About submodel option Significance options
2
3 4 6 7
8
9 10
Permutation test (H1H0) Bootstrap Traits contribution
Two-linked QTL model Parameters of simulation data example Computation results Submodels option About submodel option
Significance options Compare Submodels option Permutation test (H1H0) and (H2H1) Traits contribution
Bootstrap
13 14 16 18
19 20 21 22
6-2
Part 6: Multiple-trait model
The details of the method adopted here for multiple trait analysis are described in Korol et al. (2001) [see also Korol et al. 1995; Zeng & Jiang 1995; Ronin et al. 1999; and refs therein). Note that in our algorithms the multivariate QTL mapping is based on ML-analysis with interval-specific transformations of the trait space (Korol et al. 2001).
In the simplest case of two non-correlated traits, the advantage of joint analysis of two
traits is in the increase of the “multivariate effect” according to d 2 = (dx/x)2 + (dy/y)2,
where dx and dy are the substitution effects of the QTL for traits x and y, and x and y are
the corresponding standard deviations within the QTL groups (residual standard
deviations). In case of correlated traits, the potential gains from joint mapping analysis,
compared to single trait analysis, are due to: (i) the pleiotropic effects of the QTL on x and
y; (ii) residual correlation between x and y (within the QTL groups) caused by non-genetic
effects and segregation of unlinked QTLs; and (iii) the combined effect of both factors (i)
and (ii) (Korol et al. 2001). Joint analysis proved an efficient tool, upon certain conditions,
for increasing QTL detection power, mapping resolution, and accuracy of estimated
parameters. It may also increase the power of discriminating among various hypotheses
concerning the trait genetic architecture, such as linkage versus pleiotropy.
Introduction
In order to demonstrate the options of the Multiple trait QTL analysis and the performance of the employed algorithms, we provide two examples of simulated data with several traits.
6-3
Parameters of simulation example
Parameters of QTL3 of chromosome chr2 and QTL4 of chromosome chr3 and general parameters are displayed. Open the project and press the <AboutModel> button. All data parameters will be shown. In order to see simulation parameters press the <AboutSimulation> button.
Part 6: Multiple-trait model Single-QTL model
The file with the first example is named “MyExtr6_4chr.job”. The simulated data represent a backcross population with 4 chromosomes, with the phenotype (6 traits) scored in one environment. The number of genotypes (and phenotypes) is 200.
6-4
Computation results
We shall show multi-trait single-QTL model performance on this example, for chromosomes 2 and 3. A model with all six traits was created. Scanning type of graph of the second chromosome is displayed.
Estimated and simulated characteristics are shown.
Part 6: Multiple-trait model
Single-QTL model
6-5
Computation results (continued)
Estimates table may be received by <Estimate> menu option. The interval with global maximum LOD is highlighted by red color.
Part 6: Multiple-trait model
The estimates of the genetic parameters for each trait are shown, including the pleiotropic effects eff.(d) of the putative QTL and the percentage of explained phenotypic variation (P.E.V.)
Single-QTL model (continued)
Important comment: It is clear that not every trait combination can be treated based on multitrait QTL analysis. Indeed, this is impossible if the residual variance-covariance matrix is degenerative (i.e., its determinant is zero). Thus, during the analysis, the last condition is tested, and if this is indeed the case, the trait(s) causing this condition are detected and reported. In the example shown here the traits that cause the foregoing condition are displayed in the “cells”. As you can see, the set of such traits may vary among the chromosomes. To continue the analysis, you should build a new multiple-trait model that will not include any of these trouble-maker traits and repeat the analysis.
6-6
Submodel option
In this case the submodel contains only a part of all traits. Use <SubmodelAdd> menu option, select desired traits in the <Submodel> window and press <OK> button.
Submodel with trait1, trait3, trait4, trait5 was created in this example. Later we will consider how to employ the special options of MultiQTL analysis of multi-trait complexes in order to select a “reasonable” subset of traits to remain in the final model.
<SubmodelDelete> option may be used.
Part 6: Multiple-trait model
Single-QTL model (continued)
6-7
About submodel option
Part 6: Multiple-trait model
<About submodel> menu option displays submodel selected in the graphs window. The color of rectangle in the upper-left corner of <About submodel> panel shows the color of corresponding graph. Three submodels are shown in this example.
Single-QTL model (continued)
It is noteworthy that in multitrait analysis the comparison between submodels that differ in the number of traits is conducted using the function Trait contribution, whereas Compare Submodels option serves for comparing two-QTL vs single-QTL models or two-QTL models with and without epistasis.
6-8
Significance options: H1 H0
<Distribution> menu option does not exist for multi-trait model. <Significance Compare H1H0> option was chosen for analyzing the results on chromosome 2, submodel 3. It is identical to the same options in other single-QTL models.
Part 6: Multiple-trait model
Single-QTL model (continued)
6-9
<SignificanceBootstrap> option is identical to the same options as other single-QTL models. Note that in the bootstrap table all traits’ parameters are displayed.
Part 6: Multiple-trait model
Single-QTL model (continued)
Significance options: Bootstrap
6-10
Significance options: Trait contribution
<SignificanceCompareTrait Contribution> is a special option for multitrait model. Choose this option for selected submodel. Panel of trait significance appears. Press <Start> button. Two tests are then conducted simultaneously for each of the traits based on permutations of the trait values relative to the set of the remainder traits and the marker set (Korol et al. 2001): (a) test of the trait “contribution” to the LOD, and (b) test of the putative QTL effects, additive and/or heterozygous.
Part 6: Multiple-trait model Single-QTL model (continued)
The need in these tests derives from the fact that not for every trait combination joint analysis of the multitrait complex is automatically more informative than the analysis of a sub-complex or even of some single traits (Korol et al. 1995, 2001). An improvement is expected when the interval (QTL) affects several traits and/or when some of the traits are correlated (see p.6.2). If one includes too many traits, some may be non-informative or depend on different regions of the same chromosome increasing thereby the uncertainty of QTL location compared to the simple single-QTL analysis. Our advice is to take for joint analysis complexes of functionally-related and/or correlated traits.
6-11
Part 6: Multiple-trait model
We receive significance of each trait for current submodel. The significance is defined for LOD, substitution effect {d} , and heterozygous effect {h} (for F2 or F3 population). To demonstrate how to employ this function in order to “optimize” the trait complex, we will use the example in file Tr6_Chr4F2.job. The LOD graph for the 6-trait complex and the results of the trait contribution test are shown.
Clearly, the next steps are to remove trait #6 from the set, re-analyze the reduced set, then to remove (if necessary) the next non-significant trait, etc. For this example, after step 3 no further reduction of irrelevant (“parasitic”) traits is possible. The results are shown on graphs below. Please, note that if you are going to try multiple-trait analysis with MIM (see part 7), the requirement is that for all chromosomes the same trait set will be employed. Thus, for MIM you should use either the initial set or a reduce set, with the same traits for all chromosomes.
Single-QTL model (continued)
Significance options: Trait contribution (continued)
6-12
Part 6: Multiple-trait model
For the simplified 3-trait model (highlighted in dark yellow) the QTL detection power was P=99.9% (at significance 0.001) and the accuracy (standard deviation) of the estimated QTL position was S L=7.1 cM (the first bootstrap graph) whereas for the initial 6-trait model we have got P=99.7% and S L=7.8 cM, respectively.
For traits with strongest QTL effects of the considered chromosome the results were: for trait 3, P=84.4% and S L=12.2 cM; for trait 5, P=92.2% and S L=13.4 cM (the second graph).
Single-QTL model (continued)
Significance options: Trait contribution (continued)
6-13
Consider, for example, chromosome #1. We shall show how to use the options of multitrait two-linked QTL analysis. These simulation data are on backcross population, with one chromosome, 10 traits, one environment. The number of genotypes / phenotypes was 200.
Parameters of simulation data example
Parameters of QTL 1 and QTL 2
and epistatic parameters are displayed.
Two-linked QTL model
Part 6: Multiple-trait model
6-14
Computation results One model with all six traits was created.
Simulation (left click on single QTL) and computation (left click on any square) results are shown.
Part 6: Multiple-trait model
Two-linked QTL model (continued)
6-15
Computation results (continued)
A global estimates table can be received by <Estimate> menu option.The interval with global maximum LOD is highlighted by red color, whereas interval(s) with local maximum LOD is (are) highlighted by blue color.
Part 6: Multiple-trait model
We should note that in multiple-trait analysis the two- QTL model is challenging from computational viewpoint and there is no guarantee about the convergence to the solution for an arbitrary trait complex and marker data set.
Two-linked QTL model (continued)
6-16
Submodel option
Use <SubmodelAdd> menu option, select the desired traits in the <Submodel> window and press <OK> button.
In this case, submodel contains only a part of all traits, with possibility to include epistasis (by default, no epistasis). For more details about the submodel analysis combined with trait contribution analysis, please, consider the examples for single-QTL analysis (p.6.11).
In addition to the basic model of no epistasis, three submodels were created in this example: - epistasis fitted for all 6 traits; - selected were traits 2,3,4,5; - no epistasis; - selected were traits 2,3,4,5 with epistasis
Part 6: Multiple-trait model
Two-linked QTL model (continued)
6-17
Submodel option (continued)
Three-dimensional graphs of the initial model (red) and all three submodels are displayed. In the small window corresponding radio buttons for four submodels are shown.
The ToolTip (prompt) shows the
full name of the sub-model.
Part 6: Multiple-trait model
Two-linked QTL model (continued)
6-18
About submodel option
The general information about each submodel may be displayed by the <About Submodel> menu option.
Part 6: Multiple-trait model
Two-linked QTL model (continued)
6-19
In this model <SignificanceCompareSubmodel> option is possible. Only submodels with identical sets of traits, including and not including epistasis, may be compared.
In this example two submodels with traits #1-6 are compared.
Significance options: Compare Submodels option
Part 6: Multiple-trait model
Two-linked QTL model (continued)
6-20
Significance options: H2H0, H2H1
<SignificanceCompare H2H0> and <SignificanceCompare H2H1> options are identical to those options in other
two-linked QTL models.
Part 6: Multiple-trait model
Two-linked QTL model (continued)
6-21
Traits contribution
Part 6: Multiple-trait model
This option is identical to the corresponding option for single QTL model (see page11)
Two-linked QTL model (continued)
6-22
Bootstrap
Part 6: Multiple-trait model
This option is identical to the
corresponding two-linked QTL
option for one- or two-trait model
(see part 5, page 19). Note that
in the bootstrap table all traits’
parameters are displayed.
Two-linked QTL model (continued)
7-1
Table of contents
Introduction
Simulated example (MultiSet.job) Windows for creation of a multi-chromosome set
Part 7
Multiple chromosome analysis
Selection of traits for single- or two-trait models Selection of traits for multitrait models
Creation of multi-chromosome set Multiple interval mapping (MIM) Results of the Multiple interval mapping (MIM)
Multi-chromosome set for two-trait analysis Multi-chromosome set for multiple-trait model
Multiple Simulation options
2 3 5
6
6 7 8
11 11 14 16
7-2
In the previous parts, we have demonstrated how to conduct QTL analysis for each separate chromosome. Here we show how to use the MultiQTL tools for entire genome analysis. The examples include Monte-Carlo simulation (for simulating multiple samples) and Multiple Interval Mapping (MIM) for a single (real) data set (Kao et all. 1999). It should be stressed that combination of MIM with other “multiple” approaches, i.e., multiple-trait (part 6), multiple-environment (part 8), and multiple-family (part 9), allows achieving high quality mapping, including further increase in QTL detection and accuracy of the estimated QTL position. Our own experience indicates that even with a rather modest sample size (n=100-200) one may reach fine mapping (with standard deviation of QTL position being 2-3 cM or even less that 1 cM).
There are three possibilities to start multiple interval analysis:
1. From <MultiSet> menu option of Data Analysis window
2. From icon
3. From <MultiSet>option of the main menu
Part 7: Multiple chromosome analysis
Introduction
7-3
Simulated example
For showing multiple interval mapping (MIM) analysis, a project based on simulated data Tr6-4chrF2.job is used. The simulated data represent F2 population, with six traits, one environment, and four chromosomes. The number of genotypes (and phenotypes) is 500. In the current 2.5 version of MultiQTL (unlike the 2.4 one), Multiple Interval Mapping analysis is also possible for multiple-trait models, multiple-environment models, and multiple-family models. To demonstrate the single-trait analysis we have prepared tr1Q1 and tr1Q2 models, for two-trait analysis - tr2Q1 and tr2Q2 models, and for multiple-trait analysis - mltrQ1Tr4_1,mltrQ2Tr4_1models.
Part 7: Multiple chromosome analysis
QTLs parameters on the first
chromosome:
Parameters of QTL 1 and QTL 2
7-4
Simulated example (continued)
Part 7: Multiple chromosome analysis
Parameters of QTL3 of chromosome chr2 and QTL4 of chromosome chr3 and general
parameters are displayed.
7-5
Windows for creating a multi-chromosome set
After you opened the <MultiSet> option, window <Multichromosome set> appears.
The window is either empty or it includes names of multiple chromosome sets created before. For opening an old set click left mouse button on its name and press button <Open Set>. You can also delete a chosen set by <Delete Set> option or copy it under a new name by <Copy Set> option. In order to create a new multi- chromosome set click <Add New Set>. The window <Selection of traits> will appear that is different for different types of models fitted at the single-chromosome stage of the analysis (see next page).
Part 7: Multiple chromosome analysis
<Multichromosome set> window
7-6
In these windows you should enter a name for the multiset, choose the number of traits according to the model (one, two, or multiple) and select trait names for it. Press <OK> button.
Part 7: Multiple chromosome analysis
<Selection of traits> window for single- or two-trait models
For multitrait models, we can choose a model from a set of models with different number of traits that were created and computed for each chromosome. Our example includes a model with six traits and two models with five traits. We enter the multiset name and chose the model class: Multitrait. In order to find out what traits were entered to the set, the set should be marked by the mouse left button. Then, the button <Select> should be pressed to confirm the choice.
Windows for creation of a multi-chromosome set (continued)
<Selection of traits> window for multitrait models
7-7
Creation of multi-chromosome set
Part 7: Multiple chromosome analysis
We can add a chromosome to the set. For that, select a chromosome name and a model name. Choose a submodel from the <Submodel name> window and click <Add to Set> button to add chromosomes to the set. To delete any chromosome from the set, choose it by left click on its name and press the <Delete from Set> button.
For the example, we consider a set twoTrait (tr5_2) with two traits.
As a result, we will obtain a MIM set with four chromosomes with different models and submodels. Click <Open Set> button for further work.
7-8
Multiple interval mapping (MIM)
A new window appears with graphs of all selected chromosomes and a menu with multi-chromosome analysis options. The menu becomes visible only after activation of the window. To perform multiple QTL analysis, select <MIM> option.
Part 7: Multiple chromosome analysis
Our ‘Multiple Interval Mapping’ algorithm reduces the background (non-controlled) variation by taking into account QTL effects from other chromosomes (Jansen & Stam 1994; Zeng 1994; Kao et al. 1999). In our software, it consists of sequential chromosome analysis for QTL presence while subtracting the QTL effects of other chromosomes. Since the positions and effects of these QTLs are unknown in advance, the algorithm is iterative.
7-9
Multiple interval mapping (MIM) (continued)
Part 7: Multiple chromosome analysis
increase or decrease this level. Say, you may increase it to 0.1 in order to allow less significant effects to be temporary included into the model (with a hope that reduction of the residual variation due to other included QTLs can make some small effects “significant”). Then, after the model is build, you may again return to the initial 0.05 or even set a more stringent level. This is similar to the procedures of forward and backward stepwise regression. ”Min. Permut” and “Max. Permut” mean minimum and maximum number of permutations conducted at each round of the MIM process.
First, the most powerful QTL is found and its effect is subtracted. Then, the next powerful QTL is searched and its effect is subtracted, and so on. This procedure is repeated until no QTLs are found anymore on the remaining chromosomes. Assuming no interaction, the QTLs effects are re-evaluated by fitting the QTLs from other chromosomes in the order of their power. This procedure is applied repeatedly until the difference between parameters of
each QTL on two consecutive iterations is less than a reselected value.
Here we need to explain the meaning of parameters appearing in the window: “Level” means the level of significance for individual chromosome effect that you want to require for the chromosome to be included into the MIM model, whereas “Precision” means the standard error of this Level. You may want to use the default value (0.05) or either
7-10
Multiple interval mapping (MIM) (continued)
Part 7: Multiple chromosome analysis
The yellow arrow icon shows the currently treated chromosome; the blue sign marks a chromosome without detected QTL, and a red-green “molecule” marks a detected QTL.
For any current stage, if after the minimum number of permutations the trial chromosome does not fit the significance conditions defined by the Level, then the MIM algorithm moves to test the next chromosome. If during this test the chromosome was not left after the minimum number of permutations (Min. Permut. runs), the testing continues till one of the following events occurs: (a) a non-significant level was reached before the number of permutations reached Max.Permut. and the chromosome is marked as non-significant; and (b) the p-value (significance level) becomes lower than Level - 2Precision before (or just when) the number of permutation runs reached Max. Permut., then the chromosome is marked as significant for the current iteration. After a few or even few dozens of iterations the process converges. But you may want to continue the iterations by setting more challenging permutation numbers
Click <Start> button of <Multiple interval mapping> window.
Press <Close> button after process finishing.
7-11
All LOD graphs after the calculations appear. The blue triangle is a symbol indicating the position of the QTL after
MIM analysis. Status of <after> button is ON. To see LOD graph of each chromosome before MIM analysis, i.e. based on single-chromosome analysis, click the <before> button. Click the chromosome name to see a large graph of <before> or <after> status. Use <view> menu option to switch <before>/<after> buttons for all chromosomes simultaneously.
Part 7: Multiple chromosome analysis
Results of the Multiple interval mapping (MIM)
P.E.V. [P.E.V.(ad)] is the percentage of explained variance [explained additive variance] of the trait
Multi-chromosome set for two-trait analysis
7-12
On the graph obtained by MIM analysis you can see both green (only for simulated data) and blue (QTL fitted using MIM analysis) symbols. Click the blue QTL symbol to see the table of “Detected” values.
To compare the results from <before> and <after> graphs, open both of them. You can open a graph of a model obtained after MIM and further tune this model by creating its derivative submodels and compare these with the obtained MIM model.
Part 7: Multiple chromosome analysis
Results of the Multiple interval mapping (MIM) (continued)
7-13
Part 7: Multiple chromosome analysis
A graph for a “before”-MIM model
Submodel for the “after” MIM model and its general characteristics
A graph for an “after”-MIM model
Results of the Multiple interval mapping (MIM) (continued)
7-14
Part 7: Multiple chromosome analysis
By pressing buttons <General PEV> or <General PEV(ad)> we obtain PEVs for each of the traits
Results of the Multiple interval mapping (MIM) (continued)
Multi-chromosome set for multiple-trait model
7-15
Part 7: Multiple chromosome analysis
We can compare the results for each chromosome (chr. 2 in graphs below) obtained by single-chromosome analysis and by MIM (“before” and “after” variants).
You can further work with the results of MIM, by building new submodes, e.g., based on “trait contribution” analysis in case of combined multitrait-MIM analysis (see the green submodel above).
Results of the Multiple interval mapping (MIM) (continued)
7-16
Multiple Simulation option
This regime is possible for simulated data only! To perform multiple simulations, select <MultiSimulation> from the menu.
Part 7 : Multiple chromosome analysis
The window <Multiple simulation > appears. It has two options. The <Calculating Empirical Threshold> option allows you evaluating the QTL detection power in multiple simulation analysis. By choosing this option, we get the <Calculation Emp> window. Press <Start/Continue> button of this window to start the process. New diagrams appear in the <Multichromosomes Set> window:
for Multiple set with one or two traits for Multiple set with multitrait set
7-17
After closing the <Calculation Empirical Threshold> window we can begin multiple simulation process. This process may also be performed without previous calculation of the threshold. Press <Multi-simulation> button and <Start/Continue> button in the <Multisimulation Test> window.
Multiple Simulation options (continued)
Part 7: Multiple chromosome analysis
The diagrams in the <Multichromosome Set> window are ever changing during the calculations. Close <Multisimulation Test> window in order to see the results of multiple simulation process.
Parameter <Samples> of this window can be changed.
Now close the <Multisimulation Test> window.
7-18
Click a chromosome name to see the results.
Click the <Open in EXCEL> button in the table of the result to open it in EXCEL.
Choose <Experimental levels> of the test statistics threshold to see the corresponding detection power for each chromosome.
Multiple Simulation options (continued)
Part 7 : Multiple chromosome analysis
If the radio button <Computed> is in state <On> then in all rows the deviation from mean values is shown. In case <Occurred>, deviations from occurred values are shown.
8-1
Table of contents
Introduction
Single-QTL analysis for one trait model
Part 8 Multiple environments
3
4
5
7
8
13
14
16
17
Parameters of simulated data (multiEnv.job)
Results for model with trait normalization
Results for model without trait normalization
Submodel option
About submodel option
Submodel compare option
Distribution option
Significance options
8-2
Table of contents (continued)
18 19 20 21 22
23 24 25 26 27 28 29
Two_linked QTL analysis for two-trait model
Results of computation Submodel option Significance options
Multichromosome set
Multiple interval mapping (MIM) Multisimulation option
Parameters of simulated data (multiEnv3chrom.job) Results of computation Estimate option Submodel option Submodel compare option
Single-QTL and Two_linked QTL analysis for two-trait model
Format transition: “multiple-environment” “multiple-trait” formats
8-3
Introduction
In the current version of the package we consider only the case when the same mapping population is phenotyped in several environments (for description of the analytical model for such a situation see Jansen et al. 1995). The marker genotypes are the same for all environments, but the phenotypes are different for each environment; thus the number of phenotypes is equal to that of genotypes multiplied by the number of environments. For this type of data all models in the package are available, excluding multi-trait models that will be implemented in further versions. We will also implement the approximate model based on environment “bioindication” principle, with no limitation on the number of environments (Korol et al. 1998). Still, the current version allows two-trait analysis with multiple environments, as well as all the aforementioned options combined with MIM. The number of environments in the current version may be relatively large, hence the number of parameters may also be large. In order to reduce the number of parameters, it is possible to create a model with trait centering / normalization for each environment (see page 7 of Part 3). To show how to analyze data in this case, we give an example of simulated F2 data for one trait scored in six environments. The number of genotypes is 100; phenotype number is 600; file name multiEnv.job.
Part 8: Multiple environments
In the data scored in multiple environments we provide a technical function “normalization”. The intention is to make data transformation in such a way that the residual variance will be the same in all environments. The Estimates tables can display the results in either the normalized or non-normalized form, in accordance to user’s choice.
8-4
Single-QTL analysis for one-trait model
In order to see the general simulation parameters press <Param> button and panel <Setting parameter value> will appear. To see the QTL parameters, press on the green QTL symbol and the panel <Setting QTL’s parameters> will appear.
Two models were created: m without trait normalization and m_tn with trait normalization.
Part 8: Multiple environments
Parameters of simulation data
8-5
Results for model with trait normalization
Press left mouse in any interval to get the estimates. Then, choose whether to see the results for normalized or non-normalized regime.
Estimates for normalization regime
LOD graph is shown in the scanning form.
Estimates for non-normalization regime
Part 8: Multiple environments
Single-QTL analysis for one-trait model (continued)
8-6
<Estimate> option
If status of check box <Trait normalization> is ON, we see the estimates for normalization model.
Otherwise we see the estimates for no normalization model.
Part 8: Multiple environments
Single-QTL analysis for one-trait model (continued)
Results for model with trait normalization (continued)
8-7
Results for model without trait normalization
Estimated values for any interval can be received by left mouse button click.
Table of all estimated values is received by choosing <Estimate> from the menu options.
Part 8: Multiple environments
Single-QTL analysis for one-trait model (continued)
8-8
Submodel option
In multiple-environment analysis, the submodel options include two parts: -You may take into account some assumptions about the problem parameters for selected environments (same way as in the case of one environment). Then only conditions <no variance effect> and <no covariance effect> are taken into account. - You may want to put some problem parameters equal for some or all environments.
By default, no special conditions except <no variance effect> and <no covariance effect> are taken into account. The first graph on next page (shown in red) represents the default submodel. Select <SubmodelAdd> menu option, panel <Submodel> appears.
Part 8: Multiple environments
Single-QTL analysis for one-trait model (continued)
When the panel appears, the status of the <Constraints within selected environments> radio button is ON.
For the first example we choose constraint h=d/2 (dominant model) for all environments; we press <Apply>, and then <OK> buttons.
8-9
For the second example we choose condition h=d/2 for 2,3,5,6 environments. It is necessary to press <Apply> button after each choice. We turn to the lower part of submodel panel. For that, we press <Equating parameters in the selected environments> radio button. It is not necessary to choose anything in the upper panel.
Part 8: Multiple environments
Single-QTL analysis for one-trait model (continued)
Submodel option (continued)
For the second example, we first click on <dom.ef(h)>radio button on the lower panel part; then we choose environments: 2,3,4,5; and then we press <Apply> and <OK> buttons.
We obtain tree graphs: - red graph is the default submodel; - green graph, for our first example; - blue graph, for our second example.
8-10
A panel appears, and in its upper part we see the constraint selected for this submodel. In the lower part of panel, we choose <ad ef(d)> radio button and environments 1,2,3; and then we press <Apply> and <OK> buttons.
We shall show how to receive a new submodel by using an old submodel. We select “green” submodel and choose <SubmodelAdd> option.
New graph (yellow) is displayed.
Submodel option (continued)
Part 8: Multiple environments
Single-QTL analysis for one-trait model (continued)
8-11
Submodel option (continued)
You may want to delete one (or more) of the reviewed submodels. Thus, let us delete the “green” submodel, using <SubmodelDelete> menu option. We can take one of the already defined models, and put additional conditions on the parameters, creating thereby a new submodel. We choose the default submodel (red). In the lower part of <Submodel> panel we choose <ad.ef(d)> radio button and environments 1,2,3.
Part 8: Multiple environments
Choose this new graph and <Submodel <Add>
option for that.
Single-QTL analysis for one-trait model (continued)
Now we press <Apply> button, and then choose <dom.ef(h)> radio button and the same environments. We press <Apply> and <OK> buttons. New graph (yellow) is displayed. The ToolTip (prompt) shows the full name of this submodel
8-12
In the lower part of <Submodel panel> we see condition defined on the previous page. By pressing <dom.ef(d)>radio button, we see the selected environments (1,2,3) since this condition is also satisfied for this submodel. We may also delete a condition by pressing <Remove> button.
Press <dom.ef(d)>radio button, then <Remove> and <OK> buttons. We create a new submodel for which only one parameter <ad.ef(h)> is equal for the 1-3 environments.
Part 8: Multiple environments
Single-QTL analysis for one-trait model (continued)
Submodel option (continued)
8-13
About submodel option Choose <About submodel> menu option.
Rectangle’s color corresponds to the submodel graph’s color.
Part 8: Multiple environments
Single-QTL analysis for one-trait model (continued)
8-14
Submodel compare option
Choose <SignificanceCompareSubmodel> menu option. Two submodels may be compared if parameter set of one of them is a subset of parameter set of the other. Therefore, all submodels may be compared to the default model. In this example only submodels 4 and 5 may be compared one to another.
Part 8: Multiple environments
Single-QTL analysis for one-trait model (continued)
8-15
For the “yellow” submodel the ‘Comparison’ window allows to test it against two other submodels. In the figure, the main (red) submodel was selected, and to compare the two submodels click <OK>. <Comparison test> panel appears.
All the initial parameters here may be changed. Press <Start> and the test will run.
Part 8: Multiple environments
Single-QTL analysis for one-trait model (continued)
Submodel compare option (continued)
8-16
Distribution option
Options of the <Distribution> menu are similar to those in one- and two-trait models. Note that in this case, trait distribution may be displayed for each selected environment.
For each environment, we may
choose any interval to display the
distribution in putative QTL groups.
Part 8: Multiple environments
Single-QTL analysis for one-trait model (continued)
8-17
<Significance Compare H1H0> and <Significance Bootstrap> are identical to those in single QTL models in single-environment problems.
Significance options
Part 8: Multiple environments
Single-QTL analysis for one-trait model (continued)
Now, in the bootstrap table, parameters for all environment are displayed. By moving the Scroll bar you can see the parameters for each of the environments
8-18
Part 8: Multiple environments
To show the features of the two-trait analysis and to demonstrate multiChrom regimes (MIM и multiSimulation), we consider a simulated example multiEnv3chrom. Simulated were backcross data with 3 chromosomes, 2 traits scored in 6 environments, affected by 3 QTLs; the number of genotypes was 200, hence the number of phenotypes 1200.
Single-QTL and Two_linked QTL analysis for two-trait model
Parameters of simulation data
8-19
Part 8: Multiple environments
We show below the results of computation of two-trait model q1t2 with one QTL for chromosome 3. The figures display the results for two traits across all four environments.
Single-QTL analysis for two-trait model
Results of computation
8-20
Part 8: Multiple environments
Single-QTL analysis for two-trait model (continued)
Estimate option
8-21
Part 8: Multiple environments
In case of two-trait analysis the panel for creating submodels and <About submodel> panel for displaying the structure of the submodels are different from those of single-trait models (the graphs on this page are for chromosome 3).
Single-QTL analysis for two-trait model (continued)
Submodel option
8-22
Part 8: Multiple environments
In this example, the submodels can be created to test various hypotheses. Thus, for testing QTL-E interaction hypothesis, we can set equal the QTL effects (di) in the 3 environments. In comparison of such models, the criterion employed in MultiQTL is difference in likelihoods rather than Critical LOD increment of the compared submodels (although we have not changed the corresponding text in the Comparison Test panel).
Significance options are similar to those of other models.
Single-QTL analysis for two-trait model (continued)
Submodel compare option
From the comparison of LODs on graphs on the previous page, we can see that the difference in maxLOD values for these models is about 2. We should that for comparisons of different submodels in the Multiple-Environment analysis, the criterion for comparison is the difference of likelihoods (and not of LODs) that may differ from that of LODs.
8-23
Part 8: Multiple environments
Two_linked QTL analysis for two-trait model
Results of computation
Consider the results of computation of two-trait model q2t2 with two_linked QTLs of chromosome 1. The results for two-trait model across all environments.
8-24
Part 8: Multiple environments
Two linked QTL analysis for two-trait model (continued)
Submodel option
In this case, both the panel <Submodel> and window <About submodel> display parameters of two QTLs.
8-25
Part 8: Multiple environments
Two_linked QTL analysis for two-trait model
Significance options H2H0 and Bootstrap practically do not differ form corresponding options in other sections of package. <Significance Compare H2 H1> option are identical to those for two-linked QTL models in single-environment problems.
Significance options
The results for two-trait model q2t2 (for
two-QTL model, we had maxLOD=12.84)
The results for single-trait model q2t1 (trait2)
(for two-QTL model, we had maxLOD=5.33)
8-26
Part 8: Multiple environments
Multichromosome set
Let us open the MultiSet option, create a set for two-trait model and set two-QTL for chromosome 1 and single-QTL model for the other two chromosomes (for the details of the simulated data set see page 18. After opening the set, select MIM option.
8-27
Part 8: Multiple environments
Multichromosome set (continued)
Using MIM analysis of the foregoing data, two effects were detected, on chromosomes 2 and 3. By pressing corresponding buttons we can get corresponding values of P.E.V. and P.E.V.(ad). In this case these values are equal because of absence of epistasis in two-QTL model.
As in general case, we can open for each chromosome windows for “after” MIM analysis, and create and compare submodels for each of these.
Multiple interval mapping (MIM)
8-28
Part 8: Multiple environments
Multisimulation option
The Results table for all environments and all traits will appear after closing <Multisimulation Window> and then pressing the button with the name of chosen chromosome
Multichromosome set (continued)
8-29
Part 8: Multiple environments
Creating a project with “single-environment” traits using “multiple-environment” project
Some options for single-environment analysis are limited when your are working within a multiple-environment project, e.g., you may want to consider the trait scores across environments as a multiple-trait complex. For a transition to single-environment analysis you can apply the <Extract> option of the main menu (see Part1, page 59). By choosing this option, you will get a new window:
First, if we chose option <Environments> our data will appear in the multiple-environment format. Now, by selecting the <Traits> option, we can get for each trait a multiple-trait complex with the number of traits equal to the number of environments. The names of these new traits are formed from the old name extended as *_e1, *_e2 , *_e3…
We should now set a name for the new folder where the extracted data will be saved, and add a new folder name in the current folder (shown in the window) or in another folder. The new folder will host a new single-environment project that can be created using the <New Project> option of the main menu (see Part1, page 59).
Format transition: “multiple-environment” “multiple-trait” formats
8-30
Part 8: Multiple environments
Under some circumstances an opposite transformation may also be of interest: creating a a “multiple-environment” project from an existing standard project. Again, we can employ <Extract> option of the main menu. Corresponding windows look now a bit different:
If the check box will be now set in state <On>, the window will change: a list of all traits of the project will appear together with a cell for setting the number of environments for new multiple-environment problem.
We can select from the list the “traits” that represent scores of one trait across a few environments, write the trait name and by pressing button <Add new trait> define the first trait of the new multiple-environment problem. Then the next group of scores, for the second trait, should be selected, etc. Clearly, at each such step, the number of selected scores should be the same and equal to the number of environments. By default, the names of the environments are: “env1, env2, …”. By pressing <OK> we finish the process of defining trait-environment combinations, and the system will suggest to set the folder name. By pressing again <OK> we conclude the operation. Now, a new multiple- environment project can be created using <New Project> option of the main menu.
Creating a “multiple-environment” project using a project with “multiple-trait” data
Format transition: “multiple-environment” “multiple-trait” formats
9-1
Table of contents
Introduction
Parameters of simulated data (family3chrom.job)
Computation results
Submodel option
Submodel compare option
Distribution option
Significance options
Multichromosome set
Part 9 Multiple families
2
4
6
8
10
11
12
15
16
17
MIM
Multisimulation
9-2
Introduction
Part 9: Multiple families
This section allows you joint analysis of multiple families (populations) of the same structure. Such analysis was not available in the previous versions of MultiQTL. It is noteworthy that the model of “multiple families” can also be applied for “multiple- environments” data sets. Namely, it allows conducting QTL-E analysis when the data include different genotypes scored in different environments, unlike our other multiple-environment model (Part 8) where the same set of genotypes is scored in multiple environments. The “multiple families” model and algorithms implemented in the current version can be applied for mapping analysis of data that fit a few additional assumptions:
- Same population type is assumed across families (but we plan to relax this restriction. In the future versions of the package this restriction will be relaxed. - For each marker locus, at least one family must must be polymorphic (this allows using the algorithms of the multiple-families section for QTL analysis in advanced backcross design (e.g., Luo et al. 2002). - Missing marker scores are automatically recovered based on information on polymorphic neighbor markers. - Each trait should be scored in all families. - Data input is conducted separately for each family (see Part 1).
There are also some limitations in the algorithm presented in the current version:
- Only single-trait and two-trait models are available. - Selective genotyping model is not yet available. - Each chromosomal interval is scanned over the user-defined number of steps, hence computations are relatively slow for two-QTL F2 models.
9-3
Introduction (continued)
Part 9: Multiple families
The main options of the Multiple families analysis will be demonstrated using simulated data. As with real data, the number of individuals, chromosome lengths, QTL relative position within the interval, and QTL effect can be family-specific. As indicated above, this section of the package can also be applied to analyze data sets from multiple environments when the mapping population is comprised of sub-populations (or families), each represented by its specific genotypes. Likewise, families derived from advanced backcross design (that become popular due to a possibility of combining together breeding scheme and QTL mapping), can also be analyzed based on the multiple-family mapping. Due to the specificity of multilocus marker haplotypes sampled in each family/environment, linkage phase between marker loci (as well as between marker haplotypes and the QTL) may also be family-specific. Despite the last difference and corresponding differences in the mapping model and algorithms, the logic of the analysis and the main questions and tests are quite similar, allowing a less detailed description compared to that of Part 8. Still, possible variation in linkage phases between the markers and QTLs needs a special consideration that differs from that in Part 8. Namely, if in the previous Multiple-environment analysis one found nearly equal but oppositely directed QTL effects in two environments, this could be interpreted as QTL-E interaction (due to identity of the set of genotypes scored in the two environments). This situation cannot be interpreted as QTL-E interaction in the new model with family/environment specific genotypes (subpopulations). In the last case one can only declare about phase variation among the subpopulations.
9-4
Parameters of simulation data
Part 9: Multiple families
The illustration examples will be based on data simulated in family3chrom.job file. Number of individuals, chromosome lengths, QTL positions/effects can be family-specific. Details about the parameters can be obtained in corresponding windows (see Part3 page 13).
Windows <Setting QTL’s parameters> show how the values of QTL effects and positions are displayed. The QTL location in interval 3 for family 2 and in interval 7 for family 1 can be seen after selecting the corresponding family in the window <Families Selection>.
9-5
Parameters of simulation data (continued)
Part 9: Multiple families
Here the mean values of the traits are set different across families, whereas the values for Stand.dev and Correlation were equal in different families.
9-6
Part 9: Multiple families
Computation results
Four models with one and two traits with single- and two-QTL were created. To conduct the computations the user should choose the number of points in each interval to fit the model (by default 10).
Analysis of two-trait model with single QTL is shown in the figures. In the <Estimated> table we can see the lengths of the marked interval for each of the 4 families whereas table <Occurred> shows the QTL position relative to the left flank of the interval. By moving the Scroll bar you can see the parameter estimates for each family.
9-7
Part 9: Multiple families
As usually, table Estimates displays the intervals with max LOD values (the local ones and the global), the sample size, and coordinates of the intervals’ flanks for each family and the location of max LOD (the local ones and the global) in each family. By moving the Scroll bar you can see the parameter estimates for each family.
Computation result (continued)
9-8
Submodel option Part 9: Multiple families
The <Scanning> option is not provided for Family data because the basic computation is conducted by scanning along the interval. The <Submodel> option is identical to the corresponding option of the Multi-Environment analysis (see Part 8). Two examples of submodels are shown below.
9-9
Important comment: In creating new submodels with equal additive (d) effects, we actually take into account the size of the effects, because linkage phase between the markers and QTL may vary among families (i.e., coupling or repulsion). Similar comment is also applied when heterozygous (h) effects are supposed equal between (some or all) families. It should be noted that if both, d and h, are supposed to be equal between families, we take care that the sign relationship between d and h is maintained within each family. An example of F2 is provided below:
Submodel option (continued)
Part 9: Multiple families
We create a submodel with (d) and (h) equal in all families (i.e. the absolute values are equal).
Please compare the effects of the new submodel (green) with those of the initial model (red) for the interval with max LOD.
9-10
Submodel compare option
Choose <SignificanceCompareSubmodel> menu option.
Part 9: Multiple families
This option is identical to the corresponding option
of the Multi-Environment analysis.
9-11
Distribution option
Options of the <Distribution> menu are similar to those in one- and two-trait models.
Note that in this case, trait distribution may be displayed for each selected family.
Part 9: Multiple families
9-12
<SignificanceCompareH1H0> is identical to that in single QTL models with either single or multiple environments. But due to specific aspects of the algorithm, the computation is slower here. We demonstrate this option on the second trait using single-trait model and markers of chromosome 2.
Significance options
Part 9: Multiple families
9-13
Significance options (continued)
<Significance Bootstrap> option is identical to that in other models. However, in the bootstrap table parameters for all families are displayed. By moving the Scroll bar you can see the parameter estimates for each family. The displayed example is the same as was used in the demonstration on the previous pages.
Part 9: Multiple families
9-14
<SignificanceCompareH2H1> options are identical to those in two-linked QTL models for single environment problems. An example of comparing two-QTL and single-QTL models for first trait of chromosome 1 is shown.
Significance options (continued)
Part 9: Multiple families
9-15
Part 9: Multiple families
Multichromosome set
By calling for the option MultiSet we create an example of a set with single-trait model for trait 2. For one of the chromosomes (chrom1) two-QTL model was selected and for the other two chromosomes single-QTL model was selected. After opening the set, we choose MIM.
9-16
Part 9: Multiple families
Multichromosome set (continued)
By using MIM function we revealed three effects, on chromosomes 1, 2, and 3. By clicking on the QTL symbol (blue triangle), the table of results can be obtained.
Multiple interval mapping (MIM)
9-17
Part 9: Multiple families
Multisimulation option As in general case of simulated data, Multisimulation option can be obtained. A table with results for all families (environments) and traits appears if you press the button with the name of the selected chromosome, after closing <Multisimulation Window>.
Multichromosome set (continued)
10-1
Table of contents
Introduction
Data Simulation
Computation results
Part 10 Selective genotyping
2
3
5
6
7 8 9
10
10 Multisimulation option Multichromosome set
Results for four models on simulation data Real data Real data. One-trait model Real data.Two-trait model Estimate option
10-2
Introduction
Part 10: Selective genotyping
Selective genotyping (SG) is cost-efficient approach of QTL mapping utilizing the fact that the relative amount of information about QTL-marker association is much higher in the individuals from the tails of trait distribution compared with those from the middle part of trait distribution (Lander & Botstein, 1989). Consequently, in SG design only a part of objects, with minimum and maximum values of the target trait, are genotyped for marker loci. However, in order to get unbiased estimates for the QTL effect, all phenotypes, non-genotyped and genotyped, are included in the analysis (for the advantages of SG see: Lander & Botstein, 1989; Ronin et al. 1998, 2003; and refs. therein). To demonstrate the SG functions of the package, simulated data will be employed. Simulation of Selective Genotyping data sets can be achieved by creating special models with single- or multiple-traits (with only one of the multiple traits being the selected trait). Clearly, the weight of the left and right tails can be different in the simulated as well as real data. To analyze the data, models with single- and two-QTL per chromosome can be employed.
10-3
Data Simulation
To conduct the simulation and build the analytical model, select the <Selective Genotyping> option. Choose the target trait from the <Selected trait> menu that will define the objects for selective genotyping from the tails of the distribution of the selected trait . A panel named <Selective genotyping> appears. Click in this panel to set selective genotyping parameters.
To create the data for SG examples we should return to the corresponding section of Part3 (see page 3-7). The simulated example included 2000 individuals F2. Four data sets with different sizes of tails selected for genotyping where simulated: - ms10: 10% individuals for SG; - ms20: 20% individuals for SG; - ms40: 20% individuals for SG; - ms100: all individuals were genotyped.
Part 10: Selective genotyping
10-4
Data Simulation (continued)
When panel <Selective Genotyping> opens, the option <Rough Tuning> is on. Select the left part of the trait distribution (min. values) by moving the left slider. Then select the right part of trait distribution (max. values) by moving the right slider.
The percent of selected objects and the number of objects will appear in the
corresponding areas. Note that in the current version, for calculating the interval lengths, we simulate and employ marker scores for all individuals. For fitting the SG mapping model only genotypes from the defined tails are used.
Fine-tuning the number of selected objects is possible. To do this, click <Fine Tuning> radio button and move the left or right slider. In this way we can create all four data sets.
To create a simulation project, it is necessary to choose the threshold phenotypic values of the chosen trait.
Part 10: Selective genotyping
10-5
Computation Results
Results for the simulation data sets: We show below the results for all 4 sets that may give an idea how the simulation tools can help you in designing experiments. Note very close results in ms100 and ms40 models despite 2.5-fold smaller sample size in ms40.
Calculation result for ms100 model.
Calculation results for ms10 model. Calculation result for ms20 model.
Calculation results for ms40 model.
Scanning option was used in both cases. Estimated parameter values for the interval with maximum LOD value are displayed.
Part 10: Selective genotyping
10-6
Computation Results (continued) Real data
For illustration of real data analysis, we employ data on population F2 with 1000 phenotypes (four traits, trait1…trait4), with 400 being genotyped for markers of 2 chromosomes. Individuals for genotyping were selected from tails of the phenotypic distribution of trait1. In the input data first are placed the selected (genotyped) individuals and then the remainder (for more details on input <Selective Genotyping> data see Part 1 page 30).
Two models were created for this set, with single- and two-trait analysis. Note that for although any two-trait combination can be used for two-trait analysis, we strongly recommend to use only pairs that include the selected trait. For that, you may find helpful the option “Two(2lists)” as shown in the illustration (note that the option <Selective Genotyping> is in state <On>; the selected trait is indicated). See also Part 3, page 5.
Part 10: Selective genotyping
10-7
Computation Results (continued)
Real data. Single-trait model: It is noteworthy that if SG was based on tails for a trait x, then single-trait analysis for any another traits y=yi correlated with x may result in biased estimates of QTL effects for y. This point can be illustrated by the results for Trait2 and Trait3.
Trait1 Trait2
Trait3 Trait4
Part 10: Selective genotyping
10-8
Computation Results (continued)
Real data.Two-trait model: The aforementioned bias, i.e. the possibility of false positive detection for traits correlated to the selected trait, can be corrected by using two-trait analyses for (x, yi) (see Ronin et al. 1998).
Trait1 -Trait2 Trait1 -Trait3
Trait1 -Trait4 The obtained <Estimates> for the pairs indicated relatively high Cor.Coef of Trait2 and Trait3 with Trait1 (~0.7) and small Cor.Coef for Trait4 with Trait1. Accordingly, the estimated QTL effects for Trait2 and Trait3 vanish in the two-trait model compared to the single-trait model. Clearly, this does not mean that the effects detected for the correlated trait should always be zero.
Part 10: Selective genotyping
10-9
Estimate option
Computation Results (continued)
The <Estimate> table carries the information about the number of phenotypes and genotypes, as well as the name of the selected trait.
Significance options are identical to those in the usual analysis (with non-SG data) and, therefore, are not considered repeatedly here.
Part 10: Selective genotyping
10-10
Multichromosome set
MIM analysis is not implemental for SG data, if together with genotyped the non-genotyped phenotypes are also included into analysis to reduce the biases. But for simulated data, it may be instructive to employ MultiSimulation option, to get a better understanding of the effects obtained under single-run simulation analysis. As an example, we consider the previous example of an F2 population with 1000 phenotypes out of which 40% were selected from the tails of the Trait1 distribution for genotyping. The simulated additive QTL effect was 0.5 for Trait1 and 0 for Trait2. The results of two-trait analysis for a single simulated set look like:
We create a single chromosome MultiSet and employ the MultiSimulation option:
One can see that the average ad.ef(d) for Trait2 (equal to 0.0637) is clearly smaller than the value obtained for a single simulated experiment (0.155) and approaches to the initial set zero effect.
Part 10: Selective genotyping
11-1
Part 11
Summarizing the results
Table of contents
Introduction
Simulated example (TotalSign1.job)
Creating Second project for this example (TotalSign2.job)
Total Significance option
Significance value on the Calculation panel
Model groups
Creating the result table with specified threshold significance
Computation of the total significance by Benjamini & Hogberg method
Creating the report table
Creating a summary panel with LOD score graphs
Appending the second project
References (to Parts 1-11)
2
3
4
5
6
7
8
11
13
18
20
23
11-2
In the previous parts, it was shown how to perform data analysis for all models. Now we will see how to summarize the obtained results. This part includes three aspects: (a) Combining the results obtained in different job files; (b) Calculation of the experiment-wise significance; and (c) Generating output summary information (tables and figures). A real data set (problem) usually includes many chromosomes and traits. Consequently, many computing intensive analyses should be conducted and some of the calculations (especially for two-linked models) take a lot of time. Therefore, to expedite the analysis, it may be helpful to create several projects (job files) for one problem and to perform calculations for different projects on several computers (in the current version only two projects can be merged in one step, but you can apply this option sequentially). Finally, we want to combine all the calculations and to summarize the results for the whole problem. In particular, we need to take into account the fact that the calculated trait-chromosome significances should be corrected for multiple comparisons to get experiment-wise significance. All these steps will be shown on simulated examples.
Part 11: Summarizing the results
Introduction
11-3
Simulated example
Chromosomes and QTLs are shown on the <Setting the <Chromosomes> panel. Trait information is displayed on the <Setting parameter values > panel.
We simulated data for a project with name “TotalSign1.job”.
This data corresponds to backcross population genotyped for markers of six chromosomes and phenotyped for five traits in one environment. The number of genotypes (and phenotypes) is
200.
Part 11: Summarizing the results
11-4
Creating second project for this example
In our example, we defined and computed four models: - one-trait with single- and two-linked QTL, - two- and multiple-trait with single-QTL. In order to show how to work with several projects, we created a second project for our example. For that, using <File Extract> option of the main menu, a folder “Total” was created with chromosomes, traits and information files.
By <FileNew Project> option, we created in this folder the second project, file “TotalSign2.job”, with two chromosomes and three traits. In the <Select Data Files> panel we selected
“chrom4“, “chrom5“, ”trait3”, ”trait4”, and “trait5”. For more
details about these functions see in Part 1 pages 44 and 59.
In the second project, three models were defined and computed: one-, two- and multiple-trait analysis with two-linked QTL.
Part 11: Summarizing the results
11-5
Total Significance option
For every chromosome-trait (or chromosome-[trait complex]) combination in both projects, data analysis was performed by creating submodels and testing significance. We should now obtain genome-wise (or experiment-wise) significance.
On the <Calculation panel> significances are displayed only for the chromosome-trait pairs that have already been calculated.
First, we choose the model “m1” (one-trait & single- QTL).
Let us open the first project “TotalSign1.job” by <FileOpen Project> main menu option.
Then, we choose <Total Significance> menu option.
Part 11: Summarizing the results
11-6
- Significance value is displayed. It means, that for this cell only one submodel was tested for significance. - Colored rectangle in the cell refers to color of the LOD graph of submodel with the best significance. - The term “Multiple match” means, that significance was computed for several submodels and the best is not chosen.
By clicking left mouse button on this cell, we should get a window with several graphs. We can choose the best submodel (e.g., the one obtained after and removing the non-significant parameters) and close the window. Colored rectangle will then appear in this cell.
Significance value(s) on the Calculation panel
We define the term cell for chromosome-trait or chromosome-[trait complex] combination. There are three options of displaying the significances, as explained below:
Part 11: Summarizing the results
Model groups
Clearly, to correct for multiple comparisons, one cannot consider simultaneously “heterogeneous” models, e.g., it is impossible to combine within such a summary test single-trait and multiple-trait models. Thus, comparison should be performed for the same model groups. The simplest division of the models into such groups is to consider separately one-, two-, and multiple-trait models. But two one-trait models based on marker and interval analyses must also be classified into two different groups. A created group should include models, where all initial parameters are the same, and only single- and two-linked QTL models and their submodels being included in one group. For example, three groups of models are presented in our first job file “TotalSign1.job”:
- One-trait model group includes m1 and m2 models for single- and two- linked QTL, respectively. - Two-trait model includes one single-QTL model, m1tr2. - Multiple-trait group includes one model
m1mlt with single QTL.
11-7
These groups are displayed on the
<Total significance> panel.
Part 11: Summarizing the results
11-8
Press <Threshold sign.choice > button and set maximal limit of significance value in the <Declared threshold> panel. Press <OK> button of this panel.
Labels in the cells of the <Calculation Panel>, in which significance is better than the maximal limit, will be highlighted in red.
In <Total significance> panel, we choose model group “One trait” with m1 and m2 models of the first project. Then examine all significance values on <Calculation Panel> for all models of this group. For the cell labeled “Multiple match” choose submodel with the best significance. Estimate maximal limit of significance value for this model group.
Creating the Result table with specified threshold significance
Part 11: Summarizing the results
Press <Display chosen cases> button in order to receive the “total significance” table. It is possible that different models are represented for one cell (e.g., single-QTL and two-linked QTL). In such a case, a new panel for model choice appears (see next page).
11-9
You can choose one of two models or exclude both. For choice, click left mouse on one of the two <Choice>radio buttons, and then press <Include one> button. For exclusion, press <Exclude both> button. It is necessary to perform this operation for every cell for which both single- and two-QTL models have been fitted previously.
Creating the Result table with specified threshold significance (continued)
Part 11: Summarizing the results
Finally, we receive the table with all selected significances. By default the information in this table is sorted by the chromosome number. It is possible to sort it by the trait number. For that, it is necessary to change the choice in the window <Sorting>.
11-10
By using this table we can: - Calculate experiment-wise significance based on method of Benjamini & Hogberg (1995) of controlling False- Discovery-Rate (FDR). - Obtain combined estimates table for all significant results. - Assemble the LOD score graphs for all significant results.
For that, it is necessary to choose all
or some of the rows in the table and press <OK>.
After that one of the resulting options of the <Total significance> window may be chosen.
Part 11: Summarizing the results
Creating the Result table with specified threshold significance (continued)
11-11
Computing total significance by Benjamini & Hogberg (1995) method
Computation is performed if corresponding option of <Total significance> window was chosen. It is necessary to define FDR parameter.
In case of choosing all rows of the significance table, this table will be shown again, but only rows whose significance fits the FDR criterion will be included in this table. In case of multitrait groups, computing of global significance is not performed.
Part 11: Summarizing the results
11-12
For example,”chrom1”,”chrom3” and “chrom5” were chosen and their global significance was computed with FDR criterion=0.01. Results are represented in the “choice 1” window. The same operation was performed for ”Chrom2”,”chrom4” and “chrom6” and their computed results are represented in the “choice 2” window. In cases where the “Choice all” option was marked, the computed results are represented in the same (main) window.
In case of choosing all rows of the significance table, this table will be shown with rows whose significance fits the FDR criterion.
Part 11: Summarizing the results
Computing total significance by Benjamini & Hogberg method (continued)
11-13
Creating the Report table
You can get global report table for any significance table when pressing <Report table> button, even without performing global significance computation. We receive the report table for the chosen rows. Every row of this table contains estimates for one cell (trait-chromosome combination). Estimated values may be received from bootstrap test, if it was performed during the mapping analysis. In this case, the report contains an estimate of “QTL detection power” and the parameter estimates accompanied with standard deviations/errors.
For that, it is necessary to choose the significance level (0.05, 0.01 or 0.001) that will define the threshold value of the test statistics under H0 (calculated earlier based on permutation test). If “bootstrap” test was not performed during the previous analysis, the parameter estimates are
received from main computation.
If two-linked QTL analysis was performed for some cell, the corresponding line in the table will include estimated values for the two QTLs and “Epistasis” values (if the corresponding model with epistasis was created and fitted earlier). See example in the next page.
Part 11: Summarizing the results
11-14
Creating the Report table (continued)
In our example, this report is provided for one-trait group, for a backcross population. Bootstrap test was not performed for pairs chrom1_trait5, chrom4_trait4, chrom5_trait3, and chrom6_trait1. Two-linked QTL analysis was performed for pairs chrom4_trait1, chrom4_trait2, and chrom5_trait2. All computations, except for chrom4_trait1 and chrom4_trait2 pairs, were performed for the general submodel.
In order to see all results, use scrolling option. Press <Open in EXCEL> button to transform the report table to Excel format.
Part 11: Summarizing the results
11-15
Creating the Report table (continued)
This report refers to two-trait group for a backcross population. All estimated values were
displayed for each trait.
Part 11: Summarizing the results
11-16
Creating the Report table (continued)
This report refers to multi-trait group of a backcross population. All estimated values
were displayed for each trait.
Part 11: Summarizing the results
11-17
Creating the Report table (continued)
A multi-environment example is displayed in this table (multEnvSign.job file). This report refers to one-trait group of a backcross population. All estimated values were displayed for each environment.
Submodel with equal effects (d) for env2, env4, and env6 is used for chrom1_trait1 pair. For chrom2_trait1 pair, the submodel was fitted with trait mean (m) equal among env1, env2, env3; QTL effects were equal across env1-env5 and variance effects were equal among env3, env4, and env6.
Part 11: Summarizing the results
11-18
Creating a summary panel with LOD score graphs
It is possible to get a panel with LOD score graphs for any significance table when choosing all or some rows and pressing <LOD score graphs> button of the <Total significance> window. Use <Save to PCX format> menu option to get an output file with *.pcx extension. It is necessary to enter the name of this file. The graphical file is automatically placed into the folder with the project file. Up to 12 graphs can be placed in one page. If the number of chosen rows is >12, additional pages are created.
Part 11: Summarizing the results
11-19
Creating a panel with LOD score graphs (continued)
Use option menu <ViewNext> or <ViewPrevious> for page turning.
Part 11: Summarizing the results
11-20
Appending the second project
Now we will show how to add the results contained in the second project to this table.
We have to open our second project “TotalSign2.job” by choosing <FileOpen Project> main menu option. In order to save the best selected submodels in the first project, answer “YES” to the message that appears on the screen:
If the results of first project were received from bootstrap test, it is necessary to choose and to remember the significance limit level for each model group. A message with corresponding question will appear.
Part 11: Summarizing the results
11-21
In the second project we also choose model group “One trait” in the <Total Significance> panel and corresponding model “m1Qtl2” in <Calculation Panel>.
Threshold significance of this model group (0.003) was already defined. Therefore, we could see red messages in the cells of the <Calculation Panel>, in which significances are less than the allowed maximal limit. Note that the limit value for second project of current model group has to be equal to that for the first project.
Press <Display chosen cases> button to
receive the combined significance table for
both projects.
Appending the second project (continued)
Part 11: Summarizing the results
11-22
Different significances may be received in the same cell for two projects. It is necessary to choose the best submodel or exclude both using the special panel.
Finally, we receive the table with all
selected significances of both projects.
Appending the second project (continued)
Part 11: Summarizing the results
Table of contents
Changes and additions to Total significance option
• General option
• FDR analysis after fitting MIM model
2
3
4
5
7
9
9
13
16
25
28
The main additions to version 2.6 as compared to 2.5
Analyzing epistasis between non-homologous
chromosomes
Separated storing of the results files
Extending Re-sampling options by Jackknife analysis
Dealing with traits that are proportions: Logit transformation
Global permutation test for “after fitting MIM” model
Allowing for half-sib family structure
Tutorial book of MultiQTL package, part 12
Description of the additions and changes included to version 2.6 as compared to 2.5
part 12
Building submodels with variance effect for multiple-environment
and multiple-family data
12-1
• Epistasis analysis for both linked and unlinked QTLs
for singe-, two-, and multiple-trait models, for singe- and multiple-
environment models, interval and marker analysis
• Jackknife analysis, in addition to bootstrap re-sampling analysis
• Global permutation test after fitting MIM model for a selected set
• FDR analysis after fitting MIM model (for genome-wise analysis)
• Extension of FDR analysis for multiple-trait models
to facilitate selection of variants with non-overlapping traits for joint
FDR analysis
• Allowing for half-sib family structure (via extended data input)
• Logit-transformation for QTL analysis of proportions
• Separated storing of the results of epistasis analysis for non-linked
QTLs and MIM results (to prevent too large job files)
The main additions to version 2.6 as compared to 2.5
part 12
12-2
All results of analysis of epistasis between QTL on non-homologous chromosomes are stored in a separate
file in the same folder where the main job file of the problem was saved. The file for these epistasis results
has the same name as the main job file, but a different extension, *.twc. It existence is noted in the main file
and is checked each time when the main file is opened. But the file *.twc is opened only when the user
select the corresponding option of the main menu.
Separated storage of the results
In the new version of the package, all information about multi-chromosome sets (multiChromosome set )
treated using Multiple Interval mapping (MIM) is excluded from the main file *.job. This information is stored
in a special file with the same name but different extension ( *.mim). The program reads this file only when
the user starts working with multi-chromosomal sets or employs option <Total significance> <After
MIM>. During opening of the *.job file the system checks whether the folder that harbors the opened file
*.job includes also the file with same name and extension *.mim. If this file is absent, for a problem where
MIM analysis has already been conducted for some sets, a corresponding warning message will appear.
Such a situation may be a result from moving/copying the job file to a new folder without simultaneous
moving/copying the *.mim file.
The described sub-division of the results into three sections, stored in the main file (*.job) and two
additional files (*.twc and *.mim) was done to reduce the size of the files.
IMPORTANT:
Comment A. It should be stressed, that this new form of storing the results makes impossible reading
the new results (created using version 2.6 of MultiQTL) by previous versions of the package (2.4 and 2.5).
Clearly, the old results are readable by the new (2.6) version.
Comment B. If you have already used the previous version of MultiQTL, you may want to install the new
version in a separate folder, in order to have both versions in parallel, a least for some transitory period.
part 12
12-3
12-4
Extending Resampling options by Jackknife analysis
Previous versions included only one of the two main resampling options – Bootstrap analysis. Here we add a
new option – Jackknife analysis. Choosing one of these two options is conducted at the sage of defining the
model. The window <Extended Parameters> of the version 2.6 includes a new box <Resampling
method>. By default, the <Bootstrap> method is applied. If user selects Jackknife method, the size of
sampled part of the mapping population should
be indicated in % (e.g., 60%).
On the next steps, all operations with this model
that employ Re-sampling options, are based on
the choice made here. This includes single-, two-
, and multi-chromosome sets. Global re-sampling
analysis will also be based on this choice. We
strongly recommend not using subsamples close
to 90-100%.
The chosen method will be indicated in the title of the window for Resampling analysis, and if Jacknnife was
selected, then the sampled part of the total sample size will be indicated:
(additions to Part 3 - Extended Parameters, Part 4, Part 5 - Bootstrap analysis option)
part 12
12-5
After choosing the trait for this
transformation, the user will be
asked for a confirmation.
Dealing with traits that are proportions: Logit transformation
In case of QTL analysis of proportions special measures are needed if the trait values are close to zero or one,due
to high deviation of the distribution from normality. Our package includes a special section for normalization of trait
distribution by transformations preserving trait range. In version 2.6 we included a new option for traits that are
proportions: logit transformation, the most suitable for such traits. In this case, the trait values “P” are replaced by
P’=log(P/(1-P). In version 2.6, the window <Data check and transformation> includes box <Logit
transformation>.
(addition to Part1 - Data revision option)
part 12
12-6
Then, the system defines minimum and maximum values in the trait. If one of these values is out of the
permitted interval (0 and 1 or 0 and 100) a warning message is displayed. Otherwise (0Pmin and Pmax 1),
Pmin and Pmax are displayed together with censoring values c1 and c2 (to exclude too extreme values of the
transformed trait): trait values with P< c1 and P>1-c2 are replaced by c1 and 1-c2, respectively. By default
c1=c2=0.01, but user can change these values.
Dealing with traits that are proportions: Logit transformation (continued)
part 12
12-7
It should be noted that model fitting during MIM process, small-scale significance testing based on permutations
is conducted, for stepwise selection of chromosomes to be included to or removed from the model. To get more
precise (real) estimates of significance, permutation test should then be conducted after MIM analysis. To
facilitate this process, in the window for MIM a new options, called <GlobalPerm. test>, is included to version
2.6 of MultiQTL package.
Before starting this analysis, we should select the
chromosomes targeted for permutation analysis.
This is initiated by pressing button <Choice for
permutation test>. The window of the selected
chromosomes will be colored (in blue).
To cancel the selected, the same button should be
pressed again. After selection is finished, the
number of permutations should be set from the
corresponding window.
Button <Start> of <Global permutation test>
window is then used to begin the Permutation test.
Button <Stop> allows to break the process after
treating of current chromosome. After the process
was stopped or finished, the color of treated
chromosomes will recover and the <Global
permutation test> window will be closed.
Global permutation test for after fitting MIM model (addition to Part 7)
part 12
12-8
The results of significance tests and permutation number for any chromosome can be seen by opening its window.
Important note:
In the new version of the package, all information about multi-chromosome sets (multiChromosome set )
treated using MIM tools is excluded from the main file *.job. This information is stored in a special file with
the same name but different extension ( *.mim). For more details see section “Separate storage of the
results”
Global permutation test for after fitting MIM model (continued)
part 12
12-9
Changes and additions to Total significance option
Changes have been made in the <Total significance> section to fix a program bug that was causing inflated
estimates of significance values for two- and multi-trait models. An extension of this section was included to
allow estimating significance for genome-wise analysis based on MIM analysis. Thus, the <Total
significance> option is now subdivided into two: <General> (old) and <After MIM> (a new one).
one <Total significance> and a new one <Selection from the model(s)> for selecting
multiple trait models consisting of different
sets of traits (see the example
based on sim10trPerm_2 file).
This example includes 5 models.
After choosing one of these
(mlt1_4), the names of all
models with all traits being
different from the traits of the
chosen model, will be listed in
window <Next choice>. We
can choose now one of these
listed models, and this will bring
to a new list of models with trait sets different from the traits of the selected two, etc. If no such models are anymore available, the window will be
empty. If from the beginning all models have shared traits, a corresponding massage will appear and,
consequently, only one model can be chosen to proceed the analysis.
After pressing button <OK>, the genome wise significance based on controlling FDR will be calculated for the
selected multi-trait models. If you do not proceed and just close the window, Total significance analysis will not be
conducted for the selected models.
(additions to Part 11) General option
Multiple trait models: In case of several multiple trait models, two windows will appear, the old
part 12
12-10
Two-trait models: All the models that were scored for significance appear in the window <Total significance>. Single-
trait and the selected multi-trait models can be analyzed as usually (including genome-wise significance testing,
obtaining report tables and figures). When user selects two-trait models, a message appears suggesting to choose
trait pairs with non-shared traits between the pairs.
Then, in the window <Calculation
panel> appears a new button <OK>
and a new check button <Chose
different trait pairs> in state
<On>.
Assume a pair trait1/trait2 is chosen. The
chromosomes of this pair, that were analyzed for
significance, will be marked by blue. If by using <Ctrl>
an additional pair will chosen that includes one of these
two traits (e.g., the pair trait1/trait4), a message will be
displayed that this choice is impossible.
Clearly, for this example, only
pair trait3/trait4 can be added
to previously selected
trait1/trait2.
By pressing button <OK> the selected pairs for the current model will be included to the estimation of the
Total significance.
General option (continued)
part 12
12-11
In this example, we considered all two-trait combinations for the single-QTL model. Two-QTL models that were
fitted and tested individually for significance should also be considered. Namely, before conducting Total
significance test, both singe- and two-linked QTL models should be selected for the chosen non-overlapping
trait pairs. This will allow us for each pair of traits to chose the best of the two fitted models. By choosing
Threshold sign (e.g., 0.05) and pressing <Display chosen case>, we will obtain a window for choosing the
best model for each of the selected trait pairs. We select linked-QTL model for Chrom2, and single-QTL model
for Chrom3. After that, we get the window <Group two_ trait> only for the selected pairs of traits.
General option (continued) part 12
12-12
In the presented example, we selected 3 multitrait models.
In the last version (2.6) the Report document has changed. It includes now PEV and PEVad for each trait, in
correspondence to all other report documents of the system.
General option (continued) part 12
12-13
Multi_chromosome sets for MIM analysis can be created for single traits, for trait pairs, and for multi-trait
complexes. For the same trait, Multi-set may include different models for different chromosomes (single- and
two- QTL). No more than one MIM set can be selected for one trait. Similarly, for two-trait and multiple-trait
sets, the sets should be selected in such a way that with one round of Total significance analysis any trait will
enter not more than once in the selected sets. Thus, after choosing the option <Total significance> <After
MIM>, the program displays all the single-trait sets with the same trait (if such sets exist at all) and allows the
user to make the choice.
To assist in selection of MIM sets, for each set the user can see the trait(s), chromosome, employed model,
and the calculated single-chromosome significance. Out of the suggested alternative, only one can be
selected, followed by confirmation (using <OK> button). By closing the window without any choice, the user
will remain in the final documents only with traits represented by one model only.
FDR analysis after fitting MIM model
part 12
12-14
If Multi_chromosome sets include two-trait problems, all such sets are suggested to the user for choosing sets with non-
overlapping traits. In this case, each set is represented by its name and accompanied by trait names. For each
set, its chromosomes, models (single- and two-QTL), and significance for each chromosome are displayed.
The traits cannot be repeatedly chosen. Thus, if we select, e.g., set tr2 with traits trait4 trait5, the window
<Next choose> will show only sets with non-overlapping traits relative to the already selected set(s). This
process can be repeated. When selection is finished, we should press button <OK>. If, instead, you close the
window using Cancel, none of the two-trait sets will be included to the analysis of significance.
FDR analysis after fitting MIM model (continued)
part 12
12-15
For multi-trait combinations the program automatically detects and displays the sets with non-overlapping traits.
The user can choose any set or a group of sets with different traits by using the information in the window <Next
choose>. As before, the selection of sets should be confirmed by pressing button <OK>, whereas to reject the
chosen sets you just close the window.
All selected sets will be represented by their models
in the <Total significance> window. Each of these,
can be analyzed for significance and characterized
by reporting table and graphs in the same way as
done for single-chromosome analysis. Clearly, the
results may be rather different between the single-
chromosome and MIM analysis.
FDR analysis after fitting MIM model (continue)
part 12
12-16
Analyzing epistasis between non-homologous chromosomes
This function is implemented for all types of population implemented so far in the package, as well for multiple-
environment analysis, but is not yet available for multiple-family analysis. It makes sense to move to this analysis
after the single-chromosome analysis is already done. In the current version 2.6, the analysis can be conducted
only with single-QTL chromosomes for single-trait, two-trait and multiple-trait models. As an example we use the
problem for F2 population with 5
chromosomes and 4 traits, represented in file F2TwoChrom.job Four models were previously
computed here: m1 – single-trait,
m1_2 – the same but with two-linked
QTLs (hence excluded from this
example), m2 – two-trait, and mlt –
multi-traits. To call this function, we
use the option <TwoChromSet> of
the main menu.
A window for chromosome pairs will
appear with lists of all models for all
chromosomes.
part 12
Using the window <trait/pair traits> you can chose any trait or trait pair for the analysis, whereas for the
multiTraits model this window is not available because ALL TRAITS of the multiple-trait model are then selected.
Let us chose model m1, trait3, and chromosomes 2 to 5. By pressing <OK> we will obtain a window similar to our
standard window, but this time it is build to represent all possible pairs of chromosomes (for the selected trait or
trait combination) rather than usual pattern of chromosomes versus traits.
In this panel we can define a
request for computing a certain
pair of chromosomes, or
several pairs, or all pairs. For
that, we should chose, as
usually, the desired
combination(s) or <All> and
then press <Compute>.
This example shows a choice of all
chromosomes from one list (2-5);
naturally, the diagonal cells remain empty,
and the results are symmetric relative to
the diagonal.
Analyzing epistasis between non-homologous chromosomes (continued)
part 12
12-17
Another option is to chose two sets of chromosome, so that
epistasis will be analyzed between chromosomes of these
lists but not within lists. For that, we press button <Second
list> and obtain the second list. Now we should chose
chromosomes to define the targeted pairs. The system
reports when a chromosome is selected in both lists (this is
considered a wrong choice). By pressing <OK> we obtain
the panel for the chosen combinations for the two lists. To
return to the single-list variant, we should press again the
button <Second list>. Note that the window <Computation
for chromosome pairs > includes a small window
<Variants>. This window lists the sets of chromosomes for
the current model and trait (or trait pairs). By choosing one of
the created variants, we can see which chromosomes it
includes (from one or two lists of chromosomes). By pressing
<OK> we can see the results for this set. By pressing
<Delete>, we can remove this set with all its results.
In the process of choosing the pairs of chromosomes, you
may be interested to see the results of single-chromosome
analysis. This is provided by pressing <View of the single
chromosome results>. Repeated pressing of this button
removes the single-chromosome window.
The panel with the two-chromosome results can be
printed or saved in EXCEL file.
Analyzing epistasis between non-homologous chromosomes (continued)
part 12
12-18
The window includes a table
of results for all pairs of
intervals of the analyzed pair
of chromosomes. In
particular, the estimates of
the effects can be displayed
for each pair of intervals.
By selecting a result from the computation panel (e.g., for the pair chrom1-chrom2), we obtain a window similar to
the window for two-linked QTL model from our standard analysis for one chromosome. It is accompanied with a
menu that allows using all functions described earlier for linked-QTL analysis.
Please, note that the main (default) model in this analysis assumes
epistasis (and includes epistatic parameters) between the analyzed
chromosomes.
Analyzing epistasis between non-homologous chromosomes (continued)
part 12
12-19
Let us select option <Estimates> of the menu. We obtain then a window very similar to
that of the two-linked QTL model for a single chromosome.
Analyzing epistasis between non-homologous chromosomes (continue)
part 12
12-20
The window of Permutation test is analogous to that of the single-chromosome analysis, whereas the window for
re-sampling (bootstrap and jackknife) function slightly differs from that of single-chromosome analysis.
Analyzing epistasis between non-homologous chromosomes (continued)
part 12
Testing H2:H1 is not presented among functions of
option <Significance>. This is based on the assumption
that pairs of chromosomes to be tested for epistasis
include only chromosomes with significant individual
effect on the trait(s), hence both members of the tested
pair should affect the trait(s).
12-21
Like in single-chromosome analysis, options <Global Resampling> and <Global permutation test> are
available in the two-chromosome analysis.
For multiple-trait complexes, the two-chromosome analysis does not include the function Traits
Contribution that has a more exploratory character.
Analyzing epistasis between non-homologous chromosomes (continued)
part 12
12-22
Here we demonstrate how to create sub-models and
compare models. The example is for pair chrom4 and
chrom5.
The default model (with epistasis) is highlighted in red.
We can see that epistatic parameters e1-e4 are small
(relative to d1, d2, h1 and h2). Is epistasis between
chrom4 and chrom5 significant? We create a sub-
model assuming no episatasis (green).
Now we can employ compare the models
Analyzing epistasis between non-homologous chromosomes (continued)
part 12
12-23
Thus, all main functions employed for epistasis within chromosome (for linked QTLs) are available for epistasis
analysis for QTLs from non-homologous chromosomes.
Closing the <Computation of the pair chromosome> window returns the standard panel of single-
chromosome analysis. Using options <Save>, <Save As> or closing the problem result in saving all results of
analysis of chromosome pairs. In next round of analysis with the saved job file, we could see all the variants
and results of analysis conducted using function <TwoChromSet>. This information can be seen in the
window <Computation of the pair chromosome>.
Important note:
In the new version of the package, the information resulting from analysis of epistasis between non-
homologous chromosomes is stored in a special file with the same name but different extension ( *.mim).
For more details see section “Separate storage of the results”
Analyzing epistasis between non-homologous chromosomes (continued)
part 12
12-24
Data input for half-sibs population
The new version 2.6 allows analyzing data of half-sib mapping design that is widely used in dairy cattle genetics. The
usual way to treat such data is via “backcross” model. In such a case, the estimated effect is half of the true additive
effect. It is noteworthy, that in MultiQTL, the out put of the substitution effect is d=(XQQ-Xqq)=2D (which is twice of
the additive effect). Thus, in case of half-sib design, our d represents exactly the estimate of additive effect.
Another specific aspect of using MultiQTL for half-sib mapping design concerns the data input. For input the data
should be prepared in EXCEL format. In the input table (file *.xls), the first line represents the names of the markers
and quantitative traits. The second line indicates the names of sire’s alleles. The remainder liens of the table
represent the progeny alleles. Thus, the number of lines is equal to the progeny size plus 2. High flexibility is allowed
for allele designation: they could be numbers, character, or mixed (among marker loci). For example: 200 190, А В,
200 А. The two alleles of a diploid progeny can be separated, e.g., by a backspace, comma, or slash, or presented
without any division symbol,e.g., АВ.
Data input is conducted as follows:
We start with calling the data
table. For that, during creating a
new project, we chose in
Population Type the option half
sibs. After pressing button Import
Data, the data format xls is
selected and then, on the tree of
folders and files, we chose the
required file. This selection will
result in appearance of the data
table on the screen. The next
steps should transform this data
set to a “backcross” type data.
part 12
12-25
The data table looks as follows: 1
2
3
4 6
7
8
5
part 12
12-26
- We should mark the line of marker names (1), mark in box <Selected String> the line Column Names (2),
and then, by pressing button <Take> (3), we enter marker names (and trait names, if presented in the same
file) to MultiQTL.
- Now we click on the line with the parental alleles (4), then in box <Selected String> click on Sire’s
genotype (5), and then press button <Take> to enter names of sire’s alleles.
- Choosing each marker by button corresponding to the <Name of marker> (6) and then button <Take>,
we transform the information in the column to the internal format of MultiQTL. However, this input will be
correct only for markers with backspace as division symbol. If no division symbol is presented, then, before
pressing button <Take>, we should mark <No delimiter> (7). If there is a division symbol, comma or slash,
then before pressing button <Take>, we should enter the corresponding division symbol as <Allele
delimiter> (8). After finishing the input of each marker, we should confirm it by pressing <ОК>.
part 12
12-27
Building submodels with variance effect for multiple-environment and multiple-family data (to parts 8-9)
In the previous version, there was no possibility to create models and submodels with variance effect for multiple-
environment (and multiple-family) data. In the current version, it is already possible to create models with variant
effect for single-trait analysis. Likewise, for a model with no variance effect, it can be defined for selected or all
environments by using corresponding sub-model options.
During creation of a model, you can define a special model (that will differ from
the default one), by switching off the indicated function.
Then, upon adding a new model, a new window <Submodel> opens
allowing to set the parameters of the special model. The same
window is used for defining sub-models if the created multiple-
environment model was without variance effect (by default). For
example, we may have data for two environments with a
single-trait model without variance effect.
We can use option <Submodel Add> to
create several submodels with variance effect.
part 12
12-28
For that, in the panel <Submodel>, the check button <no variance effect> should be switched to state
<Off>. Then, you choose both environments in the list <Environments> and by pressing buttons <Apply>
and <Ok>. Obtain the new solution (green graph).
You can choose only one of the environments from the
list. For each of such two choices you’ll get then a
separate graph: yellow (variance effect allowed for the
second but not the first environment) and blue (variance
effect allowed for the first but not the second
environment). Using the <About submodel> option, you
can see the description of the submodels.
part 12
Building submodels with variance effect for multiple-environment and multiple-family data (continued)
12-29
References
Our algorithms are based on theoretical papers of the QTL mapping community, and our own publications. List of our relevant publications was provided in Part 1, pages 12-13. Here we provide references to all other papers cited in any of the Parts 1-12.
Benjamini Y. and Hochberg Y. 1995, Controlling the false discovery rate: A practical and
powerful approach to multiple testing. J. Roy. Stat. Soc., Ser. B. 57: 289-300.
Crow J.F. 1990, Mapping functions. Genetics 125: 669-671.
Darvasi A. and Soller M. 1992. Selective genotyping for determination of linkage between a
marker locus and a quantitative trait locus. Theor. Appl. Genet. 85: 353–359.
Churchill G.A. and Doerge R.W. 1994, Empirical threshold values for quantitative trait
mapping. Genetics 138: 963-971.
Jansen J., de Jong A.G., van Ooijen J.W. 2001, Constructing dense genetic linkage maps
Theor. Appl. Genet.102:1113–1122.
Jansen R.C., Van Ooijen J.M., Stam P., Lister C., and Dean C. 1995, Genotype-by-environment
interaction in genetic mapping of multiple quantitative trait loci. Theor. Appl. Genet. 91: 33–37.
Jansen R.C. and Stam P. 1994, High resolution of quantitative traits into multiple loci via
interval mapping. Genetics 136: 1447-1455.
Jiang, C. and Zeng Z.-B. 1995, Multiple trait analysis and genetic mapping for quantitative trait
loci. Genetics 140: 1111–1127.
12-30
Jump to first page
Jiang C. and Zeng Z.-B. 1997, Mapping quantitative trait loci with dominant and missing
markers in various crosses from two inbred lines. Genetica 101: 47–58.
Kao C.-H., Zeng Z.-B., and Teasdale R.D. 1999, Multiple interval mapping for quantitative
trait loci. Genetics 152: 1203–1216.
Lander E.S., Green P., Abrahamson J., et al. 1987, MapMaker: An interactive computer
package for constructing genetic linkage maps of experimental and natural populations.
Genomics 1: 174-181.
Lebreton C.M. and Visscher P.M. 1998, Empirical nonparametric bootstrap strategies in
quantitative trait loci mapping: Conditioning on the genetic model. Genetics 148: 525–535.
Luo Z.W., Ch.-I. Wu, and Kearsey M.J. 2002, Precision and high-resolution mapping of
quantitative
trait loci by use of recurrent selection, backcross and intercross schemes. Genetics 161: 915-929.
Stam P. 1993, Construction of integrated genetic linkage maps by means of a new computer
package: JoinMap. The Plant Journal 3: 739-744.
Walling G.A., Visscher P.M., and Haley C.S. 1998, A comparison of bootstrap methods to
construct confidence intervals in QTL mapping. Genet. Res. 71: 171-180.
Zeng Z-B. 1994, Precise mapping of quantitative trait loci. Genetics 136: 1457–1468.
Zeng Z-B., Kao C.-H., and Basten C.J. 1999, Estimating the genetic architecture of quantitative
traits Genet. Res. 74: 279-289.
References (continued)
Good luck !
We will be glad to your comments, criticisms and suggestions.
12-30