Genetic Evaluation
Performance data
EPD /EBV
Pedigree information ARelationships
based on Pedigreedata
Henderson’s rules to create inverse of relationshipmatrix
BLUPIteration on datamethods
Single-Step GeneticEvaluation
Performance data
BLUP
GE-EPD /GEBV
Pedigree information
SNP
A GRelationships based on
Pedigreedata
Relationships based on Genomicdata
H
Blended Relationships
• Single-step genomicevaluation
Single-Step to genomicevaluation
• Traditional genetic evaluation
𝑋 𝑋 𝑋 𝑍𝑍 𝑋 𝑍 𝑍 𝛼𝐴
𝑏𝑢
=𝑋 𝑦𝑍 𝑦
𝑋 𝑋 𝑋 𝑍𝑍 𝑋 𝑍 𝑍 𝛼𝐻
𝑏𝑢=
𝑋 𝑦𝑍 𝑦
Single step genomicevaluation
• Inverses
𝐀 = Numerator relationshipmatrix 𝐀 = Pedigree relationships between genotypedanimals𝐆 = Genomic relationships matrix
Aguilar et al., 2010 Christensen & Lund,2010
𝑋 𝑋 𝑋 𝑍𝑍 𝑋 𝑍 𝑍 𝛼𝐻
𝑏𝑢=
𝑋 𝑦𝑍 𝑦
𝐇 𝐀 0 00 𝐆 𝐀
Extra matrices required forsingle-step
• Inverses
𝐀 = Pedigree relationships betweengenotyped animals𝐆 = Genomic relationships matrix
PREGSF90𝐇 𝐀 0 0
0 𝐆 𝐀
Genomic Relationship Matrix -G• G =ZZ’/k
– Z = matrix for SNPmarker– Dimension Z=n*p– n animals,– p markers
Data file with SNPmarker
Genotype Codes1 – Homozygous2 – Heterozygous3 – Homozygous
5 – No Call (Missing)
How BLUPF90 performs Single-StepGenomic
BLUPF90Programs
Genomic module
PREGSF90Genomic module
RENUMF90
blupf90 (ai)remlf90 Gibbsxf90 etc.
Single Step in BLUPF90package RENUMF90
Data Pedigree Markerdata
Renumf90
Output filesrenf90.datrenaddx.ped
Parameter file
Add keyword to the “animal effect”SNP_FILEmarker_file_name
How BLUPF90 performs Single-StepGenomic
BLUPF90programs
Genomic module
Genomic Moduleperform qualitycontrol create extramatrices
genomic relationshippedigree relationship forgenotyped
OPTION SNP_file marker.file
PreGSf90
• Interface program to the genomic module to process the genomic information for the BLUPF90 family ofprograms
• Efficientmethods– creation of the genomic relationship matrix, relationship based on pedigree– Inverse of relationshipmatrices
• Former program to performs Quality Control of SNP information
Input file for genomicBLUPf90
• Same parameter file as for all BLUPf90programs– But with “OPTION SNP_file marker_file_name”– indicate to run genomicsubroutines
• Pedigree file
• Marker information (SNPfile)
• Cross Reference file for renumberID– Links genotypes files with codes in pedigree, etc.– Generated by renumf90
SNP mapfile• OPTION chrinfo<file>• For some genomic analyses (GWAS) or QC• Format:
– SNP number• Index number of SNP in the sorted map by
chromosome andposition– chromosome number– Position– SNP name(Optional)
• First row corresponds to first column SNP in genotype file!!!
Parameters fileRENUMF90renum.par
BLUPF90renf90.par
Pedigree file fromRENUMF90
• 1 - animalnumber• 2 - parent 1 number or UPG• 3 - parent 2 number or UPG• 4 - 3 minus number of known parents• 5 - known or estimated year of birth• 6 - number of knownparents;
if animal is genotyped 10 + number of knownparents• 7 - number ofrecords• 8 - number of progenies as parent 1• 9 - number of progenies as parent 2• 10 - original animal ID
SNP file &Cross Reference IdSNPFile
Cross ReferenceID
First col: Identification, could be alphanumeric Second col: SNP markers {codes: 0,1,2 and 5 for missing}
Pedigree File (fromRENUMF90)
Original ID
Renumber ID
Quality control By defaultexclude
• MAF– SNP with MAF< 0.05
• Call rate– SNP with call rate <0.90– Individuals with call rate <0.90
• Monomorphic– Exclude monomorphic SNP. ONLYwhen MAF <> 0
Quality controlBy default exclude(cont)
• Parent-progeny conflicts (SNP &Individuals)– Exclusion -> oppositehomozygous– For SNP: >10 % of parent-progeny exclusion from
the total of pairsevaluated
– For Individuals: > 1% of parent-progeny from total number ofSNP
Control defaultvalues
• For MAF– OPTION minfreqx
• Call rate– OPTION callratex– OPTION callrateAnimx
• Mendelian conflicts– OPTION exclusion_thresholdx– OPTION exclusion_threshold_snpx
Parent-progeny conflicts
• Presence of these conflicts results in a negative H matrix !!!• Problems in estimation of variance component by REML,
programs does not converge,etc.• Solution:
– Report all conflicts, with counts for each individual as parent or progeny to trace theconflicts
– Remove progenygenotype• maybe not the bestoption• But results in a positive-definite H matrix !!!
Genomic MatrixOptions
• OPTION whichfreqx– 0: read from file freqdata or other specified– 1: 0.5– 2: current calculated from genotypes(default)
• OPTION FreqFile file– Reads allele frequencies from afile
• OPTION maxsnpsx– Set the maximum length of string for reading marker
data from file => BovineHDchip
Saving ‘clean’files• SNP excluded from QC are set as missing (i.e.Code=5)• Excluded Individuals are treated as unrealated in G andA22
– For individual iG[i,:] = 0; G[:,i]=0; G[i,i]=1 ; Same forA22so G-A22 will cancelout
• OPTION saveCleanSNPs• Save clean genotype data with excluded SNP and individuals
– For example for a SNP_filegt– Clean fles willbe:
• gt_clean• gt_clean_XrefID
– Removed will be output infiles:• gt_SNPs_removed• gt_Animals_removed
No Qualitycontrol
• ONLYuse:– If QC was performed in a previous run
• preGSf90
– and “clean” genotype file isused
• OPTION no_quality_control
PreGSf90 wiki