27
Mining CIMMYT germplasm data to inform breeding targets for CC adaptation Zakaria KEHEL, Jose CROSSA, Thomas PAYNE and Matthew REYNOLDS Rabat-Morocco. 24-27 June 2014

THEME – 1 Mining CIMMYT germplasm data to inform breeding targets for CC adaptation

  • Upload
    icarda

  • View
    142

  • Download
    1

Embed Size (px)

Citation preview

Page 1: THEME – 1 Mining CIMMYT germplasm data to inform breeding targets for CC adaptation

Mining CIMMYT germplasm data to inform breeding targets for CC adaptation

Zakaria KEHEL, Jose CROSSA, Thomas PAYNE and Matthew REYNOLDS Rabat-Morocco. 24-27 June 2014

Page 2: THEME – 1 Mining CIMMYT germplasm data to inform breeding targets for CC adaptation

Collection Wild

Land-

race

Breeding

materials

Genetic

stocks Cultivars

Unknown

or Other TOTAL

Bread Wheat 213 32,428 41,995 8,150 6,278 331 89,395

Durum Wheat 25 5,578 14,262 1,089 1,156 58 22,356

Triticale 0 0 16,964 3,402 345 9 20,720

Barley 0 669 13,898 200 1,755 11 16,533

Species &

other 6,541 1,658 155 820 30 15 9,219

Rye 36 109 132 168 219 13 677

Total 6,816 40,442 91,057 13,829 9,783 437 158,713

TOTAL (excl.

barley)

142,180

CIMMYT Wheat Germplasm Bank

Page 3: THEME – 1 Mining CIMMYT germplasm data to inform breeding targets for CC adaptation

WGB: Opportunities, Challenges and Gaps

● Pedigrees, for GWAS or GS precision

● Phenotypes, so expensive (Curation)

● Core reference sets (SeeD, GCP, WGB, FIGS)

● GRIN Global and GeneSys

● Actions as a “global system”

● Little overlap with USDA and ICARDA

The phenotypic values, representing over 11.2M data points, are

held by CIMMYT’s IWIS database.

The value of these phenotypic values exceed USD100M, if the

trials resulting in the assembled data were to be repeated today.

Page 4: THEME – 1 Mining CIMMYT germplasm data to inform breeding targets for CC adaptation

WGB: Opportunities, Challenges and Gaps

● Species accessions

Too many!

Yet, extent of in situ diversity?

Generate new diversity with existing accessions?

● Frustration of limited access to new, improved

germplasm (this might also extend to collecting

landraces).

● Most exchange is bank-to-bank

● “my institution/government owns the germplasm”

Page 5: THEME – 1 Mining CIMMYT germplasm data to inform breeding targets for CC adaptation

Data quality control (single field analysis)

Identification of out layers

Verification (field books)

Data storage

Database with meta data available

Data control of wheat nurseries

Page 6: THEME – 1 Mining CIMMYT germplasm data to inform breeding targets for CC adaptation

Data verified by trait and by nursery

Page 7: THEME – 1 Mining CIMMYT germplasm data to inform breeding targets for CC adaptation

LOC_N

O COUNTRY LOCDESCRIP INSTITUTEN

10601 MAURITIUS REDUIT Agricultural Research and

Exte

19011 ALGERIA ITGC-DAHMOWNE ITGC

19012 ALGERIA EL HARRACH ITGC

19121 EGYPT SERS EL-LIYAN Agr. Res. Center

20701 LEBANON BEKA'A VALLEY Agric. Res. Inst.

21221 TURKEY AGRICULTURE FACULTY University of Trakya

22243 INDIA NAGAON EXP. STA. DWR

24059 CHINA AN DA ALKALI SALINE SOIL INST. Heilongjiang Academy

27121 THAILAND NONGKAI RICE EXP. STN. Rice Research Inst.

41303 UNITEDSTAT

ES ALABAMA AMU Alabama A & M Univ.

42109 MEXICO MEXICALI CIMMYT

42138 MEXICO CIANO - FULL IRRIGATION CIMMYT

65001 GREECE KENTZIKO THERMI NA

65004 GREECE CEREAL INSTITUTE (EPANOMI) NAGREF-DW Dept.

65009 GREECE SCHOOL OF AGRICULTURE YPSILON SA

65124 ITALY S.S. DI GRANICOLTURA PER LA

SICILIA NA

65127 ITALY PIETRANERA Univ. di Palermo

65451 SPAIN LA CABANA CIFA-Alameda del Obispo

LOC_NO Point:COUNTRY Polyg:COUNTR

Y LOCDESCRIP INSTITUTE

12308 KENYA Ethiopia ENDEBESS Kenya Seed Company Ltd.

19013 ALGERIA Morocco AIN EL HADJAR ITGC

19126 EGYPT India KHATTARA Agr. Res. Center

20011 AFGHANISTAN Kazakhstan TAKHAR-TALOQAN CIMMYT

20330 IRAN Russia BIRJAND AGRIC. RES. STN. SPII

21115 SYRIA Turkey AL RQA Ministry of Agriculture

21117 SYRIA Turkey HRAN Ministry of Agriculture

21121 SYRIA Iraq HIMO Ministry of Agriculture

21222 TURKEY Syria AGRICULTURAL FACULTY University of Dicle

22241 INDIA Bangladesh NEPZ, UBKV DWR

24022 CHINA Taiwan KIMMEN A.E.S. Qinghai Academy

29501 TAJIKISTAN Afghanistan TAJIK A.R.I. Kazakh Scientific Res. Inst.

42124 MEXICO United States VALLE DE MEXICALI Agric. Int. de Mexico

47702 DOMINICANREP Haiti QUINIGUA C. de Des. Agrop.

61403 HUNGARY Serbia SZEGED C.R.I. Cereal Res. Inst.

FIGS without roots, or imbalanced passport,

characterization & evaluation data

Page 8: THEME – 1 Mining CIMMYT germplasm data to inform breeding targets for CC adaptation

Data control a continuing process

• The same location with different management system has only

one planting and harvest date

• Full irrigation or irrigated locations with “NO” irrigation in the

corresponding field value

• Same location, IRR YLD less than RF YLD

• 13 Ton/Ha in RF location (Mexico Obregon) as an example

other control methods with time

• Outliers across locations and years

• Validating dates using earlier years or neighboring locations

• RF versus IRR

• …

Page 9: THEME – 1 Mining CIMMYT germplasm data to inform breeding targets for CC adaptation

MET Analysis

MET data

GxE analysis

Variance components, G corr,

BLUPS, Stability

GxE with

covariables

Patterns of GxE

(spatially changing

relationships)

Identification of

co-variables

(Factors, variates)

Meta data stored

in the DB

All, RF,

IRR

Page 10: THEME – 1 Mining CIMMYT germplasm data to inform breeding targets for CC adaptation

y = 0.0176x + 4.3539 R² = 0.1952

y = 0.113x + 8.0045 R² = 0.5792

y = 0.0013x + 1.0054 R² = 0.0005

0.00

2.00

4.00

6.00

8.00

10.00

12.00

14.00

16.00

mean

max

min

Linear (mean)

Linear (max)

Linear (min)

y = -0.3232x + 15.653 R² = 0.4153

y = 0.308x + 82.476 R² = 0.3959

0.00

10.00

20.00

30.00

40.00

50.00

60.00

70.00

80.00

90.00

100.00

Vg

Vgxe

Linear (Vg)

Linear (Vgxe)

Change in yield variability in Wheat Nursery

Page 11: THEME – 1 Mining CIMMYT germplasm data to inform breeding targets for CC adaptation

0.000

0.100

0.200

0.300

0.400

0.500

0.600

0.700

ESW

YT3

0

ESW

YT2

9

ESW

YT2

8

ESW

YT2

7

ESW

YT2

6

ESW

YT2

5

ESW

YT2

4

ESW

YT2

3

ESW

YT2

2

ESW

YT2

0

tmin_veg

tmin_rep

tmin_gf

tmin_seas

0.000

0.050

0.100

0.150

0.200

0.250

0.300

0.350

0.400

0.450

ESW

YT3

0

ESW

YT2

9

ESW

YT2

8

ESW

YT2

7

ESW

YT2

6

ESW

YT2

5

ESW

YT2

4

ESW

YT2

3

ESW

YT2

2

ESW

YT2

0

tmax_veg

tmax_rep

tmax_gf

tmax_seas

Climate/stage driving variability

Page 12: THEME – 1 Mining CIMMYT germplasm data to inform breeding targets for CC adaptation

tavg_gf tavg_rep tavg_seas tavg_veg tmax_gf tmax_rep tmax_seas tmax_veg tmin_gf tmin_rep tmin_seas tmin_veg

5398434 5390631 5398434 5390631 5398434 5390631 5398434 5390631 5398434 5551629 5551629 5551629

5534459 5551747 5551629 5551629 5534459 5551747 5390631 5552140 5534459 5551747 5398434 5390631

2430154 5551765 5390631 5551765 2430154 2430154 2430154 5551765 2430154 5390631 5398450 5552189

5398450 5551629 5398450 5552140 5398450 5398434 5398450 5551629 5534344 5551765 5390631 5552010

5398424 2430154 2430154 5552010 5398424 5551765 5551629 5398450 5398424 5534344 2430154 5552193

5551747 5534459 5551747 5398450 5390631 5534459 5551747 5398434 5398450 5534459 5534312 5534312

FDgf FDrep FDveg prec_gf prec_veg R10mmCL R10mmgf R10mmrep R10mmveg R5mmCL R5mmrep R5mmveg

5398530 5398530 5535500 5534326 5534326 5534312 5398530 5534312 5534312 5534312 5534312 5534312

5534312 2673706 5534475 5551704 5551704 5535534 5534312 5535534 5535534 5535534 5535534 5535534

5534459 4893489 5398136 5551798 5551798 5552327 5534459 5552327 5552327 5552327 5552327 5552327

5551690 5398471 2673706 5398160 5535534 5398530 5551690 5398530 5398530 5398530 5398530 5398530

5552193 5535415 5535514 5534335 5534335 5535415 5552193 5535415 5535415 5535415 5535415 5535415

2430154 5535428 5398471 5534339 5398160 5390809 2430154 5390809 5390809 5390809 5390809 5390809

How to use all these outputs? Genotypic

sensitivities

Page 13: THEME – 1 Mining CIMMYT germplasm data to inform breeding targets for CC adaptation

-0.25

-0.2

-0.15

-0.1

-0.05

0

0.05

0.1

0.15

0.2

0.25

PH

B3

0R

73

13

68

/(9

07

1X

BA

BA

MG

OYO

)-1

//9

09

1

PH

B3

0H

83

13

68

/90

71

//9

09

1

SC6

21

PH

B3

0H

37

10

2/1

36

8//

90

71

CZH

99

05

2

CZH

99

06

3(Q

PM

)

CZH

99

04

4

PA

N6

57

3

CZH

99

05

5

CZH

99

05

3(Q

PM

)

SC7

13

CZH

99

04

9(Q

PM

)

PA

N6

7

CZH

00

02

3

90

71

/(K

U1

40

3X

13

68

)-2

-1//

13

93

PA

N5

50

3

CZH

00

02

5

TZ9

04

3D

MR

SR/9

07

1

SC6

33

CZH

00

02

8

CZH

99

03

8

CZH

99

04

0

98

3W

H2

3

DK

80

51

CZH

99

06

1

SC6

27

CZH

00

02

9

CZH

00

03

0

CZH

99

02

0

CZH

99

03

7

CZH

00

02

6

PH

B3

0G

97

CZH

99

02

5

CZH

00

02

4

CZH

00

02

7

SC7

09

CZH

99

02

1

SC7

15

CZH

99

03

0

Post Silk Tmax (3% of variability of Grain WT in African Maize nursery)

to have stress populations or using Pedigree to identify useful parents

Again genotype’ sensitivities to climate is

useful!

Page 14: THEME – 1 Mining CIMMYT germplasm data to inform breeding targets for CC adaptation

Basic Model: YLD = Line + Location + LocationxYear+ Error

Full Model: YLD = Line + Location + LocationxYear+ Climate + Genetic Markers +

Genetic MarkersxClimate + Genetic MarkersxLocation + Error

Attempt to dissect GxE

Page 15: THEME – 1 Mining CIMMYT germplasm data to inform breeding targets for CC adaptation

0.2

0.25

0.3

0.35

0.4

0.45

Bas

ic M

od

el

Fu

ll M

od

el

LOC1

LOC2

LOC3

LOC4

LOC5

Predicting all genotypes in single location (4

Years)

Page 16: THEME – 1 Mining CIMMYT germplasm data to inform breeding targets for CC adaptation

Best matching clusters for all ESWYT locations

Page 17: THEME – 1 Mining CIMMYT germplasm data to inform breeding targets for CC adaptation

0.1

0.15

0.2

0.25

0.3

0.35

0.4

1 2 3 4 5 6 7 8 9 10 11 12 13 14

M1

M5

Basic Model: Line + LocxYear

Full Model: Line + LocxYear + Linex LocxYear + W + LinexW + G + GxLocxYear + GxW

Yield prediction on Elite nursery worldwide

Page 18: THEME – 1 Mining CIMMYT germplasm data to inform breeding targets for CC adaptation

-0.15

-0.1

-0.05

0

0.05

0.1

0.15

0.2

0.25

0.3

1 2 3 4 5 6 7

M1

M5

Yield prediction on Elite nursery SEA

Page 19: THEME – 1 Mining CIMMYT germplasm data to inform breeding targets for CC adaptation

25 - 75 50 - 50 RF 82.2 88.9 IRR 46.7 57.8

Can we predict some genotype in all locations?

Page 20: THEME – 1 Mining CIMMYT germplasm data to inform breeding targets for CC adaptation

Latest years

Genetic structure of 32 years of ESWYT

Page 21: THEME – 1 Mining CIMMYT germplasm data to inform breeding targets for CC adaptation

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

ALL ENV GENO ALL ENV GENO ALL ENV GENO ALL ENV GENO

Linear Regression SVM Regression Random Forest PLS Regression

Drought

Optimal

LowN

Model

CV

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

ALL ENV GENO ALL ENV GENO ALL ENV GENO ALL ENV GENO

Linear Regression SVM Regression Random Forest PLS Regression

Drought

Optimal

LowN

Modeling Maize African nurseries (EIHYB)

Page 22: THEME – 1 Mining CIMMYT germplasm data to inform breeding targets for CC adaptation

Maize landraces in Latino-America

Page 23: THEME – 1 Mining CIMMYT germplasm data to inform breeding targets for CC adaptation

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

LR RF SVM KNN

Training

TS=C

TS=C+G(PCs)

TS=C+Pop

TS=PCs

PC1=C

PC2=C

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

LR RF SVM KNN

CV

TS=C

TS=C+G(PCs)

TS=C+Pop

TS=PCs

PC1=C

PC2=C

Modeling TS and genetic structure with long-term climate

Page 24: THEME – 1 Mining CIMMYT germplasm data to inform breeding targets for CC adaptation

Cycle SOW_julia

n Emergence

_Julian HARVEST_julia

n FOLIAR_DISEASE_DEVELOPM

ENT IRRIGATE

D LODGIN

G

2005 11/19/2005 4/21/2006 TRACES YES SLIGHT

Environmental

data

Cycle SOW_julian Emergence _Julian HARVEST_julian FOLIAR_DISEASE_DEVELOPMENT IRRIGATED LODGING

2005 11/19/2005 4/21/2006 TRACES YES SLIGHT

Traits Varieties tested

PBW343

CHAM 6

KLEIN CHAMACO

HIDDAB

CHAKWAL 86

DHARWAR DRY

MILAN/KAUZ//PASTOR

FLORKWA-1/DHARWAR DRY

PASTOR/BAV92

CNDO/R143//ENTE/MEXI_2/3/AEGILOPS

SQUARROSA (TAUS)/4/WEAVER/5/PASTOR

VEBOW/IRENA

PASTOR/DHARWAR DRY

PASTOR//MILAN/KAUZ

BJY/COC//PRL/BOW/3/FRTL

RL6043/4*NAC//PASTOR

BERKUT

SERI*3//RL6010/4*YR/3/PASTOR/4/BAV92

SOROCA

PARUS/PASTOR

ASTREB

PASTOR//HXL7573/2*BAU

PASTOR//HXL7573/2*BAU

PASTOR/3/BJY/COC//PRL/BOW

SOKOLL

SOKOLL

SRMA/TUI//PASTOR

ALTAR 84/AE.SQUARROSA (224)//2*CUPE/3/BAV92

SKAUZ/PASTOR/3/CROC_1/AE.SQUARROSA

(224)//OPATA

CNO79//PF70354/MUS/3/PASTOR/4/BAV92

CNO79//PF70354/MUS/3/PASTOR/4/BAV92

MILAN/KAUZ//PRINIA/3/BAV92

MILAN/KAUZ//DHARWAR DRY/3/BAV92

MILAN/KAUZ/3/URES/JUN//KAUZ/4/CROC_1/AE.SQ

UARROSA (224)//OPATA

KABY/BAV92/3/CROC_1/AE.SQUARROSA

(224)//OPATA

PASTOR/FLORKWA-1//BAV92

BOW//BUC/BUL/3/KAUZ/4/BAV92/5/MILAN/KAUZ

PASTOR//MILAN/KAUZ/3/VEE/PJN//2*TUI

PASTOR//MILAN/KAUZ/3/BAV92

BJY/COC//PRL/BOW/3/MILAN/KAUZ/4/BAV92

RL6043/4*NAC//PASTOR/3/BAV92

RL6043/4*NAC//PASTOR/3/BAV92

KAUZ/BAV92/3/BJY/COC//PRL/BOW

ATTILA/PASTOR

FRAME/BUCHIN

SLVS/PASTOR

PASTOR*2/BAV92

SKAUZ/BAV92//PASTOR

ATTILA/BAV92//PASTOR

TEMPORALERA M 87*2/KONK

Grain yield

Days to heading

Plant height

Agronomic score

Can feed the phenology

table presented earlier

The Wheat Atlas Website

Page 25: THEME – 1 Mining CIMMYT germplasm data to inform breeding targets for CC adaptation

Table of genotype by

location values +

mean, max, min and

SD of genotypes and

locations

Install a win-win

relationships with

collaborators: They send

data, we provide analysis

and reports

The CIMMYT IWIS web-page

http://apps.cimmyt.org/wpgd/index.htm

Page 26: THEME – 1 Mining CIMMYT germplasm data to inform breeding targets for CC adaptation

IWIN-DAP: An Excel Add-In to analyze CIMMYT data

Page 27: THEME – 1 Mining CIMMYT germplasm data to inform breeding targets for CC adaptation

● Curation is important

● Vey helpful to complete info at the genebank and

creation of stress populations accelerate

germplasm exchange

● Pipelines for prediction and genomic selection:

Pedigrees and markers

● Data management and sharing; analytical and

visualization tools

● Collaborations

Conclusions