25
Supplementary Materials of Article A long-read transcriptome assembly of cotton (Gossypium hirsutum) and intraspecific SNP discovery Hamid Ashrafi 1 *, Amanda M. Hulse-Kemp 2 , Fei Wang 2 , Joshua Udall 3 , Don Jones 4 , Marta Matvienko 5 , Keithanne Mockaitis 6 , David M. Stelly 2 , Allen Van Deynze 1 1- University of California-Davis, Department of Plant Sciences and Seed Biotechnology Center, One Shields Ave, Davis CA 95616 2- Texas A&M University, Department of Soil and Crop Sciences, College Station, TX 77843 3- Brigham Young University, Plant and Wildlife Science Department, Provo, UT 84062 4- Cotton Incorporated, Cary, NC 27513 5- University of California-Davis, Genome Center, One Shields Ave, Davis CA 95616 Current Address: CLC bio, Cambridge, Massachusetts, United States of America 6- Department of Biology, Indiana University, 915 E. Third St., Bloomington IN 47405

Supplementary Materials of Article A long-read transcriptome assembly of cotton (Gossypium hirsutum) and intraspecific SNP discovery Hamid Ashrafi 1 *,

Embed Size (px)

Citation preview

Page 1: Supplementary Materials of Article A long-read transcriptome assembly of cotton (Gossypium hirsutum) and intraspecific SNP discovery Hamid Ashrafi 1 *,

Supplementary Materials of Article

A long-read transcriptome assembly of cotton (Gossypium hirsutum) and intraspecific SNP discovery

Hamid Ashrafi1*, Amanda M. Hulse-Kemp2, Fei Wang2, Joshua Udall3, Don Jones4, Marta Matvienko5, Keithanne Mockaitis6, David M. Stelly2, Allen Van Deynze1

1- University of California-Davis, Department of Plant Sciences and Seed Biotechnology Center, One Shields Ave, Davis CA 956162- Texas A&M University, Department of Soil and Crop Sciences, College Station, TX 778433- Brigham Young University, Plant and Wildlife Science Department, Provo, UT 840624- Cotton Incorporated, Cary, NC 275135- University of California-Davis, Genome Center, One Shields Ave, Davis CA 95616 Current Address: CLC bio, Cambridge, Massachusetts, United States of America6- Department of Biology, Indiana University, 915 E. Third St., Bloomington IN 47405

Page 2: Supplementary Materials of Article A long-read transcriptome assembly of cotton (Gossypium hirsutum) and intraspecific SNP discovery Hamid Ashrafi 1 *,

Fig. S1

0 150 300 450 600 750 900 105012001350150016501800195021002250240025502700285030003150330034503600375039000

100

200

300

400

500

600

Number of sequences with length X

Length of Sequences

Num

ber o

f Sequences

N50=1100

Max = 13,697Min = 101

Page 3: Supplementary Materials of Article A long-read transcriptome assembly of cotton (Gossypium hirsutum) and intraspecific SNP discovery Hamid Ashrafi 1 *,

Fig. S2

No BLAST No BLAST hit No Mapping No Annotation Annotated Total0

10000

20000

30000

40000

50000

60000

70000

80000

Data DistributionNum

ber o

f Sequences

Page 4: Supplementary Materials of Article A long-read transcriptome assembly of cotton (Gossypium hirsutum) and intraspecific SNP discovery Hamid Ashrafi 1 *,

Distribution of GO terms derived from each database

UniProtKB TAIR GR_protein FB MGI SGN ZFIN RGD WB0

500000

1000000

1500000

2000000

2500000

Distribution of GO terms derived from each database

Database

Num

ber o

f GO Terms

Fig. S3

Page 5: Supplementary Materials of Article A long-read transcriptome assembly of cotton (Gossypium hirsutum) and intraspecific SNP discovery Hamid Ashrafi 1 *,

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 >250

2000

4000

6000

8000

10000

12000

Dirstribution of Number of Sequences with GO Terms

Number of GOs

Num

ber o

f Sequences

Fig. S4

Page 6: Supplementary Materials of Article A long-read transcriptome assembly of cotton (Gossypium hirsutum) and intraspecific SNP discovery Hamid Ashrafi 1 *,

0

10

20

30

40

50

60

70

80

90

100

011

523

034

546

057

569

080

592

010

3511

5012

6513

8014

9516

1017

2518

4019

5520

7021

8523

0024

1525

3026

4527

6028

7529

9031

0532

2033

3534

5035

6536

8037

9539

1040

2541

4042

5543

7044

8546

0047

1548

3049

4550

6051

7552

9054

0555

2056

3557

5058

6559

8060

9562

10

Percen

t Ann

otated

Sequence Length

Percent of Anotated Sequences Relative to Sequence Length

Fig. S5

Page 7: Supplementary Materials of Article A long-read transcriptome assembly of cotton (Gossypium hirsutum) and intraspecific SNP discovery Hamid Ashrafi 1 *,

- 50,000 100,000 150,000 200,000 250,000 300,000

IEA

RCA

IDA

ISS

ND

IMP

ISM

IEP

TAS

IPI

IGI

IBA

NAS

IC

ISO

ISA

SequencesEviden

ce Codes

Fig. S6

Page 8: Supplementary Materials of Article A long-read transcriptome assembly of cotton (Gossypium hirsutum) and intraspecific SNP discovery Hamid Ashrafi 1 *,

Vitis viniferaGlycine max

Populus trichocarpaArabidopsis thaliana

Cucumis sativusPrunus persica

Ricinus communisSolanum lycopersicum

Fragaria vescaMedicago truncatula

Oryza sativaArabidopsis lyrata

Zea maysSorghum bicolorCapsella rubella

Brachypodium distachyonLotus japonicus

Hordeum vulgareGossypium hirsutum

Aegilops tauschiiPicea sitchensisTriticum urartu

Nicotiana tabacumSolanum tuberosum

Malus xSelaginella moellendorffii

Physcomitrella patensThellungiella halophila

unknownothers

0 20000 40000 60000 80000 100000 120000 140000 160000

Number of BLAST Hits to Available GenBank Species Sequences

Number of hits

Species

Fig. S7

Page 9: Supplementary Materials of Article A long-read transcriptome assembly of cotton (Gossypium hirsutum) and intraspecific SNP discovery Hamid Ashrafi 1 *,

Vitis vinifera

Populus trichocarpa

Glycine max

Fragaria vesca

Medicago truncatula

Arabidopsis thaliana

Oryza sativa

Lotus japonicus

Sorghum bicolor

Jatropha curcas

Hevea brasiliensis

Nicotiana tabacum

Gossypium raimondii

Citrus sinensis

Brachypodium distachyon

0 2000 4000 6000 8000 10000 12000 14000

Number of BLAST Top-Hits to Available Sequences Available in GenBank

Number of Hits

Species

Fig. S8

Page 10: Supplementary Materials of Article A long-read transcriptome assembly of cotton (Gossypium hirsutum) and intraspecific SNP discovery Hamid Ashrafi 1 *,

Fig. S9

Page 11: Supplementary Materials of Article A long-read transcriptome assembly of cotton (Gossypium hirsutum) and intraspecific SNP discovery Hamid Ashrafi 1 *,

Fig. S10

Page 12: Supplementary Materials of Article A long-read transcriptome assembly of cotton (Gossypium hirsutum) and intraspecific SNP discovery Hamid Ashrafi 1 *,

Fig. S11

Page 13: Supplementary Materials of Article A long-read transcriptome assembly of cotton (Gossypium hirsutum) and intraspecific SNP discovery Hamid Ashrafi 1 *,

Chr 1

Fig. S12-a. Alignment of TM-1 454 and EST sequences to the genome of G. raimondii. a-d) alignment of G. raimondii genes, CDS, Gene based expression and translated region expression to its own genome sequence, respectively. e), f) depicts the left and right had side of break points when TM-1 Reads mapped to G. raimondii genome. g) large InDels h) structural variants.

a

b

c

d

e

f

g

h

Page 14: Supplementary Materials of Article A long-read transcriptome assembly of cotton (Gossypium hirsutum) and intraspecific SNP discovery Hamid Ashrafi 1 *,

Chr 2

Fig. S12-b

Page 15: Supplementary Materials of Article A long-read transcriptome assembly of cotton (Gossypium hirsutum) and intraspecific SNP discovery Hamid Ashrafi 1 *,

Chr 3

Fig. S12-c

Page 16: Supplementary Materials of Article A long-read transcriptome assembly of cotton (Gossypium hirsutum) and intraspecific SNP discovery Hamid Ashrafi 1 *,

Chr 4

Fig. S12-d

Page 17: Supplementary Materials of Article A long-read transcriptome assembly of cotton (Gossypium hirsutum) and intraspecific SNP discovery Hamid Ashrafi 1 *,

Chr 5

Fig. S12-e

Page 18: Supplementary Materials of Article A long-read transcriptome assembly of cotton (Gossypium hirsutum) and intraspecific SNP discovery Hamid Ashrafi 1 *,

Chr 6

Fig. S12-f

Page 19: Supplementary Materials of Article A long-read transcriptome assembly of cotton (Gossypium hirsutum) and intraspecific SNP discovery Hamid Ashrafi 1 *,

Chr 7

Fig. S12-g

Page 20: Supplementary Materials of Article A long-read transcriptome assembly of cotton (Gossypium hirsutum) and intraspecific SNP discovery Hamid Ashrafi 1 *,

Chr 8

Fig. S12-h

Page 21: Supplementary Materials of Article A long-read transcriptome assembly of cotton (Gossypium hirsutum) and intraspecific SNP discovery Hamid Ashrafi 1 *,

Chr 9

Fig. S12-i

Page 22: Supplementary Materials of Article A long-read transcriptome assembly of cotton (Gossypium hirsutum) and intraspecific SNP discovery Hamid Ashrafi 1 *,

Chr 10

Fig. S12-j

Page 23: Supplementary Materials of Article A long-read transcriptome assembly of cotton (Gossypium hirsutum) and intraspecific SNP discovery Hamid Ashrafi 1 *,

Chr 11

Fig. S12-k

Page 24: Supplementary Materials of Article A long-read transcriptome assembly of cotton (Gossypium hirsutum) and intraspecific SNP discovery Hamid Ashrafi 1 *,

Chr 12

Fig. S12-l

Page 25: Supplementary Materials of Article A long-read transcriptome assembly of cotton (Gossypium hirsutum) and intraspecific SNP discovery Hamid Ashrafi 1 *,

Chr 13

Fig. S12-m