73
1 Relaxed selection in erythropoietic gene hemogen among high-latitude Antarctic notothenioids by Carmen M. Elenberger B.A. in Anthropology, University of Florida A thesis submitted to The Faculty of the College of Science of Northeastern University in partial fulfillment of the requirements for the degree of Master of Science December 12, 2018 Thesis directed by H. William Detrich Professor of Biochemistry and Marine Biology

hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

1

Relaxed selection in erythropoietic gene hemogen among high-latitude Antarctic notothenioids

by Carmen M. Elenberger

B.A. in Anthropology, University of Florida

A thesis submitted to

The Faculty of

the College of Science of

Northeastern University

in partial fulfillment of the requirements

for the degree of Master of Science

December 12, 2018

Thesis directed by

H. William Detrich

Professor of Biochemistry and Marine Biology

Page 2: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

2

Copyright 2018

Carmen Elenberger

Page 3: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

3

Acknowledgements

First and foremost, I would like to thank my advisor, Dr. H. William Detrich, for his guidance

and his support over the past four years. He challenged me to broaden my horizons and gave me

the opportunity to travel to the ends of the earth in order to do so. I would also like to thank Dr.

Thomas Desvignes, as well as Laura Goetz and Sierra Smith, for their assistance in conducting

field work for this project. I would like to extend further thanks to Dr. Jacob Daane for

permitting me to use his unpublished data to expand my analyses. Many thanks to Biology Open

for allowing me to reproduce their figure with permission [1].

I would like to thank my committee members, Dr. A. Randall Hughes and Dr. Steve

Vollmer, for their interest in my research and their advice in analyzing and framing the results of

my research. I would also like to thank my labmate, Dr. Michael Peters, and our lab manager,

Sandra Parker, for their advice, assistance, and encouragement over the years. Additionally, I

would like to thank the faculty and staff of the Marine Science Center, as well as the funding

sources for this research. Special thanks to the staff of Palmer Station and the crew of the

Laurence M. Gould for a productive and memorable field season. Finally, I would like to thank

my friends and family for their unwavering support and encouragement, now and always.

Page 4: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

4

Abstract of Thesis

Antarctic icefish (Channichthyidae) are the only vertebrate taxon with an erythrocyte-null

phenotype, and present an interesting model for studying the evolution and regulation of

erythropoiesis. The gene hemogen has been identified to encode a protein which plays a role in

regulating erythropoietic processes in vertebrates. hemogen may have been potentially impacted

by the loss of globin-expression. I investigated possible relaxed selection at the hemogen locus

by looking for evolutionary change to the regulatory elements or segments encoding the

Hemogen protein, and assessed the evolutionary processes that drove hemogen variation among

Antarctic notothenioids. While regulatory mechanisms remain intact, icefish show a significant

90bp indel in exon 3 of hemogen that would disrupt conserved modules in the Hemogen protein

that are critical for erythropoiesis. Despite this, hemogen still remains expressed at low levels in

adult icefish and possesses a novel splice variant that encodes a truncated protein possibly

serving as a dominant negative for wild-type Hemogen. I conclude that while hemogen has

undergone relaxed selection and accumulated mutations that would impact erythropoietic

function in non-Antarctic fish, the observed mutations may be tolerated due to erythrocyte and

hematocrit modifications in notothenioid blood phenotypes. hemogen may have a decreased—

but still important—role to play in icefish, possibly functioning as a dominant negative for

hemogen’s role in erythropoiesis.

Page 5: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

5

Table of Contents

Acknowledgements 3

Abstract of Thesis 4

Table of Contents 5

List of Tables 6

List of Figures 7

List of Abbreviations 9

Introduction 11

Methods 15

Results 21

Discussion 29

Tables and Figures 40

References 67

Page 6: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

6

List of Tables

1 Primers used in PCR and qRT-PCR reactions to amplify hemogen gDNA and cDNA in

Antarctic notothenioids (pg 40)

2 Species sequenced and included in study of Antarctic notothenioid hemogen (pg 41)

3 Codon usage bias for hemogen (total coding sequence) among Antarctic notothenioids

(pg 42)

4 Mean pairwise dN/dS for within-family comparisons of Antarctic notothenioid families

(pg 43)

5 Mean pairwise dN/dS for between-family comparisons of Antarctic notothenioid families

(pg 44)

6 Results of codon-based site tests conducted in CodeML on the Antarctic radiation (pg 45)

Page 7: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

7

List of Figures

1 Zebrafish Si:dkey-25o16.2 and human Hemogen are orthologous and encode related

proteins that differ in size (pg 46)

2 Icefish transcript variants for hemogen and their putative effects on translation illustrated

in representative species Champsocephalus gunnari (pg 48)

3 Maximum likelihood tree used to test for positive selection on the branch leading to the

Antarctic notothenioid clade (pg 50)

4 Maximum likelihood tree used in site-tests for positive/pervasive selection among

Antarctic notothenioids (pg 51)

5 RELAX tree shows relaxed selection on the branches contained Bathydraconidae and

Channichthyidae, demonstrating a trend of relaxed selection in hemogen on the way to

the erythrocyte-null phenotype (pg 53)

6 Gene structure and size remains conserved among red-blooded and white-blooded

notothenioids, including regulatory regions conserved among teleost fish (pg 54)

7 Conservation of conserved non-coding elements CNE1 and CNE2 in Antarctic

notothenioids relative to Gasterosteus aculeatus and Danio rerio (pg 56)

8 hemogen exon 3 deletions in representative species from Channichthyidae relative to a

red-blooded notothenioid, and their predicted effects on transcription and translation (pg

57)

9 Variant forms of hemogen “exon 3” deletion mapped onto the Channichthyidae species

tree (pg 59)

10 hemogen indels in Antarctic notothenioids mapped onto a maximum parsimony tree (pg

60)

11 Pairwise dN/dS comparisons plotting total dN/dS of whole Hemogen-encoding sequence

with the dN/dS values for the N-terminus and C-terminus of notothenioid Hemogen,

within families Nototheniidae (A & B) and Channichthyidae (C & D). (pg 62)

12 Pairwise dN/dS trends between families Nototheniidae and Channichthyidae, plotting

whole-Hemogen dN/dS vs the N-terminus (A) or C-terminus (B). (pg 63)

13 qPCR quantification of hemogen transcript variants in representative icefish species C.

aceratus and C. gunnari, comparing adult head kidney hemogen expression with N.

coriiceps adult head kidney for both hemgn-L and hemgn-s splice variants (pg 64)

Page 8: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

8

14 Changes to the bipartite nuclear localization signal in icefish (Champsocephalus gunnari)

relative to red-blooded notothens (Notothenia coriiceps). (pg 66)

Page 9: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

9

List of Abbreviations

aa amino acid

bp base pair

CAI Codon Adaptation Index

cDNA complementary DNA

CNE conserved non-coding element

dN nonsynonymous mutation rate

DNA deoxyribonucleic acid

dN/dS ratio of nonsynonymous to synonymous mutation rates

dS synonymous mutation rate

EDAG erythroid differentiation-associated gene

GATA1 GATA-binding protein 1

gDNA genomic deoxyribonucleic acid

HoxB4 homeobox B4

KLF4 Krueppel-like Factor 4

-lnL negative log likelihood

MMCT Middle Miocene Climate Transition

MRCA most recent common ancestor

Mya million years

Myb MYB Proto-Oncogene, Transcription Factor

NLS nuclear localization signal

p300 histone acetyltransferase p300

PCR polymerase chain reaction

Page 10: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

10

qPCR quantitative polymerase chain reaction

RNA ribonucleic acid

Sox9 transcription factor SOX-9

UTR untranslated region

Page 11: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

11

INTRODUCTION

Cold-driven evolution of the Antarctic notothenioid lineage began roughly 46 Mya [2]

concurrent with the emergence of the Drake Passage (55-41 Ma) [3] and the initial formation of

the Antarctic Circumpolar Current [4]. The development of antifreeze glycoproteins [5, 6]

permitted colonization and persistence in the Southern Ocean [7] and set the stage for further

diversification during successive cooling periods and accompanying geological events. The

radiation of the high latitude Antarctic notothenioids (Cryonotothenioidea) occurred during a

period of diversification driven by intensified cooling of the Southern Ocean during the Middle

Miocene Climate Transition (MMCT) [7, 8], with species diversification beginning ~14 Mya and

accelerating ~11 Mya during the Late Miocene [7, 9-11]. Cooling during the MMCT led to

contemporary Antarctic conditions (-2℃ to + 2℃) and resulted in the scouring of continental

shelves by ice [12, 13]. This opened ecological niches for potential colonization by removing

more temperate adapted competitors [14] and leading to rapid morphological and ecological

diversification [15]. Current day Antarctic notothenioids comprise 77% of Antarctic teleost

diversity and constitute a marine species flock [16] derived via adaptive radiation [17-19]. High

levels of morphological diversity and intense speciation make Antarctic notothenioids a useful

evolutionary model for studying cold adaptation.

Antarctic notothenioids possess a number of remarkable changes to erythropoiesis and

the oxygen-transport system at large that resulted in the evolution of the only known vertebrate

clade devoid of erythrocytes—the family Channichthyidae, characterized by a “white-blooded”

phenotype [20]. It has been hypothesized that the high oxygen concentration in polar seawater

could lead to potential relaxed selection on erythrocytes and other oxygen-binding pigments, as

hypoxic stress becomes less of a relevant factor with oxygen in such high abundance [21].

Page 12: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

12

Evidence for such relaxed selection can be seen in changes to blood content: a study of “red-

blooded” Antarctic species from McMurdo Sound showed decreased numbers of erythrocytes,

lowered hematocrit, and lowered hemoglobin concentrations when compared with temperate fish

[22]. General trends throughout the radiation show that the more derived the family, the fewer

erythrocytes present in circulating blood and the lower the hemoglobin content [21]. Both red-

blooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an

adaptive feature to contend with the increased viscosity of blood under low temperatures [23,

24]. Hemoglobin multiplicity is reduced among notothenioids relative to temperate fish [25-27]

and cold anemia responses became genetically assimilated [28-30]. At some point notothenioid

dependence on hemoglobin for respiration became so reduced even red-blooded fish could

continue to effectively absorb and utilize oxygen even in the presence of carbon monoxide [31],

suggesting that the stage had well been set for hemoglobin loss before it disappeared.

Channichthyidae are characterized by loss of the vertebrate oxygen-transport molecules

the α2β2 hemoglobin tetramer carried within erythrocytes. This occured in the most recent

common ancestor (MRCA) of all icefish via large genomic lesions within the respective loci [25,

32-36]. Furthermore, there have been multiple, independent losses of myoglobin during

diversification [36]. Icefish possess few erythroblasts, and their blood contains mostly leukocytes

and plasma [35]. The evolution of the white-blooded phenotype is unique among vertebrates and

has far reaching consequences for the cardiovascular system and key globin partners. As a result

of hemoglobin loss, we would anticipate changes to the genetic machinery involved in red blood

cell production and maintenance, as selective constraints on this may relax in the absence of key

globin partners. It is also possible that this began somewhere within the red-blooded families, as

oxygen transport molecules became less necessary for survival. Relaxed selection in the

Page 13: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

13

regulatory regions of globin has been detected among dragonfish, prior to the emergence of a

white-blooded phenotype [37].

The gene hemogen has been identified as an interesting candidate for further study in

notothenioid fish, given evidence based on subtraction libraries that expression may be impaired

or entirely absent in icefish. The hemogen gene encodes the transcription factor Hemogen (Fig

1), which acts as a regulator in hematopoietic development by stimulating the differentiation of

hematopoietic cells into both the erythroid and megakaryocytic lineages [38-43]. In teleost fish,

Hemogen is encoded by four exons and contains domains similar to those predicted in the human

ortholog: a coiled-coil domain, a bipartite nuclear localization, a series of tandem repeats and an

acidic domain (Fig 1) [1, 38]. It is promoted via two conserved non-coding elements, one

proximal and one distal, both critical for promoting primitive erythropoiesis (Fig 1) [1].

Hemogen also plays a role in cell apoptosis [39] and has been implicated in the regulation of

tumor cells in acute myeloid leukemia [44]. Other possible roles include spermatogenesis [45],

sex-determination [46], and osteoblast recruitment and bone calcification [47-49]. Research

show Hemogen’s role in hematopoiesis takes place via interactions with a number of key

proteins involved in erythropoiesis and development, including GATA1 and p300. GATA1 is

critical for erythroid differentiation [50-52] and functions in both primitive and definitive

hematopoiesis [53]. Nonsense mutations in GATA1 lead to a “bloodless” phenotype [54].

GATA1 recruitment is crucial for hemogen function and downregulation of hemogen expression

inhibits GATA1 activity [40, 43], while GATA1 recruits hemogen to the beta-globin locus [55].

p300 is crucial for cell differentiation [56, 57] and inhibition of p300 binding to Hemogen causes

decreased production of erythroid cells. Hemogen facilitates the interaction between GATA1 and

p300, making it a critical part of the erythroid differentiation process [55].

Page 14: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

14

Decreased hemogen expression in white-blooded fish may indicate functional loss. Given

the decreasing importance of red blood cells to the notothenioid lineage, selective constraints on

known regulators of erythrocyte production may have relaxed prior to complete globin loss.

Hemogen interacts with Beta-globin and regulates erythroid production, raising the possibility

that erythropoietic features may be aberrant in icefish. However, hemogen demonstrates

pleiotropy, as described in the previous paragraph, and lists of potential partners implicate it in a

number of important cellular processes beyond erythropoiesis. Therefore, at least some features

must remain conserved in order to carry out non-erythropoietic roles.

In this thesis, I characterize hemogen genes in both red-blooded and white-blooded

Antarctic notothenioids and compare them with sub-Antarctic perciform outgroups to establish

hemogen’s history within this clade. I hypothesize that the hemogen locus is undergoing relaxed

selection among the icefish, and that relaxation of selective constraints began prior to the

emergence of Channichthyidae. I investigated partial conservation of the hemogen gene,

hypothesizing that pleiotropy would protect against total pseudogenization of hemogen. Features

under relaxed selection would be implicated in erythropoietic function and could be considered

targets for further study of hemogen in erythropoiesis. I hypothesize some level of differential

expression between white-blooded and red-blooded fish; if not complete loss of expression, than

loss in certain tissues or of certain key isoforms in Channichthyidae.

My results show a strong trend towards relaxed selection in high-latitude Antarctic

notothenioids relative to Sub-Antarctic relatives, with icefish showing intensified relaxation.

Confirmation of relaxed selection among-red-blooded fish supports the theory that the decreased

dependence on erythrocytes in notothenioid fish also correlates with larger-scale changes in the

erythropoietic paradigm on the genomic level. Three out of four key functional domains show

Page 15: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

15

some form of degradation, either via nonsynonymous mutation or through the

transcriptional/translational impacts of indels on icefish hemogen. Three key evolutionary events

took place in the MRCA of all extant icefish: the degradation of the bipartite NLS, a 30aa loss in

a proline-rich region of tandem repeats, and the development of a novel splice form, hemgn-s,

which excludes all functional domains encoded by exon 3 and 4 and theoretically results in a

frameshifted and truncated hemogen protein. However, key promoter regions remain conserved

in icefish, and while expression is down-regulated in adult tissues relative to red-blooded

species, hemogen is still expressed in adult tissues of some icefish. This suggests that while the

decreased importance of erythropoietic functions may have significantly relaxed pressure on

hemogen and resulted in mutations impacting domains critical for erythropoietic-function, it is

not necessarily non-functional and may still be playing a decreased but critical role in other

processes.

METHODS

Sample collection & sequencing of notothenioid hemogen gDNA

The primary source of genomic material came from tissues obtained by the Detrich Lab

during the 2012, 2014 and 2016 winter fishing cruises conducted by the Research Vessel

Laurence M. Gould near Palmer Station, Antarctica. Tissues were flash-frozen in liquid nitrogen

and then stored at -80℃. I generated sequences from between 1-5 individual fish per species.

Molecular methods for gDNA extraction from tissues are as specified in the Quick-gDNA

miniprep kit (Zymo Research, D3024). Full notothenioid hemogen—from start codon to the 3’

UTR—was amplified by PCR from gDNA samples using 1 µM primers (Table 1) designed from

previously obtained Notothenia coriiceps sequences. The amplification protocol was as

Page 16: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

16

follows—35 cycles of 98°C for 10 s, 59°C for 10 s, 72°C for 1 min. PCR products were cloned

into the pGEM T-easy vector (Promega, A1360), and recombinant clones were transformed into

DH5α competent cells (New England Biolabs, C2987H). Recombinant plasmids were identified

using blue/white screening, purified via the Wizard Plus Miniprep DNA Purification System

(Promega, A7500), and sequenced by GeneWiz. I obtained full genomic sequences for 18

notothenioid species (Table 2, Figure S1).

Cloning and sequencing of notothenioid cDNAs

I isolated total RNA from flash-frozen tissues of adult N. coriiceps and C. aceratus using

the RNEasy Mini Kit (Qiagen, 74104). Several potential hemogen transcripts had been

previously identified by other Detrich Lab members (Figure 2). To expand upon these results,

RNA samples were prepared from ten tissues: liver, spleen, head kidney, trunk kidney, white

muscle, pectoral red muscle, testes, brain, heart ventricle and gill. Total cDNA was produced

from the mRNA using M-MuLV reverse transcriptase and an oligo(dT)23 primer according to the

protocol outlined in the Protoscript II First Strand cDNA Synthesis kit (NEB, E6560S). cDNA

was amplified via PCR using the same primers as gDNA PCR (Table 1) according to the

following protocol: 35 cycles of 98°C for 10 s, 59°C for 10 s, and 72°C for 45 s. cDNA was then

cloned into pGEM T-easy vector and subsequently transformed and purified as outlined for

gDNA sequences.

Construction of genomic, coding and protein alignments for gene characterization,

phylogenies and evolutionary analysis

Page 17: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

17

Nucleic acid alignments were constructed using MUSCLE [58] as implemented in

MEGA7 [59], with a gap opening penalty of 15 and gap extension penalty of 6.66. Alignments

were subsequently inspected and adjusted by eye in BioEdit [60]. Construction of gene trees and

evolutionary analysis relied primarily on three alignments: a gDNA alignment with all exons +

introns; a coding alignment based on cDNA sequences, transcriptome data and concatenated

exome data; and a protein alignment, translated from the coding sequences in MEGA7 [59].

The cDNA sequences that I generated were supplemented with hemogen cDNAs from

transcriptomic analyses of Pseudochaenichthys georgianus [unpublished data from Detrich lab]

and Parachaenichthys charcoti [unpublished data from Detrich lab], and aligned with my

genomic sequences to generate coding sequences for other notothenioids. Additionally, cDNA

and transcriptome sequences also served as a basis for alignment and quality control for

sequences obtained via an exome-capture analysis [unpublished] conducted by Dr. Jacob Daane

of the Detrich lab. A full list of all species included and the sequence sources can be found in

Table 2. In total 43 species representing Antarctic notothenioids from all high-latitude families

(Artedidraconidae, Bathydraconidae, Channichthyidae, Harpagiferidae, Nototheniidae) as well

as 7 Sub-Antarctic outgroups were included in evolutionary analyses.

Analysis of positive, pervasive and relaxed selection on Antarctic notothenioid hemogen

All trees were constructed in RAxML [61, 62] using nucleotide substitution model

GTRGAMMA to conduct an initial tree search of 20 trees and select the best tree from this pool.

No outgroups were specified. Branch tests were conducted using the CodeML module included

in PAML 4.0 [63, 64].

Page 18: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

18

Two tests for positive selection were run. The first was conducted using branch-site

models [65, 66] on a subset of coding sequences (SFigure 2) to search for possible positive

adaptation in the Antarctic clade relative to several notothenioid outgroups. The branch leading

to the representative Antarctic notothenioids was specified a priori as the foreground branch

(Figure 3). The null model set NSsites = 2, fix_omega = 1, and omega = 1. This assumes two

categories of sites (purifying and neutral selection) and looks for a difference in proportions of

sites undergoing neutral selection on the foreground branch relative to the background. The

positive/alternative model set NSsites = 2, fix_omega = 0, and omega = 1, which allows for three

categories of sites (purifying, neutral, and positive selection) and looks to identify sites

undergoing positive selection on the foreground relative to the background branch. If the

alternative model is accepted over the null, this indicates a site has undergone episodic positive

selection (changed once, then retained in the clade)

The second test relied on codon-substitution site models [67, 68] to detect pervasive

positive selection among Antarctic notothenioids using the coding sequences (SFigure 2). This

would identify any possible sites which changed repeatedly throughout diversification of the

clade, possibly as a result of differing adaptive challenges related to the modification of the

hematic system. Models M0, M1a, M2a, M3, M7, and M8 were run by setting NSsites = 0 1 2 3

7 8 (respectively), fix_omega = 0, and omega = 1. Model M8a set NSsites = 8 but set fix_omega

= 1 and omega = 1. The submitted gene tree for the site tests can be found in Figure 4.

Test for relaxed selection in the branch leading to Channichthyidae was conducted using

RELAX [69] as part of the HyPhy suite of hypothesis testing software [70]. RELAX conducts a

comparative test of whether an a priori specified branch or subset of branches has undergone

relaxed or diversifying selection relative to the rest of the tree. This makes it useful for

Page 19: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

19

identifying trends and/or shifts in the stringency of natural selection on a given gene, provided

one has an idea of where that should occur. The branches ending in Channichthyidae and

Bathydraconidae were selected as the test branches, with all others used as reference branches.

The reference tree used was the putative species tree of Daane [unpublished]. The test was run

on the Datamonkey server [71, 72].

Bioinformatic comparison of notothenioid hemogen promoters and coding domains

Regulatory regions from Eleginops maclovinus, N. coriiceps and Chaenocephalus

aceratus were sequenced based on the annotations for the N. coriiceps genome (NCBI

Accession: PRJNA66471, ID: 66471) [73]. gDNA sequences were aligned to N. coriiceps and C.

aceratus scaffolds via BLAST in Geneious (v. 10.0.5) [74] to determine whether notothenioid

species possess conserved synteny around the hemogen locus as observed in other vertebrate

species [1]. Scaffold sequences were confirmed by sequencing from the upstream (anp32b) and

downstream (TRMO) genes towards hemogen. Promoter alignments for hemogen were obtained

using the whole genome alignments for D. rerio and Gasterosteus aculeatus (ENSEMBL v94)

[75]. Transcription factor binding sites were predicted with ConTra v2 using a similarity matrix

of 0.75 [76]. Protein domains were identified based on annotations from human [38] and

zebrafish hemogen [1].

Parsimony gene tree and deletion mapping

A hemogen gene tree was built using coding sequences and maximum parsimony method

[77] in Mega7. Gaps were treated as partial deletions with site coverage set for 90%. This

allowed for the inclusion of sites where a majority of species possessed sequence data but one

Page 20: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

20

species (or genus) possessed a phylogenetically informative indel. The tree included 1st, 2nd, and

3rd codon positions and was computed using the Subtree-Pruning-Regrafting method, beginning

with 10 trees and retaining 100 trees. Following 1000 bootstrap iterations the best tree was

selected based on comparison with known species phylogenies. The phylogeny was edited to

include indel information using ggtree in R [78] and the Interactive Tree of Life (iTOL) v3 [79].

I ran the tree topology in CodeML [63, 64] using the M0 model (model = 0, NSsites = 0) [67] to

obtain the number of nucleotide substitutions per codon (dN+dS) as well as dN, dS, and dN/dS

for the whole tree.

An icefish species tree was constructed based on the species tree built from the exome

data of Daane [unpublished] with modifications derived from available Channichthyidae

phylogenies [80, 81].

Pairwise dN/dS comparisons

Pairwise dN/dS values were generated using a subset of the coding alignment (SFigure 2)

and were ran in PAML4 using yn00 [64]. yn00 calculates rates based on the method outlined in

Nielsen & Yang 2000 [82] and allows for codon usage bias as well as transition-transversion rate

differences. To assess codon usage bias in notothenioid hemogen, I used DnaSP v5 [83, 84] to

measure codon usage bias via the codon adaptation index (CAI/CBI) [85, 86]. Values for CAI

are shown in Table 3; the values fall within a range of 0.3-0.4 for all species, which represents

moderate codon usage bias (low bias < 0.3 and high > 0.5).

All forty-three high-latitude notothenioid species were included in this analysis (Table 2).

I examined two kinds of evolutionary relationships: within-family comparisons (ex: icefish vs

icefish) and between-family pairwise comparisons (ex: Channichthyidae vs Nototheniidae). For

Page 21: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

21

each kind of comparison I ran the data with three partitions: the total protein coding sequence,

the coding sequence for the N-terminus only (1-79 aa, which represents the end of the bipartite

NLS), and the coding sequence for the C-terminus only (80 aa—end). This allowed for a more

nuanced analysis of the selective forces at work on different parts of the gene as well as within

different clades and is derived from work done parsing geographic effects on cichlids and

positive selection in notothenioids [87, 88].

qPCR

Previous qualitative PCR I conducted on C. aceratus cDNA established general

presence/absence of hemogen expression in several adult tissues—liver, head kidney, trunk

kidney, spleen, and brain—and isolated the predominant isoforms of hemogen expression in

Channichthyidae (Figure 2). qPCR experiments utilized cDNA samples (outlined in preceding

selection) taken from tissues of adult icefish (Champsocephalus gunnari and C. aceratus). The

experiment was designed to verify and quantify hemogen expression in multiple icefish species

and compare isoform expression in adult tissues. Target transcripts were amplified from cDNA

using 1 µM primers (Table 1). Targets were amplified in triplicate. Expression was normalized

to beta-actin as the endogenous control for ΔΔCt method [89]. Standard curves were generated

to assess the primer efficiencies. qPCR was performed using QuantStudio3 ThermoCycler using

QuantStudio Design and Analysis Software.

RESULTS

Branch test does not detect positive selection in Antarctic notothenioids

Page 22: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

22

The test attempted to determine whether adaptive positive selection has occurred in

Antarctic notothenioid Hemogen in comparison to Sub-Antarctic relatives/perciform outgroups

(Figure 3) using the branch-site method. The alternative model testing for positive selection

returned –lnL = -3135.34323 with np = 20. The null model (no positive selection) gave –lnL = -

3135.34325 with np = 19. The likelihood ratio test yielded a value of 0.00039 with df = 1, with p

= 0.9842. The test for positive selection was not significant, and the evolutionary change

observed in Antarctic notothenioids relative to other teleosts is not likely to be adaptive change

driven by positive selection.

Site test results do not detect pervasive adaptive change within the high-latitude Antarctic

notothenioid radiation

Model M0 gives fundamental statistics about the base composition of the tree, as well as

measures mutational rates over all sequences. The tree used for site tests was generated from

hemogen coding sequences via the maximum likelihood method (Figure 4). dN summed over the

entire tree = 0.3485, while dS = 0.2974, giving omega dN/dS = 1.17184. While these mutational

rates are low, this ratio would be consistent with relaxed purifying selection on hemogen,

although it cannot definitively differentiate between relaxed and positive selection.

Site tests yielded three tests to detect positive selection (M1a-M2a, M7-M8, M8a-M8)

with p < 0.05 (Table 6). All of these models measure positive selection, with M8a-M8 being the

most robust and reliable. That all returned significant p-values would be indicative of pervasive

positive selection at specific sites throughout the diversification of Antarctic notothenioids.

However, the data violates a critical assumption of the site test of positive selection. dS summed

over all branches < 0.5, which indicates insufficient sequence divergence among the species

Page 23: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

23

tested. As a result of this low sequence divergence, the codon-based test is insufficiently robust

and cannot reliably measure changes to selective pressure at different sites. No pervasive positive

selection can be inferred from these results.

Relaxed selection in Channichthyidae relative to other Antarctic notothenioid families

RELAX confirmed a trend towards relaxed selection in the branches leading to

Channichthyidae and Bathydraconidae relative to Artedidraconidae, Nototheniidae, and E.

maclovinus (Figure 5). Test for selection relaxation (K = 0.25) was significant (p = 0.002, LR

= 9.77).

Key promoters remain conserved in Antarctic notothenioids

Based on previous work on both human and teleost hemogen, both conserved non-coding

regulatory elements, CNE1 and CNE2, identified by Peters et al 2018 [1] were identified in E.

maclovinus, N. coriiceps and C. aceratus (Figure 6). While the intergenic regions between CNEs

is reduced in C. aceratus relative to D. rerio (Figures 1/6), both elements remain intact in all

three species examined. Preliminary comparisons of key transcription factor binding sites show

no significant departures or losses in C. aceratus compared with N. coriiceps or E. maclovinus.

This includes putative binding sites for important co-factors like p300, GATA1, Sox9 and

HoxB4 (Figure 7).

Gene size and structure remains largely conserved in icefish relative to red-blooded fish

but show a large genetic lesion in exon 3

Page 24: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

24

I used two species as representatives for assessing differences in the gene based on

erythroid presence/absence: the red-blooded notothen N. coriiceps and the white-blooded icefish

C. aceratus. Notothenioid hemogen is composed of 4 exons and 3 introns, similar to the

previously described D. rerio [1]. hemogen is approximately the same size in both species:

1762bp in N. coriiceps, and 1701bp in C. aceratus as measured from the start codon to the stop

codon. I observed no size change in exons 1, 2 & 4, and only small indels in each intron (1-12bp)

(Figure 6). However, C. aceratus showed a significant deletion in exon 3—the loss of 89bp,

which occurs within the tandem-repeat region (Figure 6/SFigure 3). This prompted further

exploration of this region among icefish to: 1) determine its prevalence throughout the clade, and

2) assess its potential significance on transcription and translation.

Indels in exon 3 serve as the primary source of sequence divergence and evolutionary

change among Antarctic notothenioids

I surveyed 14 of the 16 extant icefish species to determine if the deletion observed in C.

aceratus is a species-specific feature or evidence of an evolutionary event in the

Channichthyidae MRCA. All species examined showed evidence of an evolutionary loss at this

locus, but in three different variants: an 89 bp deletion, a 90 bp deletion, and a 99 bp deletion

(Figure 8). The deletions are not distributed evenly throughout the radiation and do not neatly

correspond with the known phylogeny for icefish speciation (Figure 9). By far the most prevalent

deletion was the 90bp loss, in a majority of species from the most ancestral (genus

Champsocephalus) to the most recently derived (genus Chionodraco) (Figure 9). The 99bp/33aa

loss emerged more recently and is contained within the clade consisting of the genera

Chionobathyscus, Cryodraco, Chaenodraco, and Chionodraco (Figure 9). The 89 bp deletion

Page 25: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

25

was only present in species that did not form a monophyletic group. In addition, some species

possess multiple deletion alleles—within the derived clade where the 99 bp deletion first

emerged, 4 of 6 species were found to carry alleles for both the 90 bp and 99 bp deletion.

Analysis of the coding sequence (all exons) showed high sequence conservation

regardless of the breadth of speciation among Antarctic notothenioids. The majority of

evolutionary change centers on repeated insertion or deletion events, which are notable both for

their frequency as well as a tendency to reoccur independently in different species or clades

within the same, often overlapping, region of the gene. CodeML model M0 showed low

mutation rates even when E. maclovinus was included in the analysis: a tree length for dN of

0.3156, and tree length for dS of 0.2284.

The majority of indels are concentrated in two specific regions within the hemogen

protein: at the beginning of the region encoding the C-terminus and within the segment encoding

the proline-rich region of the C-terminus—both of which are contained within exon 3. Within the

species surveyed, I identified 24 unique indels in the coding regions of high-latitude Antarctic

notothenioids: 5 insertions, 19 deletions (Figure 10). Of these 24 events, 23 occur within the

segment encoding the C-terminus, and only one occurs in the segment encoding the N-terminus,

in a single species (SFigure 3). In addition to disproportionate prevalence of deletions over

insertions, there’s considerable variability among deletions and their occurrence within the tree

when compared with insertions. Insertion size is 1-2 amino acids, exclusively, and almost solely

contained within the Nototheniidae (with one exception among icefish). Deletions range from 1-

33 amino acids. These indels are also the most phylogenetically distinct features of notothenioid

hemogen; a full 22 of 24 indels are parsimony informative, while only 2 are homoplasic.

Page 26: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

26

dN/dS shows disparate selection pressure in hemogen segments encoding the N-terminus vs

the C-terminus

Pairwise comparisons show similar patterns: neutral mutation rate (as measured by dS) is

low/non-existent in all comparisons. A significant number of pairwise comparisons were

excluded from measurements of dN/dS because of a lack of synonymous mutations between the

two sequences, resulting in N/A values (Table 4/5). Both within and between family

comparisons show a preference for accumulating non-synonymous mutations, primarily within

the segments encoding the C-terminus (Figure 11/12). The segments encoding the N-terminus

are under purifying selection within families, and most between family comparisons also display

this trend (Table 4). A trend away from purifying selection, possibly relaxed to the point of

coming under drift, is shown in between-family comparisons with Channichthyidae (Table 5),

with all comparisons showing dN/dS >1. The N-terminal still remains largely conserved in

Channichthyidae (Figure 11), but given low neutral change, any nonsynonymous changes to the

coding sequence will result in high dN/dS.

Discovery of novel splice variation in Channichthyidae that excludes key domains from

translation

Two transcripts have been detected in surveyed icefish. The first transcript is hemgn-L,

which includes all exons (Figure 2). The tissue survey for C. aceratus uncovered a novel splice

variant not previously detected in zebrafish, N. coriiceps, or P. charcoti [unpublished

transcriptome]. This transcript (hemgn-s) (Figure 2) splices from the end of exon 2 beyond the

“icefish deletion” in exon 3. Curiously, hemgn-s splices into another frameshift, resulting in the

same truncation observed in hemgn-L from C. aceratus—a missense mutation and premature

Page 27: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

27

stop preventing translation of the acidic domain and exon 4 (Figure 2). This transcript has been

detected in other icefish which do not show a frameshift in hemgn-L; in each, it yields a similarly

truncated protein (Figure 2). As it does not include the beginning of exon 3, this transcript would

also exclude the bipartite nuclear localization signal from any translated protein. If translated, it

would produce a protein of approximately 78AA with only one functional domain intact: the

coiled-coil domain. The evolution of the hemgn-s splice form corresponds with an amino acid

change at the end of exon 2 that results in the gain of a basic residue (SFigure 3, position 52 in

alignment). Surveying the coding and protein sequences (SFigure 2/3) shows this change is

unique to the emergence of Channichthyidae.

qPCR shows low levels of hemogen expression in adult tissues of icefish

Primers were designed to capture two types of hemogen expression in icefish. One is the

hemgn-s variant, and the other is all “whole” hemgn. In icefish, this captures only the hemgn-L

variant, while in N. coriiceps this captures all transcripts (excluding a potential hemgn-s

transcript) Given the high standard deviations for icefish hemogen expression, I chose to analyze

my results as dCT rather than fold change. We see some hemogen expression in adult icefish,

primarily in a non-frameshifted species, C. gunnari, with little to no expression in C. aceratus.

Expression of “whole” hemogen is significantly reduced in icefish relative to red-blooded fish

(Figure 13). The hemgn-s transcript is not expressed in N. coriiceps, but does appear to be

expressed at relatively equivalent levels to hemgn-L in C. gunnari (Figure 13). Sequencing the

qPCR product confirms that hemgn-s was detected by our primers in C. aceratus, albeit at a

nearly undetectable and unreliably quantifiable level. Sequencing the N. coriiceps product for

Page 28: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

28

hemgn-s qPCR showed an off-target fragment was amplified using the hemgn-s primers. N.

coriiceps does not legitimately express the hemgn-s splice variant.

Conservation of functional domains & degradation of nuclear localization domain in

Channichthyidae

Yang et al [38] identified several structural features of the mouse ortholog: a coiled-coil

domain and bipartite nuclear localization signal in the N-terminus, and an acidic domain in the

C-terminus. I find these features conserved in red-blooded notothenioids, but partially lost or

degraded in Channichthyidae. The coiled-coil domain is located at 25-39aa; the bipartite NLS at

57-74aa; and the acidic domain at 240-261aa (SFigure 3). Exons 1, 2, and part of 3 encode the

N-terminus. Exon 3 encodes the bulk of the C-terminus, and consists of the bipartite NLS as well

as a series of tandem repeats [1]. The icefish deletion appears to eliminate at least one tandem

repeat from icefish relative to red-blooded notothens (Figure 6, 8) in those species with an in-

frame deletion (Figure 8). The out-of-frame deletion causes a missense mutation impacting the

tandem repeat structure further (Figure 8).

The bipartite nuclear localization signal has undergone degradation in all observed icefish

species (Figure 14), indicating that this feature likely evolved in the MRCA of Channichthyidae.

It’s been previously established that excising the NLS from hemogen prevents nuclear

localization [90], and simply mutating the initial amino acid of a bipartite NLS has been shown

to be sufficient to inhibit nuclear localization in some genes [91]. Furthermore, the bipartite NLS

is excised from the hemgn-s splice form. The degradation of the NLS coupled with the evolution

of an isoform encoding a protein which excludes the NLS outright suggests this feature may be a

victim of relaxed selection in icefish.

Page 29: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

29

DISCUSSION

hemogen is under relaxed selection in Channichthyidae and potentially notothenioid fish at

large

Relaxed selection plays an important role in evolution by permitting novel traits and

functions to evolve— evolutionary innovation— and sometimes permitting expanded phenotypic

plasticity [92, 93]. Alternatively, relaxed selection can lead to loss-of-function and possibly

pseudogenization [94]. RELAX does show that changes observed in Channichthyidae relative to

other Antarctic notothenioids are indicative of a trend towards relaxed selection on these

branches. This method is beneficial for detecting subtle selective pressure but comes with a

serious drawback of being unable to identify specific sites under relaxed selection, or

substantiate more than a trend between a foreground and background set of branches—which

requires some previously existing insight on where relaxed selection is most likely to occur [69].

However, the codon-based site tests to fully confirm—or refute—if changes at the level of

individual amino acids can be attributed to relaxed selection, vs positive selection.

I conducted the branch-site test, which examined N. coriiceps and C. aceratus vs other

teleosts, to specifically identify changes shared among Antarctic notothenioids which were not

present in any of the teleost outgroups. Such changes might be indicative of adaptive changes

associated with the hematic challenges of polar living, and would be good candidates for positive

selection. Distinguishing between relaxed and positive selection can be difficult, as both can

show a similar signal of increased ω when a gene should be under purifying selection [95]. The

ideal scenario would be 1) to confirm relaxed selection, or 2) exclude the possibility that

observed amino-acid changes might be due to positive selection. The branch-site test did not

Page 30: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

30

yield any sites under possible positive selection, let alone any that were considered significant

under the most-robust Bayes Empirical Bayes analysis implemented in CodeML [96]. This

would seem to support the conclusion that evolutionary change among Antarctic notothenioids is

due to relaxed selection rather than positive selection. However, this test cannot assess changes

observed within Antarctic notothenioids, but not shared in common amongst all of them—that is,

evolutionary change that emerged during speciation but not shared in kind amongst the whole

radiation.

When trying to asses change within the Antarctic clade, several factors make it difficult

to directly test hypotheses of positive selection vs genetic drift on hemogen. Evolutionary

hypothesis testing using dN/dS (ω) relies on codon-based models for best practices, where the

number of silence and replacement changes are used to “decide” if there’s been an excess of

nonsynonymous change relative to synonymous change—and thus, make inferences about

selective pressure. Large dN/dS estimates for extremely short branch lengths from model M0

indicate that any estimate of selective pressure should be considered unreliable as proof positive

of positive selection, and that fundamental assumptions of the test may be violated. The current

dataset fails to meet the sequence diversity requirement (dS over branches > 0.5) for any site test

of selection to accurately test hypotheses about positive selection [97-101].

I had intended to use similar methods to explicitly test for relaxed selection, which rely

on similar models and assumptions in order to assess selective pressure [102]. Further

exploration of these models and their practical applications showed they would be no more

robust than the tests for positive selection had been—due to either low levels of sequence

divergence among my dataset, or lack of a comparison to assess potentially accelerated mutation

rates and thus establish relaxed selection [103-106].

Page 31: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

31

This illustrates an interesting problem in evolutionary biology: if the selective changes

are subtle, or in recently diverged species, it can be difficult to pinpoint the selective cause

behind the mutational effects. Among adaptive radiations, if the gene of interest is 1) recently

under positive or relaxed selection, or 2) only under slight positive or relaxed selection, or 3) not

a gene responsible for a strong phenotypic effect of adaptive change, potentially driving

speciation—then it is unlikely that many current methods will be able to detect that selective

pressure. Given the low mutational rate of change, there’s no evidence to suggest that

notothenioid preference for non-synonymous change in this region is associated with positive

selection or novel interactions at this time.

Potentially the best evidence for relaxed selection is post hoc—with repeated mutations

that impair proper functioning of key domains, one could argue that this could only be due to

relaxed selection. Frameshift mutations tend to have strongly adverse effects and have been

linked to many cancers and diseases, but have occurred independently at least twice within the

icefish clade. Frequent indels occurring within exon 3 are consistent with the hypothesis that

genes associated with erythropoiesis underwent a change in the intensity of purifying selection as

a result of Southern Ocean colonization [21]. To substantiate this, it is necessary to outline what

the functional impacts of these indels might be and how they would tie into larger trends of

evolutionary change among notothenioids, as I will do in the following sections.

The most conclusive evidence for relaxed selection would be establishing that icefish

hemogen has undergone pseudogenization, complete loss-of-function, or relaxation of selectional

constraints leading to deterioration of some functional domains. This could be accomplished

either through observation (only a fragment remains, ex: icefish alpha- & beta-globin) or via

prediction of pseudogenization via comparative genomics [107]. While it is undeniable that

Page 32: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

32

hemogen has been impacted in icefish, I argue that it does not reach the level of complete

pseudogenization, and that the functional picture is more complicated than simply: is hemogen

nonfunctional in white-blooded fish?

Observed mutations in hemogen are not due to gene duplication or chromosomal

rearrangement

If hemogen is under relaxed selection in notothenioids, it becomes important to determine

whether or not the observed changes in the hemogen sequence represent changes to the true

hemogen orthologue of zebrafish and other vertebrates, and rule out some alternative possibilities

for the evolutionary processes behind them. Two evolutionary processes could be in play: gene

duplication in the icefish MRCA, while would result in a hemogen paralog that could be

deteriorating; or chromosomal rearrangement/localized chromosomal breakage, resulting in the

partial loss of part of the hemogen gene. Either could result in a signal of relaxed selection, but

the evolutionary conclusions surrounding hemogen and its current role would differ depending

on which—if either—process is in effect.

Gene duplications play a critical role in the evolutionary process, either as contributor to

phenotypic plasticity via evolution of new roles through neofunctinalization and

subfunctionalization [108, 109], or by development of pseudogenes [94]. I ruled out ancestral

gene duplication related to the whole-genome duplication event in teleosts [110-112], as analysis

conducted with zebrafish ortholog Si:dkey-25o16.2 showed that most teleost genomes retain only

a single hemogen ortholog [1]. However, a duplication event somewhere within the Antarctic

radiation, or prior to the diversification if Channichthyidae could be possible [113, 114]. If the

hemogen first identified in C. aceratus might happen to be a paralog, or the product of a

Page 33: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

33

duplication event in the icefish MRCA, then subfunctionalization of non-erythropoietic

processes could explain the C-terminus deterioration in said duplicate [115, 116] as well as some

of the reduced expression of icefish hemogen [117]. Thus, it was critical for me to make sure I

was analyzing true ortholog hemogen in the notothenioids, and not a duplicate which has either

subfunctionalized, neofunctionalized, or deteriorated into fragmentation and pseudogenization.

Alternatively, the large indels in Channichthyidae could be attributed to chromosomal

breakage,or rearrangement [118]. Studies with human ortholog EDAG identify it as a putative

oncogene located at a chromosomal region (9q22) linked to leukemia-associated chromosomal

breakpoints [119, 120]. If the hemogen locus has been disrupted by chromosomal breakage, the

gene seen in icefish might be a fragment rather than a whole (but mutated) gene. Given the high

sequence conservation of the hemogen gene, significant disruption around the locus seemed

unlikely.

My experimental evidence already suggested a low possibility that either gene

duplication or chromosomal rearrangement had taken place, given the high sequence

conservation between red-blooded and white-blooded notothenioids (> 90%), and the lack of

double bands following gel electrophoresis of PCR. To be entirely sure, I ruled out both

possibilities by assessing the conservation of synteny surrounding hemogen in Antarctic

notothenioids and by using available genomes to survey for similar sequences which might

represent hemogen paralogs. I was able to check my C. aceratus sequence against the available

C. aceratus [unpublished results] and N. coriiceps genomes (NCBI RefSeq NC_015653.1) [121]

to confirm its chromosomal location and assess possible sequencing errors or duplicate genes.

The sequence I obtained, with the observed 89 bp deletion, was present on the genome scaffold,

Page 34: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

34

and BLAST failed to obtain multiple hits for the whole-gene (exon + introns), coding sequence

only (all exons), or partial coding (individual exons) sequences.

Thanks to my work sequencing the promoter region of hemogen, I was also able to assess

the conservation of synteny among three representative species: C. aceratus (white-blooded), N.

coriiceps (red-blooded), and E. maclovinus (Sub-Antarctic outgroup). Previous work by the

Detrich lab established the conservation of synteny between zebrafish and other vertebrates [1]

with anp32b upstream of hemogen, and TRMO downstream. I sequenced towards hemogen from

both of those genes, and sequenced upstream and downstream from hemogen, in order to rule out

a small, localized chromosomal rearrangement that might impact hemogen. The Eleginops

sequence confirmed this localized synteny existed prior to the diversification of Antarctic

notothenioids, and expanding my search to other teleosts, lobe-finned, and cartilaginous fish

substantiated conservation of this ~15 kb region despite > 400 Ma of evolutionary distance [122-

125].

I therefore conclude that the sequence I obtained is not a hemogen paralog in C. aceratus,

and that it is unlikely that any other icefish sequences represent paralogs rather than true

orthologs. I also conclude that the indels observed in both red-blooded and white-blooded fish

are likely further evidence of relaxed selection acting upon hemogen, as opposed to larger-scale

chromosomal factors that could cause significant deletion or genomic rearrangement.

Mutation is preferentially accumulated in areas known to be conserved and vital for

erythropoiesis

The majority of evolutionary change across the Antarctic radiation occurs within the C-

terminal region encoded in exon 3, and in particular within the proline-rich region composed of

Page 35: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

35

tandem repeats. This includes nonsynonymous amino acid changes, as illustrated in the pairwise

dN/dS comparisons, but primarily occurs as indels. The C-terminus is critical for binding of

p300, and the recruitment and binding to p300 is critical for Hemogen to promote erythroid

differentiation in the human ortholog [55].

These indels also overlap or occur adjacent to a conserved C-terminal area identified as

essential for erythropoiesis in zebrafish [1]. Previous studies used CRISPR/Cas9 zebrafish

mutants to induce indels in zebrafish hemogen—a frameshift mutant deleting 5aa, and an in-

frame mutant deleting 12 aa and part of a conserved acidic motif (EEED). Analysis of

hemoglobin concentration and numbers of circulating blood cells show indels in this region lead

to reduced erythrocyte levels and hypochromatic blood in embryos of in-frame mutants, and that

the proportion of anemic individuals increases even in the heterozygous condition in both mutant

strains [1]. Frameshifted mutants did not appear to be translationally successful, whereas the

non-frameshifted mutants did produce a slightly-smaller Hemogen protein [1]. Indels in this

region also impacted development via nototchord and trunk defects in both frameshifted and in-

frame indel mutant zebrafish, and increased cellular apoptosis was identified within framesfhited

mutants throughout the embryo. Adult fish were statistically divergent in size from wild-type,

especially within homozygous in-frame mutant strains [1].

The implications of this research for interpreting the prevalence of indels suggest several

possibilities about relaxed selective pressures on erythropoiesis and hematopoietic traits.

Zebrafish mutations targeting the C-terminus of Hemogen reduced erythrocyte levels in adults

and decreased expression of Embryonic beta-globin regardless of in-frame or frameshift

condition. Therefore, deletions around or within this area of the C-terminus should impair

erythropoiesis in notothenioids even in the absence of frameshifts. The permissibility of

Page 36: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

36

hemogen indels in red-blooded fish supports the conclusion that Hemogen cannot be absolutely

essential for erythropoiesis. Importantly, given that even heterozygous mutants show impaired

erythropoiesis, “one good copy” of the gene would not be sufficient to prevent such an indel

from having an effect on erythrocyte production and concentration.

Given that red-blooded notothenioids demonstrate reduced hematocrit and lowered

erythrocyte levels, the observed in-frame indels in red-blooded fish may represent a response to

the necessary adaptation to polar conditions. At this time it is not possible to distinguish between

the chicken-and-egg of this situation: do the indels in hemogen represent a part of the adaptive

changes needed to lower hematocrit/decrease blood viscosity, or did they occur from relaxed

selection on erythrocyte-regulators as the importance of hemoglobin decreased throughout the

radiation? Given the lack of strong phylogenetic pattern among the red-blooded indel

distribution, I cannot determine between these two possibilities at this time. It is also unclear

specifically how the mutations in red-blooded notothends impact binding with p300, or if they

impair or inhibit access to a TAD. Similarly, it is unknown whether or not the nonsynonymous

mutations have a strong functional impact (or any functional impact) on binding or protein

structure in red-blooded notothenioids.

The adaptive value of frameshifts among icefish is another matter. The large size of the

deletion (89-99 bp) and the fixation of a frameshift variant are most likely the result of relaxed

selection, given that they should have significant adverse effects on erythropoiesis. Furthermore,

during diversification of Channichthyidae two independent frameshift mutations occurred in this

C-terminal region, resulting in truncated proteins that eliminate the acidic domain and the

segment encoded by exon 4. If erythropoiesis is no longer required, then there is no longer any

reason to selectively maintain the domains responsible for that function, and frameshift

Page 37: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

37

mutations may be tolerated to the point of fixation. Thus, the large indels may be indicative of

ongoing subfunctionalization in the hemogen ortholog, paring down the gene to only those

functional domains most essential to non-erythropoietic roles for hemogen.

Evolution of a novel splice variant missing majority of functional domains required for

erythropoiesis might be a natural dominant negative

Regardless of which isoform an icefish may be expressing at any given time, at least two

functional domains show evidence of relaxed selection. All icefish display a degraded bipartite

NLS and some loss to the proline-rich/tandem repeat region implicated in GATA1/p300/EDAG

complex [1, 55]. All icefish can theoretically express the hemgn-s isoform that excludes

everything but the coiled-coil domain from its truncated protein product. This elimination of

functional domains from a key isoform could be a dominant negative mutation that would

interfere with the functions of the Hemogen protein encoded by hemgn-L. This could be

accomplished either by binding to key partners (such as GATA1 or p300) or potentially through

oligomerization of Hemogen itself. Dominant negatives have an important role to play in

inhibiting wild-type expression, and have particularly been implicated in disease and promotion

of cancer formation. Additionally, dominant negatives in erythropoietic genes have been known

to lead to anemic conditions [126, 127] and development of a dominant negative in hemogen

could promote the erythrocyte-null condition. Dominant negatives have also been shown to

increase favorable outcomes in acute myeloid leukemia at other loci [128], suggesting that

dominant negative mutations in other putative oncogenes associated with leukemia (such as

hemogen) might also be beneficial in decreasing proliferation of this particular type of tumor.

Page 38: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

38

Key transcription factor binding domains for hematopoiesis- and/or erythropoiesis-

promoting genes appear to be conserved in icefish despite the erythrocyte-null phenotype,

including GATA1 [129], KLF4 [130], and Myb [57, 131]. With the promoter region and

necessary CNEs for primitive erythropoiesis still potentially intact, hemogen could still be

recruited to these processes in icefish. If hemgn-s is a dominant negative, it might interact with

some complexes in such a way as to prohibit their functions to promote erythropoiesis, while still

permitting non-erythropoietic function to occur.

Several caveats to this theory need to be stated. At this time, it is unclear how expression

of hemgn-s is regulated and what promoters might be behind it, or if it is controlled by a

promoter region further upstream that has yet to be discovered. It is unclear how it is

differentially expressed relative to wild-type hemogen, or if promoting expression of wild-type

hemogen means that hemgn-s is automatically expressed as well. However, the implications for -

hemgn-s if it could be conclusively proven to encode a dominant negative—or, alternatively, if

other forms of icefish hemogen could function as a dominant negative in some processes—could

be significant for non-erythropoietic research as well, given that the C-terminal is also implicated

in cell apoptosis and developmental defects and delays in zebrafish [1].

Large-scale deletions and degradation of functional domains occur concurrent with loss of

alpha- & beta-globin expression

The majority of functional change in Channichthyidae appears to be traceable to events

that occurred within the MRCA of icefish, prior to the diversification of the clade. The major

deletion in exon 3, the degradation of the bipartite NLS, and the evolution of a novel splice

Page 39: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

39

variant are all shared among extant icefish, and the most parsimonious explanation is that these

characters developed prior to diversification and speciation.

Additionally, the icefish-specific deletions provide insight into the evolutionary history of

the icefish clade that implicate globin-loss in relaxed selection on hemogen. Based on analysis of

hemogen allelic variation, the most likely evolutionary scenario for deletions and deteriorations

is that the 90bp/30aa deletion occurred in the MRCA of icefish, possibly as a consequence of

relaxed constraints following the loss of hemoglobin expression. From there two independent

mutations occurred: 1) a 1 bp insertion leading to the “frameshift” allele observed in C. aceratus

and Neopagetopsis ionah, which became fixed in some species, and 2) a secondary 9 bp deletion,

which has fixed in some but not all of the most recently speciated/most derived lineages.

Distribution of these deletion variants is consistent with incomplete lineage sorting, which is

common in adaptive radiations due to the rapid speciation [132], but also consistent with

possible introgression. Past introgression events have been detected in some icefish species

[133]. The evolution of these traits suggest relaxed selection permitting further deterioration

following the removal of a functional constraint: i.e., the non-expression of erythrocytes.

Page 40: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

40

Table 1. Primers used in PCR and qRT-PCR reactions to amplify hemogen gDNA and cDNA in

Antarctic notothenioids

PRIMER NAME SEQUENCE PURPOSE SPECIES

Ncor130for 5'-TGGAGGAGACATTTCAACA-3' gDNA, cDNA Antarctic notothenioids

NcHemRev2 5'-ACTAACAGGATGCACACTAACC-3' gDNA, cDNA Antarctic notothenioids

QP_CA500SpliceF2 5'-GACTAACCAGTGGGTTTAAGCC-3' qPCR C. aceratus, C. gunnari

NcHemRev1 5'-TTGTGGAGGAGGTGTCGAG-3' qPCR Antarctic notothenioids

hemAllqPCRFor 5'-AGAATGGAGGAGACATTTCAACA-3' qPCR C. aceratus, C. gunnari,

N. coriiceps

hemAllqPCRRev1 5'-TTCCTCAGAAGATCCCTGTC-3' qPCR C. aceratus, C. gunnari

hemAllqPCRRev2B 5'-CTTGTCTTCTGCTTCAGCTT-3' qPCR N. coriiceps

RTBactF 5'-CAGATCATGTTCGAGACCTTCAAC-

3'

qPCR C. aceratus, C. gunnari,

N. coriiceps

RTBactR 5'-TCACCRGARTCCATGACGATA-3' qPCR C. aceratus, C. gunnari,

N. coriiceps

Page 41: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

41

Table 2. Species sequenced and included in study of Antarctic notothenioid hemogen

ORGANISM FAMILY LOCALE ORGANISM FAMILY LOCALE

Champsocephalus esox* Channichthyidae Sub-Antarctic Harpagifer antarcticus Harpagiferidae Antarctic

Champsocephalus gunnari* Channichthyidae Antarctic Notothenia rossii Nototheniidae Antarctic

Neopagetopsis ionah* Channichthyidae Antarctic Notothenia coriiceps* Nototheniidae Antarctic

Pagetopsis macropterus* Channichthyidae Antarctic Notothenia angustata Nototheniidae Sub-Antarctic

Pseudochaenichthys georgianus*† Channichthyidae Antarctic Gobionotothen gibberifrons Nototheniidae Antarctic

Dacodraco hunteri Channichthyidae Antarctic Pleuragramma antarctica Nototheniidae Antarctic

Channichthys rhinoceratus* Channichthyidae Antarctic Trematomus hansoni* Nototheniidae Antarctic

Chaenocephalus aceratus*† Channichthyidae Antarctic Trematomus bernacchii* Nototheniidae Antarctic

Chionobathyscus dewitti* Channichthyidae Antarctic Trematomus eulepidotus Nototheniidae Antarctic

Cryodraco antarcticus* Channichthyidae Antarctic Trematomus borchgrevinki Nototheniidae Antarctic

Chaenodraco wilsoni* Channichthyidae Antarctic Trematomus newnesi* Nototheniidae Antarctic

Chionodraco myersi* Channichthyidae Antarctic Trematomus scotti Nototheniidae Antarctic

Chionodraco hamatus* Channichthyidae Antarctic Patagonotothen guntheri Nototheniidae Antarctic

Chionodraco rastrospinosus* Channichthyidae Antarctic Patagonotothen cornucola Nototheniidae Antarctic

Parachaenichthys charcoti† Bathydraconidae Antarctic Lepidonotothen nudifrons Nototheniidae Antarctic

Gerlachea australis Bathydraconidae Antarctic Dissostichus mawsoni Nototheniidae Antarctic

Bathydraco marri Bathydraconidae Antarctic Dissostichus eleginoides Nototheniidae Antarctic

Akarotaxis nudiceps Bathydraconidae Antarctic Aethotaxis mitopteryx Nototheniidae Antarctic

Vomeridens infuscipinnis Bathydraconidae Antarctic Eleginops maclovinus* Eleginopsidae Sub-Antarctic

Racovitzia glacialis Bathydraconidae Antarctic Pseudaphritis urvilli Pseudaphritidae Eastern Australia

Pogonophryne barsukovi Artedidraconidae Antarctic Cottoperca gobio Bovichtidae Sub-Antarctic

Pogonophryne scotti Artedidraconidae Antarctic Bovichtus diacanthus Bovichtidae Sub-Antarctic

Dolloidraco longedorsalis Artedidraconidae Antarctic Percophis brasiliensis Percophidae South America

Histiodraco velifer Artedidraconidae Antarctic Etheostoma nigrum Percidae North America

Artedidraco skottsbergi Artedidraconidae Antarctic Gasterosteus aculeatus Gasterosteidae Northern Hemisphere

* sequenced gDNA † transcriptome available

Page 42: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

42

Table 3. Codon usage bias for hemogen (total coding sequence) among Antarctic notothenioids

SPECIES

CAI SPECIES

CAI

Eleginops maclovinus 0.34 Pogonophryne scotti 0.329

Aethotaxis mitopteryx 0.352 Gerlachea australis 0.334

Dissostichus mawsoni 0.37 Parachaenichthys charcoti 0.332

Dissostichus eleginoides 0.345 Racovitzia glacialis 0.346

Lepidonotothen nudifrons 0.357 Vomeridens infuscipinnis 0.343

Patagonotothen cornucola 0.378 Akarotaxis nudiceps 0.349

Patagonotothen guntheri 0.377 Bathydraco marri 0.337

Trematomus bernacchii 0.356 Chaenocephalus aceratus 0.309

Trematomus borchgrevinki 0.363 Dacodraco hunteri 0.342

Trematomus eulepidotus 0.36 Neopagetopsis ionah 0.338

Trematomus hansoni 0.376 Cryodraco antarcticus 0.316

Trematomus newnesi 0.35 Chionodraco hamatus 0.332

Trematomus scotti 0.368 Chionodraco rastrospinosus 0.329

Gobionotothen gibberifrons 0.324 Champsocephalus esox 0.343

Notothenia angustata 0.359 Chionodraco myersi 0.304

Notothenia rossii 0.367 Pagetopsis macropterus 0.344

Notothenia coriiceps 0.358 Champsocephalus gunnari 0.337

Harpagifer antarcticus 0.339 Chaenodraco wilsoni 0.309

Artedidraco skottsbergi 0.332 Channichthys rhinoceratus 0.337

Histiodraco velifer 0.333 Chionobathyscus dewitti 0.313

Dolloidraco longedorsalis 0.327 Pseudochaenichthys georgianus 0.327

Pogonophryne barsukovi 0.329

Page 43: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

43

Table 4. Mean pairwise dN/dS for within-family comparisons of Antarctic notothenioid families

total N-terminus C-terminus

Artedidraconidae 1.381 N/A N/A

Bathydraconidae 0.807 0.880 0.866

Nototheniidae 1.320 0.716 1.191

Channichthyidae 1.077 0.435 1.190

Page 44: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

44

Table 5. Mean pairwise dN/dS for between-family comparisons of Antarctic notothenioid

families

total N-terminus C-terminus

Harpagiferidae-Nototheniidae 5.889 0.656 5.793

Harpagiferidae-Artedidraconidae 5.889 0.394 N/A

Harpagiferidae-Bathydraconiade 1.980 1.172 2.225

Harpagiferidae-Channichthyidae 4.737 2.171 4.723

Bathydraconidae-Nototheniidae 3.010 0.839 2.991

Bathydraconidae-Artedidraconidae 1.754 0.557 1.184

Bathydraconidae-Channichthyidae 2.225 1.844 1.653

Artedidraconidae-Nototheniidae 3.158 0.516 3.708

Artedidraconidae-Channichthyidae 2.601 1.772 1.963

Channichthyidae-Nototheniidae 2.865 1.895 2.597

Page 45: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

45

Table 6. Results of codon-based site tests conducted in CodeML on the Antarctic radiation

TEST ΔLRT DF P-VALUE

M0-M3 9.08037 4 p = 0.0591

M1a-M2a 8.3575 2 p = 0.0153

M7-M8 10.1192 2 p = 0.0063

M8-M8a 8.07556 1 p = 0.0045

Page 46: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

46

Page 47: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

47

Figure 1. Zebrafish Si:dkey-25o16.2 and human Hemogen are orthologous and encode

related proteins that differ in size. (A) Structure of the zebrafish Hemogen-like gene, Si:dkey-

25o16.2. Two conserved noncoding elements (C1 and C2, black boxes) were identified in a 2 kb

segment proximal to the start codon (see Results, Figs 4-6). Coding exons, white boxes;

noncoding exons, gray boxes. Numbers indicate length in bp. (B) Synteny of loci for zebrafish

Si:dkey-25o16.2 on chromosome 1 and Hemogen on human chromosome 9 (region q22).

Transcriptional orientations indicated by arrows. (C) Alternative splicing of zebrafish Hemogen-

like transcripts showing sequenced regions. Introns are shown as chevrons. Transcripts 1 and 2

differ by retention of 12 bp of intron (red). (D) Modular structures of zebrafish and human

Hemogen proteins each encoded by four exons (numbered boxes). Locations of truncating

mutations found in some human cancers (Forbes et al., 2017) are indicated by asterisks.

Predicted regions and motifs: green, coiled coil; blue, nuclear localization signal; red, four

residues introduced by alternative splicing; yellow, tandem peptide repeats; brown, acidic repeat

with transactivation domain (TAD) motif; gray, no prediction. (E) Three-dimensional ab initio

models of Hemogens. The ribbon diagram of the zebrafish protein, color-coded as in panel D, is

superimposed on the gray, space-filling model for the human protein. (Reproduced with

permission from Biology Open)

Page 48: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

48

Page 49: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

49

Figure 2. Icefish transcript variants for hemogen and their putative effects on translation

illustrated in representative species Champsocephalus gunnari. While several potential

transcripts were possible, only two transcripts have been confirmed for hemogen in icefish. The

first transcript, hemgn-L, consists of the complete transcription of all exons. The novel transcript

hemgn-s splices from the end of exon 2, to downstream of the 90bp deletion region; additionally,

it splices into a frameshift similar to that observed in C. aceratus, and thus, the end of exon 3 and

exon 4 would not be translated. This hemgn-s transcript would exclude the bipartite NLS, the

proline-rich tandem repeat domain and the acidic region. It has been detected in all icefish

surveyed regardless of their exon 3 deletion allele. Additionally, all transcripts detected in

surveyed icefish (C. aceratus, C. gunnari, P. georgianus and C. rastrospinosus) show that a

potential splice variant feature at the end of exon 2 (+/- 4AA) present in teleost fish only splices

with the +4AA (blue) in icefish. (C) Illustration of functional domains which would be included

in the Hemogen protein encoded by hemgn-s. If translated, only the coiled-coil domain (green)

would be present in the Hemogen protein; the splice form would exclude the functional domains

encoded within exon three: the bipartite nuclear localization signal, the tandem repeats, and the

acidic region. The light-grey regions on the protein illustration represent areas with no predicted

functional domain.

Page 50: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

50

Figure 3. Maximum likelihood tree used to test for positive selection on the branch leading

to the Antarctic notothenioid clade. Tree was constructed based on the coding sequences

derived from each species. The background branch includes all non-Antarctic outgroups ranging

from as far as the three-spine stickleback (G. aculeatus) to close Sub-Antarctic relative E.

maclovinus. The foreground branch is marked in red and includes two representative species: one

red-blooded (N. coriiceps) and one white-blooded (C. gunnari).

Page 51: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

51

Page 52: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

52

Figure 4. Maximum likelihood tree used in site-tests for positive/pervasive selection among

Antarctic notothenioids. Tree was constructed in RAxML (see Methods) and used for

calculating changes to site dN/dS for all models, in order to detect pervasive selection among the

high-latitude Antarctic notothenioids. No such selection was detected.

Page 53: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

53

Figure 5. RELAX tree shows relaxed selection on the branches contained Bathydraconidae

and Channichthyidae, demonstrating a trend of relaxed selection in hemogen on the way to

the erythrocyte-null phenotype. Test for selection relaxation (K = 0.25) was significant (p =

0.002, LR = 9.77). The test branch is indicated in light teal and represents the K value (K = 0.25)

for this branch relative to the background (K = 1, not colored.)

Page 54: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

54

Page 55: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

55

Figure 6. Gene structure and size remains conserved among red-blooded and white-

blooded notothenioids, including regulatory regions conserved among teleost fish. Structure

of the coding region in both Notothenia coriiceps (A) and Chaenocephalus aceratus (B) matches

that observed in other teleosts as well in other vertebrates and show strong size conservation for

exons 1, 2 and 4, as well as all introns. Two conserved non-coding elements described in Peters

et al 2018 for D. rerio are also still present in both species. The hemogen regulatory structure

shown (C) is derived from C. aceratus intergenic sequencing, but general spacing is approximate

to the regulatory regions of both N. coriiceps and Sub-Antarctic relative Eleginops maclovinus.

While the 5’ intergenic region of notothenioid hemogen is decreased in size relative to D. rerio

(Peters et al 2018), both CNEs are intact and show no significant genetic lesions relative to other

vertebrate species. (D) shows the partitioning of functional domains among exons in a

representative red-blooded species, Trematomus scotti. There are four primary domains: the

coiled-coil domain (exon 2), the bipartite nuclear localization signal (exon 3), a proline-rich

region composed of a variable number of tandem repeats (exon 3) and an acidic domain (exon

3). (E) shows these functional domains in a white-blooded fish, Champsocephalus gunnari.

Preliminary analysis of tandem repeats in icefish shows that the large deletion in exon 3 may

have resulted in the loss of at least one repeat.

Page 56: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

56

Figure 7. Conservation of conserved non-coding elements CNE1 and CNE2 in Antarctic notothenioids relative to Gasterosteus

aculeatus and Danio rerio. Transcripton factor binding sites were predicted using Contra V2 (Broos et al 2011) based on previously

identified key binding factors in both human hemogen and D. rerio. Relative to Antarctic notothenioids and representative teleost

species, C. aceratus does not show significant deterioration of either CNE1 (A) or CNE2 (B) and still possesses many putative binding

sites for key co-factors like p300, GATA1, Sox9, etc. Binding sites are colored accordingly: Foxl2 (orange), GFI1 (light blue), KLF4

(bright green), HNF1 (pink), HOXB4 (light brown), MYB (cyan/lavender), P300 (grey), Sox9 (red), GATA1 (dark pink/dark orange.)

Page 57: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

57

Page 58: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

58

Figure 8. hemogen exon 3 deletions in representative species from Channichthyidae relative

to a red-blooded notothenioid, and their predicted effects on transcription and translation.

The genetic lesion representing the most significant mutation to icefish hemogen takes three key

forms: a 90bp deletion, a 99bp deletion, and an 89bp deletion. (A) shows a representative red-

blooded notothenioid, Trematomus scotti, which does not possess any lesions in exon 3. (B)

shows representative C. gunnari, which possess the 90bp form of the deletion. Putative

translation shows this would result in a 30AA deletion but does not produce a frameshift, and the

rest of the gene should be translated normally. Similarily, in (C) you can see the 99bp deletion of

C. rastrospinosus would be somewhat reduced but still translated normally in frame. However,

the 89bp deletion first observed in C. aceratus (D) would result in a missense mutation leading

to a premature stop, and a subsequently truncated protein. The transcript possessing all exons is

illustrated here in red; grey regions indicate sequence that would be excluded from translation as

a result of the premature stop codon. The Hemogen protein structure is illustrated underneath

each transcript, in order to show how some features would not be wholly translated/translated at

all due to the exon 3 deletions in icefish. Functional domains are colored as follows: coiled-coil,

green; nuclear localization signal, blue; tandem repeats, yellow; acidic domain, red; light grey,

no predicted domain.

Page 59: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

59

Figure 9. Variant forms of hemogen “exon 3” deletion mapped onto the Channichthyidae

species tree. The deletion forms are not distributed evenly throughout the tree and follow a loose

evolutionary pattern at best. The most common form the deletion is the 90bp loss (green), which

may be the ancestral form of this lesion, as it is present throughout the entire tree. Second most

common is the 99bp deletion (blue), which evolved within the more recently speciated lineages

(< 4Mya) and does not appear to be fully fixed in most species, as it frequently appears as a

second allele alongside the 90bp deletion. The 89bp deletion, resulting in a frameshift, follows

no clear evolutionary pattern. This mutation appears to be fixed in species such as C. aceratus,

and surveying multiple indidividuals failed to detect any non-frameshifted hemogen variants in

these species. Tree topology was recreated using the mitochondrial tree of Near et al 2006 and

the putative species tree built by Dr. Jacob Daane [unpublished results].

Page 60: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

60

Page 61: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

61

Figure 10. hemogen indels in Antarctic notothenioids mapped onto a maximum parsimony

tree. Maximum parsimony tree was constructed in MEGA7 based on putative protein sequences

(see Methods). All indels marked appear in the coding sequence and would have an impact on

the protein composition. Purple triangles indicate a deletion event, while red triangles indicate an

insertion. Indels marked with a colored star are homoplastic and appear to have occurred

independently several times throughout Antarctic notothenioid diversification. However, the

indel marked with a green star is an artifact introduced by the parsimony construction process;

due to the high sequence similarity among icefish, the parsimony method cannot adequately

model correct species relationships in this clade.

Page 62: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

62

Figure 11. Pairwise dN/dS comparisons plotting total dN/dS of whole Hemogen-encoding

sequence with the dN/dS values for the N-terminus and C-terminus of notothenioid

Hemogen, within families Nototheniidae (A & B) and Channichthyidae (C & D). This

demonstrates the relative changes in selective pressure of the N-terminus vs the C-terminus,

relative to the overall selective pressure on the whole Hemogen protein. Values were obtained by

conducting pairwise-comparisons in the module yn00 of PAML 4 (see Methods). Each dot

represents a unique species pair and the calculated dN/dS values for the N-terminus, C-terminus,

and total coding sequence for that specific species comparison.

Page 63: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

63

Figure 12. Pairwise dN/dS trends between families Nototheniidae and Channichthyidae,

plotting whole-Hemogen dN/dS vs the N-terminus (A) or C-terminus (B). This demonstrates

the relative changes in selective pressure of the N-terminus vs the C-terminus, relative to the

overall selective pressure on the whole Hemogen protein. Values were obtained by conducting

pairwise-comparisons in the module yn00 of PAML 4 (see Methods). Each dot represents a

unique species pair and the calculated dN/dS values for the N-terminus, C-terminus, and total

coding sequence for that specific species comparison.

Page 64: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

64

Page 65: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

65

Figure 13. qPCR quantification of hemogen transcript variants in representative icefish

species C. aceratus and C. gunnari, comparing adult head kidney hemogen expression with

N. coriiceps adult head kidney for both hemgn-L and hemgn-s splice variants. Primers were

designed to capture two types of hemogen expression in icefish. One is the hemgn-s variant and

the other is referred to as “whole” hemogen. In icefish, this captures only the hemgn-L variant,

while in N. coriiceps this captures all transcripts (excluding a potential hemgn-s transcript).

When normalized to beta-actin expression, we see some hemogen expression in adult icefish,

particularly in the non-frameshifted species C. gunnari, but no amplification of hemgn-L in the

species with the frameshift indel, C. aceratus. In C. gunnari, hemgn-s is expressed at relatively

similar levels to hemgn-L, whereas in C. aceratus the variant hemgn-s is expressed at such low

levels that it is nearly undetectable. While it appears that N. coriiceps may also express hemgn-s

at low levels, sequencing of qPCR product shows this is off-target binding and not legitimate

amplification of the targeted splice variant. N. coriiceps does not express hemgn-s.

Page 66: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

66

Figure 14. Changes to the bipartite nuclear localization signal in icefish (Champsocephalus

gunnari) relative to red-blooded notothens (Notothenia coriiceps). The bipartite nuclear

localization signal consists of two clusters of positively charged amino acids (typically lysine

and arginine) separated by a spacer sequence. In C. gunnari and other icefish, the first portion of

the bipartite signal has been degraded at the first (R -> M) and fourth (R -> S) positions. This

would result in a decreased positive charge and negatively impact the nuclear localization

process.

Page 67: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

67

REFERENCES

1. Peters, M.J., et al., Divergent Hemogen genes of teleosts and mammals share conserved

roles in erythropoiesis: analysis using transgenic and mutant zebrafish. Biol Open, 2018.

7(8).

2. Near, T.J., et al., Identification of the notothenioid sister lineage illuminates the

biogeographic history of an Antarctic adaptive radiation. BMC Evol Biol, 2015.

15(109): p. 1-14.

3. Scher, H.D. and E.E. Martin, Timing and Climatic Consequences of the Opening of

Drake Passage. Science, 2006. 312: p. 428-430.

4. Barker, P.F., et al., Onset and role of the Antarctic Circumpolar Current. Deep Sea

Research Part II: Topical Studies in Oceanography, 2007. 54(21-22): p. 2388-2398.

5. Chen, L., A. Devries, and C.H. Cheng, Evolution of antifreeze glycoprotein gene from a

trypsinogen gene in Antarctic notothenioid fish. Proc Natl Acad Sci U S A, 1997. 94: p.

3811-3816.

6. Cheng, C.H. and L. Chen, Evolution of an antifreeze glycoprotein: a blood protein that

keeps Antarctic fish from freezing arose from a digestive enzyme. Nature, 1999. 401: p.

443-444.

7. Near, T.J., et al., Ancient climate change, antifreeze, and the evolutionary diversification

of Antarctic fishes. Proc Natl Acad Sci U S A, 2012. 109(9): p. 3434-3439.

8. Shevenell, A.E., J.P. Kennett, and D.W. Lea, Middle Miocene Southern Ocean Cooling

and Antarctic Cryosphere Expansion. Science, 2004. 305: p. 1766-1770.

9. Clarke, A., D.K. Barnes, and D.A. Hodgson, How isolated is Antarctica? Trends Ecol

Evol, 2005. 20(1): p. 1-3.

10. Clarke, A. and I.A. Johnston, Evolution and adaptive radiation of Antarctic fishes.

Trends Ecol Evol, 1996. 11(5): p. 212-218.

11. Dornburg, A., et al., Cradles and museums of Antarctic teleost biodiversity. Nat Ecol

Evol, 2017. 1(9): p. 1379-1384.

12. Tripati, A.K., C.D. Roberts, and R.A. Eagle, Coupling of CO2 and ice sheet stability over

major climate transitions of the last 20 million years. Science, 2009. 326(5958): p. 1394-

7.

13. Pollard, D. and R.M. DeConto, Modelling West Antarctic ice sheet growth and collapse

through the past five million years. Nature, 2009. 458(7236): p. 329-32.

14. Thatje, S., et al., Life hung by a thread: endurance of Antarctic fauna in glacial periods.

Ecology, 2008. 89(3): p. 682-692.

15. Cheng, C.H. and H.W. Detrich, 3rd, Molecular ecophysiology of Antarctic notothenioid

fishes. Philos Trans R Soc Lond B Biol Sci, 2007. 362(1488): p. 2215-32.

16. Johns, G.C. and J.C. Avise, Tests for ancient species flocks based on molecular

phylogenetic appraisals of Sebastes rockfishes and other marine fishes. Evolution, 1998.

52(4): p. 1135-1146.

17. Eastman, J.T., The nature of the diversity of Antarctic fishes. Polar Biology, 2005. 28(2):

p. 93-107.

18. Schluter, D., The Ecology of Adaptive Radiation. 2000, Oxford: OUP.

19. Rutschmann, S., et al., Parallel ecological diversification in Antarctic notothenioid fishes

as evidence for adaptive radiation. Mol Ecol, 2011. 20(22): p. 4707-21.

Page 68: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

68

20. Ruud, J.T., Vertebrates without erythrocytes and blood pigment. Nature, 1954. 173: p.

848-850.

21. Eastman, J.T., Antarctic fish biology: evolution in a unique environment. 1993: Academic

Press.

22. Wells, R.M.G., et al., Comparative study of the erythrocytes and haemoglobins in

nototheniid fishes from Antarctica. Journal of Fish Biology, 1980. 17(5): p. 517-527.

23. Wells, R.M.G., J.A. Macdonald, and G. di Prisco, Thin-blooded Antarctic fishes- a

rheological comparison of the haemoglobin-free icefishes Chionodraco kathleenae and

Cryodraco antarcticus with a red-blooded nototheniid, Pagothenia bernacchii. Journal of

Fish Biology, 1990. 36(4): p. 595-609.

24. Macdonald, J.A. and R.M.G. Wells, Viscosity of Body Fluids From Antarctic

Notothenioid Fish, in Biology of Antarctic Fish, G. Di Prisco, B. Maresca, and B. Tota,

Editors. 1991, Springer-Verlag: Berlin. p. 163-178.

25. Cocca, E., et al., Genomic remnants of alpha-globin genes in the hemoglobinless

antarctic icefishes. Proc Natl Acad Sci U S A, 1995. 92: p. 1817-1821.

26. di Prisco, G., Molecular Adaptations of Antarctic Fish Hemoglobins. 1998: p. 339-353.

27. di Prisco, G., et al., Biogeography and adaptation of Notothenioid fish: hemoglobin

function and globin-gene evolution. Gene, 2007. 398(1-2): p. 143-55.

28. Xu, Q., et al., Adaptive evolution of hepcidin genes in antarctic notothenioid fishes. Mol

Biol Evol, 2008. 25(6): p. 1099-112.

29. Beers, J.M. and N. Jayasundara, Antarctic notothenioid fish: what are the future

consequences of 'losses' and 'gains' acquired during long-term evolution at cold and

stable temperatures? J Exp Biol, 2015. 218(Pt 12): p. 1834-1845.

30. Beers, J.M., K.A. Borley, and B.D. Sidell, Relationship among circulating hemoglobin,

nitric oxide synthase activities and angiogenic poise in red- and white-blooded Antarctic

notothenioid fishes. Comp Biochem Physiol A Mol Integr Physiol, 2010. 156(4): p. 422-

9.

31. di Prisco, G., J.A. MacDonald, and M. Brunori, Antarctic fishes survive exposure to

carbon monoxide. Experientia, 1992. 48(5): p. 473-475.

32. Cocca, E., et al., Do the hemoglobinless icefishes have globin genes? Comp Biochem

Physiol, 1997. 118A(4): p. 1027-1030.

33. Zhao, Y., et al., The Major Adult -Globin Gene of Antarctic Teleosts and Its Remnants in

the Hemoglobinless Icefishes: CALIBRATION OF THE MUTATIONAL CLOCK FOR

NUCLEAR GENES. Journal of Biological Chemistry, 1998. 273(24): p. 14745-14752.

34. Near, T.J., S.K. Parker, and H.W. Detrich, 3rd, A genomic fossil reveals key steps in

hemoglobin loss by the antarctic icefishes. Mol Biol Evol, 2006. 23(11): p. 2008-16.

35. Barber, D.L., The blood cells of the Antarctic icefish Chaenocephalus aceratus

Lönnberg: light and electron microscopic observations. Journal of Fish Biology, 1981.

19(1): p. 11-28.

36. Sidell, B.D. and K.M. O'Brien, When bad things happen to good fish: the loss of

hemoglobin and myoglobin expression in Antarctic icefishes. J Exp Biol, 2006. 209(Pt

10): p. 1791-802.

37. Lau, Y.T., et al., Evolution and function of the globin intergenic regulatory regions of the

antarctic dragonfishes (Notothenioidei: Bathydraconidae). Mol Biol Evol, 2012. 29(3):

p. 1071-80.

Page 69: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

69

38. Yang, L.V., et al., Hemogen is a novel nuclear factor specifically expressed in mouse

hematopoietic development and its human homologue EDAG maps to chromosome 9q22,

a region containing breakpoints of hematological neoplasms. Mechanisms of

Development, 2001. 104: p. 105-111.

39. Li, C.Y., et al., EDAG regulates the proliferation and differentiation of hematopoietic

cells and resists cell apoptosis through the adtivation of nuclear factor-ĸB. Cell Death

and Differentiation, 2004. 11: p. 1299-1308.

40. Li, C.-Y., et al., Suppression of EDAG gene expression by phorbol 12-myristate 13-

acetate is mediated through down-regulation of GATA-1. Biochimica et Biophysica

Actta, 2008. 2008(1779): p. 606-615.

41. Li, C.-Y., et al., Overexpression of a hematopoietic transcriptional regulator EDAG

induces myelopoiesis and suppresses lymphopoiesis in transgenic mice. Leukemia, 2007.

21: p. 2277-2286.

42. Jiang, J., et al., Hemgn is a direct transcriptional target of HOXB4 and induces

expansion of murine myeloid progenitor cells. Blood, 2010. 116(5): p. 711-719.

43. Ding, Y.L., et al., Over-expression of EDAG in the myeloid cell line 32D: induction of

GATA-1 expression and erythroid/megakaryocytic phenotype. J Cell Biochem, 2010.

110(4): p. 866-74.

44. An, L.-L., et al., High expression of EDAG and its significance in AML. Leukemia, 2005.

19: p. 1499-1502.

45. Yang, L.V., et al., Alternative promoters and polyadenylation regulate tissue-specific

expression of Hemogen isoforms during hematopoiesis and spermatogenesis. Dev Dyn,

2003. 228(4): p. 606-16.

46. Nakata, T., et al., Chicken hemogen homolog is involved in the chicken-specific sex-

determining mechanism. PNAS, 2013. 110(9): p. 3417-3422.

47. Kruger, A., et al., RP59, a marker for osteoblast recruitment, is also detected in primitive

mesenchymal cells, erythroid cells, and megakaryocytes. Dev Dyn, 2002. 223(3): p. 414-

8.

48. Wurtz, T., et al., A new protein expressed in bone marrow cells and osteoblasts with

implication in osteoblast recruitment. Exp Cell Res, 2001. 263(2): p. 236-42.

49. Shao, J., et al., Sequencing and bioinformatics analysis of the differentially expressed

genes in herniated discs with or without calcification. Int J Mol Med, 2017. 39(1): p. 81-

90.

50. Iwasaki, H., et al., GATA-1 Converts Lymphoid and Myelomonocytic Progenitors into the

Megakaryocyte/Erythrocyte Lineages. Immunity, 2003. 19: p. 451-462.

51. Pevny, L., et al., Development of hematopoietic cells lacking transcription factor GATA-

1. Development, 1995. 121: p. 163-172.

52. Galloway, J.L., et al., Loss of gata1 but not gata2 converts erythropoiesis to myelopoiesis

in zebrafish embryos. Dev Cell, 2005. 8(1): p. 109-16.

53. Belele, C.L., et al., Differential requirement for Gata1 DNA binding and transactivation

between primitive and definitive stages of hematopoiesis in zebrafish. Blood, 2009.

114(25): p. 5162-72.

54. Lyons, S.E., et al., A nonsense mutation in zebrafish gata1 causes the bloodless

phenotype in vlad tepes. Proc Natl Acad Sci U S A, 2002. 99(8): p. 5454-9.

55. Zheng, W.W., et al., EDAG positively regulates erythroid differentiation and modifies

GATA1 acetylation through recruiting p300. Stem Cells, 2014. 32(8): p. 2278-89.

Page 70: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

70

56. Blobel, G.A., CREB-binding protein and p300: molecular integrators of hematopoietic

transcription. Blood, 2000. 95(3): p. 745-755.

57. Sandberg, M.L., et al., c-Myb and p300 regulate hematopoietic stem cell proliferation

and differentiation. Dev Cell, 2005. 8(2): p. 153-66.

58. Edgar, R., MUSCLE: multiple sequence alignment with high accuracy and high

throughput. Nucleic Acids Res, 2004. 32(5): p. 1792-1797.

59. Kumar, S., G. Stecher, and K. Tamura, MEGA7: Molecular Evolutionary Genetics

Analysis Version 7.0 for Bigger Datasets. Mol Biol Evol, 2016. 33(7): p. 1870-4.

60. Hall, T.A., BioEdit: a user-friendly biological sequence alignment editor and analysis

program for Windows 95/98/NT. Nucleic Acids Symposium Series, 1999. 41: p. 95-98.

61. Stamatakis, A., RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with

thousands of taxa and mixed models. Bioinformatics, 2006. 22(21): p. 2688-90.

62. Stamatakis, A., RAxML Version 8: A tool for Phylogenetic Analysis and Post-Analysis of

Large Phylogenies. Bioinformatics, 2014.

63. Yang, Z., PAML: a program package for phylogenetic analysis by maximum likelihood.

Comput Appl Biosci, 1997. 13(5): p. 555-556.

64. Yang, Z., PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol, 2007.

24(8): p. 1586-91.

65. Zhang, J., R. Nielsen, and Z. Yang, Evaluation of an improved branch-site likelihood

method for detecting positive selection at the molecular level. Mol Biol Evol, 2005.

22(12): p. 2472-9.

66. Burri, R., et al., Adaptive divergence of ancient gene duplicates in the avian MHC class II

beta. Mol Biol Evol, 2010. 27(10): p. 2360-74.

67. Yang, Z. and R. Nielsen, Codon-Substitution Models for Detecting Molecular Adaptation

at Indiviual Sites Along Specific Lineages. Mol Biol Evol, 2002. 19(6): p. 908-917.

68. Yang, Z. and W.J. Swanson, Codon-Substiution Models to Detect Adaptive Evolution that

Account for Heterogeneous Selective Pressures Among Site Clases. Mol Biol Evol, 2002.

19(1): p. 49-57.

69. Wertheim, J.O., et al., RELAX: detecting relaxed selection in a phylogenetic framework.

Mol Biol Evol, 2015. 32(3): p. 820-32.

70. Pond, S.L.K. and S.V. Muse, HyPhy: Hypothesis Testing Using Phylogenies, in

Statistical methods in molecular evolution. 2005, Springer: New York, NY. p. 125-181.

71. Delport, W., et al., Datamonkey 2010: a suite of phylogenetic analysis tools for

evolutionary biology. Bioinformatics, 2010. 26(19): p. 2455-7.

72. Weaver, S., et al., Datamonkey 2.0: a modern web application for characterizing

selective and other evolutionary processes. Mol Biol Evol, 2018.

73. Shin, S.C., et al., The genome sequence of the Antarctic bullhead notothen reveals

evolutionary adaptations to a cold environment. Genome Biol, 2014. 15(468).

74. Kearse, M., et al., Geneious Basic: an integrated and extendable desktop software

platform for the organization and analysis of sequence data. Bioinformatics, 2012.

28(12): p. 1647-9.

75. Zerbino, D.R., et al., Ensembl 2018. Nucleic Acids Res, 2018. 46(D1): p. D754-D761.

76. Broos, S., et al., ConTra v2: a tool to identify transcription factor binding sites across

species, update 2011. Nucleic Acids Res, 2011. 39(Web Server issue): p. W74-8.

77. Nei, M. and S. Kumar, Molecular Evolution and Phylogenetics. 2000, Oxford: Oxford

University Press.

Page 71: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

71

78. Yu, G., et al., ggtree: anrpackage for visualization and annotation of phylogenetic trees

with their covariates and other associated data. Methods in Ecology and Evolution,

2017. 8(1): p. 28-36.

79. Letunic, I. and P. Bork, Interactive tree of life (iTOL) v3: an online tool for the display

and annotation of phylogenetic and other trees. Nucleic Acids Res, 2016. 44(W1): p.

W242-5.

80. Near, T.J. and C.H. Cheng, Phylogenetics of notothenioid fishes (Teleostei:

Acanthomorpha): inferences from mitochondrial and nuclear gene sequences. Mol

Phylogenet Evol, 2008. 47(2): p. 832-40.

81. Near, T.J., J.J. Pesavento, and C.-H.C. Cheng, Mitochondrial DNA, morphology, and the

phylogenetic relationships of Antarctic icefishes (Notothenioidei: Channichthyidae).

Molecular Phylogenetics and Evolution, 2003. 28(1): p. 87-98.

82. Yang, Z. and R. Nielsen, Estimating Synonymous and Nonsynonymous Substitution Rates

Under Realistic Evolutionary Models. Mol Biol Evol, 2000. 17(1): p. 32-43.

83. Librado, P. and J. Rozas, DnaSP v5: a software for comprehensive analysis of DNA

polymorphism data. Bioinformatics, 2009. 25(11): p. 1451-2.

84. Rozas, J., et al., DnaSP, DNA polymorphism analyses by the coalescent and other

methods. Bioinformatics, 2003. 19(18): p. 2496-2497.

85. Sharp, P.M. and W.-H. Li, The codon adaptation index- a measure of directional

synonymous codon usage bias, and its potential applications. Nucleic Acids Res, 1987.

15(3): p. 1281-1295.

86. Morton, B.R., Codon Use and the Rate of Divergence of Land Plant Chloroplast Genes.

Mol Biol Evol, 1994. 11(2): p. 231-238.

87. Sugawara, T., Y. Terai, and N. Okada, Natural Selection of the Rhodopsin Gene During

the Adaptive Radiation of East African Great Lakes Cichlid Fishes. Mol Biol Evol, 2002.

19(10): p. 1807-1811.

88. Ota, T., et al., Positive Darwinian Selection Operating on the Immunoglobulin Heavy

Chain of Antarctic Fishes. Journal of Experimental Zoology (Mol Dev Evol), 2003.

295B: p. 45-58.

89. Schmittgen, T.D. and K.J. Livak, Analyzing real-time PCR data by the comparative CT

method. Nature Protocols, 2008. 3(6): p. 1101-1108.

90. Gao, P., Functional Study of Hemogen Knockout Mouse Model. Theses and Dissertations

(ETD), 2013. Paper 92.

91. Boulikas, T., Putative Nuclear Localization Signals (NLS) in Protein Transcription

Factors. 1994, 1994. 55: p. 32-58.

92. Lahti, D.C., et al., Relaxed selection in the wild. Trends Ecol Evol, 2009. 24(9): p. 487-

96.

93. Hunt, B.G., et al., Relaxed selection is a precursor to the evolution of phenotypic

plasticity. PNAS, 2011. 108(38): p. 15936-15941.

94. Go, Y., et al., Lineage-specific loss of function of bitter taste receptor genes in humans

and nonhuman primates. Genetics, 2005. 170(1): p. 313-26.

95. Murrell, B., et al., Detecting individual sites subject to episodic diversifying selection.

PLoS Genet, 2012. 8(7): p. e1002764.

96. Yang, Z., W.S. Wong, and R. Nielsen, Bayes empirical bayes inference of amino acid

sites under positive selection. Mol Biol Evol, 2005. 22(4): p. 1107-18.

Page 72: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

72

97. Anisimova, M., J.P. Bielawski, and Z. Yang, Accuracy and power of the likelihood ratio

test in detecting adaptive molecular evolution. Mol Biol Evol, 2001. 18(8): p. 1585-1592.

98. Yang, Z., Adaptive molecular evolution, in Handbook of statistical genetics, D.J.

Balding, M. Bishop, and C. Cannings, Editors. 2001, Wiley: New York. p. 327-350.

99. Anisimova, M., J.P. Bielawski, and Z. Yang, Accuracy and power of Bayes prediction of

amino acid sites under positive selection. Mol Biol Evol, 2002. 19(6): p. 950-958.

100. Yang, Z. and J.P. Bielawski, Statistical methods for detecting molecular adaptation.

TREE, 2000. 15(12): p. 496-503.

101. Yang, Z., Inference of selection from multiple species aignments. Current Opinion in

Genetics and Development, 2002. 12: p. 688-694.

102. Bielawski, J.P. and Z. Yang, Maximum likelihood methods for detecting adaptive

evolution after gene duplication, in Genome Evolution, A. Meyer and Y. Van de Peer,

Editors. 2003, Kluwer Academic Publishers: Netherlands. p. 201-212.

103. Zhao, H., et al., Rhodopsin molecular evolution in mammals inhabiting low light

environments. PLoS One, 2009. 4(12): p. e8326.

104. Veilleux, C.C., E.E. Louis, Jr., and D.A. Bolnick, Nocturnal light environments influence

color vision and signatures of selection on the OPN1SW opsin gene in nocturnal lemurs.

Mol Biol Evol, 2013. 30(6): p. 1420-37.

105. Markova, S., J.B. Searle, and P. Kotlik, Relaxed functional constraints on triplicate

alpha-globin gene in the bank vole suggest a different evolutionary history from other

rodents. Heredity (Edinb), 2014. 113(1): p. 64-73.

106. Feng, P., et al., Massive losses of taste receptor genes in toothed and baleen whales.

Genome Biol Evol, 2014. 6(6): p. 1254-65.

107. Dainat, J., et al., GLADX: an automated approach to analyze the lineage-specific loss

and pseudogenization of genes. PLoS One, 2012. 7(6): p. e38792.

108. Ohno, S., Evolution by gene duplication. 1970, New York: Allen & Unwin.

109. Lynch, M. and J.S. Conery, The Evolutionary Fate and Consequences of Duplicate

Genes. Science, 2000. 290(5494): p. 1151-1155.

110. Christoffels, A., et al., Fugu genome analysis provides evidence for a whole-genome

duplication early during the evolution of ray-finned fishes. Mol Biol Evol, 2004. 21(6): p.

1146-51.

111. Hoegg, S., et al., Phylogenetic timing of the fish-specific genome duplication correlates

with the diversification of teleost fish. J Mol Evol, 2004. 59(2): p. 190-203.

112. Postlethwait, J.H., et al., Zebrafish comparative genomics and the origins of vertebrate

chromosomes. Genome Res, 2000. 10(1): p. 1890-1902.

113. Brunet, F.G., et al., Gene loss and evolutionary rates following whole-genome

duplication in teleost fishes. Mol Biol Evol, 2006. 23(9): p. 1808-16.

114. Glasauer, S.M.K. and S.C.F. Neuhauss, Whole-genome duplication in teleost fishes and

its evolutionary consequences. Mol Genet Genomics, 2014. 289: p. 1045-1060.

115. Rastogi, S. and D.A. Liberles, Subfunctionalization of duplicated genes as a transition

state to neofunctionalization. BMC Evol Biol, 2005. 5: p. 28.

116. Amoutzias, G.D., et al., Posttranslational regulation impacts the fate of duplicated genes.

Proc Natl Acad Sci U S A, 2010. 107(7): p. 2967-71.

117. Qian, W., et al., Maintenance of duplicate genes and their functional redundancy by

reduced expression. Trends in Genetics, 2010. 26(10).

Page 73: hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive

73

118. Amores, A., et al., Cold Fusion: Massive Karyotype Evolution in the Antarctic Bullhead

Notothen Notothenia coriiceps. G3 (Bethesda), 2017. 7(7): p. 2195-2207.

119. Chen, D.L., et al., EDAG-1 promotes proliferation and invasion of human thyroid cancer

cells by activating MAPK/Erk and AKT signal pathways. Cancer Biol Ther, 2016. 17(4):

p. 414-21.

120. Lü, J.W.-X.X.S.-Y.W.Y.J.C.-Y.L.W.-M.C.X.-M.Y., Overexpression of EDAG-1 in

NIH3T3 cells leads to malignant transformation. heng wu hua xue yu sheng wu wu li xue

bao Acta biochimica et biophysica Sinica, 2002. 34(1): p. 95-98.

121. O'Leary, N.A., et al., Reference sequence (RefSeq) database at NCBI: current status,

taxonomic expansion, and functional annotation. Nucleic Acids Res, 2016. 44(D1): p.

D733-45.

122. Jones, F.C., et al., The genomic basis of adaptive evolution in threespine sticklebacks.

Nature, 2012. 484(7392): p. 55-61.

123. Betancur, R.R., et al., The tree of life and a new classification of bony fishes. PLoS Curr,

2013. 5.

124. Read, T.D., et al., Draft sequencing and assembly of the genome of the world's largest

fish, the whale shark: Rhincodon typus Smith 1828. BMC Genomics, 2017. 18(1): p. 532.

125. Venkatesh, B., et al., Elephant shark genome provides unique insights into gnathostome

evolution. Nature, 2014. 505(7482): p. 174-9.

126. Arnaud, L., et al., A dominant mutation in the gene encoding the erythroid transcription

factor KLF1 causes a congenital dyserythropoietic anemia. Am J Hum Genet, 2010.

87(5): p. 721-7.

127. Devlin, E.E., et al., A transgenic mouse model demonstrates a dominant negative effect of

a point mutation in the RPS19 gene associated with Diamond-Blackfan anemia. Blood,

2010. 116(15): p. 2826-35.

128. Paz-Priel, I. and A.D. Friedman, C/EBPα Dysregulation in AML and ALL. Crit Rev

Oncog, 2011. 16(1-2): p. 93-102.

129. Yang, L.V., et al., The GATA site-dependent hemogen promoter is transcriptionally

regulated by GATA1 in hematopoietic and leukemia cells. Leukemia, 2006. 20(3): p. 417-

25.

130. Gardiner, M.R., et al., A global role for zebrafish klf4 in embryonic erythropoiesis. Mech

Dev, 2007. 124(9-10): p. 762-74.

131. Soza-Ried, C., et al., Essential role of c-myb in definitive hematopoiesis is evolutionarily

conserved. PNAS, 2010. 107(40): p. 17304-17308.

132. Takahashi, K., et al., Phylogenetic relationships and ancient incomplete lineage sorting

among cichlid fishes in Lake Tanganyika as revealed by analysis of the insertion of

retroposons. Mol Biol Evol, 2001. 18(11): p. 2056-2066.

133. Marino, I.A., et al., Evidence for past and present hybridization in three Antarctic icefish

species provides new perspectives on an evolutionary radiation. Mol Ecol, 2013. 22(20):

p. 5148-61.