10
TECHNICAL NOTE Identification of 104 rapidly-evolving nuclear protein-coding markers for amplification across scaled reptiles using genomic resources Daniel M. Portik Perry L. Wood Jr. Jesse L. Grismer Edward L. Stanley Todd R. Jackman Received: 30 November 2010 / Accepted: 31 May 2011 / Published online: 17 June 2011 Ó Springer Science+Business Media B.V. 2011 Abstract As the fields of molecular systematics and phylogeography are advancing, it is necessary to incorpo- rate multiple loci in both population and species-level inference. Here, we present primer sets for 104 intronless orthologus exons designed for amplification in all squa- mates. When comparing the Anolis genome to the Gallus genome, all the markers have less than 67.2% DNA sequence identity, the percent identity of the first third of the commonly used nuclear marker RAG-1. The rate of evolution in these markers is therefore greater than nuclear markers commonly used, and we demonstrate their use- fulness for both phylogeographic and phylogenetic studies. Keywords Nuclear markers Squamates Primers Marker development Intronless exons A recent trend in both phylogenetics and phylogeography is the development of new analytical techniques that require the use of multiple nuclear loci (Brito and Edwards 2009; Edwards 2009; Heled and Drummond 2008; Hey 2010; Hey and Nielsen 2004; Pritchard et al. 2000), how- ever, accurate estimates are only possible by using a large number of loci (Leache ´ and Rannala 2010). As the results of these analyses have important implications for species delimitation, identification of management units, and con- servation policy, it is critical that a library of nuclear markers is readily available to researchers. Anonymous nuclear loci are not time or cost-effective for large-scale studies, and typically exhibit low levels of variation and informative sites (Jennings and Edwards 2005; Lee and Edwards 2008). The importance of developing nuclear markers with targeted levels of informative sites has long been recognized (Graybeal 1994), and an attempt to develop many informative nuclear protein-coding loci (NPCL) has been made Townsend et al. (2008). However, because these markers were developed based on distant genomes (pufferfish and human), they are not ideal for use within any one particular vertebrate group. Here, we describe a method for developing nuclear markers from existing genomic resources and present 104 rapidly-evolving NPCL designed primarily for squamates using the Anolis and Gallus genomes. Selection of candidate genes was the result of a multi-step filtering process involving three databases. First, a database of intronless genes was obtained from the SEGE Database (Sakharkar and Kangueane, 2004), which contains amino acid sequences of all intronless human genes. Genes Electronic supplementary material The online version of this article (doi:10.1007/s12686-011-9460-1) contains supplementary material, which is available to authorized users. D. M. Portik P. L. Wood Jr. J. L. Grismer E. L. Stanley T. R. Jackman Department of Biology, Villanova University, 800 Lancaster Avenue, Villanova, PA 19085-1699, USA D. M. Portik (&) Museum of Vertebrate Zoology and Department of Integrative Biology, University of California, 3101 Valley Life Sciences Building, Berkeley, CA 94720-3160, USA e-mail: [email protected] Present Address: J. L. Grismer Natural History Museum and Biodiversity Research Center and Department of Ecology and Evolutionary Biology, University of Kansas, 1345 Jayhawk Blvd, Dyche Hall, Lawrence, KS 66045, USA Present Address: E. L. Stanley Department of Herpetology, American Museum of Natural History, Central Park West at 79th Street, New York, NY 10024, USA 123 Conservation Genet Resour (2012) 4:1–10 DOI 10.1007/s12686-011-9460-1

Identification of 104 rapidly-evolving nuclear protein-coding markers for amplification across scaled reptiles using genomic resources

Embed Size (px)

Citation preview

Page 1: Identification of 104 rapidly-evolving nuclear protein-coding markers for amplification across scaled reptiles using genomic resources

TECHNICAL NOTE

Identification of 104 rapidly-evolving nuclear protein-codingmarkers for amplification across scaled reptiles using genomicresources

Daniel M. Portik • Perry L. Wood Jr. •

Jesse L. Grismer • Edward L. Stanley •

Todd R. Jackman

Received: 30 November 2010 / Accepted: 31 May 2011 / Published online: 17 June 2011

� Springer Science+Business Media B.V. 2011

Abstract As the fields of molecular systematics and

phylogeography are advancing, it is necessary to incorpo-

rate multiple loci in both population and species-level

inference. Here, we present primer sets for 104 intronless

orthologus exons designed for amplification in all squa-

mates. When comparing the Anolis genome to the Gallus

genome, all the markers have less than 67.2% DNA

sequence identity, the percent identity of the first third of

the commonly used nuclear marker RAG-1. The rate of

evolution in these markers is therefore greater than nuclear

markers commonly used, and we demonstrate their use-

fulness for both phylogeographic and phylogenetic studies.

Keywords Nuclear markers � Squamates � Primers �Marker development � Intronless exons

A recent trend in both phylogenetics and phylogeography

is the development of new analytical techniques that

require the use of multiple nuclear loci (Brito and Edwards

2009; Edwards 2009; Heled and Drummond 2008; Hey

2010; Hey and Nielsen 2004; Pritchard et al. 2000), how-

ever, accurate estimates are only possible by using a large

number of loci (Leache and Rannala 2010). As the results

of these analyses have important implications for species

delimitation, identification of management units, and con-

servation policy, it is critical that a library of nuclear

markers is readily available to researchers. Anonymous

nuclear loci are not time or cost-effective for large-scale

studies, and typically exhibit low levels of variation and

informative sites (Jennings and Edwards 2005; Lee and

Edwards 2008). The importance of developing nuclear

markers with targeted levels of informative sites has long

been recognized (Graybeal 1994), and an attempt to

develop many informative nuclear protein-coding loci

(NPCL) has been made Townsend et al. (2008). However,

because these markers were developed based on distant

genomes (pufferfish and human), they are not ideal for

use within any one particular vertebrate group. Here, we

describe a method for developing nuclear markers from

existing genomic resources and present 104 rapidly-evolving

NPCL designed primarily for squamates using the Anolis and

Gallus genomes.

Selection of candidate genes was the result of a multi-step

filtering process involving three databases. First, a database

of intronless genes was obtained from the SEGE Database

(Sakharkar and Kangueane, 2004), which contains amino

acid sequences of all intronless human genes. Genes

Electronic supplementary material The online version of thisarticle (doi:10.1007/s12686-011-9460-1) contains supplementarymaterial, which is available to authorized users.

D. M. Portik � P. L. Wood Jr. � J. L. Grismer �E. L. Stanley � T. R. Jackman

Department of Biology, Villanova University,

800 Lancaster Avenue, Villanova, PA 19085-1699, USA

D. M. Portik (&)

Museum of Vertebrate Zoology and Department of Integrative

Biology, University of California, 3101 Valley Life Sciences

Building, Berkeley, CA 94720-3160, USA

e-mail: [email protected]

Present Address:J. L. Grismer

Natural History Museum and Biodiversity Research Center

and Department of Ecology and Evolutionary Biology,

University of Kansas, 1345 Jayhawk Blvd, Dyche Hall,

Lawrence, KS 66045, USA

Present Address:E. L. Stanley

Department of Herpetology, American Museum of Natural

History, Central Park West at 79th Street, New York,

NY 10024, USA

123

Conservation Genet Resour (2012) 4:1–10

DOI 10.1007/s12686-011-9460-1

Page 2: Identification of 104 rapidly-evolving nuclear protein-coding markers for amplification across scaled reptiles using genomic resources

Ta

ble

1A

list

of

70

gen

es,

pri

mer

seq

uen

ces,

per

cen

tam

ino

acid

iden

tity

(A.A

.%

ID)

bet

wee

nA

no

lis

and

Ga

llu

s,an

dre

sult

sfr

om

init

ial

scre

ens

Gen

eP

rim

erS

equ

ence

50

to30

A.

A.

%ID

Ex

pec

ted

bas

ep

airs

TV

AL

CB

AB

L1

F1

MC

TG

CC

TC

GR

AA

GC

GG

GC

MA

61

.87

83

AF

R1

GG

CT

GR

CG

GG

TC

TT

CC

GC

AG

AK

AP

5F

1Y

TG

TT

TT

AA

GA

AR

AG

GA

AG

AA

GT

CC

TG

TG

A5

0.4

13

61

MB

R1

RM

GC

CA

TY

TC

GT

TS

AC

CA

GC

TG

CT

CR

AT

AL

MS

1F

1A

AA

AG

RA

TY

CA

GA

AA

AA

AA

TG

AA

47

.0F

1-R

1:

11

99

;F

2-R

2:

48

8;

F3

-R3

:7

06

;

F4

-R3

:4

84

;F

6-

R6

:1

46

8

MB

AF

F2

MC

CW

CT

TA

CT

AG

RT

CY

CA

GT

CT

GA

MA

AC

T

F3

RC

YT

GG

AA

YA

TG

AG

CC

AG

T

F4

CC

WC

TT

AC

TA

GR

TC

YC

AG

TC

TG

A

F6

AG

AG

GR

RT

YC

CW

GA

TG

TT

TC

R1

KT

CA

GA

CT

GR

GA

YC

TA

GT

AA

GW

GG

R2

RG

CT

GC

CT

GC

CC

CA

GA

AC

TA

G

R3

CT

GC

YT

KC

CC

CA

GA

AC

WA

G

R6

AC

TG

GC

TC

AT

RT

TC

CA

RG

AL

PK

2F

1A

GC

AG

GA

WG

AG

TG

CC

AW

AA

SG

45

.5F

1-R

1:

62

3;

F2

-R2

:

89

8

MB

MB

F2

MY

TG

AA

TG

CA

CA

AA

GT

GA

CA

GT

R1

WR

AC

AT

TT

TC

TT

TC

AC

TT

CC

AG

AC

R2

CT

AC

ST

TW

TG

GC

AC

TC

WT

CC

TG

AN

KR

D5

0F

1R

GC

CT

GG

GG

WG

GG

CA

TG

AR

GA

YA

54

.74

25

SB

R1

GG

CC

CA

TG

GA

GG

CY

GC

AG

CC

AG

AP

OB

F1

KC

CA

CG

TA

TC

CC

AC

AY

WC

WG

TR

AC

TG

CT

CC

65

.3F

1-R

1:

12

50

;F

2-R

2:

15

67

;F

3-R

3:

62

7

AF

F2

WC

TG

TG

GG

AA

SA

RR

CA

GG

SC

TG

AA

GA

A

F3

YC

CA

CT

GA

CA

MT

GS

AG

AR

TG

AA

AT

GA

AT

GC

R1

TC

TT

CA

GS

CC

TG

YY

TS

TT

CC

CA

CA

G

R2

RT

CC

AT

GT

RM

CC

TY

CC

AA

AG

AT

CC

R3

AG

TC

CT

GC

MA

CA

TY

CA

SA

TT

AA

CT

CT

GT

G

AP

OB

Ex

on

28

F1

YT

GC

GG

GA

GG

AA

TA

YT

TT

GA

59

.05

44

SB

R1

YA

YR

TC

TA

TT

CT

RA

GC

TC

TC

CT

TS

RC

GA

A

AR

HG

AP

21

F1

AA

TT

GG

CC

AG

TG

AG

TT

TT

TA

GM

AG

CT

GA

TG

TC

A7

6.4

F1

-R1

:9

09

;F

2-R

2:

88

7

SB

AF

AF

F2

CT

TG

GG

GA

TC

TG

GA

AA

GG

AT

R1

TT

TT

GG

GC

TC

TC

CA

ST

TT

CA

CC

CG

TM

TT

A

R2

CA

RT

CT

GC

GG

CT

GT

CY

AA

CT

T

AS

XL

1F

1W

GC

CA

GC

AR

CC

CT

YT

GG

TG

R5

8.2

12

23

AF

R1

AC

GC

CC

CG

CA

GC

CY

TT

RC

AC

AT

2 Conservation Genet Resour (2012) 4:1–10

123

Page 3: Identification of 104 rapidly-evolving nuclear protein-coding markers for amplification across scaled reptiles using genomic resources

Ta

ble

1co

nti

nu

ed

Gen

eP

rim

erS

equ

ence

50

to30

A.

A.

%ID

Ex

pec

ted

bas

ep

airs

TV

AL

CB

BB

S1

0F

1S

GT

GC

AT

GG

RC

TT

AA

TG

AG

CA

RC

51

.4F

1-R

1:

56

3;

F1

-R2

:

84

6;

F2

-R3

:5

06

;

F3

-R4

:9

17

F3

-R5

:

MB

AF

MB

AF

F2

AT

GG

SG

CM

TT

CA

AM

AT

GC

T

F3

AT

GS

MM

TT

AA

TG

AG

CA

RC

AY

GC

R1

AM

AG

AT

AA

TA

AT

TT

AA

YA

GG

AT

TT

CA

AA

G

R2

AG

CC

TY

TT

AA

TR

CC

AA

TG

AT

R3

TT

TC

AA

AG

AW

WC

CW

CC

TA

CC

R5

GT

RG

CC

KY

TT

AA

YR

CC

AA

TG

AT

BC

OR

L1

F1

KG

GG

GT

CT

CC

TC

CA

CM

CT

CR

CC

TT

SC

C5

6.6

84

7A

F

R1

GC

GG

GG

CT

CS

CC

CT

GR

TC

AA

TR

BL

C9

LF

1R

TG

GA

TC

CA

KC

CA

TG

TT

TG

CT

GG

G6

0.3

66

9M

B

R1

GG

CG

TC

TG

CG

GG

GA

CT

TG

AG

BR

CA

1F

1Y

GA

AG

AY

AA

AA

TA

TT

TG

GG

AG

AA

CA

47

.1F

1-R

1:

58

7;

F2

-R2

:

59

0;

F3

-R3

:9

25

;

F4

-R4

:5

49

AF

MB

AF

F2

GA

AG

AY

AA

AA

TA

TT

TG

GG

AR

AA

CA

T

F3

AG

GC

GG

AG

CA

GG

AG

GC

TT

CA

F4

GC

CA

GA

AA

TG

CC

GT

CC

AC

GC

R1

RC

TG

AA

GC

CT

CC

TG

CT

CC

GC

C

R2

AG

YA

RC

TG

AA

GC

CT

CC

TG

CT

R3

GC

GT

GG

AC

GG

CA

TT

TC

TG

GC

R4

GC

TG

GG

TC

CA

GC

AA

GC

CC

TC

C1

0O

RF

71

F1

CC

YA

AW

CG

CA

GA

GG

TG

GT

TC

TT

GT

53

.4F

1-R

4:

76

6;

F3

-R5

:

71

0

SB

MB

AF

F3

AG

YA

TT

GG

GA

GT

CT

SC

TS

GA

TG

AC

A

R4

YA

TA

TT

TG

CC

AS

TT

TC

TG

GR

TC

AT

AA

AA

TG

TC

R5

CT

GC

CT

TT

YC

TT

GC

TG

GY

T

CA

SC

5F

1M

AA

TA

AY

GA

CA

TG

GA

AA

TA

AT

AA

AG

42

.1F

1-R

1:

16

00

;F

2-R

2:

24

92

MB

AF

F2

CA

TM

TT

YT

CA

GA

KG

AA

AA

TG

AA

AT

GG

AT

AT

GA

R1

RT

TT

GG

TA

GY

TT

AG

GT

GG

AA

AG

AT

AC

C

R2

GC

YT

WM

TT

AT

TT

CC

AT

GT

CR

TT

AT

T

CG

NL

1F

1W

GT

CA

GC

AT

YM

GW

GT

CC

AA

GG

RA

TM

GA

TG

G5

8.1

91

6S

B

R1

AC

CA

GT

GT

CA

AT

AT

CA

CG

AC

C

CK

AP

5E

XO

N1

F1

MA

YY

MA

CC

CA

GM

TG

WA

AG

GR

CT

TC

TG

C6

0.4

63

8A

F

R1

TG

CC

AG

TY

TG

TT

CA

GC

CC

AG

G

DM

NF

1T

CM

AA

GG

AC

CC

CA

AG

CA

TC

A5

5.3

F2

-R2

:5

50

;F

1-R

1:

16

00

MB

MB

F2

MA

TT

GT

TG

CW

GA

AG

TC

AG

TC

TG

TC

YC

R1

RT

TG

GT

WA

CA

TC

CA

TY

TG

AA

AA

G

R2

KA

CT

TY

TT

CT

GT

KG

TT

TG

GA

TT

TC

TT

YA

G

Conservation Genet Resour (2012) 4:1–10 3

123

Page 4: Identification of 104 rapidly-evolving nuclear protein-coding markers for amplification across scaled reptiles using genomic resources

Ta

ble

1co

nti

nu

ed

Gen

eP

rim

erS

equ

ence

50

to30

A.

A.

%ID

Ex

pec

ted

bas

ep

airs

TV

AL

CB

EX

PH

5F

1*

AA

TA

AA

CT

KG

CA

GC

TA

TG

TA

CA

AA

AC

AA

GT

C4

9.8

F1

-R1

:1

00

8;

F2

-R2

:

56

7;

F3

-R3

:1

49

8

SB

MB

AF

SB

MB

AF

F2

TG

GA

TG

AG

CA

GC

TG

GC

MC

AG

G

F3

RC

GG

TC

TT

CA

GA

AG

GK

AG

CT

R1

*A

AY

CG

CC

CT

TC

TG

TG

AG

TG

AC

CT

CT

R2

TC

CG

CT

TG

GT

TT

GG

TG

RC

WA

R3

RC

TK

CT

MC

TK

GR

RG

AA

CC

AG

TR

AT

GC

TG

FA

ST

KD

5F

1W

CC

CA

CC

CA

CY

CY

RT

GC

TG

AA

T5

8.6

10

96

AF

R1

KG

TG

TG

AG

GC

AR

RA

TC

AT

GT

GG

TK

T

GE

MIN

4F

1C

AT

CC

TG

CA

CG

GC

GG

ST

TC

CT

G6

0.9

86

5S

B

R1

YT

CG

GC

YT

CC

TT

CA

CY

TT

CT

GO

LG

B1

F1

RG

AR

AG

AG

AA

CA

GC

TK

CA

GA

AG

AA

G5

2.0

F1

-R1

:9

11

;F

2-R

2:

71

3

AF

AF

F2

YT

GC

TG

CT

GG

AR

CT

SM

AA

GA

RG

CT

CA

R1

GY

TC

CT

CY

TT

CA

GT

GA

CT

TR

CT

C

R2

TT

TT

KM

AG

GC

TY

TG

GA

TT

TS

YT

CC

KT

CA

G

GP

5F

1S

CC

TG

GC

SG

AG

CT

GC

AG

GA

RC

46

.0S

BA

F

R1

GC

AG

GG

AR

AC

GT

TC

TG

CA

GY

TT

ST

GP

RIN

3F

1Y

CT

GC

CA

AG

CT

YT

CM

TT

GG

TC

AC

F1

-R1

:9

53

;F

2-R

2:

13

70

;F

3-R

3:

12

58

;F

98

3-

R2

15

6:

11

82

MB

AF

AF

AF

F2

TT

CA

GA

GA

CA

CA

GG

YA

CA

AT

GA

CA

GT

F3

YA

TC

YT

TG

CT

GC

CT

TC

YT

AA

A

F9

83

YA

CC

AG

YC

CM

AG

YA

TC

YT

TG

CT

GC

CT

R1

YC

TT

TT

AR

GA

AG

GC

AG

CA

AR

GA

R2

CT

CC

CT

TT

CA

GT

TT

CT

TA

TT

TG

AR

GA

YG

TA

TC

T

R2

16

5G

GT

TC

TG

GA

TA

GC

GA

TY

CC

CA

GG

GA

R3

RT

TR

TG

YT

GC

CT

CC

CT

TT

CA

HP

S6

F1

MY

CG

GG

TS

GT

GG

CC

TT

CC

AR

C6

5.4

F1

-R1

:1

05

7S

BA

F

R1

CA

GG

CC

TC

CT

CG

AA

CA

CC

AT

CT

CM

AC

C

KIA

A1

10

7F

1R

TG

CC

AT

GA

TT

TY

CT

TG

GW

AG

AA

GC

AG

58

.31

03

2S

B

R1

WC

GT

AT

TC

AG

TA

GR

TG

CT

GG

TT

SA

CT

KIA

A1

21

7F

1W

YG

GA

GA

AY

AT

TG

CT

TT

CA

TG

72

.0F

1-R

1:

56

8;

F2

-R2

:

64

2

SB

SB

SB

F2

KT

GY

TT

GG

AC

AA

AA

AR

CC

WG

TC

AT

R1

RC

GT

AT

TG

CT

TG

RG

TW

AG

AG

C

R2

RA

TT

TC

AA

AY

CT

TT

TW

GC

CT

CY

TT

AT

GT

KIA

A1

54

9F

1R

TA

AC

AA

GY

AA

TG

AG

GC

AK

TC

TT

AA

53

.78

31

SB

SB

SB

SB

R1

RT

AT

GR

TC

TM

GT

GA

AA

GG

CA

CT

G

4 Conservation Genet Resour (2012) 4:1–10

123

Page 5: Identification of 104 rapidly-evolving nuclear protein-coding markers for amplification across scaled reptiles using genomic resources

Ta

ble

1co

nti

nu

ed

Gen

eP

rim

erS

equ

ence

50

to30

A.

A.

%ID

Ex

pec

ted

bas

ep

airs

TV

AL

CB

KIA

A2

01

8F

1R

CC

CA

TC

CY

TA

CC

TA

TG

CA

GC

CA

TT

A6

1.3

F1

-R1

:6

71

;F

2-R

2:

93

5

SB

SB

AF

SB

SB

AF

F2

AG

CT

AC

TC

TG

CT

GA

GG

CA

CT

TA

TT

GG

R1

YT

GC

CC

AG

CC

AT

TT

GT

GA

TA

TG

CT

YT

GA

R2

CG

CT

GA

GT

AR

CT

GA

AG

AA

TT

TG

GC

TG

AC

GA

KIA

A2

02

6F

1T

CC

CA

AT

GG

AC

CT

TC

AG

CC

AC

64

.51

20

6A

F

R1

AC

AT

GG

CT

GC

CA

GC

TC

CA

AG

KIF

24

F1

*S

AA

AC

GT

RT

CT

CC

MA

AA

CG

CA

TC

C4

9.8

51

8S

BA

FS

BS

BA

F

R1

*W

GG

CT

GC

TG

RA

AY

TG

CT

GG

TG

LM

TK

2F

1W

CC

AG

AS

AA

GA

GA

SC

AA

CR

GC

AG

AW

GA

RG

TG

C5

4.4

F1

-R1

:4

07

;F

2-R

2:

16

56

AF

F2

TT

GG

CC

AT

GA

RA

TG

GA

GG

AA

GT

R1

YT

TC

CT

GM

CC

AC

TT

YC

AT

CY

TG

WG

CT

CC

R2

AC

AT

CT

AA

AG

AA

TC

YA

AT

GA

MA

P1

AF

1R

GA

CA

CA

GT

SA

AC

AG

CA

TS

CC

TT

CC

TC

50

.6F

1-R

1:

44

0;

F2

-R2

:

11

80

AF

AF

SB

F2

KA

CC

CC

CA

CC

AG

CG

CM

GG

RC

R1

KG

CG

CT

GG

TG

GG

GG

TM

GG

R2

YC

YT

CT

GG

AA

AC

CA

YA

CT

TT

CT

MC

1R

F1

GA

CA

TG

CT

GG

TG

AG

YR

TC

AG

F1

-R1

:5

51

;F

23

2-

R8

79

:6

48

SB

F2

32

CC

RA

TG

TA

CT

AC

TT

CA

TC

TG

CT

G

R1

GG

GC

AG

RT

GA

BG

AT

GA

GG

R8

79

GA

TG

AT

GA

GG

AT

GA

GG

WA

GA

GG

ML

LP

LU

SF

1C

GC

AC

CG

TS

AA

GG

TS

AC

YC

TG

AC

TC

C6

2.0

F1

-R1

:1

10

4;

F2

-R1

:

70

6

SB

F2

MA

CA

GG

SA

AG

AA

GC

GW

GG

GA

AG

MG

GT

R1

TG

TC

YT

TG

CC

CC

GG

TT

GC

TR

MX

RA

5F

1Y

AT

TT

TG

GC

AA

AR

GT

CC

GT

GG

GA

AR

A4

6.2

F1

-R1

:2

31

7;

F2

-R2

:

10

20

AF

SB

SB

AF

SB

AF

F2

KG

CT

GA

GC

CT

KC

CT

GG

GT

GA

R1

WT

GT

GC

TG

CA

TA

TG

CT

GT

WA

TC

TC

WG

GT

R2

YC

TM

CG

GC

CY

TC

TG

CA

AC

AT

TK

MY

ST

3F

1R

CA

GA

AC

AT

GG

AG

AC

TA

GC

CC

50

.3F

1-R

1:

13

09

MB

R1

YC

CC

AT

CA

TY

CC

CA

TY

TG

CA

TC

TG

C

N4

BP

2F

1S

AA

CA

AA

CW

AT

GG

GR

CA

GA

GR

GT

SA

AA

AR

A5

4.5

83

3S

B

R1

TC

AC

TT

TC

TT

CC

AC

AW

AT

GT

RC

TT

TT

NA

IPF

1A

AG

GA

GA

GG

CC

GG

CA

GT

GG

A5

5.2

80

9A

FA

F

R1

TG

GC

CA

AA

GA

CG

TT

GG

GC

TG

TG

NP

AT

F1

YA

AT

GC

TG

TT

TC

CA

GC

AT

CA

C5

2.7

F1

-R1

:4

55

MB

R1

GW

TC

TR

GG

AG

GA

GT

CC

GY

AA

TK

GC

TG

Conservation Genet Resour (2012) 4:1–10 5

123

Page 6: Identification of 104 rapidly-evolving nuclear protein-coding markers for amplification across scaled reptiles using genomic resources

Ta

ble

1co

nti

nu

ed

Gen

eP

rim

erS

equ

ence

50

to30

A.

A.

%ID

Ex

pec

ted

bas

ep

airs

TV

AL

CB

OM

GF

1R

YT

GC

CM

CA

CG

GA

CT

TC

A6

6.2

F1

-R1

F2

-R2

AF

MB

F2

CC

AG

GA

AY

AT

GG

TG

GA

AA

GR

G

R1

YK

GC

RG

CC

GC

TT

CG

TA

A

R2

YT

GG

AR

TA

GT

TT

GT

GG

TR

AT

R

PD

ZD

2F

1Y

TK

GA

GT

CA

GA

TG

AW

GA

RC

AA

AT

TG

AR

A4

7.0

F1

-R1

:1

53

5;

F2

-R2

:

70

4

MB

AF

MB

SB

F2

GG

AA

CA

AR

CC

AG

GR

CC

AA

AG

R1

GT

GC

GC

TG

AA

TT

TC

CT

GA

CC

R2

CT

CA

AT

GA

AW

ST

GC

GC

TG

AA

TT

PH

F3

F1

TG

CC

AG

TG

GA

TG

AC

AT

YC

TT

CA

RA

GC

C5

5.0

F1

-R1

:1

29

2;

F1

-R2

:

11

42

SB

AF

MB

R1

CC

AA

GG

RT

CA

CT

RT

GR

CG

CC

TY

R2

KA

CA

GG

KG

GC

CA

CG

GC

AT

CA

W

PIG

WF

1W

GT

GG

AY

TT

CC

CG

CA

GT

WT

CC

RC

GR

CG

61

.96

75

MB

R1

AG

GT

TS

GC

CA

TS

CG

GC

GG

GA

PK

DR

EJ

F1

36

0G

TA

GT

TT

CA

VC

AG

GG

TG

CA

AA

GG

GT

AT

CT

TG

T5

7.3

F1

36

0-R

24

80

:1

15

5;

F1

90

0-R

25

40

:

69

1;

GE

CK

OF

1-

GE

CK

OR

1:

60

4

SB

SB

AF

F1

90

0A

TT

AT

AT

TA

TA

TG

GT

TT

GA

CC

TA

TG

GC

TA

CA

CA

AC

GE

CK

OF

1M

AW

TT

TC

CA

TG

GT

GG

TG

SA

GA

TA

TG

T

GE

CK

OR

1T

CA

GT

GG

CA

CA

AA

GA

CA

TT

GC

R2

48

0T

TT

CA

GT

AT

CT

TT

DG

CC

CT

TA

TT

TG

CC

TC

AT

TC

R2

54

0G

AH

GG

CA

GT

GG

CT

TT

TA

CT

AA

TC

AC

AA

PR

DM

2F

1C

AY

CA

GC

GS

MG

GG

TY

CA

CG

AG

CG

56

.47

29

AF

R1

TC

GA

AG

WR

CC

GG

RC

TG

TG

CT

G

PR

LR

F2

AR

YG

AA

GA

CC

AG

CA

AC

TG

AT

GC

60

.37

70

SB

SB

R4

GG

CA

AG

GC

CT

CC

AY

TT

TG

Pse

ud

oZ

Ex

on

15

F1

SC

AG

CC

MC

TG

GA

YT

TC

TC

AG

G5

2.7

10

22

SB

R1

KG

GA

KC

CA

AA

SA

AC

GA

KG

AG

AT

G

Pse

ud

oZ

Ex

on

21

F1

YT

GA

TM

AA

RG

GC

TC

CG

TG

GA

TG

AA

G5

4.4

11

15

AF

SB

R1

RT

TC

AG

GT

CS

AT

YT

GG

AT

CA

TA

TC

TT

TC

RA

I4F

1W

CA

GA

KA

TY

TC

AG

AR

AA

TG

GC

TC

TG

AT

C6

5.5

43

7A

F

R1

YT

TK

TC

AR

YT

TG

TC

CT

GY

AG

CC

TT

TT

RA

T

RIF

1F

1C

CA

CG

AT

AC

CA

YA

CC

MG

RA

G5

0.0

F1

-R1

:6

19

;F

2-R

2:

89

8

MB

F2

MT

GY

CA

GC

AC

AA

RA

GA

AG

CA

R1

YT

GC

TT

CT

YT

TG

TG

CT

GR

CA

R2

GG

AW

GC

TG

AA

GG

AG

AC

CA

RIK

EN

F2

ST

TT

TA

TT

CR

WT

GC

CA

TC

YT

TA

TC

CT

TA

A5

3.1

70

2M

BS

B

R1

YK

CC

KT

TC

TA

GT

TC

TC

CA

GT

TA

TT

AC

AG

6 Conservation Genet Resour (2012) 4:1–10

123

Page 7: Identification of 104 rapidly-evolving nuclear protein-coding markers for amplification across scaled reptiles using genomic resources

Ta

ble

1co

nti

nu

ed

Gen

eP

rim

erS

equ

ence

50

to30

A.

A.

%ID

Ex

pec

ted

bas

ep

airs

TV

AL

CB

RL

FF

1S

AA

AC

AC

CT

YC

GM

AG

GG

CT

CA

TC

C5

6.9

98

1A

F

R1

CT

CC

CC

TC

TC

KT

GC

TT

CY

AG

TY

RM

I1F

1G

AG

TG

GA

AA

CC

TG

GC

TA

TC

AT

CT

AC

AT

56

.89

41

MB

R1

AA

AC

AA

AT

CG

TC

AT

CT

AA

TG

GA

AA

GT

C

RN

AS

EL

F1

GG

GA

AG

GA

GG

AA

GC

CC

TG

AG

GT

T6

3.0

F1

-R1

:5

01

;F

2-R

1:

73

4;

F3

-R2

:7

25

;

F4

-R2

:5

19

MB

F2

WT

GG

WT

GG

AC

RC

CS

CT

TC

AY

AG

TG

F3

GA

CR

CC

SC

TT

CA

YA

GT

GC

TG

F4

TC

AT

GG

AR

GC

TG

CT

TG

GT

AT

R1

YC

TT

GC

WC

CY

TT

TT

CA

CA

CA

RC

R2

CT

TG

CW

CC

YT

TT

TY

AC

AC

A

SE

C1

6A

F1

WC

AG

AA

CC

AA

GA

RG

TK

YT

GC

CM

AG

YG

AG

CC

50

.61

13

7A

F

R1

GG

CT

GR

GC

CA

AG

YT

RT

AR

CT

YT

GR

TT

MG

GC

TG

SN

X1

9F

1R

CC

TG

CA

AC

GT

GC

TG

YT

GC

C5

3.8

91

6A

F

R1

TG

TG

CT

CY

CG

GG

CW

GT

GA

TG

GT

SO

X1

0R

1A

TG

TG

CT

AC

TT

GC

AT

AA

AT

AA

GG

47

.81

79

6M

BM

B

F1

YA

TW

GG

CC

TT

CT

AG

AT

GA

GG

A

SP

EN

F1

YA

GC

GC

MA

AG

AT

CA

GY

CA

GA

TC

CC

61

.4F

1-R

1:

77

3;

F1

-R2

:

10

03

MB

R1

GS

GT

GA

CG

CT

GT

GC

GG

GG

GC

AT

R2

SA

CG

TC

GG

RC

TG

SA

CG

GG

GG

C

TL

R3

F1

TG

AT

TG

CA

CY

TG

TG

AM

AG

YA

TW

GC

TT

GG

TT

TG

66

.35

10

MB

R1

WA

TA

AT

CT

TC

CT

GC

TC

CT

TY

TT

AT

GC

TL

R4

F1

RA

GA

GT

GC

TY

CG

KA

TY

AC

CA

AG

A5

9.6

13

27

AF

R1

KC

GG

RA

CA

GK

CC

CA

GY

YT

YT

GC

C

TL

R5

F1

TG

GC

TR

AA

TG

AA

AC

CA

AT

GT

MA

CY

YT

AG

CT

GG

62

.05

30

AF

R1

AC

AC

AC

CA

SC

CA

TC

YT

TG

AG

AA

AC

TG

C

TL

R7

F1

WG

GC

CC

AG

GG

RC

AY

RS

AR

AG

GG

A6

6.0

45

3A

F

R1

KT

GC

CA

CT

TK

YA

AT

RT

AC

TT

GT

TK

GT

TR

AN

K1

F1

SA

AG

TT

CA

TT

GY

WG

GC

TT

GA

AS

TG

TG

AG

G6

4.4

11

35

SB

R1

WA

CA

GT

WC

GY

TC

AG

CC

TC

TC

CT

GA

UB

N1

F2

CC

TC

CC

TS

GA

AG

CM

GT

CT

CT

AA

GG

AA

CT

67

.29

66

MB

SB

SB

R2

YM

AC

AG

CW

GG

CT

TY

AG

GG

AG

GA

GG

TC

AG

US

PL

1F

1W

TG

GC

TT

GA

RT

GT

GA

TG

AY

T4

8.2

12

76

MB

R1

YT

TY

TC

CT

TT

TT

AG

CW

TT

AA

G

WD

R8

1F

1W

TG

GG

GT

YG

TS

CA

GC

TC

TT

TG

AC

CA

G6

1.3

11

09

SB

MB

AF

SB

AF

R1

CT

GG

GC

CA

CR

AA

GC

AG

TC

TG

TG

TA

SA

GG

TA

GA

A

Conservation Genet Resour (2012) 4:1–10 7

123

Page 8: Identification of 104 rapidly-evolving nuclear protein-coding markers for amplification across scaled reptiles using genomic resources

containing over 1,500 base pairs (bp) were selected. Second,

we obtained exons consisting of over 1,500 bp from the

Gallus gallus database by screening the three databases

(ILD, ULD, and EID) at the University of Toledo (Shepelev

and Fedorov 2006). Last, we used Biomart (Smedley et al.

2009) to obtain all Anolis exons consisting of over 1,500 bp.

We focused on intronless genes because they allow a greater

chance for specific lengths of target sequence to be obtained,

and conservation of a reading frame allows better align-

ments. Human intronless genes were searched for in the

Gallus genome using HomoloGene (NCBI), keeping only

genes with less than 70% amino acid identity. As a pre-

caution, the Human-Gallus genes were compared to the

Anolis genome using the Gallus version of the correspond-

ing human ortholog to check for the presence of paralogs in

the Anolis genome. The Anolis and Gallus exons obtained

were first compared to Anolis or Gallus genomes using a

discontiguous megablast in Geneious v5.3.6 (Drummond

et al. 2011) to discover and remove paralogs and remove

genes with DNA identities of over 75%. The tBlastN func-

tion was then used for whole exon comparisons to screen the

Anolis-Gallus genes for our ultimate cutoff of 67.2% DNA

identity (faster than the first third of RAG-1). Across all

databases, only orthologous genes over 1,500 bp with less

than 67.2% DNA identity between Anolis and Gallus were

selected. Although use of the three databases may appear

redundant, there were cases in which useful genes were

revealed through the use of one database that were absent

from the other two databases.

Primers were designed from exon alignments between

Gallus and Anolis using Primer3 (Rozen and Skaletsky

2000). Amplifications occurred in 25lL volume reactions

initiated at 95�C for 2 min followed by 35 cycles of

95�C for 35 s, 50�C for 35 s and 72�C for 1 min 35 s (with

extension increasing 4 s per cycle) as in Portik et al.

(2011).

Initial sorting of exons over 1,500 bases produced over

500 exons from the three databases. After the filtering

process, 104 genes were selected for primer design in

squamates. From these 104 genes, primer sets were

developed for 170 gene fragments ranging in size from 407

to 2,492 base pairs, with the average fragment containing

897 base pairs. We present results from 70 genes tested

in at least one squamate group in Table 1, and 34 untested

genes are presented in Online Resource 1. Resulting

genetic diversity indices are presented in Table 2.

We have developed 104 rapidly-evolving orthologous

NPCL useful for both interspecific and intraspecific studies

within squamates. We have tested a subset of these markers

using skinks, varanids, gekkonids, cordylids, and agamids.

Several markers have proven useful for diagnosing intra-

specific populations in skinks (EXPH5, KIF24, Table 2;

Portik et al. 2010; Portik et al. 2011) as well as delimitingTa

ble

1co

nti

nu

ed

Gen

eP

rim

erS

equ

ence

50

to30

A.

A.

%ID

Ex

pec

ted

bas

ep

airs

TV

AL

CB

XIR

P1

F1

YG

GW

GA

TG

TC

AR

AA

CA

GC

CA

AG

TG

G6

3.5

F1

-R1

:6

97

;F

2-R

2:

12

47

;F

3-R

3:

11

16

SB

F2

RA

AC

AG

CC

AA

GT

GG

CT

CT

TT

GA

AA

CK

CA

AC

CY

A

F3

YG

AG

AA

GG

GA

GA

TC

TG

GA

CT

AY

CT

GA

AG

R1

RA

AC

CT

TT

TG

CC

YC

CA

AC

AT

CT

CC

R2

YR

GG

CT

GG

TT

TT

CA

AA

AA

GC

CA

GG

TR

GA

T

R3

AA

AT

CC

CC

TT

TG

GA

GA

CA

TT

AA

CG

TT

AR

AC

TT

T

ZH

X3

F1

YC

GG

AA

RT

GG

TT

YA

GC

GA

YA

GG

A6

0.6

85

6A

F

R1

SC

GA

CT

RT

CM

CC

AA

AC

CA

GC

G

ZN

F4

51

F1

WC

GT

TG

TC

GT

AA

TK

CT

GG

CC

C6

1.3

56

9M

B

R1

YC

CT

CC

AT

GR

AA

YC

GG

CT

CA

TR

TG

CA

Inca

ses

of

mu

ltip

lep

rim

erse

ts,

spec

ific

com

bin

atio

ns

of

pri

mer

sp

airs

and

exp

ecte

dp

rod

uct

size

are

giv

en.

Res

ult

sar

eab

bre

via

ted

asfo

llo

ws:

SB

rep

rese

nt

sin

gle

-ban

dP

CR

pro

du

cts

of

exp

ecte

dsi

ze,

MB

mu

ltip

leb

and

s,A

Ffa

ilu

reto

amp

lify

.B

lan

kn

ot

test

ed.

Th

efo

llo

win

gta

xo

no

mic

abb

rev

iati

on

sar

eu

sed

:T

,T

rach

ylep

is(S

cin

cid

ae);

V,

Va

ran

us

(Var

anid

ae);

A,

Aca

nth

osa

ura

(Ag

amid

ae);

L,

Lei

ole

pis

(Ag

amid

ae);

C,

Co

rdyl

us,

Ch

am

aes

au

ra,

Pla

tysa

uru

s,P

seu

do

cord

ylu

s(C

ord

yli

dae

);B

,B

ava

yia

(Gek

ko

nid

ae).

Pri

mer

sw

ith

aste

risk

sp

ub

lish

edin

Po

rtik

etal

.(2

01

0)

8 Conservation Genet Resour (2012) 4:1–10

123

Page 9: Identification of 104 rapidly-evolving nuclear protein-coding markers for amplification across scaled reptiles using genomic resources

species boundaries in cordylid lizards (KIF24, PRLR,

Table 2; Stanley et al. 2011). Several markers have amplified

broadly across five diverse squamate families (Agamidae,

Cordylidae, Gekkonidae, Scincidae, Varanidae) and have

potential for resolving higher-level squamate relationships

(Table 1).

Our identified NCPL have great potential for squamate

conservation efforts, as they can be used at a variety of

levels. A critical step to the protection of evolutionary

lineages is their initial identification, which is often com-

pleted using molecular evidence. The NCPL in this study

can be used individually or in a multilocus framework to

accomplish this task and allow evolutionary lineages to

be targeted for conservation efforts or protective status.

Alternatively, within particular species multiple NCPL can

be used to determine the genetic cohesiveness of popula-

tions in a metapopulation system and allow researchers to

define management units. These management units can be

assessed for genetic variation and informed decisions can

be made to protect the overall genetic diversity within a

species.

Acknowledgments We would like to thank Nicole Rocha, Andrew

Feiter, Arianna Kuhn, Maria Tempera, Lauren Adderly, and Stuart

Love Nielsen for contributions in laboratory work. We thank Aaron

Bauer for providing many tissue samples used in this study. Funding for

this project was provided by a National Science Foundation grant (DEB

0515909) and by the Department of Biology at Villanova University.

References

Brito P, Edwards SV (2009) Multilocus phylogeography and phylog-

enetics using sequence-based markers. Genetica 135:439–455

Drummond AJ, Ashton B, Buxton S, Cheung M, Cooper A, Duran C,

Field M, Heled J, Kearse M, Markowitz S, Moir R, Stones-Havas S,

Sturrock S, Thierer T, Wilson A (2011) Geneious v5.4. Available

from http://www.geneious.com/

Edwards SV (2009) Is a new and general theory of molecular

systematics emerging? Evolution 63:1–19

Graybeal A (1994) Evaluating the phylogenetic utility of genes: a

search for genes informative about deep divergences among

vertebrates. Syst Biol 43:174–193

Heled J, Drummond AJ (2008) Bayesian inference of population size

history from multiple loci. BMC Evol Biol 8:289

Hey J (2010) Isolation with migration models for more than two

populations. Mol Biol Evol 27:905–920

Hey J, Nielsen R (2004) Multilocus methods for estimating popula-

tion sizes, migration rates and divergence time, with applications

to the divergence of Drosophila pseudoobscura and D-persim-

ilis. Genetics 167:747–760

Jennings WB, Edwards SV (2005) Speciational history of Australian

grass finches (Poephila) inferred from thirty gene trees. Evolu-

tion 59:2033–2047

Leache AD, Rannala B (2010) The accuracy of species tree

estimation under simulation: a comparison of methods. Syst

Biol. doi:10.1093/sysbio/syq073

Lee JY, Edwards SV (2008) Divergence across Australia’s carpen-

tarian barrier: statistical phylogeography of the red-backed fairy

wren (Malurus melanocephalus). Evolution 62:3117–3134

Portik DM, Bauer AM, Jackman TR (2010) The phylogenetic

affinities of Trachylepis sulcata nigra and the intraspecific

evolution of coastal melanism in the western rock skink. Afr

Zool 45:147–159

Portik DM, Bauer AM, Jackman TR (2011) Bridging the gap: western

rock skinks (Trachylepis sulcata) have a short history in South

Africa. Mol Ecol 20:1744–1758

Pritchard JK, Stephens M, Donnelly P (2000) Inference of population

structure using multilocus genotype data. Genetics 155:945–959

Rozen S, Skaletsky J (2000) Primer3 on the WWW for general users

and for biologist programmers. In: Krawetz S, Misener S (eds)

Bioinformatics methods and protocols: methods in molecular

biology. Humana Press, Totowa, pp 365–386

Sakharkar MK, Kangueane P (2004) Genome SEGE: a database for

intronless genes in eukaryotic genomes. BMC Bioinform 5:67

Shepelev V, Fedorov A (2006) Advances in the exon-intron database.

Brief Bioinform 7:178–185

Smedley D, Haider S, Ballester B, Holland R, London D, Thorisson G,

Kasprzyk A (2009) BioMart—biological queries made easy. BMC

Genomics 10:22

Table 2 Nucleotide diversity of markers for sequenced squamate species

Gene Acanthosaura (6 sp.) Cordylus (14 sp.) Leiolepis (5 sp.) Trachylepis (2 sp.) Varanus (8 sp.)

Intra Inter Intra Inter Intra Inter Intra Inter Intra Inter

RAG-1 0.0014–0.0145 0.1461 0.0019–0.0089 0.0112 0.0024–0.0130 0.0298

C10orf71 0.0015–0.0058 0.0067

EXPH5 0.0013–0.0098 0.0217

KIAA1217 0.0043–0.0056 0.0311

KIAA1549 0.0092–0.0786 0.0406 0.0016–0.0092 0.0228

KIAA2018 0.0023–0.0072 0.0105 0.0015–0.0058 0.0067

KIF24 0.0095–0.0220 0.0317 0.0063–0.0191 0.0235

MXRA5 0.0034–0.0064 0.0179 0.0023–0.0133 0.0302

PKDREJ 0.0016–0.0056 0.0115 0.0012–0.0184 0.0714

PRLR 0.0016–0.0501 0.0547 0.0015–0.0189 0.0220

Diversity calculations were conducted at the interspecific (all sequenced species within the genus) and intraspecific (within each species,

presented as a range) levels

Conservation Genet Resour (2012) 4:1–10 9

123

Page 10: Identification of 104 rapidly-evolving nuclear protein-coding markers for amplification across scaled reptiles using genomic resources

Stanley EL, Bauer AM, Jackman TR, Branch WR, Mouton PLFN

(2011) Between a rock and a hard polytomy: rapid radiation

in the rupicolous girdled lizards (Squamata: Cordylidae). Mol

Phylogen Evol 58:53–70

Townsend TM, Alegre RE, Kelley ST, Wiens JJ, Reeder TW (2008)

Rapid development of multiple nuclear loci for phylogenetic

analysis using genomic resources: an example from squamate

reptiles. Mol Phylogen Evol 47:129–142

10 Conservation Genet Resour (2012) 4:1–10

123