33
Department of Geological Sciences | Indiana University (c) 2012, P. David Polly G563 Quantitative Paleontology (Sepkowski, 1993, Paleobiology, 19: 43-51) Empirical patterns Temporal Diversity

Temporal Diversity - Indiana University 3 - Diversity curves.pdf · Lists, Matrices, and other Multidimensional data You will often work with “lists”, which is Mathematica’s

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Temporal Diversity - Indiana University 3 - Diversity curves.pdf · Lists, Matrices, and other Multidimensional data You will often work with “lists”, which is Mathematica’s

Department of Geological Sciences | Indiana University (c) 2012, P. David Polly

G563 Quantitative Paleontology

(Sepkowski, 1993, Paleobiology, 19: 43-51)

Empirical patterns

Temporal Diversity

Page 2: Temporal Diversity - Indiana University 3 - Diversity curves.pdf · Lists, Matrices, and other Multidimensional data You will often work with “lists”, which is Mathematica’s

Department of Geological Sciences | Indiana University (c) 2012, P. David Polly

G563 Quantitative Paleontology

John Phillips , 1860 Life on the Earth: Its Origin and Succession

The first Phanerozoic Diversity Curve

Tabulation of number of fossil species in British strata from John Phillips, 1860.

Analysis of diversity can be used to calculate

1. diversity curve showing rise and fall in standing diversity

2. rates of origination (speciation or origins of higher taxa)

3. rates of extinction (used to estimate background rates and mass extinctions)

Page 3: Temporal Diversity - Indiana University 3 - Diversity curves.pdf · Lists, Matrices, and other Multidimensional data You will often work with “lists”, which is Mathematica’s

Department of Geological Sciences | Indiana University (c) 2012, P. David Polly

G563 Quantitative Paleontology

Diversity terminology

Taxonomic diversity. The number of different taxa, usually species, but often higher taxa, at a given time and place. Diversity counts differ based on temporal bin and geographic extent (e.g., local diversity vs. global diversity). This concept is distinct from morphological diversity.

Taxonomic richness. The number of unique taxa in a sample, synonym of taxonomic diversity as defined here. Other measures of diversity could take into account relative abundances of taxa.

Standing diversity. The taxonomic diversity at a particular time.

Diversity curve. Graph showing standing diversity through time. Usually time is divided into bins and the curve is extrapolated between bins.

Pull of the Recent. Bias in diversity when the modern world is compared to the geological past. Diversity can, in principle, be sampled in full in the modern world, but only in part in the past because of non-preservation of many taxa. Generally speaking, the fossil record gets less complete deeper into the past, so diversity patterns may be biased near the “recent”.

Page 4: Temporal Diversity - Indiana University 3 - Diversity curves.pdf · Lists, Matrices, and other Multidimensional data You will often work with “lists”, which is Mathematica’s

Department of Geological Sciences | Indiana University (c) 2012, P. David Polly

G563 Quantitative Paleontology

(Sepkowski, 1993, Paleobiology, 19: 43-51)

Calculation of diversity curve is not as simple as it looks

Sampling of the past is not continuous

Full complement of taxa from one time are seldom preserved at a single site (or “collection” in PaleoDB terms).

Fossil sites (or “collections”) must be therefore binned together to calculate standing diversity, which causes the following issues:

1. time averaging (binned sites not precisely the same age)

2. local diversity versus broader diversity

Bins can be geological periods (e.g., Eocene) or intervals of time (e.g., 5 my)

Page 5: Temporal Diversity - Indiana University 3 - Diversity curves.pdf · Lists, Matrices, and other Multidimensional data You will often work with “lists”, which is Mathematica’s

Department of Geological Sciences | Indiana University (c) 2012, P. David Polly

G563 Quantitative Paleontology

Foote, 2000. Paleobiology

Bins, Boundaries, and Counting TaxaThere are four types of taxa associated with any bin:

1. taxa that extend into the bin from earlier times but become extinct (lower boundary crossers)

2. taxa that originate in the bin and persist to later times (upper boundary crossers)

3. taxa that originate before the bin and persist to later times (both boundary crossers)

4. taxa that originate and become extinct within the bin (singletons)

Diversity can be estimated as number of taxa within a bin (time averaging inflates this estimate) or as number of taxa crossing a boundary (singletons make this an underestimate)

Inte

rval

or B

in

Upper Boundary

Lower Boundary

first appearance datumFAD

Origination

last appearance datumLAD

extinction

infe

rred

rang

e

Page 6: Temporal Diversity - Indiana University 3 - Diversity curves.pdf · Lists, Matrices, and other Multidimensional data You will often work with “lists”, which is Mathematica’s

Department of Geological Sciences | Indiana University (c) 2012, P. David Polly

G563 Quantitative Paleontology

Estimating diversity curves

1. Find the FAD and LAD for each taxon

2. Divide time into bins, either by named geological period or equal intervals of absolute time (the latter is preferred)

3. Starting at the oldest end of the scale, count the number of originations and extinctions within each interval

4. Calculate the diversity using standard method (diversity within bins) or boundary crosser method

Page 7: Temporal Diversity - Indiana University 3 - Diversity curves.pdf · Lists, Matrices, and other Multidimensional data You will often work with “lists”, which is Mathematica’s

Department of Geological Sciences | Indiana University (c) 2012, P. David Polly

G563 Quantitative Paleontology

Using SQL to find the FAD and LAD?

Page 8: Temporal Diversity - Indiana University 3 - Diversity curves.pdf · Lists, Matrices, and other Multidimensional data You will often work with “lists”, which is Mathematica’s

Department of Geological Sciences | Indiana University (c) 2012, P. David Polly

G563 Quantitative Paleontology

Using SQL to find the FAD and LAD

SELECT Genus, Min(interval_midpoint), Max(interval_midpoint) FROM Occurrences GROUP BY Genus;

Page 9: Temporal Diversity - Indiana University 3 - Diversity curves.pdf · Lists, Matrices, and other Multidimensional data You will often work with “lists”, which is Mathematica’s

Department of Geological Sciences | Indiana University (c) 2012, P. David Polly

G563 Quantitative Paleontology

Standard Method for Diversity Curves

Starting with oldest bin, calculate diversity for each bin as follows:

dt = dt-1 + N origt - N extt-1

dt diversity at time interval tdt-1 diversity at time interval immediately before tN origt number of originations during time interval tN extt-1 number of extinctions during time interval immediately before t

Explain?

Alternative method: calculate d without including singletons:

dt = dt-1 + N origt - N extt-1 - Nst

Nst number of singletons in time interval t

Page 10: Temporal Diversity - Indiana University 3 - Diversity curves.pdf · Lists, Matrices, and other Multidimensional data You will often work with “lists”, which is Mathematica’s

Department of Geological Sciences | Indiana University (c) 2012, P. David Polly

G563 Quantitative Paleontology

Standard Method for Diversity Curves

Starting with oldest bin, calculate diversity for each bin as follows:

dt = dt-1 + N origt - N extt-1

dt diversity at time interval tdt-1 diversity at time interval immediately before tN origt number of originations during time interval tN extt-1 number of extinctions during time interval immediately before t

Explain?

Alternative method: calculate d without including singletons:

dt = dt-1 + N origt - N extt-1 - Nst

Nst number of singletons in time interval t

Page 11: Temporal Diversity - Indiana University 3 - Diversity curves.pdf · Lists, Matrices, and other Multidimensional data You will often work with “lists”, which is Mathematica’s

Department of Geological Sciences | Indiana University (c) 2012, P. David Polly

G563 Quantitative Paleontology

(c) Wolfram Research

Mathematica

Page 12: Temporal Diversity - Indiana University 3 - Diversity curves.pdf · Lists, Matrices, and other Multidimensional data You will often work with “lists”, which is Mathematica’s

Department of Geological Sciences | Indiana University (c) 2012, P. David Polly

G563 Quantitative Paleontology

Uses for Mathematica

Calculations, simple or complicated

Mathematical functions

Statistical analysis

Page 13: Temporal Diversity - Indiana University 3 - Diversity curves.pdf · Lists, Matrices, and other Multidimensional data You will often work with “lists”, which is Mathematica’s

Department of Geological Sciences | Indiana University (c) 2012, P. David Polly

G563 Quantitative Paleontology

Graphics in MathematicaPlots of data

Plots of functions Specialized objects

Three-dimensional plots

Page 14: Temporal Diversity - Indiana University 3 - Diversity curves.pdf · Lists, Matrices, and other Multidimensional data You will often work with “lists”, which is Mathematica’s

Department of Geological Sciences | Indiana University (c) 2012, P. David Polly

G563 Quantitative Paleontology

A typical Mathematica notebook

Kernels and Notebooks

Mathematica has two components, the kernel and the notebook

The kernel is the invisible part of the program that does all the calculations

The notebook is the main user interface, its purpose is to allow you to perform analyses and to save them for re-use or for later reference

You can work with many notebooks at once. They share information between them because they interface with the same kernel

For advanced work you can work with two kernels, which allows you to run two sets of calculations in different notebooks at the same time

Page 15: Temporal Diversity - Indiana University 3 - Diversity curves.pdf · Lists, Matrices, and other Multidimensional data You will often work with “lists”, which is Mathematica’s

Department of Geological Sciences | Indiana University (c) 2012, P. David Polly

G563 Quantitative Paleontology

Notebooks and cellsNotebooks are organized into cells

Default cells are for calculations, with input entered by you followed by output created by the kernel

Cells must be executed to obtain output: Shift + Enter to execute

Cells may be executed more than once, and the input can be changed between executions

Brackets in the right margin show cell boundaries and distinguish between input and output

Uses for brackets:

1. monitor calculations (bracket is highlighted while the kernel is executing)

2. select entire cell for deletion

3. hide output by double clicking

Page 16: Temporal Diversity - Indiana University 3 - Diversity curves.pdf · Lists, Matrices, and other Multidimensional data You will often work with “lists”, which is Mathematica’s

Department of Geological Sciences | Indiana University (c) 2012, P. David Polly

G563 Quantitative Paleontology

Formatting notebooks

Notebooks can be formatted like a word processor document

Individual cells can be formatted as titles, text, section headings, or input (input is the default)

Use Format | Style menu to format individual cells

Use Format | Stylesheet menu to format the whole notebook

Notebook formatted with default stylesheet

Notebook formatted with pastel color stylesheet

Page 17: Temporal Diversity - Indiana University 3 - Diversity curves.pdf · Lists, Matrices, and other Multidimensional data You will often work with “lists”, which is Mathematica’s

Department of Geological Sciences | Indiana University (c) 2012, P. David Polly

G563 Quantitative Paleontology

Functions

Functions are key to Mathematica: functions receive information or data, process it, and return a result

Functions are called by their name, usually composed of complete English words describing what the function does, with no spaces and first letters capitalized

Function names are followed by square brackets, in which one or more arguments is entered:

FunctionName[argument]

For example, the ListPlot[] function takes a matrix of x,y values as its argument:

ListPlot[{{1,2},{3,4}}]

Mathematica’s help files give descriptions and examples of every function

Page 18: Temporal Diversity - Indiana University 3 - Diversity curves.pdf · Lists, Matrices, and other Multidimensional data You will often work with “lists”, which is Mathematica’s

Department of Geological Sciences | Indiana University (c) 2012, P. David Polly

G563 Quantitative Paleontology

Options for functions

Many functions have options that are entered as arguments

Options usually have the format OptionName -> Value

Find options with Options[FunctionName] or in Documentation Center

Listplot with no options

Listplot with three options

Page 19: Temporal Diversity - Indiana University 3 - Diversity curves.pdf · Lists, Matrices, and other Multidimensional data You will often work with “lists”, which is Mathematica’s

Department of Geological Sciences | Indiana University (c) 2012, P. David Polly

G563 Quantitative Paleontology

Getting help

Mathematica help files can be browsed or searched from the Documentation Center of the Help menu

Function names are always made of up complete words, no spaces, with the first word capitalized

Search for functions you hope exist: “Histogram”, “LinearRegression”, “PrincipalComponents”, “GenomeData”

Note Function Browser and Mathematica Book help buttons at top left of the Documentation Center

Page 20: Temporal Diversity - Indiana University 3 - Diversity curves.pdf · Lists, Matrices, and other Multidimensional data You will often work with “lists”, which is Mathematica’s

Department of Geological Sciences | Indiana University (c) 2012, P. David Polly

G563 Quantitative Paleontology

VariablesVariables are also key to Mathematica, allowing you to store information

Variables do not have brackets or options

You create variables, giving them a name and putting something into them

Here a variable called data is used to store a number, a sequence of numbers, the natural log of a sequence of numbers, and data imported from an Excel file. A variable called mygraph is used to store a graphic

You can retrieve what is inside a variable by executing it (the graph is displayed again by executing mygraph)

Page 21: Temporal Diversity - Indiana University 3 - Diversity curves.pdf · Lists, Matrices, and other Multidimensional data You will often work with “lists”, which is Mathematica’s

Department of Geological Sciences | Indiana University (c) 2012, P. David Polly

G563 Quantitative Paleontology

Parts of variables

When a variable has more than one item stored, you can get specific parts using double square brackets after the variable name

data returns all the items in data

data[[1]] returns only the first item in data

data[[1;;3]] returns items 1 to 3

For more examples look at the Documentation Center under the function Part[] and under the tutorial GettingPiecesOfLists

Page 22: Temporal Diversity - Indiana University 3 - Diversity curves.pdf · Lists, Matrices, and other Multidimensional data You will often work with “lists”, which is Mathematica’s

Department of Geological Sciences | Indiana University (c) 2012, P. David Polly

G563 Quantitative Paleontology

Lists, Matrices, and other Multidimensional data

You will often work with “lists”, which is Mathematica’s term for any group of several items

Some lists have only one element (scalar), some have a long row of elements (vector), some have columns and rows of data (matrix or array)

You can get columns, rows, or elements from the list using the double square bracket system

See Documentation Center under:

1. ListsOverview

2. HandlingArraysOfData

Page 23: Temporal Diversity - Indiana University 3 - Diversity curves.pdf · Lists, Matrices, and other Multidimensional data You will often work with “lists”, which is Mathematica’s

Department of Geological Sciences | Indiana University (c) 2012, P. David Polly

G563 Quantitative Paleontology

Special formatting tags

You can control the display of output in many ways by putting special tags at the end of a line of input

semicolon (;) prevents output from being displayed

//N forces numbers to be displayed in decimal form

//MatrixForm displays tables of data in rows and columns

Page 24: Temporal Diversity - Indiana University 3 - Diversity curves.pdf · Lists, Matrices, and other Multidimensional data You will often work with “lists”, which is Mathematica’s

Department of Geological Sciences | Indiana University (c) 2012, P. David Polly

G563 Quantitative Paleontology

Importing and exporting data

Mathematica has an extensive range of file types that can be imported and exported: text files, Excel files, Word files, PDFs, Illustrator, JPEG, etc.

Import[FilePath]

Export[FilePath, “type”]

Note the helpful file path chooser found on the Insert menu

Page 25: Temporal Diversity - Indiana University 3 - Diversity curves.pdf · Lists, Matrices, and other Multidimensional data You will often work with “lists”, which is Mathematica’s

Department of Geological Sciences | Indiana University (c) 2012, P. David Polly

G563 Quantitative Paleontology

Simple graphics

ListPlot[]

Plot[]

Histogram[]

BarChart[]

Page 26: Temporal Diversity - Indiana University 3 - Diversity curves.pdf · Lists, Matrices, and other Multidimensional data You will often work with “lists”, which is Mathematica’s

Department of Geological Sciences | Indiana University (c) 2012, P. David Polly

G563 Quantitative Paleontology

Loops: programming structure for repeating things

Use Table[], Map[], or Do[] to carry out repeated tasks

Table[ lines to be repeated , {iterator}]

where the lines to be repeated consist of other Mathematica functions or lists of functions separated by semicolons

iterator is a special construction that creates a temporary counting variable and specifies number of times to repeat

Simple: {10} (repeats 10 times)With variable: {x,10} (repeats while incrementing x from 1 to 10 in steps of 1)Full: {x,1,10,1} (repeats while incrementing x from 1 to 10 in steps of 1)Full: {x,10,2,-2} (repeats while incrementing x backward from 10 to 2 in steps of 2)

Page 27: Temporal Diversity - Indiana University 3 - Diversity curves.pdf · Lists, Matrices, and other Multidimensional data You will often work with “lists”, which is Mathematica’s

Department of Geological Sciences | Indiana University (c) 2012, P. David Polly

G563 Quantitative Paleontology

Conditional statementsIs equal? ==Is unequal? !=Greater than? >Less than? <And &&Or ||

If[ statement is true, then this, or else this ]

myage = 65.5;If[ myage > 50, Print[“my age is older”], Print[“my age is not older”]

If[ myage > 55 && myage < 65, Print[“my age is in the bin”], Print[“my age is outside the bin”]

Page 28: Temporal Diversity - Indiana University 3 - Diversity curves.pdf · Lists, Matrices, and other Multidimensional data You will often work with “lists”, which is Mathematica’s

Department of Geological Sciences | Indiana University (c) 2012, P. David Polly

G563 Quantitative Paleontology

Working with Strings

Strings are entities of characters, as opposed to numbers. You can manipulate strings in Mathematica as well as numbers. For example:

mytext = “Species”;

You can combine strings by joining them with the StringJoin[] function or <> (which does the same thing):

You can create a list of labels using Table[] and ToString[], the latter of which converts numbers to strings so they can be joined to other strings:

Page 29: Temporal Diversity - Indiana University 3 - Diversity curves.pdf · Lists, Matrices, and other Multidimensional data You will often work with “lists”, which is Mathematica’s

Department of Geological Sciences | Indiana University (c) 2012, P. David Polly

G563 Quantitative Paleontology

mySQL and MathematicaFunctions for interacting with SQL server are found in the DatabaseLink package.

First, load the package:

<<DatabaseLink`

Next, set up a database link and give it a name (“conn” in this example):

conn = OpenSQLConnection[]

This function will open a Connection Tool window, where you can create a new connection to your mySQL database. Click “New” and follow the steps to create the connection:

1. give it a short name, e.g. ‘mySQL’2. specify System Level; 3. choose “MySQL(Connector/J)” from the database type menu; 4. Specify “localhost” or “129.0.0.1”, port 3306, username “root”, password for your mySQL (probably “root” by default). [test it to make sure it works]; 5. store password if you wish

Close the connection again...

CloseSQLConnection[conn]

Page 30: Temporal Diversity - Indiana University 3 - Diversity curves.pdf · Lists, Matrices, and other Multidimensional data You will often work with “lists”, which is Mathematica’s

Department of Geological Sciences | Indiana University (c) 2012, P. David Polly

G563 Quantitative Paleontology

Using your connection again

Once you have created a connection, you can use it again without going through the configuration steps:

This opens the database (aka, “Catalog”) named “Felidae” using the stored connection named “mySQL”.

Put semicolons at the end of the line to suppress the output:

Always close the connection when you are finished with it.

Page 31: Temporal Diversity - Indiana University 3 - Diversity curves.pdf · Lists, Matrices, and other Multidimensional data You will often work with “lists”, which is Mathematica’s

Department of Geological Sciences | Indiana University (c) 2012, P. David Polly

G563 Quantitative Paleontology

Doing something with your connection

There’s no use opening a database connection, unless you do something with it. We’ll use the connections to load data into Mathematica using an SQL query. The SQLExecute[] function allows you to send an SQL statement to the database and get back the results. Store the results in a variable so you can use them:

The entire occurrences table should now be stored in the variable data. We can check the first line to make sure:

Page 32: Temporal Diversity - Indiana University 3 - Diversity curves.pdf · Lists, Matrices, and other Multidimensional data You will often work with “lists”, which is Mathematica’s

Department of Geological Sciences | Indiana University (c) 2012, P. David Polly

G563 Quantitative Paleontology

Getting labels or specialized tables

Note that by default each line of data comes back as a subgroup defined by curly brackets. The subgroups are useful when several variables are found in each line, but annoying when you just want a list of names. Flatten[] gets rid of them.

Page 33: Temporal Diversity - Indiana University 3 - Diversity curves.pdf · Lists, Matrices, and other Multidimensional data You will often work with “lists”, which is Mathematica’s

Assignment for next week

Department of Geological Sciences | Indiana University (c) 2012, P. David Polly

G563 Quantitative Paleontology

1. Get acquainted with Mathematica by going through the “Quick Overview” and “Learn with Guided Examples” that are linked from the Welcome Screen. (if your welcome screen isn’t showing, you can reopen it from the Help menu).

2. Describe briefly what the Mathematica code in the handout does.

3. Using Mathematica to interface with mySQL, calculate a diversity curve for your data using the Standard Method for calculating diversity. Divide the total geological time span represented by your data into 15 bins for this exercise. Turn in a table similar to 8.2 in the Foote and Miller handout, with columns reporting the age of the base of each interval, the number of originations, extinctions, and singletons, and dt with all data. Also turn in a graph of the curve, preferably created in Mathematica (though you could use Excel, if you must).

Reading for next weekFoote, M. 2000. Origination and extinction components of taxonomic diversity: general problems. Paleobiology, 26: 74-102.