32
OBIS

OBIS

Embed Size (px)

DESCRIPTION

OBIS. Current situation. Working on new IT platform Present technology 8 years old Data ingestion going fine Including data quality Position, time Taxonomy Web site well visited. Number of records (M). Number of datasets. Average size dataset (K). Web statistics. Data statistics. - PowerPoint PPT Presentation

Citation preview

Page 1: OBIS

OBISOBIS

Page 2: OBIS

Current situationCurrent situation

•Working on new IT platform•Present technology 8 years old

•Data ingestion going fine•Including data quality

•Position, time•Taxonomy

•Web site well visited

Page 3: OBIS

Number of records (M)Number of records (M)

0.00

2.00

4.00

6.00

8.00

10.00

12.00

14.00

16.00

18.00

Apr-01 Sep-02 Jan-04 May-05 Oct-06 Feb-08 Jul-09

Page 4: OBIS

Number of datasetsNumber of datasets

0

100

200

300

400

500

600

Apr-01 Sep-02 Jan-04 May-05 Oct-06 Feb-08 Jul-09

Page 5: OBIS

Average size dataset (K)Average size dataset (K)

0

20

40

60

80

100

120

140

160

180

Apr-01 Sep-02 Jan-04 May-05 Oct-06 Feb-08 Jul-09

Page 6: OBIS

Web statisticsWeb statistics

Page 7: OBIS

Data statisticsData statistics

0

20000

40000

60000

80000

100000

120000

Mar-06 Oct-06 Apr-07 Nov-07 Jun-08 Dec-08 Jul-09

Page 8: OBIS

Analysis of contentAnalysis of content

•First preliminary analyses•Has to take into account huge bias

•Geography•Mostly coastal•Mostly northern hemisphere

•Taxonomy•Presence-only•‘Safety in numbers’

Page 9: OBIS

Number of recordsNumber of records

For known species most important to your project, what major discoveries have been made about their range or distribution? What is least known with regards to their distribution that you

would like to know?

Page 10: OBIS

Number of speciesNumber of species

Page 11: OBIS

Hurlbert’s index (es(50))Hurlbert’s index (es(50))

Page 12: OBIS

Large marine ecosystemsLarge marine ecosystems

Page 13: OBIS

‘Age’ of record – trends study‘Age’ of record – trends study

1

10

100

1,000

10,000

100,000

1,000,000

1600 1700 1800 1900 2000 2100

Page 14: OBIS

Latitudinal gradient ES(50)Latitudinal gradient ES(50)

0

10

20

30

40

50

60

-90 -60 -30 0 30 60 90

Page 15: OBIS

Marine fish to be discoveredMarine fish to be discovered

Mora et al (2007). The completeness of taxonomic inventories for describing the Mora et al (2007). The completeness of taxonomic inventories for describing the global diversity and distribution of marine fishes. Proc. R. Soc. B, published on lineglobal diversity and distribution of marine fishes. Proc. R. Soc. B, published on line

Percentage completeness 1 100

Page 16: OBIS

How good is the data?How good is the data?

•Data are from many sources•Inconsistent become apparent

•Differences in names used•Mistakes in transformations

•Decimalising lat/lon•Needs quality control•Data collection driven by priorities

•Sampling bias; resolution

Page 17: OBIS

Quality controlQuality control

•Check formal record structure•Check date/time•Check position

•In the ocean?•In dataset bounding box?

•Check taxonomy•Problem: no reference list

Page 18: OBIS

New species are discoveredNew species are discovered

Data from http://marinespecies.org

Page 19: OBIS

Problems with taxonomic namesProblems with taxonomic names

• MisspellingsMisspellings• Mixed with other informationMixed with other information

• Gadus sp.; Gadus sp. A; Gadus sp. a…Gadus sp.; Gadus sp. A; Gadus sp. a…• Gadus morhua?; Gadus cfr morhua; Gadus aff. Gadus morhua?; Gadus cfr morhua; Gadus aff.

morhua…morhua…• Gadus morhua juv.; Gadus morhua juvenile; Gadus Gadus morhua juv.; Gadus morhua juvenile; Gadus

morhua juveniles…morhua juveniles…

• Mixed with ecological/sampling informationMixed with ecological/sampling information

• Also variation in classification and author stringAlso variation in classification and author string

Page 20: OBIS

Examples of variationExamples of variation

• Callorhinchus callorynchusCallorhinchus callorynchus• Cirrhinus or CirrhinaCirrhinus or Cirrhina

• Cirrhinus cirrhosa or C. cirrhosusCirrhinus cirrhosa or C. cirrhosus• Cirrhina cirrhosa or C. cirrhosusCirrhina cirrhosa or C. cirrhosus

• Microsoft helping a bit:Microsoft helping a bit:• Calinectes ornatus Calinectes ornatus Ordway, 1863 Ordway, 1863

Calinectes ornatus Calinectes ornatus Ordway, 1864 … Ordway, 1864 … Calinectes ornatus Ordway, 1891Calinectes ornatus Ordway, 1891

Page 21: OBIS

Number of ‘species’ in OBISNumber of ‘species’ in OBIS

• 147K unique ‘scientific names’147K unique ‘scientific names’• 132K ‘clean names’132K ‘clean names’

• Approx 10% reduced (from 147K)Approx 10% reduced (from 147K)

• 80K match with WoRMS80K match with WoRMS• 11K known synonyms or misspellings11K known synonyms or misspellings• Non-matches assumed validNon-matches assumed valid

• 121K ‘valid names’121K ‘valid names’• Approx 20% reducedApprox 20% reduced

Page 22: OBIS

Reduction of es(50) per 5d squareReduction of es(50) per 5d square

0

5

10

15

20

25

30

35

40

45

50

0 5 10 15 20 25 30 35 40 45 501

10

100

1000

10000

Page 23: OBIS

Same for fishSame for fish

0

5

10

15

20

25

30

35

40

45

50

0 5 10 15 20 25 30 35 40 45 50

1

10

100

1000

0

0.05 0.1

0.15 0.2

0.25 0.3

0.35 0.4

0.45 0.5

0.55 0.6

Page 24: OBIS

General patterns indistinguishableGeneral patterns indistinguishable

All

Fish

Dirty Clean

Page 25: OBIS

CompletenessCompleteness

0

10000

20000

30000

40000

50000

60000

70000

80000

90000

1750 1800 1850 1900 1950 2000 2050

0

10000

20000

30000

40000

50000

60000

70000

80000

90000

1950 1960 1970 1980 1990 2000 2010

Page 26: OBIS

How to get OBIS data?How to get OBIS data?

• Web siteWeb site• DiGIR providerDiGIR provider• OGC-compliant web servicesOGC-compliant web services

• Exist on experimental basisExist on experimental basis

• Google baseGoogle base

• Ask us!Ask us!• Custom data extractionCustom data extraction

Page 27: OBIS
Page 28: OBIS
Page 29: OBIS
Page 30: OBIS

Data from field projectsData from field projects

• Not always easy to ‘trace’Not always easy to ‘trace’• Not well documented what is CoML data, aand which Not well documented what is CoML data, aand which

field project it belongs tofield project it belongs to• Needs mechanism to better documentNeeds mechanism to better document

• Part of the metadata?Part of the metadata?

• Exercise was done at iOBISExercise was done at iOBIS• Spreadsheet will be made availableSpreadsheet will be made available• Please checkPlease check• In general, good agreement with our understanding In general, good agreement with our understanding

and information from annual reportsand information from annual reports

Page 31: OBIS

Field projectsField projects

Acronym# datasets

as per group# records

as per group# datasets

in OBIS# records

in OBIS

CeDaAMar 10 9018 1 9017

CAML 64 625000 58 1091837

ArcOD 18 70197 18 70061

CoMargE 2 3000+ 16 67067

POST 1 ? 1 114161

CReefs 8 203929 8 203929

ICoMM 22 5511 1 5511

Mar-ECO 1 ? 1 2744

NaGISA 1 13138 1 23552

GoMA 4 203130 4 33740

CenSeam 77 14233 1 19488

TOPP 0 0 0 0

ChEss 1 2593 1 2593

CMarZ 1 114 1 114

HMAP 10 319,111 1 260182

FMAP 2 144346 2 144346

Page 32: OBIS

How to get data in OBIS?How to get data in OBIS?

• Dialogue ongoing with all major providersDialogue ongoing with all major providers• All field projectsAll field projects• Regional OBIS Nodes (RONs)Regional OBIS Nodes (RONs)• FishBase, OBIS SEAMAP…FishBase, OBIS SEAMAP…

• iOBIS needs time to ingest dataiOBIS needs time to ingest data• Quality control…Quality control…• Data cycle Data cycle

• Lag in data availability ~3 monthsLag in data availability ~3 months• Depending on quality of the dataDepending on quality of the data