15

Text mining Montpellier

Embed Size (px)

Citation preview

Page 1: Text mining Montpellier
Page 2: Text mining Montpellier

Using language to detect potential change in ecosystem services in the light of ecological

surprises!

Juan Carlos Rocha & Robin Wikström

Page 3: Text mining Montpellier

Ecosystem services are the benefits humans receive from nature (MEA 2005)

Foley et al. 2005. Science

Page 4: Text mining Montpellier

1980 1985 1990 1995 2000 2005 2010 2015

050

010

0015

0020

00

Year

Scie

ntifi

c Pa

pers

ISI Web of Knowledge

Ecosystem services is a relatively recent field of studies. Yet, assessing which ecosystem services are likely to be affected by ecological surprises is one of the greatest challenges of

current ecological research.

Page 5: Text mining Montpellier

• Knowledge bias: we know a lot about easy to study stuff

• There is the lack of high quality datasets and time series to assess ecosystem services change

• It is difficult to experiment with ecosystems, especially when it comes to large scale phenomena, possible irreversible, with great potential to affect human well-being

The problem

Page 6: Text mining Montpellier

Latent Dirichlet Allocation (LDA)The main goal is to create algorithms for discovering

main themes that pervades large and unstructured collection of documents

The elemental idea is that documents consist of

random mixtures of latent topics, topics that can be

represented as a distribution of words

The goal is to automatically discover these “hidden” topics,

without having any prior knowledge about the text and its content

Page 7: Text mining Montpellier

Model selection & number of topics

• We test Variational Estimation Methods (VEM), correlated topic models (CTM) & Gibbs sampling.

• We tested 5 models with different topic numbers (20:100)

• VEM algorithm with 80 topics fit the best the MEA training dataset

VEM1

VEM2

VEM3

VEM4

VEM5

0.010 0.012 0.014 0.016

alpha

VEM1

VEM2

VEM3

VEM4

VEM5

-1280000 -1240000 -1200000

logLik

VEM1

VEM2

VEM3

VEM4

VEM5

500 600 700 800

Perplexity

VEM1

VEM2

VEM3

VEM4

VEM5

0.1 0.2 0.3 0.4 0.5

Entropy

Page 8: Text mining Montpellier

Topics = 80Millenium Ecosystem AssessmentTopics detection

38 58 5 14 73 41 22 70 45 80 72 55 18 19 15 44 52 56 57 78 9 1 47 28 12 39 29 30 40 75 6 27 4 35 68 71 42 48 11 51 59 2 66 63 25 34 67 3 65 60 46 32 10 21 54 7 37 17 23 69 20 49 26 8 13 16 64 77 76 33 50 53 24 61 62 31 74 79 43 36

Chapter 20 Inland Water Systems−2.pdf.txtChapter 8 Food.pdf.txtChapter 9 Timber Fuel and Fiber.pdf.txtChapter 15 Waste Processing and Detoxi...cation.pdf.txtChapter 17 Cultural and Amenity Services.pdf.txtChapter 23 Island Systems.pdf.txtChapter 5 Ecosystem Conditions and Human Well−being.pdf.txtChapter 6 Vulnerable Peoples and Places.pdf.txtFisher−2013−Strengthening_conceptual_foundations_Analysing_frameworks_for_ecosystem_services_and_poverty_alleviation_research.pdf.txtStallman−2011−Ecosystem_services_in_agriculture_determining_suitability_for_provision_by_collective_management.pdf.txtPower−2010−Ecosystem_services_and_agriculture_tradeoffs_and_synergies.pdf.txtSaad−2013−Land_use_impacts_on_freshwater_regulation_erosion_regulation_and_water_purification_a_spatial_approach_for_a_global_scale_level.pdf.txtMitchell−2013−Linking_Landscape_Connectivity_and_Ecosystem_Service_Provision_Current_Knowledge_and_Research_Gaps.pdf.txtMann−2012−Ecosystem_Service_Value_and_Agricultural_Conversion_in_the_Amazon_Implications_for_Policy_Intervention.pdf.txtMaskell−2013−Exploring_the_ecological_constraints_to_multiple_ecosystem_service_delivery_and_biodiversity.pdf.txtCardinale−2013−Biodiversity_simultaneously_enhances_the_production_and_stability_of_community_biomass_but_the_effects_are_independent.pdf.txtJax−2013−Ecosystem_services_and_ethics.pdf.txtChapter 28 Synthesis Condition and Trends in Systems and Services Trade−offs for Human Well−being and Implications for the Future.pdf.txtChen−2013−Changes_in_land_useland_cover_and_ecosystem_services_in_Central_Asia_during_1990...2009.pdf.txtChapter 1 MA Conceptual Framework.pdf.txtNelson−2005−Chapter 3 Drivers of ecosystem change summary chapter.pdf.txtChapter 2 Analytical Approaches for Assessing Ecosystem Condition and Human Well−being.pdf.txtChapter 14 Human Health Ecosystem Regulation of Infectious Diseases.pdf.txtChapter 24 Mountain Systems.pdf.txtZheng−−Benefits_costs_and_livelihood_implications_of_a_regional_payment_for_ecosystem_service_program.pdf.txtSircely−2012−Biodiversity_and_Ecosystem_Multi−Functionality_Observed_Relationships_in_Smallholder_Fallows_in_Western_Kenya.pdf.txtLangerwisch−2012−Potential_effects_of_climate_change_on_inundation_patterns_in_the_Amazon_Basin.pdf.txtChapter 13 Climate and Air Quality.pdf.txtQiu−−Spatial_interactions_among_ecosystem_services_in_an_urbanizing_agricultural_watershed.pdf.txtChapter 22 Dryland Systems.pdf.txtChapter 7 Fresh Water.pdf.txtChapter 4 Biodiversity.pdf.txtChapter 18 Marine Fisheries Systems.pdf.txtChapter 12 Nutrient Cycling.pdf.txtVidal−Abarca_Gutie..rrez−2013−Which_are_what_is_their_status_and_what_can_we_expect_from_ecosystem_services_provided_by_Spanish_rivers_and_riparian_areas.pdf.txtChapter 25 Polar Systems.pdf.txtTurner−2012−Consequences_of_spatial_heterogeneity_for_ecosystem_services_in_changing_forest_landscapes_priorities_for_future_research.pdf.txtBai−2012−Grazing_alters_ecosystem_functioning_and_CNP_stoichiometry_of_grasslands_along_a_regional_precipitation_gradient.pdf.txtChapter 10 New Products and Industries from Biodiversity.pdf.txtAcreman−2013−How_Wetlands_Affect_Floods.pdf.txtChapter 16 Regulation of Natural Hazards Floods and Fires.pdf.txtChapter 11 Biodiversity Regulation of Ecosystem Services.pdf.txtSchmitt−2013−Capturing_Ecosystem_Services_Stakeholders'_Preferences_and_Trade−Offs_in_Coastal_Aquaculture_Decisions_A_Bayesian_Belief_Network_Application.pdf.txtChapter 21 Forest and Woodland Systems.pdf.txtChapter 27 Urban Systems.pdf.txtChapter 26 Cultivated Systems.pdf.txtChapter 19 Coastal Systems.pdf.txt

0.2 0.6Value

020

00

Color Keyand Histogram

Cou

nt

Page 9: Text mining Montpellier

Topics matchingEcosystem Services Topics WordsEcosystem Processes

Soil formation 12 species, fallows, soil, biomass, grazed, fallowPrimary production 19 soil, carbon, species, freshwater, carbonNutrient cycling 4 nutrient, organic, fertility, soil, ocean, cyclingWater cycling 39 discharge, amazon, inundation, basing, slopeBiodiversity 6 species, plant, richness, biodiversity, services

Provisioning servicesFreshwater 75 freshwater, renewable, supply, freshwater, riverFood crops 5 nutrition, countries, livestock, africa, healthLivestock 11 grazing, steppe, soil, ground, biomass, poolsFisheries 27 marine, fisheries, fish, coastal, system, oceanWild animals and plants productsTimber 2 forest, countries, carbon, plantations, trees, faoWood fuel 14 timeber, products, tao, forest, fuelwood, cottonFeed, fuel and fiber crops 14 timeber, products, tao, forest, fuelwood, cottonHydropower

Regulating servicesAir quality regulation 29 atmospheric, emissions, carbon, warmingClimate regulation 29 atmospheric, emissions, carbon, warmingWater purification 73 waste, chemicals, organic, health, exposureRegulation of soil erosion 72 cycle, soil, spatial, erosion, groundwaterPest and disease 1 disease, health, infectious,malaria, transmissionPollination 45 pollination, provision, benefits, farmers, landscapeNatural hazards 67 fire, events, floods, drivers, wetlands, coastal

Cultural services 41 landscapes, tourism, traditional, knowledge, heritageRecreationAesthetic valuesKnowledge and educational valuesSpiritual and religious

Page 10: Text mining Montpellier

Topics = 80Corpus = 812 papersTopics detection

Soil

form

atio

nPr

imar

y pr

oduc

tion

Nut

rient

cyc

ling

Wat

er c

yclin

gBi

odive

rsity

Fres

hwat

erFo

odcr

ops

Live

stoc

kFi

sher

ies

Tim

ber

Woo

dfue

lC

limat

e re

gula

tion

Wat

er re

gula

tion

Reg

ulat

ion

of s

oil e

rosi

onPe

st a

nd d

isea

se re

gula

tion

Pollin

atio

nN

atur

al h

azar

d re

gula

tion

Cul

tura

l

West Antarctic IceSheet CollapseTundra to ForestTermohaline circulationSteppe to tundraSoil structureSoil salinizationSea grass collapseSalt marshesRiver channel changePeatlandsMonsoon weakeningMarine foodwebsMarine eutrophicationMangroves collapseKelps transitionsHypoxiaGreenlandForest to savannasFloating plantsFisheries collapseEutrophicationEncroachmentDry land degradationCoral transitionsBivalves collapseArctic Sea Ice

0.1 0.4Value

025

0

Color Keyand Histogram

Cou

nt

Soil

form

atio

nPr

imar

y pr

oduc

tion

Nut

rient

cyc

ling

Wat

er c

yclin

gBi

odive

rsity

Fres

hwat

erFo

odcr

ops

Live

stoc

kFi

sher

ies

Tim

ber

Woo

dfue

lC

limat

e re

gula

tion

Wat

er re

gula

tion

Reg

ulat

ion

of s

oil e

rosi

onPe

st a

nd d

isea

se re

gula

tion

Pollin

atio

nN

atur

al h

azar

d re

gula

tion

Cul

tura

l

West Antarctic IceSheet CollapseTundra to ForestTermohaline circulationSteppe to tundraSoil structureSoil salinizationSea grass collapseSalt marshesRiver channel changePeatlandsMonsoon weakeningMarine foodwebsMarine eutrophicationMangroves collapseKelps transitionsHypoxiaGreenlandForest to savannasFloating plantsFisheries collapseEutrophicationEncroachmentDry land degradationCoral transitionsBivalves collapseArctic Sea Ice

0 0.4 1Value

180

Color Keyand Histogram

Cou

nt

Human Readers Computer reading

Page 11: Text mining Montpellier

Soil

form

atio

nPr

imar

y pr

oduc

tion

Nut

rient

cyc

ling

Wat

er c

yclin

gBi

odive

rsity

Fres

hwat

erFo

odcr

ops

Live

stoc

kFi

sher

ies

Tim

ber

Woo

dfue

lC

limat

e re

gula

tion

Wat

er re

gula

tion

Reg

ulat

ion

of s

oil e

rosi

onPe

st a

nd d

isea

se re

gula

tion

Pollin

atio

nN

atur

al h

azar

d re

gula

tion

Cul

tura

l

West Antarctic IceSheet CollapseTundra to ForestTermohaline circulationSteppe to tundraSoil structureSoil salinizationSea grass collapseSalt marshesRiver channel changePeatlandsMonsoon weakeningMarine foodwebsMarine eutrophicationMangroves collapseKelps transitionsHypoxiaGreenlandForest to savannasFloating plantsFisheries collapseEutrophicationEncroachmentDry land degradationCoral transitionsBivalves collapseArctic Sea Ice

Topics = 80Corpus = 812 papersTopics detection

False positives

Soil

form

atio

nPr

imar

y pr

oduc

tion

Nut

rient

cyc

ling

Wat

er c

yclin

gBi

odive

rsity

Fres

hwat

erFo

odcr

ops

Live

stoc

kFi

sher

ies

Tim

ber

Woo

dfue

lC

limat

e re

gula

tion

Wat

er re

gula

tion

Reg

ulat

ion

of s

oil e

rosi

onPe

st a

nd d

isea

se re

gula

tion

Pollin

atio

nN

atur

al h

azar

d re

gula

tion

Cul

tura

l

West Antarctic IceSheet CollapseTundra to ForestTermohaline circulationSteppe to tundraSoil structureSoil salinizationSea grass collapseSalt marshesRiver channel changePeatlandsMonsoon weakeningMarine foodwebsMarine eutrophicationMangroves collapseKelps transitionsHypoxiaGreenlandForest to savannasFloating plantsFisheries collapseEutrophicationEncroachmentDry land degradationCoral transitionsBivalves collapseArctic Sea Ice

False negatives

Page 12: Text mining Montpellier

Zooming problemEcosystem services categories make sense for MEA authors

but not necessarily for other scientist, readers or… computers.

Page 13: Text mining Montpellier

Concluding remarks• It’s hard to make students do

homework, but even harder to make your computer do it automatically for you…Errors

• Broad categories such as supporting & cultural services are hard to identify, we did much better with regulating services.

• Our results open the possibility of using text mining for monitoring trends on ecosystem services at larger scales and real time.

Soil

form

atio

nPr

imar

y pr

oduc

tion

Nut

rient

cyc

ling

Wat

er c

yclin

gBi

odive

rsity

Fres

hwat

erFo

odcr

ops

Live

stoc

kFi

sher

ies

Tim

ber

Woo

dfue

lC

limat

e re

gula

tion

Wat

er re

gula

tion

Reg

ulat

ion

of s

oil e

rosi

onPe

st a

nd d

isea

se re

gula

tion

Pollin

atio

nN

atur

al h

azar

d re

gula

tion

Cul

tura

l

West Antarctica Ice Sheet collapse (50)Tundra to forest (17)Thermohaline circulation (26)Steppe to tundra (10)Soil structure (2)Soil salinization (7)Sea grass collapse (6)Salt marshes (17)River channel change (22)Peatlands (21)Monsoon weakening (9)Marine foodwebs (50)Marine Eutrophication (8)Mangroves collapse (6)Kelps transitions (11)Hypoxia (7)Greenland (20)Forest to savannas (39)Floating plants (4)Fisheries collapse (75)Eutrophication (20)Encroachment (12)Dry land degradation (26)Coral transitions (33)Bivalves collapse (13)Arctic Sea Ice (19)

0 0.4 1Value

215

Color Keyand Histogram

Cou

nt

Page 14: Text mining Montpellier

Questions?? e-mail: [email protected] twitter: @juanrocha

slides: http://criticaltransitions.wordpress.com/ | data: www.regimeshifts.rog

Page 15: Text mining Montpellier

Subscribe  to  our  newsletter  www.stockholmresilience.su.se/subscribe  

Thank  you!