Azerbaijan2.2.3 sofi and text scanning

Preview:

DESCRIPTION

A look at data mining for use in futures and foresight research.

Citation preview

State of the Future Index and Text Mining

Second Semester

Class 3

The Azerbaijan State Economic University March 4, 2011

Ted Gordon, Senior Fellow and Jerome Glenn, Executive Director, The Millennium Project

Alan Porter, Director of R&D, Search Technology, Inc. & Co-Director, Technology Policy & Assessment Center Georgia Tech

Assignments From Last Session

• Create a Futures Wheel on the technology you selected

• Read Chapter on State of the Future Index• Read Text Scanning• Visit www.gccsr.org (collective intelligence example)• SOFI Assignment• Peace Scenarios Assignment: What are the

conditions for Peace and what are previous agreements

Today’s Topics

• The concept of SOFI – The role of Trend Impact Analysis– The role of Real Time Delphi

• Text Mining

The Class Project (Session 1)

• Produce an actual (not just a demonstration) State of the Future Index for Azerbaijan. This requires:– Understanding data availability– Choosing variables for the SOFI and collecting data– Identifying future events that could change the course of

the variables– Forecasting the variables– Computing the SOFI– Finding policy implications

State of the Future Index

• A synthesis of variables to help answer the question “Is the outlook for the future improving?”

• A tool for– Policy analysis– Improving discussion about the future– Education– National comparisons

The Concept

• Developed by the Millennium Project in 2002• A dynamic method for studying whether the future

seems to be improving or not and for testing the effects of policy on the future outlook

• Combines 20-30 variables important to the world or the country

Examples of Indexes

• Human Development Index• Dow Jones Industrial Average• Index of Leading Indicators

The State of the Future Index (SOFI)

• Constructed of weighted variables such as life expectancy and employment.

• Retrospective 20 years; prospective 10 years• Judgments about variables, weights, expectations,

and future developments• For the world index, experts were recommended by

the Millennium Project’s Nodes.

Indexes: Caveats

• Indexes may lead to oversimplification and loss of detail

• An index may hide cultural differences• Data behind the index must be preserved and

transparent• Apparent precision may be mistaken for

accuracy

Three Types of SOFI

• Global (comprised of global variables)

• National comparison (comprised of a standard set of variables, national data)

• National focus (comprised of variables of nation’s choosing, national data)

The Global SOFI Variables

R&D expenditures 

Corruption 

Improved water sources 

Percent unemployment 

School Enrollment, secondary 

Renewable Energy  

Energy consumption per GDP Internet Users Carbon Dioxide emissionsPeople killed or injured in terrorist attacks

GDP per capita 

Physicians per 1000 population 

Literacy rate

Population growth rate 

Forest Lands 

People voting in elections 

Life expectancy at birth 

Food availability 

Infant mortality 

Number of refugees

Homicides, intentional 

Debt Service (% of GNI)

Seats held by women in parliaments 

Prevalence of HIV 

People Living in Extreme Poverty

Global Temperature Anomalies

Nuclear Proliferation

Number of infectious diseases

Number of armed conflicts  

People in Countries that are Free

Example of SOFI (State of the Future Index) variables:Infant mortalityFood availabilityGNP per capita Access to fresh waterCO2 emissionsLiteracyWarsAIDS deathsTerrorist attacksDebt ratioUnemploymentCalories per capitaHealth careForest landsRich poor gap…

Comparison of SOFI's

0.5

0.6

0.7

0.8

0.9

1

1.1

1.2

1980 1985 1990 1995 2000 2005 2010 2015 2020

2001

2002

2003

2004

2005

Global SOFI

Time Series

1970

1980

1990

2000

2010

0

20

40

60

80

100

% 12-17 year olds who ever smoked

1970

1980

1990

2000

2010

0

10

20

30

40

50

60

70

80

90

100

% of 12-17 year olds who ever smokedDeterministic Non-deterministic

Unprecedented events make the difference

.

Trend Impact Analysis

• Used to assess consequences of future developments on the course of the extrapolations of the variables

• Developments expressed in probabilistic terms• Impacts in terms of percentage change in the end

point of a variable• Monte Carlo solution

Examples of Developments• A nuclear accident such as Three Mile Island (causes many nuclear nations

to de-nuclearize). (10%)• A very good, fast $150 laptop computer becomes available everywhere.

(65%)• Advent of a “teachers without borders” movement (50,000 new teachers

in the field) (30%)• A pandemic of the scale of HIV/AIDS (30%)• At least 10 countries introduce effective policies designed to increase birth

rates to avoid population implosion (75%)• Automation and robotics increase productivity 25% in enough countries to

make “jobless" economic growth (50%)• Availability of a cheap effective anti-aging therapy (35%)• Bad weather (storms, hurricanes, floods) cause wide spread crop failures

in at least one year (25%)

Example of TIA ResultsLiteracy

50

55

60

65

70

75

80

85

90

95

1980 1990 2000 2010 2020

Per

cen

t P

op

ula

tio

n Base

UQ

MED

LQ

Best

Worst

SOFI in IFs

• The IFs model– Is a multi equation model that links economic,

demographic, and social measures– Covers 182 countries– Uses both systems dynamic relationships (flows and

stocks) and econometric relationships– Includes “cohort-component systems for population;

markets for production, exchange, and consumption; and social accounting matrices for financial flows”

– SOFI computations have been integrated into the model

Dashboard Experiments: Getting Better

Dashboard Experiments: Getting Worse

Example of SOFI (State of the Future Index) variables:Infant mortalityFood availabilityGNP per capita Access to fresh waterCO2 emissionsLiteracyWarsAIDS deathsTerrorist attacksDebt ratioUnemploymentCalories per capitaHealth careForest landsRich poor gap…

Global SOFI

Collecting Judgments for SOFI

• Judgments include:– What variables?– What weights?– What best and worst expectations?– What developments can affect the variables?– What are their probabilities and impacts?

• Techniques– Delphi– Real Time Delphi– Others

RTD Features• The choice of invited participants is careful and deliberate• The questionnaire is presented on line• There are no sequential questionnaires as in conventional

Delphi• Group averages and reasons for answers of prior participants

are shown• When the respondent comes back to the study in a minute or

a day, the original input form with their answers is presented• By then others may have contributed judgments, the averages

and reasons may have changed

An Azerbaijan Student RTD

• Collects judgments about variables in SOFI• URL: www.realtimedelphi.net• Sign on• Use code “azer” (no quotes, lower case)

• You may answer the questions

Developments

Variables (ordered by importance):

Prob Import Ready28. Massive financial crisis triggers a world depression as large as in the 1930's 42.39 7.56 7.1453. New technology displaces carbon fuels as cheapest energy source 30.70 7.33 6.4345. Gulf Cooperation Council moves countries in the region toward EU-like arrangements agreements 

46.43 7.22 7.04

26. Extremist political religious groups change the current direction of the region  47.25 7.21 7.4248. Aging population doubles government social costs  65.42 7.16 6.4444. Water scarcity problems are essentially solved (for example, through low cost desalination) 

27.69 7.08 6.77

37. Extremists detonate nuclear devices, dirty bombs, or other weapons of mass destruction 

24.97 7.00 5.92

39. Stability achieved in Iraq  45.38 7.00 7.6032. Renewable energy sources, like wind and solar, provide 50% of the world's power  26.30 6.84 6.7029. Most glaciers melt twice as fast as in the decade 2000 - 2009 51.19 6.67 6.2141. OPEC's ability to control oil production dramatically dissipates 41.66 6.65 7.2940. Cyber warfare is more difficult to detect and triples in damages from 2010 levels.  63.62 6.54 6.4551. Economic growth spike in other global regions limits ability to attract talent 73.46 6.44 6.7942. Middle East oil producing countries successfully diversify beyond energy production (50% of income from sources other than oil) 

34.57 6.32 6.17

Resources• Real Time Delphi on line

realtimedelphi.net code “azer”

• Millennium Project reports <www.stateof thefuture.org>

• Tutorials for Real Time Delphi and SOFI <http://www.mpcollab.org/learning/course/view.php?id=3 >

• International Futures (IFs) http://www.ifs.du.edu/

Why Tech Mining?• Welcome to the age of too much information.• We need to treat: text as datato gain intelligence.• Mine “ST&I” [Science, Technology & Innovation] information resources to answer technology management questions = Tech Mining.• Enable Open Innovation

© 2009 Search Technology, Inc.

Tech Mining leverages text and data mining to analyze information. Since tech mining allows us to use computers to “read” the information, we can digest far more information than we could traditionally absorb.

What does Tech Mining give us?

Text and data mining techniques are good at addressing:

•WHO?•WHAT?•WHEN?•WHERE?

Additional questions usually require more human insight:•HOW?•WHY?

Types of Questions

Technical Information

• Science, Technology & Innovation (“ST&I”) Databases (e.g., Web of Science; CSCD, Thomson Innovation)

• Internet Sources(e.g., Googling)

• Technical Expertise

Contextual Information

• Business, competition, customer, policy, popular content Databases (e.g., Thomson One)

• Internet Sources (e.g., blogs, website profiling)

• Business Expertise

Six information types

On-line Data Sources Custom DataCambridge Scientific Abstracts Factiva Patbase Comma/tab delimited tablesDelphion ISI Web Of Knowledge Questel-Orbit Microsoft Excel and AccessDialog Lexis Nexis SilverPlatter SmartChartsEBSCOHost Micropatent STN XMLEi Engineering Village Ovid Thomson Innovation

Databases Record/Field ToolsAerospace Focust Pascal Combine duplicate recordsArt Abstracts Food Sci & Tech Patent Citation Index Remove duplicate recordsBiobase Foodline Market PCT Create “frankenrecords”Biological Abstracts Foodline Science PCTPAT (merge records fromBiological Sciences Forege Phin dissimilar sources)Biosis Frosti Pira Classify recordsBiotechno FSTA Pluspat Merge fieldsBusiness & Industry Gale PROMT PROMT Clean up fieldsCAPlus (AnaVist export) GeoRef PsycINFO Apply thesauriCassis Global Reporter PubMedCBNB IFIPAT Rapra Claims IFIUDB Recent RefsComputer & Info Systems INPADOC Reference ManagerCorrosion INSPEC Science Citation IndexCurrent Contents IPA SciSearchDerwent Biotech Abstracts ISD ScopusDerwent Innovations Index ITRD Tech ResearchDerwent World Patent Index JAPIO ToxFile Ei Compendex JICST TransportEMBase Kosmet USAppsEnCompass Literature LGST USPat EnCompass Patents MATBUS WaternetEnergy Medline WaterResAbsEnergySciTech METADEX Web of ScienceEngineering Materials Abstr Mgmt and Org Studies WeldaSearch Envr Sci & Pollution Mgmt Micropatent Materials Wisdomain ERIC MobilityEuroPat NSF AwardsFamPat NTIS

VantagePoint Import Filters and Tools

How to do Tech Mining: 8-steps

1. Spell out the questions and how to answer them2. Get suitable data3. Search (iterate)4. Import into text mining software (e.g., VantagePoint)

– http://www.thevantagepoint.com/

5. Clean the data6. Analyze & interpret 7. Represent the information well – communicate!8. Standardize and semi-automate where possible

Doing the Tech Mining Process

Sample Applications• Developments for use in TIA, a step in SOFI• R&D Portfolio Management• Research Evaluation• Research Profiling• Tracking R&D over time• Research Network Analyses• Monitoring Research Knowledge Flows• Geo-mapping• ST&I Indicators

Ceramics in Engines• Overcoming

Management Resistance• Jumping Domains• “Discovering” new

technology

Case Examples

A Success Story: Ceramics Save Tax Dollars• TARDEC (Tank-Automotive Research, Development &

Engineering Center) • Task in 1996: Reassess a “loser technology” –

ceramics for automotive engine applications• R&D Profile: Amount of activity declines• But -- Uncovers clues of significant maturation

Case Examples

Case Examples

Ceramics Engine Publications (85-96)Technology Maturity & Keyword Diffusion

Cognitive Sci.

Agri Sci

Biomed Sci

Chemistry

Physics

Engineering

Env Sci & Tech

Mtls Sci

Infectious Diseases

Psychology

Social Studies

Clinical Med

Computer Sci.Business & MGT

GeosciencesEcol Sci

Econ. Polit. & Geography

Health & Social Issues

Azerbaijan Research, 2005-09 on Global Map of Science, SCI-SSCI 2007

Case Examples

Research Profile: Azerbaijan 2005-09 by Disciplines (top 5)Macro-Discipline Author Affiliations Key Terms Authors Year Top 3 Top 5 Top 3 2008-09Chemistry[475] Natl Acad Sci Azerbaijan [119]

Baku State Univ [95]Azerbaijan Acad Sci [48]

synthesis [72]thermodynamic properties [27]Density [24]Water [23]methanol [21]

Abdulagatov, I M [25]Magerramov, A M [19]Chyragov, F M [18]

48% of 475

Materials Sci[382] Azerbaijan Acad Sci [95]Baku State Univ [66]Azerbaijan Natl Acad Sci [64]

effect [29]TlInS2 [19]Incommensurate phase [17]CRYSTALS [17]SINGLE-CRYSTALS [14]

Suleymanov, R A [16]Altindal, S [14]Tagiev, O B [13]Mammadov, T S [13]

51% of 382

Engineering[333] Natl Acad Sci Azerbaijan [83]Baku State Univ [74]Azerbaijan Acad Sci [38]

methanol [14]Initial stresses [11]sufficient conditions [10]thermodynamic properties [10]approximation [10]boundedness [10]

Akbarov, S D [22]Guliyev, V S [16]Khanmamedov, A K [9]Abdulagatov, I M [9]Nasibov, S M [9]

50% of 333

Physics[231] Azerbaijan Acad Sci [58]Baku State Univ [47]Azerbaijan Natl Acad Sci [35]

MODEL [22]PHYSICS [12]SCATTERING [10]VARIABILITY [10]SYSTEMS [9]

Shahverdiev, E M [13]Shore, K A [13]Aliev, T M [12]Sultansoy, S [12]

51% of 231

Biomed Sci[105] Baku State Univ [27]Azerbaijan Med Univ [9]Azerbaijan Acad Sci [7]

EFFICIENCY [10]sturgeons [8]diencephalon [7]CYTOARCHITECTONIC ANALYSIS [7]Azerbaijan [7]EXPRESSION [7]organization [7]

Zeynalov, R [9]Musayev, I [9]Rustamov, E K [8]Dadasheva, N [8]

39% of 105

Case Examples

Authors Subject Category Key Terms Authors Year Top 5 Items Top 5 Items Top 5 Items 2008-09Magerramov, A M[19] Chemistry, Applied [8]

Chemistry, Organic [7]Chemistry, Physical [3]Materials Science, Multidisciplinary [2]Optics [2]

reaction [9]synthesis [9]DERIVATIVES [4]reactions [3]formation [3]

Magerramov, A M [19]Allakhverdiev, M A [9]Mamedov, I G [5]Bairamov, M R [5]Farzaliev, V M [4]Rzaeva, I A [4]

68% of 19

Chyragov, F M[18] Chemistry, Analytical [14]Chemistry, Inorganic & Nuclear [4]

complexation [15]photometric determination [11]stability constants [8]

Chyragov, F M [18]Gadzhieva, S R [12]Makhmudov, K T [6]Alieva, R A [5]Guseinov, F E [4]

17% of 18

Allakhverdiev, M A[14] Chemistry, Applied [8]Chemistry, Organic [5]Chemistry, Physical [3]Energy & Fuels [2]Engineering, Chemical [2]Engineering, Petroleum [2]

synthesis [8]reaction [6]antioxidant activity [3]cumene [3]

Allakhverdiev, M A [14]Magerramov, A M [9]Farzaliev, V M [6]Rzaeva, I A [6]Guseinova, A T [4]

79% of 14

Gadzhieva, S R[14] Chemistry, Analytical [9]Chemistry, Inorganic & Nuclear [5]

complexation [11]photometric determination [8]stability constants [6]

Gadzhieva, S R [14]Chyragov, F M [12]Makhmudov, K T [6]Guseinov, F E [4]Pashaev, F G [3]

29% of 14

Babanly, M B[12] Materials Science, Multidisciplinary [8]Electrochemistry [2]Chemistry, Inorganic & Nuclear [2]

X-ray diffraction [7]thermodynamic properties [6]differential thermal analysis [6]standard entropies [5]

Babanly, M B [12]Babanly, N B [2]Sadygov, F M [2]Shykhyev, Y M [2]Imamalieva, S Z [2]Yusibov, Y A [2]

100% of 12

Research Profile: Baku State University – 5 Researchers

Case Examples

Co-authoring among top Baku State Researchers

Assignments• Lecture 4, March 11: Delphi, Cross Impact and Trend Impact

Analysis; read appropriate chapters in FRMv3

• Go to Real Time Delphi on line • www.realtimedelphi.net; Code: azer• Register to participate. • Provide answers to given questions• Generate list of variables and developments to consider

– Peace Scenarios Assignment: Based on the conditions for Peace and previous agreements, generate several “causal paths” by which peace, in retrospect, may have been reached.

Recommended