15
The 2016 Watson Analytics Global Competition Examining the Relationship between the U.S. Economy and Temperature Change Tera Black Christopher Hutwelker Jeffrey Peck Dr. Michael Gendron Faculty Sponsor Central Connecticut State University New Britain, Connecticut April 6, 2016

The 2016 Watson Analytics Global Competition Examining the ... 2016 Examining the... · Social Media Figure 2 Social Media Dashboard Pane Watson’s Social Media tool was used to

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: The 2016 Watson Analytics Global Competition Examining the ... 2016 Examining the... · Social Media Figure 2 Social Media Dashboard Pane Watson’s Social Media tool was used to

The 2016 Watson Analytics Global Competition

Examining the Relationship between the U.S. Economy and Temperature Change

Tera Black

Christopher Hutwelker

Jeffrey Peck

Dr. Michael Gendron

Faculty Sponsor

Central Connecticut State University

New Britain, Connecticut

April 6, 2016

Page 2: The 2016 Watson Analytics Global Competition Examining the ... 2016 Examining the... · Social Media Figure 2 Social Media Dashboard Pane Watson’s Social Media tool was used to

U.S. Economy & Temperature Change 2

Contents

Table of Figures .............................................................................................................................. 2

Introduction ..................................................................................................................................... 3

Literature Review............................................................................................................................ 3

Methodology ................................................................................................................................... 4

Hypothesis................................................................................................................................... 4

Data Sources and Cleanup .......................................................................................................... 4

IBM Data Set Scores ................................................................................................................... 5

Results ............................................................................................................................................. 5

Predictions................................................................................................................................... 6

Social Media ............................................................................................................................... 7

Temperature Anomalies .............................................................................................................. 8

Gross Domestic Product Growth ................................................................................................ 8

Industry Sectors Breakdown ..................................................................................................... 10

Limitations .................................................................................................................................... 13

Future Research ............................................................................................................................ 14

Discussion ..................................................................................................................................... 14

References ..................................................................................................................................... 15

Data Sources ................................................................................................................................. 15

Table of Figures

Figure 1Predictive Tool .................................................................................................................. 6 Figure 2 Social Media Dashboard Pane .......................................................................................... 7 Figure 3 Temperature Anomalies Dashboard Pane ........................................................................ 8 Figure 4 GDP Growth Dashboard Pane .......................................................................................... 9 Figure 5 Top Industry Predictor Dashboard Pane ......................................................................... 10 Figure 6 Related Industries Dashboard Pane ................................................................................ 11 Figure 7 Related Industries Dashboard Pane ................................................................................ 12 Figure 8 Percentage of Agricultural Gross Output by the Total Private Industries Gross Output 13

Page 3: The 2016 Watson Analytics Global Competition Examining the ... 2016 Examining the... · Social Media Figure 2 Social Media Dashboard Pane Watson’s Social Media tool was used to

U.S. Economy & Temperature Change 3

Introduction

The World Economic Forum recently said, “Climate change is the most severe global

economic risk of 2016” (Hulac, 2016). To determine the effects climate change has had on the

U.S. economy over the past decade, Watson’s technology will compare the relationship between

climate change and the economy. Prior research on this topic is mostly limited to future

predictions and not analyzing past years to determine if there is any relation. There are numerous

studies showing a relationship between one specific indicator of climate change and economy to

make predictions for the future. However, limited studies exist on comparing the effect an

indicator of climate change has had on past years as a method to predict future research.

To determine if there is a relationship between the U.S. economy and climate change we

will use two measurements. The U.S. economy is measured using the annual Gross Domestic

Product (GDP) metric. The climate change indicator used was annual temperature anomalies

since temperature is “one of the most obvious signals of climate change” (National Oceanic and

Atmospheric Administration, 2016). The purpose of this study is to determine if a relationship

exists between anomalous temperature change and the U.S. GDP.

The specific goal of this study is to analyze the effect temperature anomalies have on the

gross output of the agricultural industry. Starting in 1960, the relationship between the

agricultural industry’s gross output as a percentage of total gross output of private industries and

the temperature anomalies will be determined. With the ability to use Watson, the potential

benefits of determining the relationship are significant. Being able to predict the agricultural

gross output using temperature anomalies stands to improve the decision making of

policymakers. Providing agricultural workers an improved means of assessing and forecasting

future years will strengthen the industry and improve the economy. This potential relationship

will also be applicable in other countries providing them the same economy strengthening tools.

Literature Review

In 2015, a study by Burke et. al. (2015) analyzed the global non-linear effect of

temperature on economic production. Economic productivity is best defined as “the efficiency

with which societies transform labor, capital, energy, and other natural resources into new goods

or services”. Further, this study states that as the effects of climate change increase, future

economic productivity will decrease. This study examines how a specific indicator affects the

global economy without considering potential impacts of other indicators.

The University of Maryland produced a report in 2007 on the economic impact of climate

change and the potential cost of inaction against climate change. Included in the report were the

costs and benefits for each region of the United States (i.e. New England, Midwest, Southest,

etc.). This report also addresses the differences between researchers in regards to how climate

change affects the economy. Most researchers believe climate change will negatively affect the

economy, yet other researchers believe climate change will improve the economy. The

improvement is related to new industries that will be created as a result (i.e. green energy

resources, among others). The report by the University of Maryland dismissed this theory by

Page 4: The 2016 Watson Analytics Global Competition Examining the ... 2016 Examining the... · Social Media Figure 2 Social Media Dashboard Pane Watson’s Social Media tool was used to

U.S. Economy & Temperature Change 4

stating “although there may be temporary benefits from a changing climate, the costs of climate

change rapidly exceed benefits and place major strain on public sector bugdet, personal income

and job security” (University of Maryland, 2007). The report concludes by stating that each area

of the U.S. will be affected differently, but will have a cumulative negative effect on the U.S.

economy. While the Univeristy of Maryland used data for previous years to find their results, the

correlation over the past thirty years was not determined.

Methodology

Within the constraints of the project, the primary focus of the project centers around the

utilization of IBM Watson. This tool allows for easy analysis, visualization, and prediction of

one or many data sets. In conjunction with the Watson analytics tool, other applications such as

Microsoft Excel allow for data preparation for the upload process within IBM. Together these

tools provide the means of testing of our hypothesis.

Hypothesis

Gradual and rapid climate changes are something currently affecting the world around us.

This research will evaluate if climate change has a direct relationship with Gross Domestic

Product in the United States and if those effects can be predicted using IBM Watson.

Data Sources and Cleanup

Within the constructed data set, there is a section of data that contains climate

information consisting of world averages, annual minimum, maximum and average temperature

with the associated anomalies within the date range of 1960 to 2014. Additionally, exporting the

average annual precipitation and the Palmer Drought Severity Indices for analytical use. This

data is obtainable from the National Oceanic and Atmospheric Administration (NOAA) ‘Climate

at a Glance’ section. All of the GDP related data was exported from The World Bank website,

which is automatically limited to the date range of 1960 to present, which is the rationale used

for the date range selection within the weather data.

The World Weather Data was obtained through the climate data set on the National Aero Space

Association (NASA) website. NASA records the surface temperatures for the world and displays

the changes between 1880 and present day in Celsius. For consumer use, this information can be

downloaded in pre-averaged CSV files, which are converted to Fahrenheit to be consistent with

the rest of the weather data used within the project.

Another source of data used was the Bureau of Economic Analysis, which is an agency of

the United States Department of Commerce that provides economic statistics regarding GDP of

the United States. The interactive data location offers many data sets that can be downloaded, but

the specific data set utilized is the GDP-by-Industry data. This data set only contains content

from 1997 to 2014, so any visualizations with this have been appended with the proper date

range of weather and GDP specifications. Since all of the data throughout the project has been

utilized in years, there was little conversion necessary other than matching monetary leading

zeros.

The last source of data came from Watson’s Social Media tool, which allows for an

automatic search of keywords over a variety of social media platforms. This presented the

Page 5: The 2016 Watson Analytics Global Competition Examining the ... 2016 Examining the... · Social Media Figure 2 Social Media Dashboard Pane Watson’s Social Media tool was used to

U.S. Economy & Temperature Change 5

perfect opportunity to search both of the topics within this project: Gross Domestic Product and

climate change. Once all of the associated terms have been decided upon and entered within the

application, Watson does the legwork and outputs a comprehensive data set, which can be used

in the other Watson tools.

Some formatting/cleanup was needed to provide data in a useable manner for Watson.

The weather data was exported in sections (average, min, max and drought) and compiled into

one data source. The World Bank data was downloaded with all countries formatted in rows.

This information was transposed into columns and reduced to data for the United States only.

Any fields within this data set that were not recorded or not being used were left blank. This

impacts some exclusionary conditions of Watson and additionally impacts the quality score.

These blanks were filled with zeros and excluded anywhere that was applicable in the

visualizations. The two independently cleaned data sets were then merged into one spreadsheet

that could be uploaded to the IBM Watson web interface.

IBM Data Set Scores

Within the cleanup process, one of the main areas used to assess the quality of the data

was the quality score metric within the Watson web application. Per the IBM Watson Analytics

documentation (IBM, 2015) making sure there are minimal blank rows, removing summary data,

eliminating column headings, etc. allowed for a higher rating. All of the data sets used in this

project have been a rating of 80 or above with a rating of “High Quality”.

Results

The utilization of Watson’s ability to analyze data, predict and use other external social

media-based sources makes it a logical application to not only seek trends, but also build useable

dashboards. This application allows for visualizations of the multiple data sources in the

dashboard area in a cohesive manner. This solution presented in this project creates a dashboard

that consists of seven panes, with related visualizations included within each.

Page 6: The 2016 Watson Analytics Global Competition Examining the ... 2016 Examining the... · Social Media Figure 2 Social Media Dashboard Pane Watson’s Social Media tool was used to

U.S. Economy & Temperature Change 6

Predictions

Figure 1Predictive Tool

One of the most innovative tools that IBM Watson offers is its predictive tool. This

allows the selection of variables within your data set and creates a spiral graph that incorporates

relevant predictors. You have the ability to narrow your prediction down by easily adding or

subtracting fields allowing for combinations that are more predictive, or easier to understand.

Within Figure 1, the initial targets selected within the data are all GDP industry totals, specific

agriculture fields, the associated percentages, and temperature anomalies. This will give you a

quick glance at related fields within the datasets such as agriculture, forestry, fishing and hunting

gross outputs. Together these are an excellent predictor of total GDP in private industries at

91.3%. While this does not signify temperature anomalies have a direct causal relationship on

driving agriculture based GDP prices, it allows for an indicator that there is valid reason to

continue research.

Page 7: The 2016 Watson Analytics Global Competition Examining the ... 2016 Examining the... · Social Media Figure 2 Social Media Dashboard Pane Watson’s Social Media tool was used to

U.S. Economy & Temperature Change 7

Social Media

Figure 2 Social Media Dashboard Pane

Watson’s Social Media tool was used to select a focus for this study at it allowed us to analyze

the popularity of the topic “climate change” by individual countries. As shown in Figure 2, the

relationship between popularity of climate change and GDP results are evident. The U.S. is the

leader in discussing climate change in relation to GDP. The next four countries with mentions

include England, Canada, Australia and England. This led us to focus on the U.S. and how

climate change has affected that economy over the past years.

To determine the potential influence of temperature in relation to GDP growth, the annual

average temperature anomaly is compared to the annual GDP growth. The average temperature

anomaly is defined by the National Oceanic Atmospheric Administration (NOAA) as “a

departure from a reference value or long-term average.” Additionally, a positive anomaly

indicates that the annual temperature was higher than the reference value, while a negative

anomaly indicates that the temperature was lower (National Oceanic and Atmospheric

Administration, 2016).

Page 8: The 2016 Watson Analytics Global Competition Examining the ... 2016 Examining the... · Social Media Figure 2 Social Media Dashboard Pane Watson’s Social Media tool was used to

U.S. Economy & Temperature Change 8

Temperature Anomalies

Figure 3 Temperature Anomalies Dashboard Pane

A clear pattern emerges when visualizing the minimum and maximum temperature

anomalies by decade, visualized in Figure 3. From the 1960s to the present, the anomalies are

increasing, resulting in wider fluctuations in temperature each year. The significance of the

1980s cannot be overlooked. The temperature anomalies became a positive value, which

coincides with the start of global warming. The changing temperature anomalies provide an

opportunity to compare how one specific indicator, climate change, has effected the economy.

Gross Domestic Product Growth

Gross Domestic Product (GDP) is the “standard measure of the value of final goods and

services produced by a country during a period minus the value of imports” (Organization for

Economic Co-operation and Development, 2016). Additionally, “GDP is one of the most

comprehensive and closely watched economic statistics” (Bureau of Economic Analysis, 2015).

Using GDP and temperature anomalies allows us to compare the effect that temperature

fluctuations have had on the U.S. economy.

Page 9: The 2016 Watson Analytics Global Competition Examining the ... 2016 Examining the... · Social Media Figure 2 Social Media Dashboard Pane Watson’s Social Media tool was used to

U.S. Economy & Temperature Change 9

Figure 4 GDP Growth Dashboard Pane

A pattern manifests when comparing the yearly GDP growth percentage with the

temperature anomaly average, seen in Figure 4. Several outliers due to external factors appear,

such as the oil crisis that occurred 1979 to 1982 and the Great Recession 2008 to 2009. The first

observation is that GDP growth occurs the most when the change in the temperature anomaly is

minor compared to the previous year. The average change in temperature anomaly is 0.45

degrees Fahrenheit compared to the average change in GDP was 3.04 percent. The highest GDP

growth was in 1984 when the temperature anomaly is 0.04 degrees Fahrenheit. Additionally,

2012 featured the highest anomaly and resulted in 2.211 percent GDP growth, less than half the

average change. The relationship between the annual temperature anomaly and the U.S.

economy is not clearly definable in this visual.

Page 10: The 2016 Watson Analytics Global Competition Examining the ... 2016 Examining the... · Social Media Figure 2 Social Media Dashboard Pane Watson’s Social Media tool was used to

U.S. Economy & Temperature Change 10

Industry Sectors Breakdown

Figure 5 breaks down the U.S. economy as three industry sectors to find how varying

annual temperatures effect each sector. The U.S. Census Bureau defines three categories of

private industry sectors; goods producing (manufacturing), services, and information

communication and technology industries. Temperature change affects each sector differently

and will allow further analysis on how the economy is affected.

Figure 5 Top Industry Predictor Dashboard Pane

At first, it does not appear changing temperature anomalies from 1997 to 2014

significantly impacts each sector. However, due to limitations, properly formatted data prior to

1997 was not available. Being able to calculate data from 1960 would provide additional data to

better analyze each sector’s individual impact. As seen in previous figures, the annual

temperature has varied by increasing and decreasing since 1960. Without additional data,

analyzing the relationship between GDP and temperature would not be conclusive here.

Page 11: The 2016 Watson Analytics Global Competition Examining the ... 2016 Examining the... · Social Media Figure 2 Social Media Dashboard Pane Watson’s Social Media tool was used to

U.S. Economy & Temperature Change 11

Figure 6 Related Industries Dashboard Pane

The U.S. Census Bureau further delineates the U.S. economy into 13 distinct industries.

The three industry sectors mentioned previously are comprised of these 13 industries. Figure 6

allows for the selection of each of the 13 industries individually to analyze industry wide GDP

and average temperature anomalies. After using Watson to compare how each industry is

influenced by temperature anomalies, the agricultural and recreation industries stood out. Both

industries have similar patterns and as a result, temperature change affects them similarly, as

displayed in Figure 7.

Page 12: The 2016 Watson Analytics Global Competition Examining the ... 2016 Examining the... · Social Media Figure 2 Social Media Dashboard Pane Watson’s Social Media tool was used to

U.S. Economy & Temperature Change 12

Figure 7 Related Industries Dashboard Pane

The agriculture industry appears to have adapted in the U.S. to the effects of climate

change when compared to other industries. However, the National Climate Assessment report

states “increased innovation will be needed to ensure the rate of adaptation of agriculture and the

associated socioeconomic system can keep pace with climate change over the next 25 years”

(U.S. Global Change Research Programs, 2014). Similarly, the recreation industry is easily

adaptable to changes in temperatures and will continue to adapt to gradual climate change.

Page 13: The 2016 Watson Analytics Global Competition Examining the ... 2016 Examining the... · Social Media Figure 2 Social Media Dashboard Pane Watson’s Social Media tool was used to

U.S. Economy & Temperature Change 13

Figure 8 Percentage of Agricultural Gross Output by the Total Private Industries Gross Output

Looking more closely at the agricultural industry, the relationship between agriculture’s

gross output and temperature anomalies was compared. Gross output is a more comprehensive

metric than GDP and is defined as “a measure of an industry’s sales or receipts, which can

include sales to final users in the economy (GDP) or sales to other industries (intermediate

inputs)” (U.S. Department of Commerce, 2014). Starting in 1960, a pattern is established as the

gross output of the agricultural industry, as a percentage by the total private industries gross

output, is the highest when the temperature anomaly is either negative or approximately 0.00. In

1997, the temperature anomaly remained positive, resulting in the lowest gross output by the

agricultural industry. The pattern indicates how temperature change effects the agricultural

industry, and provides valid evidence to expand the research on this relationship.

Limitations

When comparing GDP and climate change there are a series of limitations that should be

taken into consideration when analyzing this data. The majority of the data regarding GDP is

broken out by year and region of the world. This allows for forecasting over several decades, but

not all aspects of the GDP have been recorded since 1960 (when the data sets began). Due to the

time constraints of this Watson-based research project the decision was made to focus on the

annual trending for the U.S. only.

Climate data was obtained from the National Oceanic and Atmospheric Administration

(NOAA) website/FTP location. This data can be downloaded two ways: annually, averaged for

the whole U.S. or monthly, broken out by weather station. The monthly option contains all

weather recordings by location throughout the U.S. on a daily basis. These files are large and

Page 14: The 2016 Watson Analytics Global Competition Examining the ... 2016 Examining the... · Social Media Figure 2 Social Media Dashboard Pane Watson’s Social Media tool was used to

U.S. Economy & Temperature Change 14

would take a significant amount of time to compile into one data source. Within the constraints

of this project, there was not adequate time to process this data into a useable format.

Future Research

Considering the available data, there are a few directions future research can take. The

main areas would be either expanding to a worldwide scale or expanding the weather and region

data in the U.S. The NOAA resource has enough data to breakout the U.S. by region and state, as

well the ability to get more granular. This could expand on learning patterns within the main

agricultural areas in the U.S. Additional new or existing data sets could be structured in a more

comprehensive manner. Changing the data sets from the main points being laid out in columns to

a crosstab style would allow for further comparison between data points. This may be dependent

on future Watson development or additional time to format and cleanup the data.

Discussion

The intention of this research is to evaluate if climate change has a direct relationship

with Gross Domestic Product (GDP) in the U.S. and if those affects can be predicted using IBM

Watson. The results of this study show that while there is a pattern or trend, a definitive

relationship cannot be established at this time. Due to the limitations discussed previously, along

with the visualized trends, it also cannot be said at this time that the relationship does not exist.

Based on the trends demonstrated during this research it is recommended that additional research

be conducted either by expanding to a worldwide scale or by evaluating the data on a more

granular level.

This study also examined the relationship between the agricultural industry and

temperature anomalies. The goal was to determine the feasibility of using temperature anomalies

as a metric to predict the agricultural industry’s gross output. We found there to be an inverse

relationship with the gross output and temperature anomalies. This may be due to a variety of

extenuating factors, but a relationship between gross output and temperature anomalies is

apparent. Further research would facilitate a way to increase the confidence level in our

assumptions and lead to use of predictive analysis tools within the agricultural industry. Potential

benefits of such tools would allow for prediction and improvements of yields, reductions in risk

and loss, possible avoidance of climate related hazards, and overall better management of

agricultural activities.

Page 15: The 2016 Watson Analytics Global Competition Examining the ... 2016 Examining the... · Social Media Figure 2 Social Media Dashboard Pane Watson’s Social Media tool was used to

U.S. Economy & Temperature Change 15

References

Bureau of Economic Analysis. (2015). Measuring the Economy. U.S. Department of Commerce.

Burke, M., Hsiang, S. M., & Miguel, E. (2015). Global non-linear effect of temperature on

economic production. Nature, 235-239.

Environmental Protection Agency. (2014). Climate Change Indicators in the United States. U.S.

EPA.

Hulac, B. (2016). Top Economic Risk of 2016 is Global Warming. ClimateWire.

IBM. (2015). Data Loading and Data Quality. Retrieved from IBM Watson Analytics:

https://community.watsonanalytics.com/introduction-to-data-loading-and-data-quality/

National Oceanic and Atmospheric Administration. (2016, March 11). Global Surface

Temperature Anomalies. Retrieved from National Centers for Environmental

Information: https://www.ncdc.noaa.gov/monitoring-references/faq/anomalies.php

National Oceanic and Atmospheric Administration. (2016, March 12). Global Temperature

Anomalies - Graphing Tool. Retrieved from NOAA Climate.gov:

https://www.climate.gov/maps-data/dataset/global-temperature-anomalies-graphing-tool

Organization for Economic Co-operation and Development. (2016, March 12). Domestic

product. Retrieved from Organization for Economic Co-operation and Development:

https://data.oecd.org/gdp/gross-domestic-product-gdp.htm

U.S. Department of Commerce. (2014, April 22). Frequently Asked Questions. Retrieved from

Bureau of Economic Analysis: http://www.bea.gov/faq/index.cfm?faq_id=1034

U.S. Global Change Research Programs. (2014). National Climate Assessment. Washington,

D.C.: U.S> Global Change Research Programs.

University of Maryland. (2007). US Economic Impacts of Climate Change and the Cost of

Inaction. College Park, Maryland: Center for Integrative Environmental Research.

Data Sources

Bureau of Economic Analysis- Industry Data:

http://www.bea.gov/iTable/iTable.cfm?ReqID=51&step=1#reqid=51&step=51&isuri=1&5101=

1&5114=a&5113=pgoodgo,pservgo,ictgo&5112=1&5111=1997&5102=15

National Centers for Environmental Information- Climate at a Glance:

http://www.ncdc.noaa.gov/cag/

The World Bank- World Development Indicators:

http://data.worldbank.org/topic/climate-change