38
Data Visualization Analysis of a Highway Loss Data Institute report By Drew Knoblauch CDS 301 December 12, 2014

Data Visualization Analysis of a Highway Loss Data Institute report By Drew Knoblauch CDS 301 December 12, 2014

Embed Size (px)

Citation preview

Page 1: Data Visualization Analysis of a Highway Loss Data Institute report By Drew Knoblauch CDS 301 December 12, 2014

Data Visualization Analysis

of a Highway Loss Data Institute

report

By Drew Knoblauch

CDS 301

December 12, 2014

Page 2: Data Visualization Analysis of a Highway Loss Data Institute report By Drew Knoblauch CDS 301 December 12, 2014

Objective

• Evaluate an existing vehicle information report and make critical assessments of the data visualizations.

• Examined factors may include (but not be limited to) scope of data, clarity of message, level of detail, audience, font, color, white space, consistency, annotation, titles and labels.

• Provide alternative visualizations as necessary, with well-reasoned explanations for changes.

Page 3: Data Visualization Analysis of a Highway Loss Data Institute report By Drew Knoblauch CDS 301 December 12, 2014

Description of Data

The Highway Loss Data Institute vehicle information bulletin examined in this report:

• Highway Loss Data Institute. 2012. Spring 2011 Tornado Losses. Loss Bulletin Vol. 29, No. 6. Arlington, VA.

The complete bulletin is contained in Appendix B.

The raw data is contained in Appendix C.

Page 4: Data Visualization Analysis of a Highway Loss Data Institute report By Drew Knoblauch CDS 301 December 12, 2014

Visualization Analysis

The Highway Loss Data Institute bulletin subject to analysis herein discusses U.S. automobile insurance losses due to tornadoes. The time periods evaluated are Spring 2007-10 and Spring 2011.

Page 5: Data Visualization Analysis of a Highway Loss Data Institute report By Drew Knoblauch CDS 301 December 12, 2014

Visualization Analysis

Each data visualization from the bulletin will be examined through the following procedure:

• Description: The data visualization will be shown and described.

• Analysis: A review will be conducted discussing

multiple factors.

• Modification: Potential changes will be discussed, if necessary.

• Modified visualization: Changes will be shown, if necessary and possible.

Procedure

Page 6: Data Visualization Analysis of a Highway Loss Data Institute report By Drew Knoblauch CDS 301 December 12, 2014

Description 1

Map 1 shows March-June weather-related insurance losses for 2007-10 and 2011. These losses are in dollars and shown by state.

Page 7: Data Visualization Analysis of a Highway Loss Data Institute report By Drew Knoblauch CDS 301 December 12, 2014

Analysis 1

• Map 1 is the first visualization in this report and as such needs to give an introductory and overall sense of the data to follow. This map does that. Additionally, clear differences in both states and over time can be seen.

Scope

• The scope is appropriate. The entire United States is shown but the visualization clearly highlights that certain smaller sections of the country will benefit from further examination.

Initial Impression

Clarity

• The clarity could be improved. Several of the larger 50 paired bar charts appear to extend into other charts, where some of the very small bar charts are so small as to unclear that they are actually charts. Also, while the large bars clearly demonstrate large losses, extending the bars across state boundaries creates some visual noise.

Page 8: Data Visualization Analysis of a Highway Loss Data Institute report By Drew Knoblauch CDS 301 December 12, 2014

Analysis 1, cont’d

Level of Detail

• The level of detail may be too fine. The difference in losses over time for the entire United States is shown. A relative value to compare each state might have been simpler to show here but the legend indicates the bar charts are the actual loss values. While the audience for this HLDI report is sophisticated, it might be too much to ask them to “do the math” for a state while also comparing it to another state’s “math”.

• The choice of a lighter color as the background was good. But while the bar chart colors of gold and silver contrast each other adequately, they almost seem to clash with the underlying yellow.

Color

• The text refers to the losses as being automobile claims, but adding “automobile” to the map title might be valuable.

Title

Page 9: Data Visualization Analysis of a Highway Loss Data Institute report By Drew Knoblauch CDS 301 December 12, 2014

Modification 1

Clarity

• Instead of bar charts in each state, convert the difference in time frame 1 and time frame 2 into a value. This gives 50 values. These may be percentage changes or actual dollars if that level of detail is determined to be necessary. Then depending on how the values deviate, assign a color legend that shows greater differences with greater intensity of that color

• While the legend was not mentioned in the analysis, the first modification requires a second modification, one to the legend. By simplifying the visualization, a value can be added to the legend or title indicating the national values. Not including the “national” on the initial visualization was appropriate though given the complexity.

Legend/Title

Page 10: Data Visualization Analysis of a Highway Loss Data Institute report By Drew Knoblauch CDS 301 December 12, 2014

Modified Visualization 1

Map 1: Change in March-June weather-related insurance losses from 2007-10 to 2011

National = 46

Page 11: Data Visualization Analysis of a Highway Loss Data Institute report By Drew Knoblauch CDS 301 December 12, 2014

Description 2

Table 1 shows March-June weather-related losses for 2007-10 and 2011. These losses are shown as claim frequency, claim severity and overall losses by state.

(Only the first section of Table 1 is shown for space purposes. All states are included in the entire table)

Page 12: Data Visualization Analysis of a Highway Loss Data Institute report By Drew Knoblauch CDS 301 December 12, 2014

Analysis 2

• Table 1 is the underlying data for the initial map, as well as some additional detail. A table is a smart choice for the complex data represented.

Initial Impression

Clarity

• The clarity is fair. With 500 data elements (5 categories, 2 time frames, 50 states), determining which states have extreme values in any given category is not simple, but is viable.

White Space

• While this table is not overly cluttered, one option would have been to follow the Wall Street Journal (WSJ) Guide to Information Graphics rule regarding repetition of units in a table: “It is only necessary to display the units, such as a dollar sign or percentage sign, once with the first entry.” The elimination of dollar signs might create a better white space balance.

Page 13: Data Visualization Analysis of a Highway Loss Data Institute report By Drew Knoblauch CDS 301 December 12, 2014

Modification 2

White Space

• Dollar signs were eliminated according to the WSJ recommendation to see if a better white space balance was created. That might be the case if there were more dollar signs, but the change in white space balance is not significant here. (In this instance the modified visualization is shown to demonstrate that the modification is not necessarily a substantial improvement)

Title (column header)

• The first column in the table is “Exposure”. This term is a term of art within the automotive insurance industry. However, exposure is not defined in the bulletin. While the audience for this report may be familiar with the term, a footnote was added to both provide the definition and clarify the unit of measure.

Page 14: Data Visualization Analysis of a Highway Loss Data Institute report By Drew Knoblauch CDS 301 December 12, 2014

Modified Visualization 2

1 Exposure is the length of time a vehicle is insured under a given coverage type and is measured in insured vehicle years

1

Page 15: Data Visualization Analysis of a Highway Loss Data Institute report By Drew Knoblauch CDS 301 December 12, 2014

Description 3

Figure 1 shows March-June weather-related insurance losses from 2004-11.These losses are shown as claim frequency, claim severity and overall losses.

Page 16: Data Visualization Analysis of a Highway Loss Data Institute report By Drew Knoblauch CDS 301 December 12, 2014

Analysis 3

• Figure 1 effectively demonstrates that 2011 March-June weather losses were significantly greater than the previous 7 years.

Initial Impression

Clarity

• The clarity is solid. The message, that the major component of the difference was not the value of each claim being larger but the amount of claims being more than twice the average, is clear.

• The color choices work. There is sufficient contrast and no distraction.

Color

Page 17: Data Visualization Analysis of a Highway Loss Data Institute report By Drew Knoblauch CDS 301 December 12, 2014

Modification 3

Modifications to this visualization are not necessary.

Page 18: Data Visualization Analysis of a Highway Loss Data Institute report By Drew Knoblauch CDS 301 December 12, 2014

Description 4

Map 2 shows the magnitude of 2011 tornadoes in the United States, by county.

Page 19: Data Visualization Analysis of a Highway Loss Data Institute report By Drew Knoblauch CDS 301 December 12, 2014

Analysis 4

• Map 2 shows the presence and severity of tornadoes in the U.S. While initially it seemed that this visualization was simply intended to show causation for the losses shown in Map 1, it is actually also a general presentation of information that will lead to a specific presentation of information.

Scope

• As with Map 1, the scope is appropriate. The entire country is shown but the visualization clearly points towards the smaller sections of the country that will be the subject of further examination.

Initial Impression

Clarity

• The message of this map is simple and clear: “here is where the tornadoes were, generally.”

Page 20: Data Visualization Analysis of a Highway Loss Data Institute report By Drew Knoblauch CDS 301 December 12, 2014

Modification 4

Modifications to this visualization are not necessary.

Page 21: Data Visualization Analysis of a Highway Loss Data Institute report By Drew Knoblauch CDS 301 December 12, 2014

Description 5

Map 3 shows the magnitude of April 25-28, 2011 tornadoes by county in selected states.

Page 22: Data Visualization Analysis of a Highway Loss Data Institute report By Drew Knoblauch CDS 301 December 12, 2014

Analysis 5

• Map 3 is a detail of Map 2 and shows the presence and severity of tornadoes in a specific section of the country.

Scope

• As with prior maps, the scope is appropriate. The promise of a smaller section of the country to be the subject of further examination is delivered upon.

Initial Impression

Clarity

• The message of this map is simple and clear: “here is where the tornadoes were, specifically.”

Page 23: Data Visualization Analysis of a Highway Loss Data Institute report By Drew Knoblauch CDS 301 December 12, 2014

Modification 5

Modifications to this visualization are not necessary.

Page 24: Data Visualization Analysis of a Highway Loss Data Institute report By Drew Knoblauch CDS 301 December 12, 2014

Description 6

Table 2 shows March-June weather-related losses for 2007-10 and 2011 by selected states and counties. These losses are shown as claim frequency, claim severity and overall losses. Major events are also noted.

Page 25: Data Visualization Analysis of a Highway Loss Data Institute report By Drew Knoblauch CDS 301 December 12, 2014

Analysis 6

• Table 2 takes a detailed look at the data from Table 1. An additional column is added for Major Events, and that is clearly why these counties were selected. The information structure, as with Table 1, suits a table best.

Initial Impression

• The need for Table 2 is questionable, or at the very least, the structure is questionable. Providing the county level data where Major Events occurred is valuable and as previously stated, the loss data is complex enough to necessitate a table. However, since essentially all rows of the table have a major event, eliminating the Major Events column and simply adding “where Major Events occurred” to the title streamlines the visualization.

• Additionally the Major Event information could likely be added to Table 1, possibly by bolding the rows of states where Major Events occurred.

Table Structure

Page 26: Data Visualization Analysis of a Highway Loss Data Institute report By Drew Knoblauch CDS 301 December 12, 2014

Modifications 6 and 6a

• While not necessary for inclusion in a visualization, the text should include the definition (for the purposes of this bulletin) of “major weather event” as a magnitude 4 or 5 tornado.

Text

• For Table 2, the Major Events column was eliminated and “with major weather event” was added to the title. (6a)

• For Table 1, rows of states with major weather events were bolded and noted. (6b)

Table Structure

Page 27: Data Visualization Analysis of a Highway Loss Data Institute report By Drew Knoblauch CDS 301 December 12, 2014

Modified Visualization 6a

Page 28: Data Visualization Analysis of a Highway Loss Data Institute report By Drew Knoblauch CDS 301 December 12, 2014

Modified Visualization 6b

*Bold rows indicate states with major weather events.

Page 29: Data Visualization Analysis of a Highway Loss Data Institute report By Drew Knoblauch CDS 301 December 12, 2014

Description 7

Map 1 shows, by county, weather-related insurance losses for March-June 2011 and the magnitude of April 25-28 tornadoes.

Page 30: Data Visualization Analysis of a Highway Loss Data Institute report By Drew Knoblauch CDS 301 December 12, 2014

Analysis 7

Scope

• The scope initially appears appropriate. A subsection of United States is shown, delivering on the promise of earlier visualizations. However, there seem to be two sets of data shown together that do not show a pattern. This can lead to issues with clarity of message.

• Map 4 shows a natural progression towards detailed information. However, why the particular counties that were chosen is not automatically clear. The assumption is that these are the “worst” losses.

Initial Impression

Page 31: Data Visualization Analysis of a Highway Loss Data Institute report By Drew Knoblauch CDS 301 December 12, 2014

Analysis 7, cont’d

• The choice of color scale from blue to red is effective in demonstrating the magnitude of the tornadoes. However, a shade of blue that might appear to indicate a level of magnitude is used to represent the insurance losses. This presents issues of clarity as well as creating a more monochromatic palette than necessary.

Color

Clarity

• The clarity could be improved. While the large bars clearly demonstrate large losses, extending the bars across county boundaries creates some visual noise. While the intense colors clearly indicate large magnitude tornadoes, assessing this information simultaneously with the loss information is difficult, particularly as no pattern seems obvious.

Page 32: Data Visualization Analysis of a Highway Loss Data Institute report By Drew Knoblauch CDS 301 December 12, 2014

Modification 7

Clarity

• Instead of showing both magnitude and loss information in the same visualization, separate them into two maps. To show any correlation between magnitude and losses, juxtapose the maps side by side or top to bottom. If there is a clear, similar pattern in both, it should be visible.

• Instead of bars in each county, assign a color legend that shows greater values with greater intensity of that color.

• By separating the maps, the similarity of the blues is less disconcerting. However, it would probably be best to assign a more contrasting color, such as green or yellow, to the bars in the loss value map.

Color

Page 33: Data Visualization Analysis of a Highway Loss Data Institute report By Drew Knoblauch CDS 301 December 12, 2014

Modified Visualization 7

No modified visualization is provided as the software used to generate the county level detail on this map was inaccessible at the time of this report.

Page 34: Data Visualization Analysis of a Highway Loss Data Institute report By Drew Knoblauch CDS 301 December 12, 2014

Appendix A

The author of the reports was contacted and asked to offer insight into the visualization decisions from the initial report as well as comment on any suggested modifications.

Comments were provided regarding Analysis 1 and Analysis 7

Author feedback

Page 35: Data Visualization Analysis of a Highway Loss Data Institute report By Drew Knoblauch CDS 301 December 12, 2014

Appendix A cont’d

Analysis 1 Comments

Using two bars to show the 2007-10 and 2011 weather losses provides more information to the reader than just a thematic map of the percent or absolute differences. The reader can compare 2007-10 average losses, 2011 losses, the percent differences, and the absolute differences. An increase from $100 to $300 is much different than an increase from $1 to $3. The bars for states with very low losses provide little information on the difference between 2007-10 and 2011, but still inform the reader of the low weather losses in these states. For that reason they were not deleted from the map. The colors of the bars were altered from my original colors of orange and blue, but maintain their contrast without being distracting.

Author feedback

Page 36: Data Visualization Analysis of a Highway Loss Data Institute report By Drew Knoblauch CDS 301 December 12, 2014

Appendix A cont’d

Analysis 7 Comments

The report first examines the entire prime tornado season and then several of the large tornado outbreaks. April 25-28 was one of these large outbreaks. A broader map of the affected counties along with their maximum tornado magnitude was shown first, followed by the map discussed here. This map contains a closer view of the most affected counties and their losses. By placing the loss bars on top of the tornado magnitude information, the reader does not need to cross reference between two maps to examine how the two measures compare.

Author feedback

Page 37: Data Visualization Analysis of a Highway Loss Data Institute report By Drew Knoblauch CDS 301 December 12, 2014

Appendix BOriginal Report

Page 38: Data Visualization Analysis of a Highway Loss Data Institute report By Drew Knoblauch CDS 301 December 12, 2014

Appendix CRaw Data

Microsoft Excel 97-2003 Worksheet