58
Lecture 3 - Data Visualization Module 2

Lecture 3 - Data Visualization...calculate the relative proportion of data in each category the divisions of a pie (proportions or segments) add to a whole (100%) • It is generally

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Lecture 3 - Data Visualization...calculate the relative proportion of data in each category the divisions of a pie (proportions or segments) add to a whole (100%) • It is generally

Lecture 3 - Data Visualization

Module 2

Page 2: Lecture 3 - Data Visualization...calculate the relative proportion of data in each category the divisions of a pie (proportions or segments) add to a whole (100%) • It is generally
Page 3: Lecture 3 - Data Visualization...calculate the relative proportion of data in each category the divisions of a pie (proportions or segments) add to a whole (100%) • It is generally

A. Data is the raw material

used to create information.

B. Data collection gathers

information measurements.

C. Knowledge is measured by

the number of data

collected.

A. B. C.

0% 0%0%

Page 4: Lecture 3 - Data Visualization...calculate the relative proportion of data in each category the divisions of a pie (proportions or segments) add to a whole (100%) • It is generally

A. It employs a lot of people

B. It produces data visuals

C. It gathers data

measurements for further

analysis

A. B. C.

0% 0%0%

Page 5: Lecture 3 - Data Visualization...calculate the relative proportion of data in each category the divisions of a pie (proportions or segments) add to a whole (100%) • It is generally

A. True

B. False

A. B.

0%0%

Page 6: Lecture 3 - Data Visualization...calculate the relative proportion of data in each category the divisions of a pie (proportions or segments) add to a whole (100%) • It is generally

A. True

B. False

A. B.

0%0%

Page 7: Lecture 3 - Data Visualization...calculate the relative proportion of data in each category the divisions of a pie (proportions or segments) add to a whole (100%) • It is generally

A. True

B. False

A. B.

0%0%

Page 8: Lecture 3 - Data Visualization...calculate the relative proportion of data in each category the divisions of a pie (proportions or segments) add to a whole (100%) • It is generally

A. Satellite images

B. Monitoring networks

C. Sampling

A. B. C.

0% 0%0%

Page 9: Lecture 3 - Data Visualization...calculate the relative proportion of data in each category the divisions of a pie (proportions or segments) add to a whole (100%) • It is generally

A. Data from outer space

B. Data with coordinates

C. Data referring to

environmental resources

A. B. C.

0% 0%0%

Page 10: Lecture 3 - Data Visualization...calculate the relative proportion of data in each category the divisions of a pie (proportions or segments) add to a whole (100%) • It is generally
Page 11: Lecture 3 - Data Visualization...calculate the relative proportion of data in each category the divisions of a pie (proportions or segments) add to a whole (100%) • It is generally

5 Words about Climate Change by SCI103 Community – 2015/2016http://www.wordle.net/

Page 12: Lecture 3 - Data Visualization...calculate the relative proportion of data in each category the divisions of a pie (proportions or segments) add to a whole (100%) • It is generally

“Having the data is not enough. I have to show it in ways people both enjoy and understand.”

(Prof. Hans Rosling)

• Raw data does not provide much insight unless it is processed and presented.

• The ways data is presented have a huge impact in providing meaningful analysis and

interpretation.

• The human brain retains more the information contained in visuals rather than the

information conveyed by written words or words spoken in a conversation.

Visual tools are crucial for effective data communication.

Page 13: Lecture 3 - Data Visualization...calculate the relative proportion of data in each category the divisions of a pie (proportions or segments) add to a whole (100%) • It is generally

https://youtu.be/AdSZJzb-aX8

Some of the worst data visualizations are the ones you have to stare at for several minutes before you even comprehend what they’re trying to say.

Turn data into something more engaging. There should be a swift “aha!” moment within seconds after someone sees your data.

Page 14: Lecture 3 - Data Visualization...calculate the relative proportion of data in each category the divisions of a pie (proportions or segments) add to a whole (100%) • It is generally

Goal of data visualization: communicate information clearly and effectively using graphics.

Graphics: visual images presented in a surface such as paper or a computer screen.

• Data visualization: effective techniques used to communicate data or information by

encoding it as visual objects able to synthesize large amounts of data.

Visuals need to combine functionality and design to convey information intuitively.

Page 15: Lecture 3 - Data Visualization...calculate the relative proportion of data in each category the divisions of a pie (proportions or segments) add to a whole (100%) • It is generally

HTTPS://YOUTU.BE/MKEXX7SDXAI

Charts and tables: synthesize and display data (qualitative and quantitative)

Tables

Source: Goulburn Murray Water

Charts

Page 16: Lecture 3 - Data Visualization...calculate the relative proportion of data in each category the divisions of a pie (proportions or segments) add to a whole (100%) • It is generally

Conceptual diagrams and infographics– synthesis, visualisation and context

Source : Bureau of Meteorology

Conceptual diagram

http://images.wookmark.com/96600_post1_new.jpg

Infographics

Page 17: Lecture 3 - Data Visualization...calculate the relative proportion of data in each category the divisions of a pie (proportions or segments) add to a whole (100%) • It is generally

Satellite images and maps - geographic context

Source : Bureau of Meteorology

MapSatellite image

Page 18: Lecture 3 - Data Visualization...calculate the relative proportion of data in each category the divisions of a pie (proportions or segments) add to a whole (100%) • It is generally

Photographs – provide unique information

Video clips – can help to tell a story by capturing motion, perspective and sound

Source: L Huzzey

Photograph

Video

Page 19: Lecture 3 - Data Visualization...calculate the relative proportion of data in each category the divisions of a pie (proportions or segments) add to a whole (100%) • It is generally

• Graphic visual representation of

complex data/information, quickly

and clearly

• Include “traditional” data visuals

such as charts and tables

• Right combination of design and

content to be effective

https://doms.csu.edu.au/csu/thumbs/2aa657f5-7f7a-45a6-85b4-3610fd378eba/1/ad6a21bc-9c86-4640-b42a-dd470fa45d1a

Page 20: Lecture 3 - Data Visualization...calculate the relative proportion of data in each category the divisions of a pie (proportions or segments) add to a whole (100%) • It is generally

http://images.sixrevisions.com/2009/05/09-03_coffee_drinks.jpg http://graphs.net/wp-content/uploads/2012/11/Worlds-10-Popular-Books-Sold-in-Last-50-Years.jpg

Page 21: Lecture 3 - Data Visualization...calculate the relative proportion of data in each category the divisions of a pie (proportions or segments) add to a whole (100%) • It is generally

"Excellence in statistical graphics consists of complex ideas communicated with

clarity, precision and efficiency" (Prof. Edward Tufte)

• Graphical excellence is of extreme relevance when data is used to perform

analytical tasks such as making comparisons or determining causality.

• “Bad” visuals might distort the data, making it harder to understand or compare &

ineffective and poorly presented information can lead to misinformation.

Page 22: Lecture 3 - Data Visualization...calculate the relative proportion of data in each category the divisions of a pie (proportions or segments) add to a whole (100%) • It is generally

• Understand the data you are trying to visualize including its size and cardinality

(the uniqueness of data values in a column)

• Determine what you are trying to visualize and what kind of information you want

to communicate

• Know your audience and understand how it processes visual information

• Use a visual that conveys the information in the best and simplest form for your

audience.

Page 23: Lecture 3 - Data Visualization...calculate the relative proportion of data in each category the divisions of a pie (proportions or segments) add to a whole (100%) • It is generally

• Guideline 1: create the simplest graph that

conveys the information you want to convey

• Guideline 2: consider the type of encoding

object (points, lines, and bars) and attribute

(point position, line length, color) used to

create a plot

• Guideline 3: focus on visualizing patterns or

on visualizing details, depending on the

purpose of the plot

• Guideline 4: select meaningful axis ranges

• Guideline 5: data transformations and

carefully chosen graph aspect ratios can be

used to emphasize rates of change for time-

series data

Page 24: Lecture 3 - Data Visualization...calculate the relative proportion of data in each category the divisions of a pie (proportions or segments) add to a whole (100%) • It is generally

• Guideline 6: plot overlapping points in a way

that density differences become apparent in

scatter plots

• Guideline 7: use lines when connecting

sequential data in time-series plots

• Guideline 8: aggregate larger datasets in

meaningful ways

• Guideline 9: keep axis ranges as similar as

possible to compare variables

• Guideline 10: select an appropriate color

scheme based on the type of data

Page 25: Lecture 3 - Data Visualization...calculate the relative proportion of data in each category the divisions of a pie (proportions or segments) add to a whole (100%) • It is generally
Page 26: Lecture 3 - Data Visualization...calculate the relative proportion of data in each category the divisions of a pie (proportions or segments) add to a whole (100%) • It is generally

“… summarize and describe vast amounts of information in a compact, efficient and

eye-catching way” (Ducklan & Martin, 2002)

Useful for data analysis, visualization & communication:

• present raw data

• present results of fairly complex analyses,

• summarise information,

• expose unanticipated characteristics of data,

• suggest hypotheses which may be further investigated.

Page 27: Lecture 3 - Data Visualization...calculate the relative proportion of data in each category the divisions of a pie (proportions or segments) add to a whole (100%) • It is generally

Bar charts

Histograms

Pie charts

Graphs

http://www.statmethods.net/graphs/images/pie2.jpg

Page 28: Lecture 3 - Data Visualization...calculate the relative proportion of data in each category the divisions of a pie (proportions or segments) add to a whole (100%) • It is generally

http://www.statmethods.net/graphs/images/barplot3.jpg

• Display categorical data being each category

independent of the others

• Used to compare a variable across a number of

different groups, showing the size of each

group (the length of each bar is proportional to

the value it represents)

• Two main types of bar charts: horizontal and

vertical. Horizontal bars usually represent a

single period of time whereas column bars may

represent similar items at different times

• Can also be displayed as subdivided bar charts

where different variables are represented in

the same columnX-axis: discrete data

Y-axis: continuous data (count, value or percent)

Gaps between

bars

Page 29: Lecture 3 - Data Visualization...calculate the relative proportion of data in each category the divisions of a pie (proportions or segments) add to a whole (100%) • It is generally

• Type of a bar chart since data is displayed data using bars/columns but bars are

placed next to each other

• Used to display frequency values, or the number of values that fall within the same

category or interval (represented in the x-axis). In statistics, histograms are a graphical

representation of the distribution of data

• To construct a histogram:

divide the entire range of values into a series of categories;

count how many values fall into each category;

draw a rectangle with height proportional to the count and width equal to the

category size.

Page 30: Lecture 3 - Data Visualization...calculate the relative proportion of data in each category the divisions of a pie (proportions or segments) add to a whole (100%) • It is generally

X-axis: interval data

Y-axis: continuous data (frequency values)

Page 31: Lecture 3 - Data Visualization...calculate the relative proportion of data in each category the divisions of a pie (proportions or segments) add to a whole (100%) • It is generally

• Mainly used to compare proportions

• To construct a pie chart:

calculate the relative proportion of data in each

category

the divisions of a pie (proportions or segments)

add to a whole (100%)

• It is generally recommended to avoid pie charts

for data analysis and visualization:

humans process easily differences in line length

than surface area: it is more effective to use a

bar chart (which takes advantage of line length

to show comparison) rather than pie charts

(which use surface area to show comparison)

pie charts usually encode only a handful of

numbers and a table is usually a much more

efficient way to present such information

Proportions are ratio data

Labels are nominal or ordinal data

Page 32: Lecture 3 - Data Visualization...calculate the relative proportion of data in each category the divisions of a pie (proportions or segments) add to a whole (100%) • It is generally

• A line showing the relationship between two or more variables (line graphs and

scatterplots)

• Advantages:

Display of high information density, sometimes with no loss of data

Rapid assimilation of the overall result

Clearly display of complex relationships among multivariate data

• Graph interpretation:

Height of the line (or series of lines)

Patterns (seasonal pattern, trend or a combination of both)

Page 33: Lecture 3 - Data Visualization...calculate the relative proportion of data in each category the divisions of a pie (proportions or segments) add to a whole (100%) • It is generally

http://www.statmethods.net/graphs/images/linechart1.png

X-axis:

Discrete and continuous

data

Y-axis:

Discrete and continuous data

Scale should start at zero

Shows a trend Relationship between two variables by

plotting their (x,y) positions

Y-axis:

Continuous data

X-axis:

Continuous data

Page 34: Lecture 3 - Data Visualization...calculate the relative proportion of data in each category the divisions of a pie (proportions or segments) add to a whole (100%) • It is generally

• Columns and rows filled with data

• Used for summarising results and data comparison

Tables vs charts

Tables are usually a better option than a chart when only few data points need to

be displayed.

If exact numeric values are required, a table is best since it can be hard to

represent the exact values in a chart axis.

In a thesis or research report, the detail and precision of tables may be more

important since they are a repository of information.

Page 35: Lecture 3 - Data Visualization...calculate the relative proportion of data in each category the divisions of a pie (proportions or segments) add to a whole (100%) • It is generally

(Swires-Hennessy, 2014)

Page 36: Lecture 3 - Data Visualization...calculate the relative proportion of data in each category the divisions of a pie (proportions or segments) add to a whole (100%) • It is generally

Swires-Hennessy, E. (2014). How to Communicate your Message Effectively.

Hay, I. (2012). Communicating in geography and the environmental sciences.

Thomas, J. E., Saxby, T. A., Jones, A. B., Carruthers, T. J. B., Abal, E. G. & Dennison, W. C. (2006) Communicating science effectively: a practical handbook for integrating visual elements.

Kelly, D., Jasperse, J. & Westbrooke, I. (2005). Designing science graphs for data analysis and presentation: the bad, the good and the better.

Schwabish, J.A. (2014). An Economist’s Guide to Visualizing Data.

Page 37: Lecture 3 - Data Visualization...calculate the relative proportion of data in each category the divisions of a pie (proportions or segments) add to a whole (100%) • It is generally

• Show the data

• Reveal content

• Avoid distorting data

• Present many numbers in a small space

• Encourage comparison of datasets

• Reveal data at several levels of details

• Serve a clear purpose

• Be integrated with verbal and statistical descriptions of data

• Consistent style and format

Page 38: Lecture 3 - Data Visualization...calculate the relative proportion of data in each category the divisions of a pie (proportions or segments) add to a whole (100%) • It is generally

• Concise and Comprehensible

(present only the information that is relevant and required to support the content –

“Who?”,“What?”,“How?”)

• Independent

(someone who has not read the document associated with the graphic should be

able to look at it and understand what it means)

• Referenced

Page 39: Lecture 3 - Data Visualization...calculate the relative proportion of data in each category the divisions of a pie (proportions or segments) add to a whole (100%) • It is generally

http://abacus.bates.edu/~ganderso/biology/resources/writing/graphparts2003.gif http://abacus.bates.edu/~ganderso/biology/resources/writing/population_variation_table_gif.gif

Page 40: Lecture 3 - Data Visualization...calculate the relative proportion of data in each category the divisions of a pie (proportions or segments) add to a whole (100%) • It is generally

Axis labels legible and easy to find and easily associated with the axis/object depicted

Labels on the graph should be clearly offset from the data or outside the axes

When appropriate the units of measurement should be displayed

Start the y-axis at zero when graphing numbers

http://www.owlnet.rice.edu/~labgroup/pdf/excelplot.pdf

Chart axis

Page 41: Lecture 3 - Data Visualization...calculate the relative proportion of data in each category the divisions of a pie (proportions or segments) add to a whole (100%) • It is generally

Range of axis scale allows the full range of data to be included.

Data points properly spaced

Tick marks placed at sufficiently frequent intervals for a reader to work out accurately the value of each data point

Time shown as the x-axis with time progressing from left to right & time intervals should be equal

http://www.owlnet.rice.edu/~labgroup/pdf/excelplot.pdf

Chart axis

Page 42: Lecture 3 - Data Visualization...calculate the relative proportion of data in each category the divisions of a pie (proportions or segments) add to a whole (100%) • It is generally

Two or more datasets must be easily distinguished from one another

No more than 4 simultaneous symbols, values or lines, and each line or symbol should be sufficiently different from the others

Overlapping symbols or lines must be visually separable

Use vertical axes on the left and right sides of the graph to depict different scales when comparing datasets with different measurements

Chart type

O.P. Yakutina, T.V. Nechaeva, N.V. Smirnova, Consequences of snowmelt erosion: Soil fertility, productivity and quality of wheat on Greyzemic Phaeozem in the south of West Siberia, Agriculture, Ecosystems & Environment,

Volume 200, 1 February 2015, Pages 88-93, ISSN 0167-8809, http://dx.doi.org/10.1016/j.agee.2014.10.021.

Page 43: Lecture 3 - Data Visualization...calculate the relative proportion of data in each category the divisions of a pie (proportions or segments) add to a whole (100%) • It is generally

Table number (unique number for each table to be easily identified)

Table title (self-explanatory, above the table)

Column headings (explain meaning of the data, including units of measurement)

Table notes (supplementary information, below the table)

Table source (references)

Page 44: Lecture 3 - Data Visualization...calculate the relative proportion of data in each category the divisions of a pie (proportions or segments) add to a whole (100%) • It is generally

Excel Table with Wagga Wagga climate data(September 2005 to August 2006) (Bureau of

Meteorology)Formatting the table to make it effective

Highlighted rows show the days in September 2005

where the rainfall was greater than evaporation

Page 45: Lecture 3 - Data Visualization...calculate the relative proportion of data in each category the divisions of a pie (proportions or segments) add to a whole (100%) • It is generally
Page 46: Lecture 3 - Data Visualization...calculate the relative proportion of data in each category the divisions of a pie (proportions or segments) add to a whole (100%) • It is generally

• Figures and tables are numbered separately (e.g. Table 1, Table 2, Figure 1, Table

3, Figure 2)

• Figures and tables must always be properly referred to and captioned in the text.

This means that a figure or table must be mentioned in the text before the figure

or table appears

• Figures and tables captions: should be informative without being too long; if the

data is sourced from elsewhere then this should be referenced in the caption.

Page 47: Lecture 3 - Data Visualization...calculate the relative proportion of data in each category the divisions of a pie (proportions or segments) add to a whole (100%) • It is generally

Thomas et al., 2006

Page 48: Lecture 3 - Data Visualization...calculate the relative proportion of data in each category the divisions of a pie (proportions or segments) add to a whole (100%) • It is generally

Figure captions go below the figure

Table captions go above the table

Graham S. Leonard, Carol Stewart, Thomas M. Wilson, Jonathan N. Procter, Bradley J. Scott, Harry J. Keys, Gill E. Jolly, Johnny B. Wardman, Shane J. Cronin, Sara K. McBride, Integrating multidisciplinary science, modelling and

impact data into evolving, syn-event volcanic hazard mapping and communication: A case study from the 2012 Tongariro eruption crisis, New Zealand, Journal of Volcanology and Geothermal Research, Volume 286, 1 October

2014, Pages 208-232, ISSN 0377-0273, http://dx.doi.org/10.1016/j.jvolgeores.2014.08.018.

Page 49: Lecture 3 - Data Visualization...calculate the relative proportion of data in each category the divisions of a pie (proportions or segments) add to a whole (100%) • It is generally

• A title is not always used if the chart is used as a figure in the text and it is

included in as part of the caption (a stand-alone chart in a presentation would

normally have a title)

• Choose the right typeface style

• Avoid using different typefaces: use bold, italics, capitals, small caps or

contrasting colours to create contrast and emphasis

• If possible, avoid putting values on charts (if numbers are needed use tables

instead)

• Eliminate all redundant terms

• For tables, ensure that all values for the same variable have the same number of

decimal points and ensure decimal points are aligned

Page 50: Lecture 3 - Data Visualization...calculate the relative proportion of data in each category the divisions of a pie (proportions or segments) add to a whole (100%) • It is generally

Only use 2D charts for 2D data (i.e. to plot two variables)

3D visual appearance distorts the data and avoids a clear interpretation

Page 51: Lecture 3 - Data Visualization...calculate the relative proportion of data in each category the divisions of a pie (proportions or segments) add to a whole (100%) • It is generally

Use colour, but use it with caution & Try to design your chart without the use of colour

If it reproduces well in black and white it will be able to be reproduced in any medium

Black and white design: make patterns in columns as contrasting as possible

Don’t make shading too gradual

Page 52: Lecture 3 - Data Visualization...calculate the relative proportion of data in each category the divisions of a pie (proportions or segments) add to a whole (100%) • It is generally

Use the right resolution for your graphics

Resolution measured as number of dots per inch (dpi) or number of pixels (given as the width

and height of the image or as the total number of pixels in the image).

More pixels > more resolution > file size

Different figures formats (*.jpg, *.tif, *.gif, *.png, *.eps): resolution-dependent (image quality

changes with compression) and resolution-independent (same quality even when we change

size)

Page 53: Lecture 3 - Data Visualization...calculate the relative proportion of data in each category the divisions of a pie (proportions or segments) add to a whole (100%) • It is generally
Page 54: Lecture 3 - Data Visualization...calculate the relative proportion of data in each category the divisions of a pie (proportions or segments) add to a whole (100%) • It is generally

Multivariate data refers to data that is measured for more than 2 variables (bivariate data refers to 2 variables)

Multivariate charts:

Scatterplot matrices (continuous data)

Mosaic plots (categorical data)

Example multivariate data

Page 55: Lecture 3 - Data Visualization...calculate the relative proportion of data in each category the divisions of a pie (proportions or segments) add to a whole (100%) • It is generally
Page 56: Lecture 3 - Data Visualization...calculate the relative proportion of data in each category the divisions of a pie (proportions or segments) add to a whole (100%) • It is generally

Displays the relationship among two or more categorical variables

Used for representing frequency tables (i.e. the number of times a data value occurs)

Example: Mortality rates aboard the Titanic vary for males and females. Among females, 67% survived (coded as 1) and 33% died (coded as 0).

https://medschool.vanderbilt.edu/cqs/files/cqs/media/DrTsai2_0.pdf

Page 57: Lecture 3 - Data Visualization...calculate the relative proportion of data in each category the divisions of a pie (proportions or segments) add to a whole (100%) • It is generally

• Multidimensional data refers to measurements of variables in more than 2

dimensions (or 2D, normally referring to the Cartesian plot with the x,y axis)

• Common multidimensional charts: 3D plots where time (t) and depth (z) are

frequently chosen as the third dimension

Page 58: Lecture 3 - Data Visualization...calculate the relative proportion of data in each category the divisions of a pie (proportions or segments) add to a whole (100%) • It is generally

Next lecture:

What is spatial data?

Spatial data collection (GPS)

Go through Module 2 in

your Learning Modules

The information presented

here is important for

Assessment 2a) and 2b)