Upload
qun
View
304
Download
41
Embed Size (px)
Citation preview
7/23/2019 Evans Analytics2e Ppt 03
1/76
Chapter 3
Visualizing andExploring Data
7/23/2019 Evans Analytics2e Ppt 03
2/76
Data visualization - the process of displayingdata (often in large quantities) in a meaningfulfashion to provide insights that will support betterdecisions. Data visualization improves decision-making, provides
managers with better analysis capabilities that reducereliance on ! professionals, and improves collaborationand information sharing.
Data Visualization
7/23/2019 Evans Analytics2e Ppt 03
3/76
!abular data can be used to determine e"actly howmany units of a certain product were sold in a particularmonth, or to compare one month to another. #or e"ample, we see that sales of product $ dropped in #ebruary,
specifically by %.&' (computed as *+*). *eyond suchcalculations, however, it is difficult to draw big picture conclusions.
Example 3.1: Tabular vs. Visual Data
Analysis
7/23/2019 Evans Analytics2e Ppt 03
4/76
$ visual chart provides themeans to easily compare overall sales
of different products (roduct
/ sells the least, for e"ample)0 identify trends (sales of
roduct D are increasing),other patterns (sales ofroduct / is relatively stablewhile sales of roduct *fluctuates more over time),and e"ceptions (roduct 12ssales fell considerably in3eptember).
Example 3.1: Tabular vs. Visual Data
Analysis
7/23/2019 Evans Analytics2e Ppt 03
5/76
$ dashboard is a visual representation of a set of keybusiness measures. t is derived from the analogy of anautomobile2s control panel, which displays speed,gasoline level, temperature, and so on. Dashboards provide important summaries of key business
information to help manage a business process or function.
Dashboards
7/23/2019 Evans Analytics2e Ppt 03
6/76
3elect the Inserttab. 4ighlight the data. /lick on chart type, then subtype.
5se Chart Toolsto customize.
Creating Charts in i!roso"t Ex!el
7/23/2019 Evans Analytics2e Ppt 03
7/76
1"cel distinguishes between vertical and horizontal barcharts, calling the former column charts and the latterbar charts. $ clustered column chart compares values across categories
using vertical rectangles0
a stacked column chart displays the contribution of each value tothe total by stacking the rectangles0
a 66' stacked column chart compares the percentage that eachvalue contributes to a total.
/olumn and bar charts are useful for comparingcategorical or ordinal data, for illustrating differencesbetween sets of values, and for showing proportions orpercentages of a whole.
Column and #ar Charts
7/23/2019 Evans Analytics2e Ppt 03
8/76
Example 3.$: Creating a Column Chart
4ighlighted /ells
4ighlight the range /+78%, which includes the headings anddata for each category. /lick on the Column Chart buttonand then on the first chart type in the list (a clusteredcolumn chart).
7/23/2019 Evans Analytics2e Ppt 03
9/76
Example 3.$: Creating a Column Chart
!o add a title, click on the first icon in the Chart Layouts group. /lick on 9/hart!itle: in the chart and change it to 911; 1mployment en: and 9?omen: in a similar fashion.
7/23/2019 Evans Analytics2e Ppt 03
10/76
@ine charts provide a useful means for displaying dataover time. Aou may plot multiple data series in line charts0 however, they can
be difficult to interpret if the magnitude of the data values differsgreatly. n that case, it would be advisable to create separate
charts for each data series.
%ine Charts
1"ample +.+7 $ @ine/hart for /hina 1"port
Data
7/23/2019 Evans Analytics2e Ppt 03
11/76
&ie Charts
$ pie chart displays this by partitioning a circle into pie-shaped areas showing the relative proportion.
1"ample +.B7 $ ie
/hart for /ensus Data
7/23/2019 Evans Analytics2e Ppt 03
12/76
&ie Charts
Data visualization professionals donCt recommend using pie charts.n a pie chart, it is difficult to compare the relative sizes of areas0however, the bars in the column chart can easily be compared todetermine relative ratios of the data.
f you do use pie charts, restrict them to small numbers of categories,always ensure that the numbers add to 66', and use labels to display
the group names and actual percentages. $void three-dimensional (+-D)pie charts=especially those that are rotated=and keep them simple.
7/23/2019 Evans Analytics2e Ppt 03
13/76
$n area chart combines the features of a pie chart withthose of line charts. $rea charts present more information than pie or line charts alone
but may clutter the observer2s mind with too many details if toomany data series are used0 thus, they should be used with care.
Area Charts
1"ample +.7 $n $rea/hart for 1nergy
/onsumption
7/23/2019 Evans Analytics2e Ppt 03
14/76
3catter charts show the relationship between twovariables. !o construct a scatter chart, we needobservations that consist ofpairsof variables.
'!atter Charts
1"ample +.%7 $3catter /hart for
7/23/2019 Evans Analytics2e Ppt 03
15/76
$ bubble chart is a type of scatter chart in which the sizeof the data marker corresponds to the value of a thirdvariable0 consequently, it is a way to plot three variablesin two dimensions.
#ubble Charts
1"ample +.&7$ *ubble/hart for3tock/omparisons
7/23/2019 Evans Analytics2e Ppt 03
16/76
3tock chart3urface chartDoughnut chart
7/23/2019 Evans Analytics2e Ppt 03
17/76
>any applications of business analytics involve geographic data.Eisualizing geographic data can highlight key data relationships,identify trends, and uncover business opportunities. n addition, itcan often help to spot data errors and help end users understandsolutions, thus increasing the likelihood of acceptance of decision
models. /ompanies like Fike use geographic data and information systems
for visualizing where products are being distributed and how thatrelates to demographic and sales information. !his information isvital to marketing strategies.
Geographic mapping capabilities were introduced in 1"cel 666 butwere not available in 1"cel 66 and later versions. !hesecapabilities are now available through >icrosoft >apoint 66,which must be purchased separately.
(eographi! Data
7/23/2019 Evans Analytics2e Ppt 03
18/76
Data bars /olor scales con sets
3parklines /amera tool
)ther Ex!el Data Visualization Tools
7/23/2019 Evans Analytics2e Ppt 03
19/76
Data bars display colored bars that are scaled to themagnitude of the data values (similar to a bar chart) butplaced directly within the cells of a range. 4ighlight the data in each column, click the Conditional
Formattingbutton in the Stylesgroup within the Hometab, selectData Bars, and choose the fill option and color.
Example 3.*: Data Visualization through
Conditional +ormatting
7/23/2019 Evans Analytics2e Ppt 03
20/76
Color s!ales shade cells based on their numerical valueusing a color palette. /olor-coding of quantitative data is commonly called a heatmap.
Example 3.*: Data Visualization through
Conditional +ormatting
7/23/2019 Evans Analytics2e Ppt 03
21/76
,!on sets provide similar information using varioussymbols such as arrows or stoplight colors.
Example 3.*: Data Visualization through
Conditional +ormatting
7/23/2019 Evans Analytics2e Ppt 03
22/76
'par-linesare graphics that summarize a rowor column of data in a single cell.
1"cel has three types of sparklines7 line,
column, and winloss. @ine sparklines are clearly useful for time-series data
/olumn sparklines are more appropriate forcategorical data.
?in-loss sparklines are useful for data that move up ordown over time.
'par-lines
7/23/2019 Evans Analytics2e Ppt 03
23/76
Generally you need to e"pand the row or column widths to display
them effectively. Fotice, however, that the lengths of the bars arenot scaled properly to the data0 for e"ample, in the first one,products D and 1 are roughly one-third the value of roduct 1 yetthe bars are not scaled correctly. 3o be careful when using them.
Example 3. Examples o" 'par-lines
7/23/2019 Evans Analytics2e Ppt 03
24/76
!his tool allows you to create live pictures of variousranges from different worksheets that you can place on asingle page, size them, and arrange them easily.
!hey are simply linked pictures of the original ranges,
and the advantage is that as any data are changed orupdated, the camera shots are also. !o use the camera too, first add it to the Quick ccess Toolbar(the set of
buttons above the ribbon). #rom the Filemenu, choose !ptionsand thenQuick ccess Toolbar. /hoose Commands, and then Commands "ot inthe #ibbon. 3elect Cameraand add it.
Ex!el Camera Tool
7/23/2019 Evans Analytics2e Ppt 03
25/76
>anagers often need to sort and filter data. +ilteringmeans e"tracting a set of records having
certain characteristics.
1"cel provides a convenient way of formattingdatabases to facilitate analysis using sorting andfiltering, called Tables.
Data /ueries: Tables0 'orting0 and
+iltering
7/23/2019 Evans Analytics2e Ppt 03
26/76
#irst, select the range of the data, including headers (a useful shortcut is to
select the first cell in the upper left corner, then click Ctrl$Shi%t$do&n arro&,and then Ctrl$Shi%t$right arro&).
Fe"t, click Tablefrom the Tablesgroup on the Inserttab and make sure thatthe bo" for 'y Table Has Headersis checked. (Aou may also Hust select acell within the table and then click on Tablefrom the Insertmenu.)
!he table range will now be formatted and will continue automatically whennew data are entered.
f you click within a table, the Table Tools Design tab will appear in theribbon, allowing you to do a variety of things, such as change the colorscheme, remove duplicates, change the formatting, and so on.
Example 3.1: Creating an Ex!el Table
7/23/2019 Evans Analytics2e Ppt 03
27/76
3uppose that in the Credit #isk Datatable, we wish to calculate thetotal amount of savings in column /. ?e could, of course, simplyuse the function I35>(/B7/BJ). 4owever, with a table, we coulduse the formula I35>(Table()Sa*ings+). !he table name,Table(,can be found (and changed) in the ,ropertiesgroup of the Table
Tools Design tab. Fote that Sa*ingsis the name of the header incolumn /. ;ne of the advantages of doing this is that if we add newrecords to the table, the calculation will be updated automatically,
Example 3.11: Table2#ased
Cal!ulations
7/23/2019 Evans Analytics2e Ppt 03
28/76
'orting Data in Ex!el
!he sort buttons in 1"cel can be found under the Datatab in the Sort - Filter group. 3elect a single cell in thecolumn you want to sort on and click the 9$K downarrow: button to sort from smallest to largest or the 9$K
up arrow: button to sort from largest to smallest. Aoumay also click the Sortbutton to specify criteria for moreadvanced sorting capabilities.
7/23/2019 Evans Analytics2e Ppt 03
29/76
3uppose we wish to sort the data by supplier. /lick onany cell in column $ of the data (but not the header cell
$+) and then the 9$K down: button in the Data tab. 1"celwill select the entire range of the data and sort by name
of supplier in column $.
Example 3.1$ 'orting Data in the
&ur!hase )rders Database
7/23/2019 Evans Analytics2e Ppt 03
30/76
$n talian economist, Eilfredo areto, observed in L6%that a large proportion of the wealth in taly was ownedby a small proportion of the people.
3imilarly, businesses often find that a large proportion ofsales come from a small percentage of customers, a
large percentage of quality defects stems from Hust acouple of sources, or a large percentage of inventoryvalue corresponds to a small percentage of items
$ &areto analysis involves sorting data and calculating
cumulative proportions.
&areto Analysis
7/23/2019 Evans Analytics2e Ppt 03
31/76
Example 3.13: Applying the &areto
&rin!iple
&' of the bicycle inventory value comes from B6' (LB) of items.
3ort by
7/23/2019 Evans Analytics2e Ppt 03
32/76
#or large data files, finding a particular subset ofrecords that meet certain characteristics by sortingcan be tedious.
1"cel provides two filtering tools7 $uto#ilter for simple criteria, and
$dvanced #ilter for more comple" criteria.
+iltering Data
7/23/2019 Evans Analytics2e Ppt 03
33/76
3elect any cell in thedatabase
Data M Sort - Filter .Filter
/lick on the dropdownarrow in cell D+.
3elect *olt-nut package to
filter out all other items.
Example 3.1: +iltering 4e!ords by ,tem
Des!ription
n the ,urchase !rders database, suppose we are interested ine"tracting all records corresponding to the item *olt-nut package.
7/23/2019 Evans Analytics2e Ppt 03
34/76
Example 3.1: +ilter 4esults
!he filter tool does not e"tract the records0 it simply hides therecords that don2t match the criteria. 4owever, you can copy andpaste the data to another 1"cel worksheet, >icrosoft ?orddocument, or a ower-oint presentation.
!o restore the original data file, click on the drop-down arrow againand then click /lear filter from 9tem Description.:
7/23/2019 Evans Analytics2e Ppt 03
35/76
3uppose we wish to identify all records in the ,urchase !rdersdatabase whose item cost is at least N66. #irst, click on the drop-down arrow in the tem /ost column and position the cursor over"umbers Filter. !his displays a list of options. 3elect /reater Than!r E0ual To . . . from the list.
Example 3.15: +iltering 4e!ords by ,tem
Cost
7/23/2019 Evans Analytics2e Ppt 03
36/76
!he Custom utoFilterdialog allows you to specify up to twospecific criteria using 9and: and 9or: logic. 1nter 66 in the bo" asshown0 the tool will display all records having an item cost of N66or more.
Example 3.15: +iltering 4e!ords by ,tem
Cost
7/23/2019 Evans Analytics2e Ppt 03
37/76
utoFiltercreates filtering criteria based on the type ofdata being filtered. f you choose to filter on ;rder Dateor $rrival Date, theutoFiltertools will display a differentDate Filters menu list for filtering that includes9tomorrow,: 9ne"t week,: 9year to date,: and so on.
utoFilter can be used sequentially to 9drill down: intothe data. #or e"ample, after filtering the results by *olt-nut package, we
could then filter by order date and select all orders processed in
3eptember.
About theAutoFilter
7/23/2019 Evans Analytics2e Ppt 03
38/76
'tatisti!sis both the science o% uncertainty andthe technology o% e1tracting in%ormation %rom data.
$ statisti!is a summary measure of data. Des!riptive statisti!s are methods that describe
and summarize data. >icrosoft 1"cel supports statistical analysis in two
ways7
. 3tatistical functions .nalysis Toolpak add-in
'tatisti!al ethods "or 'ummarizing Data
7/23/2019 Evans Analytics2e Ppt 03
39/76
$ "re6uen!y distribution is a table that showsthe number of observations in each of severalnonoverlapping groups. /ategorical variables naturally define the groups in a
frequency distribution. !o construct a frequency distribution, we need
only count the number of observations thatappear in each category. !his can be done using the 1"cel /;5F!# function.
+re6uen!y Distributions "or Categori!al
Data
7/23/2019 Evans Analytics2e Ppt 03
40/76
Example 3.17: Constru!ting a +re6uen!y Distribution
"or ,tems in the Purchase Orders Database
@ist the item names in a column on the spreadsheet. 5se the function I/;5F!#(NDNB7NDNL&,cell2re%erence),
where cell2re%erenceis the cell containing the item name
7/23/2019 Evans Analytics2e Ppt 03
41/76
Example 3.17: Constru!ting a +re6uen!y Distribution
"or ,tems in the Purchase Orders Database
/onstruct a column chart to visualize the frequencies.
7/23/2019 Evans Analytics2e Ppt 03
42/76
7/23/2019 Evans Analytics2e Ppt 03
43/76
Example 3.18: Constru!ting a 4elative +re6uen!y
Distribution "or ,tems in the Purchase Orders Database
#irst, sum the frequencies to find the total number (notethat the sum of the frequencies must be the same as thetotal number of observations, n).
!hen divide the frequency of each category by thisvalue.
7/23/2019 Evans Analytics2e Ppt 03
44/76
#or numerical data that consist of a small numberof discrete values, we may construct a frequencydistribution similar to the way we did for
categorical data0 that is, we simply use /;5F!#to count the frequencies of each discrete value.
+re6uen!y Distributions "or 9umeri!al
Data
7/23/2019 Evans Analytics2e Ppt 03
45/76
n the ,urchase !rders data, the $ terms are allwhole numbers , , +6, and B.
Example 3.1*: +re6uen!y and 4elative
+re6uen!y Distribution "or A& Terms
7/23/2019 Evans Analytics2e Ppt 03
46/76
$ graphical depiction of a frequency distributionfor numerical data in the form of a column chart iscalled a histogram.
#requency distributions and histograms can becreated using thenalysis Toolpak in 1"cel. /lick the Data nalysis tools button in thenalysisgroup
under the Datatab in the 1"cel menu bar and selectHistogramfrom the list.
Ex!el HistogramTool
7/23/2019 Evans Analytics2e Ppt 03
47/76
3pecify the Input #ange corresponding to the data. f you includethe column header, then also check the Labelsbo" so 1"cel knowsthat the range contains a label. !he Bin #ange defines the groups(1"cel calls these 9bins:) used for the frequency distribution.
;istogram Dialog
7/23/2019 Evans Analytics2e Ppt 03
48/76
f you do not specify a Bin #ange, 1"cel willautomatically determine bin values for the frequencydistribution and histogram, which often results in a ratherpoor choice.
f you have discrete values, set up a column of these
values in your spreadsheet for the bin range and specifythis range in the Bin #ange field.
7/23/2019 Evans Analytics2e Ppt 03
49/76
?e will create a frequency distribution and histogram forthe $ !erms variable in the ,urchase !rdersdatabase.
?e defined the bin range below the data in cells
4LL746+ as follows7>onth
+6 B
Example 3.1:
7/23/2019 Evans Analytics2e Ppt 03
50/76
Histogramtool results7
Example 3.1:
7/23/2019 Evans Analytics2e Ppt 03
51/76
#or numerical data that have many different discrete values withlittle repetition or are continuous, a frequency distribution requiresthat we define by specifying
. the number of groups,
. the width of each group, and
+. the upper and lower limits of each group.. /hoose between to groups, and the range of each should be
equal.. /hoose the lower limit of the first group (@@) as a whole number
smaller than the minimum data value and the upper limit of the last
group (5@) as a whole number larger than the ma"imum data value.
;istograms "or 9umeri!al Data
7/23/2019 Evans Analytics2e Ppt 03
52/76
!he data range from a minimum of N%J.& to a ma"imum ofN&,660 set the lower limit of the first group to N6 and the upperlimit of the last group to N+6,666.
f we select groups, using equation (+.) the width of each groupis (N+6,666 - 6) I N%,666
Example 3.$: Constru!ting a +re6uen!y
Distribution and ;istogram "or Cost per )rder
7/23/2019 Evans Analytics2e Ppt 03
53/76
!en-group histogram
Example 3.$: Constru!ting a +re6uen!y
Distribution and ;istogram "or Cost per )rder
7/23/2019 Evans Analytics2e Ppt 03
54/76
3et the cumulative relative frequency of the first group equal to itsrelative frequency. !hen add the relative frequency of the ne"t groupto the cumulative relative frequency.
#or, e"ample, the cumulative relative frequency in cell D+ iscomputed as IDO/+ I 6.666 O 6.BB& I 6.BB&.
Example 3.$1 Computing Cumulative
4elative +re6uen!ies
7/23/2019 Evans Analytics2e Ppt 03
55/76
!he kthpercentile is a value at or below which at least kpercent of the observations lie. !he most common way tocompute the kthpercentile is to order the data values fromsmallest to largest and calculate the rank of the k thpercentileusing the formula7
3tatistical software use different methods that often involveinterpolating between ranks instead of rounding, thusproducing different results. !he 1"cel function 1
7/23/2019 Evans Analytics2e Ppt 03
56/76
/ompute the L6thpercentile for Cost per order inthe ,urchase !rders data.
7/23/2019 Evans Analytics2e Ppt 03
57/76
Data .Data nalysis .
#ank and ,ercentile
L6.+rd
percentile I N&B,+&
(same result asmanually computing
the L6
th
percentile)
Example 3.$ Ex!el Rank and Percentile
Tool
!he 1"cel value of the L6thpercentile that was computed in1"ample +.+ as N&B,+& is the L6.+rdpercentile value.
7/23/2019 Evans Analytics2e Ppt 03
58/76
/uartilesbreak the data into four parts. !he th percentile is called the first quartile,P0
the 6th percentile is called the second quartile, P0
the &th percentile is called the third quartile, P+0 and
the 66th percentile is the fourth quartile, PB.
;ne-fourth of the data fall below the first quartile, one-half are below the second quartile, and three-fourths arebelow the third quartile.
1"cel function P5$
/uartiles
E l 3 $5 C ti / til i
7/23/2019 Evans Analytics2e Ppt 03
59/76
/ompute the Puartiles of the Cost per !rder data #irst quartile7 IP5$
!hird quartile7 IP5$
#ourth quartile7 IP5$
Example 3.$5 Computing /uartiles in
Ex!el
7/23/2019 Evans Analytics2e Ppt 03
60/76
$ !ross2tabulation is a tabular method that displays thenumber of observations in a data set for differentsubcategories of two categorical variables. $ cross-tabulation table is often called a !ontingen!y table.
!he subcategories of the variables must be mutuallye"clusive and e"haustive, meaning that eachobservation can be classified into only one subcategory,and, taken together over all subcategories, they mustconstitute the complete data set.
Cross2Tabulations
E l 3 $7 C t ti C
7/23/2019 Evans Analytics2e Ppt 03
61/76
Sales Transactionsdatabase
/ount the number (and compute the percentage) ofbooks and DEDs ordered by region.
Example 3.$7: Constru!ting a Cross2
Tabulation
C T b l ti Vi li ti Ch t "
7/23/2019 Evans Analytics2e Ppt 03
62/76
Cross2Tabulation Visualization: Chart o"
4egional 'ales by &rodu!t
7/23/2019 Evans Analytics2e Ppt 03
63/76
1"cel provides a powerful tool for distilling acomple" data set into meaningful information7&ivotTables.
ivot!ables allows you to create custom
summaries and charts of key information in thedata.
ivot!ables can be used to quickly create cross-tabulations and to drill down into a large set ofdata in numerous ways.
Exploring Data
7/23/2019 Evans Analytics2e Ppt 03
64/76
/lick inside yourdatabaseInsert .
Tables .
,i*otTable
!he wizard creates ablank ivot!able as
shown.
Constru!ting &ivotTables
7/23/2019 Evans Analytics2e Ppt 03
65/76
3elect and drag thefields to one of theivot!able areas7 #eport Filter Column Labels #o& Labels 7alues
&ivotTable +ield %ist
7/23/2019 Evans Analytics2e Ppt 03
66/76
nitial ivot!ablefor
7/23/2019 Evans Analytics2e Ppt 03
67/76
cti*e Field . naly8e .Field Settings /hange summarization
method in 7alue Field
Settingsdialog bo" 3elect Count
Changing Value +ield 'ettings
7/23/2019 Evans Analytics2e Ppt 03
68/76
+inal &ivot Table
7/23/2019 Evans Analytics2e Ppt 03
69/76
5ncheck the bo"es inthe ,i*otTable FieldList or drag the fieldnames to differentareas.
Aou may easily addmultiple variables inthe fields to createdifferent views of the
data. 1"ample7 drag the
Sourcefield into the #o&Labelsarea
odi"ying &ivotTables
Example 3 $*:
7/23/2019 Evans Analytics2e Ppt 03
70/76
Dragging a field into the
7/23/2019 Evans Analytics2e Ppt 03
71/76
&ivotChartsvisualize data in ivot!ables. !hey can be created in a simple one-click fashion.
3elect the ivot!able
#rom the analyze tab, click ,i*otChart.
1"cel will display an Insert Chart dialog that allows you tochoose the type of chart you wish to display.
&ivotCharts
Example 3 $: A &ivotChart "or 'ales
7/23/2019 Evans Analytics2e Ppt 03
72/76
Example 3.$: A &ivotChart "or 'ales
Data
*y clicking on the drop-down buttons, you can easily change thedata that are displayed. by filtering the data. $lso, by clicking on the
chart and selecting the ,i*otChart Tools Design tab, you can switchthe rows and columns to display an alternate view of the chart orchange the chart type entirely.
7/23/2019 Evans Analytics2e Ppt 03
73/76
1"cel 66 introduced sli!ers= tools for drillingdown to 9slice: a ivot!able and display a subsetof data.
!o create a slicer for any of the columns in the
database, click on the ivot!able and chooseInsert Slicerfrom thenaly8etab in the ,i*otTableToolsribbon.
'li!ers
7/23/2019 Evans Analytics2e Ppt 03
74/76
Example 3.3
7/23/2019 Evans Analytics2e Ppt 03
75/76
!he camera tool is useful for creating ivot!able-based dashboards.
f you create several different ivot!ables andcharts, you can easily use the camera tool to take
pictures of them and consolidate them onto oneworksheet.
n this fashion, you can still make changes to theivot!ables and they will automatically bereflected in the camera shots.
&ivotTable Dashboards
7/23/2019 Evans Analytics2e Ppt 03
76/76
Camera2#ased Dashboard Example