Upload
britney-coughlin
View
216
Download
2
Tags:
Embed Size (px)
Citation preview
Fin
anci
al In
form
ati
on
M
an
ag
em
en
t
FIM: BUSINESS INTELLIGENCE
Stefano Grazioli
Critical Thinking
Easy Meter
BI and AnalyticsAnalytics is “the extensive use of data, statistical and quantitative analysis, explanatory and predictive models, and fact-based management to drive decisions and actions” (Davenport and Harris – Competing on Analytics)
“BI refers to the general ability to organize, access and analyze information in order to learn and understand the business.” (Gartner)
GIGO: data quality affects the quality of your decisions
Analysts cannot find what they need 50% of the times
10-25% of the records have inaccuracies or missing elements
Data is frequently misinterpreted Data loss and theft Most databases implement inconsistent definitions
Source: T. Redman, Data Driven, 2008
Find the Data Quality issuesCust ID Name Addr1 Addr2 City State Zip Phone0345 Peter Parker 765 Spider Cove New York NY 10012 875-32530346 Mr. Bigg Mr. Bigg’s Wigs, Inc. Cville Virginia 22901 434-567-34550467 MJ Watson 753 45th St Apt 45 New York New York 10024 999-99990488 Carl Zeithaml 34 Sprigg Lane Charlottesville VA 22904 (434)-453-35560499 Pete Parker 765 Spider Cove New York NY 10012 #875-32530722 Ben Grimm Broad and Main Staunton VA 24403 null
0834 Sue Storm 8564 Carver Dr. NYC NY null 212-450-3556
0853 Peter Parker 2345 Benson Rd Los Angeles CA 90210 #875-3253
StateID State
VA Virginia
NY New York
WY null
null null
Why is Data Bad?
No one gets up in the morning and says
“I’m going to make lots of errors today”
Source: T. Redman, Data Driven, 2008
Data Quality Benchmarks Analysts cannot find what they need
50% of the times 10-25% of the records have inaccuracies
or missing elements Data frequently misinterpreted Known data loss and theft Most databases implement inconsistent
definitions Source: T. Redman, Data Driven, 2008
Approaches to Data Quality
1. Find and Fix2. Prevent at the
source3. Do nothing
(3M)
Fin
anci
al In
form
ati
on
M
an
ag
em
en
t HomeworkBusiness Scenario:
Google’s Daily Cagr
Daily Cagr for Google You are an analyst at a broker firm.
Many of our customers invest for short amounts of time on Google. They sell their shares within a few weeks…. I wonder: do they make
any money out of it?
Daily Cagr for Googlefile with ~800customers whobought and sold GOOG within thelast two months.
Three steps (and two homework)1. Clean data: phones, dates2. Compute Daily Cagr = [(final price/initial price)1/days ]-13. Report the Average Daily Cagr across all customers.
Cleaning Phone Numbers From:
#2-345-3-48565 To:
(234)-534-8565
When the user presses a button labeled “start”, a file selection windows pops out. The user
selects a .csv file. The file is shown starting at “A1”. The start button becomes invisible.Three more buttons appear: “Clean phone
numbers”, “Format Dates”, and “Compute Daily CAGR”.
UML Activity Diagram - Daily Compound Average Growth of a Security (part I)
Select the next phone no. Count its digits
[Compute]
[Exactly 10 digits]
Next homework
[Clean ph.no]
Highlight the cell in red
Format as(xxx)-xxx-xxxx
& clear highlight if any
[No More Ph.No]
[Format Dates]
A
A
Select the next column and/or date
[is a date]
Highlight the cell in yellow
Format asmm/dd/yyyy
& clear highlight if any
[No More Dates in this column]
A
[No more columns]
Reading a File into EXCEL' store the address of the current active sheet, i.e., the ‘target’ Dim myActiveS As Excel.Worksheet = Application.ActiveSheet' select a file Dim myFile As String = Application.GetOpenFilename()' get the data in a new temporary workbook Application.Workbooks.OpenText(myFile, , , Excel.XlTextParsingType.xlDelimited, , , , , True)' store the address of the temporary workbook Dim myActiveWB As Excel.Workbook = Application.ActiveWorkbook' copy the content from the temporary to the ‘target’ sheet myActiveS.Range("A1:J1000").Value = Application.ActiveSheet.Range("A1:J1000").Value‘ close the temp workbook myActiveWB.Close()
Finding the last non-empty row
Dim lastRow As IntegerlastRow = _ Cells(Rows.Count,1).End(Excel.XlDirection.xlUp).Row
Suggestions
Video available
Fin
anci
al In
form
ati
on
M
an
ag
em
en
t WINITWhat Is New
In Technology?
Fin
anci
al In
form
ati
on
M
an
ag
em
en
t
Strings and Dates
Strings and CharactersDim myString As String = “This is a sample string"Dim myString2 As String = "s"Dim myChar As Char = "s"c
Testing NumbersDim myString As String = "#2344-234-33-3"Dim temp As String = ""
For Each x As Char In myString If IsNumeric(x) Then temp = temp + x End IfNext
Inserting and Removing Dim myS As String = "This is a sample string"myS = myS.Insert(4, "xyz")myS = myS.Remove(4, 3) 'starting where, how manymyString = myS.Replace(" is", " was")myS = myS.Substring(0, 9) + “ another" + myS.Substring(10, 13) + "."
Finding Dim myS As String = "This is a sample string"Dim myPosition As Integer = 0
myPosition = myS.IndexOf("s")
Trimming and PaddingmyLenght = myString.Length
myNewString = myString.Trim()myNewString = myString.TrimEnd()myNewString = myString.TrimStart()myNewString = myString.PadLeft(50)myNewString = myString.PadRight(20)
Total length of the result
You do the talking Name, major Learning objectives Things you like about the class Things that can be improved Strengths / Attitude towards the
Tournament
DatesDim myDate As Date = "11/14/2002“
Year = myDate.YearMonth = myDate.MonthDay = myDate.DayDOW = myDate.DayOfWeekDOY = myDate.DayOfYear......
MyDate
YearMonthDayWeek.......
2002111445
TimeSpan
Dim myDate1 As DateDim myDate2 As DateDim myTS As TimeSpanmyDate1 = Range("A1").ValuemyDate2 = Range("A2").ValuemyTS = myDate2 - myDate1Range("A3").Value = myTS.Days
TIMESPANDate1 Date2
A TimeSpan represents the elapsed time between two dates.
TimeSpan mySpan.Days
gives you the total number of days mySpan.TotalDays
gives you the total number of days, plus a fraction of day based on the hours