View
228
Download
2
Tags:
Embed Size (px)
Citation preview
James Nowotarski
24 April 2008
IS 425Enterprise Information
Spring 2008
2
Topic Duration
Recap of 4/17 20 minutes
IS Competency Analysis 20 minutes
Data warehouse 40 minutes
*** Break 15 minutes
Data mining 40 minutes
Analytics 30 minutes
Current events 20 minutes
Wrap-up
Today’s Agenda
3
Gartner 2008 CIO survey
2008 Business Expectations
To what extent will each of the following be a top priority for you in 2008?
* New question for 2008 ** New question for 2007
Improving business processes 1 1 1
Attracting and retaining new customers 2 3 3
Creating new products or services (innovation) 3 10 9
Expanding into new markets or geographies 4 9 **
Reducing enterprise costs 5 2 2
Improving enterprise workforce effectiveness. 6 4 **
Expanding current customer relationships 7 * *
Increasing the use of information/analytics 8 7 6
Targeting customers and markets more effectively 9 * *
Acquiring new companies and capabilities (M&A, etc) 10 * *
2008 2007 2006
2008 Business Expectations
To what extent will each of the following be a top priority for you in 2008?
* New question for 2008 ** New question for 2007
Improving business processes 1 1 1
Attracting and retaining new customers 2 3 3
Creating new products or services (innovation) 3 10 9
Expanding into new markets or geographies 4 9 **
Reducing enterprise costs 5 2 2
Improving enterprise workforce effectiveness. 6 4 **
Expanding current customer relationships 7 * *
Increasing the use of information/analytics 8 7 6
Targeting customers and markets more effectively 9 * *
Acquiring new companies and capabilities (M&A, etc) 10 * *
2008 2007 2006
4
Gartner 2008 CIO survey2008 CIO Technology Priorities
To what extent will each of the following technologies be a top five priority for you in 2008?
* New question for 2008 ** New question for 2007
2008 2007 2006
Business intelligence 1 1 1 11.20%
Enterprise applications (ERP, SCM, CRM, etc) 2 2 ** 8.02%
Servers & storage technologies 3 5 9 8.45%
Legacy modernization, upgrade or replacement 4 3 10 5.79%
Security Technologies 5 6 2 8.53%
Technical Infrastructure 6 8 12 4.67%
Networking, Voice and Data 7 4 8 6.83%
Collaboration technologies 8 10 4 7.75%
Document management 9 9 ** 7.91%
Service oriented (SOA, SOBA) 10 7 6 6.71%
2008Unweighted Average
Budget Change
2008 CIO Technology Priorities
To what extent will each of the following technologies be a top five priority for you in 2008?
* New question for 2008 ** New question for 2007
2008 2007 2006
Business intelligence 1 1 1 11.20%
Enterprise applications (ERP, SCM, CRM, etc) 2 2 ** 8.02%
Servers & storage technologies 3 5 9 8.45%
Legacy modernization, upgrade or replacement 4 3 10 5.79%
Security Technologies 5 6 2 8.53%
Technical Infrastructure 6 8 12 4.67%
Networking, Voice and Data 7 4 8 6.83%
Collaboration technologies 8 10 4 7.75%
Document management 9 9 ** 7.91%
Service oriented (SOA, SOBA) 10 7 6 6.71%
2008Unweighted Average
Budget Change
5
Porter’s Value Chain Model
Figure 3.6: Porter's value chain model for a manufacturing firm. (Source: Reprinted with permission of the Free Press, a Division of Simon & Schuster Inc. from Competitive Advantage: Creating and Sustaining Superior Performance. Copyright © 1985 by Michael Porter.)
6
e-Business Application Architecture
Supply Chain Mgmt
Selling Chain Mgmt
Sta
keh
old
ers
Business Partners,Suppliers, Resellers
Distributors,
Customers, Resellers
Em
plo
yees
HR
MS/
E-P
rocu
rem
en
t
Fin
ance
Auditin
gM
gm
t Contro
l
BI EAI
CRM
ERP
Logistics
Pro
ductio
n
Distrib
utio
n
Mark
etin
g
Sale
s
Cust S
vce
ERP
Information Systems IS 425
DePaul University
7
What is ERP?
Anatomy of an Enterprise System
Source: Davenport, T. (1998). Putting the enterprise into the enterprise system. Harvard Business Review, (July/August), 131
Information Systems IS 425
DePaul University
9
ENTERPRISE SYSTEMS
Enterprise System Architecture
Anatomy of an Enterprise System
Source: Adam & Sammon
Information Systems IS 425
DePaul University
11
ERP Supported Functions
Financial Hum Res Ops & Log Sales & Mktg
Accts receivable Time accounting Inventory Orders
Asset account Payroll MRP Pricing
Cash forecast Personnel plan Plant Mtce Sales Mgt
Cost accounting Travel expense Prod planning Sales plan
Exec Info Sys Project Mgmt
Financial consol Purchasing
General ledger Quality Mgmt
Profit analysis Shipping
Standard costing Vendor eval
Classification of IT portfolio
Operations Decisions Strategies
Finance
Accounting
Marketing
Human resources
Etc.
IBM (Cognos)
Information Builders
BI Platforms
SAS
JD Edwards
Peoplesoft
SAP
Oracle
Microsoft
EnterpriseSystems
Custom
Enabling Process Agility
Managing the End-to-End
Process Cycle
Enabling Information
Workers
The Enterprise Need Through 2010: Balance Information, Processes and People
Incorporating Information
Application Portfolio
Management
CRM Changes, 2007 to 2010Technology
• SaaS to move to become 25% of all CRM, hottest in sales force automation, Web analytics, e-commerce and small call centers
• CRM BPM and CRM BPP rise in takeup, displacing custom-build and CRM suites
• Customer data integration and multichannel integration, increasing focus for investment as CRM moves outside individual business units
Market
• Growing at 11%, strongest market position since 2000, skills shortages
• SAP and Oracle = 50%, Salesforce.com, Microsoft growing fastest
• High levels of M&A, but also high levels of startups
Functional
• Build-your-own drops from 70% of all CRM implementations to <50%
• SFA, call centers, campaign management remain 65% of package projects
• Hot markets in community marketing, sales pricing management, analytics for Web, sales and call center, collaborative intelligence
ERP Changes, 2007 to 2010
Technology
• SOA has an impact on ERP implementations
Market
• Vendor consolidation — bifurcation to big vendors and little vendors
• Slowing of user upgrades with SOA benefit uncertainty
• Instance consolidation remains big (retiring of legacy products)
Functional
• Return of shared services
• Focus on reconnecting end-to-end processes with integration technologies and acquisitions of big suites
• Functionality delivered through components rather than big changes to core of applications
The ability to adapt business application portfolios to share information and create and source new business processes quickly and at a low cost will be a critical source of competitive advantage.
Enabling business users to augment current business application environments through composition of new business processes and/or alternate sourcing models will enable users to quickly and efficiently adapt to changes in business models.
Focus 1: Business Model and Process Agility
The Technology Path Through 2010
To:
Enabling Business and
Person-to-Process
Innovation/Agility
From: Managing the End-to-End
Process Cycle
Disruptive Impact: The Process of "Me"
Process of Me — Process needs to be redefined: Business + People Processes
Instant Messaging, Alerts, Threaded Discussion (Collaboration and Personal Productivity)
Processes embrace the chaos of business and
people within the business.
SOABPM
Middleware
BI/BAM
EDA
Collaboration
Personal Productivity
EIM
Differentiating Applications
Infrastructure
CommoditizedApplications
Middleware
Commoditized Business Processes Differentiating Bus. Processes
Adaptable TechnologyAdaptable IT Processes
Segregate Your Processes: Commoditized vs. Differentiating
19
Topic Duration
Recap of 4/17 20 minutes
IS Competency Analysis 20 minutes
Data warehouse 40 minutes
*** Break 15 minutes
Data mining 40 minutes
Analytics 30 minutes
Current events 20 minutes
Wrap-up
Today’s Agenda
20
CSC 212Programming
in Java II
ECT 425Technical
Fundamentals Of Distributed Info Systems
CSC 451Database
Design
SE 430Object-Oriented
Modeling
IS 425Enterprise Information
IT 215Analysis &
Design Techniques
ECT 310Internet
Application Development
CSC 211Programming
In Java I
ApplicationDevelopmentDatabase I
E-BusinessSystems
Data Mining& Analytics
Network Design
CapstoneIS 577
Level I
Level III
Level II
Foundation Phase
Prerequisite Phase
HCI MethodsInternet
ApplicationDevelopment
Database IIInformation
Assurance &Security
EnterpriseSystems
Integration
IT Project Management
I
Wireless &Mobile
Applications
Knowledge Management
IT Planning& Strategies
GlobalSystems
& Strategies
Competency Modules for MSIS
Legal & Social
Issues
Advanced Internet Tech.
IT Architecture
Design
SoftwareEngineering
IT Project Management
II
21
IT OutsourcingBest jobs in America
1. Software engineer
2. College professor
3. Financial adviser
4. Human resources manager
5. Physician’s assistant
6. Market research analyst
7. Computer/IT analyst
8. Real estate appraiser
9. Pharmacist
10. Psychologist
Source:Kalwarski, T., Mosher, D., Paskin, J. & Rosato, D. (2006, May) 50 best jobs in America. Money. Retrieved September 8, 2006, from http://money.cnn.com/magazines/moneymag/bestjobs/
22
Topic Duration
Recap of 4/17 20 minutes
IS Competency Analysis 20 minutes
Data warehouse 40 minutes
*** Break 15 minutes
Data mining 40 minutes
Analytics 30 minutes
Current events 20 minutes
Wrap-up
Today’s Agenda
Data Warehouse Architecture
Client Client
Warehouse
Source Source Source
Query & Analysis
Integration
Metadata
Staging Area
MetadataRepository
Data Marts
Data Warehouse
Optimize for Production:• Excellent response time• Workflow-driven• Vendor application development/support
Optimize for Reporting/Analysis:• Data quality/accuracy• Single version of truth across all
systems• Rapid retrieval of high volume
Sources
Basic DW: The Repositories
OperationalData Store
AnalysisQueryReportsData mining
The Schumacher Group, April 2008The Schumacher Group, April 2008
Multi-Tiered ArchitectureMulti-Tiered Architecture
DataWarehouse
ExtractTransformLoadRefresh
OLAP Engine
AnalysisQueryReportsData mining
Monitor&
IntegratorMetadata
Data Sources Front-End Tools
Serve
Data Marts
Operational DBs
othersources
Data Storage
OLAP Server
27
Lightly summarized
28
Simple cumulative
29
Simple cumulative
Data Model For OLTP
• Data stored by operational systems, such as point-of-sales, are in types of databases called OLTPs.
• OLTP, Online Transaction Process, databases do not have any difference from a structural perspective from any other databases.
• The main difference, and only difference is the way in which data is stored.
Data Model for OLTP
31
Simple cumulative
Data Model for Data Warehouse
Warehouse design: Multi-dimensional Data Base (MDDB)
• Multi-Dimensional Database– Dimensions used to index array.
Here Date, Product, and Store are the dimensions of the MDDB
– “Facts” stored in array cells. Here the Sales for each store of each product and for each month will be computed and stored in each cell of the MDDB
Pro
du
ct
Store
Date
J F M A
milk
soda
eggs
soap
AB
Sales
Multidimensional Data
• Sales volume as a function of product, month, and region
Month
Industry Region Year
Category Country Quarter
Product City Month Week
Office Day
Prod
uct
Region
Dimensions: Product, Location, TimeAnd Hierarchical summarization paths
A Sample Data Cube
Date
Product
Coun
trysum
sum TV
VCRPC
1Qtr 2Qtr 3Qtr 4Qtr
U.S.A
Canada
Mexico
sum
Total annual salesof TV in U.S.A.
Online Analytical Processing (OLAP)
• Slice and Dice ... – Select dimensions– Choose measures– Filter by dimensions
• Drill Down ...– Drill down hierarchies– Drill through to details
• Present the Results– Present as spreadsheet– Display graphically
STAR Schema for an OLAP
• OLAPs have a different mandate from OLTPs.
– OLAPs are designed to give an overview analysis of what happened. Hence the data storage (i.e. data modeling) has to be set up differently.
– The most common method used for OLAP design is called the star design.
• It is not always necessary to create a data warehouse for OLAP analysis.
37
Topic Duration
Recap of 4/17 20 minutes
IS Competency Analysis 20 minutes
Data warehouse 40 minutes
*** Break 15 minutes
Data mining 40 minutes
Analytics 30 minutes
Current events 20 minutes
Wrap-up
Today’s Agenda
Information Systems IS 425 Class Four
DePaul University
38
Terminology - A Working Definition
Data Mining is a “decision support” process in which we search for patterns of information in data.
A pattern is a conservative statement about a probability distribution. – Webster: A pattern is (a) a natural or chance configuration,
(b) a reliable sample of traits, acts, tendencies, or other observable characteristics of a person, group, or institution
What is data mining• Data mining is the process by which analysts apply
technology to historical data (mining) to determine statistically reliable relationships between variables.
• Generally, it is the procedure by which analysts utilize the tools of mathematics and statistical testing applied to business-relevant, historical data in order to identify relationships, patterns, or affiliations among variables or sections of variables in that data to gain greater insight into the underpinnings of the business process (Kudyba & Hoptrof)
Information Systems IS 425 Class Four
DePaul University
40
Why Do We Need Data Mining ?
Leverage organization’s data assets– Only a small portion (typically - 5%-10%) of the collected
data is ever analyzed
– Data that may never be analyzed continues to be collected, at a great expense, out of fear that something which may prove important in the future is missing.
– Growth rates of data precludes traditional “manually intensive” approach
Information Systems IS 425 Class Four
DePaul University
41
Why Do We Need Data Mining?
As databases grow, the ability to support the decision support process using traditional query languages becomes infeasible
– Many queries of interest are difficult to state in a query language (Query formulation problem)
– “find all cases of fraud”
– “find all individuals likely to buy a FORD expedition”
– “find all documents that are similar to this customers problem”
QUERY
RESULT
42
The Law of Accelerating Returns is driving economic growth
• The portion of a product or service’s value comprised of information is asymptoting to 100%
• The cost of information at every level incurs deflation at ~ 50% per year
• This is a powerful deflationary force– Completely different from the deflation in the 1929
Depression (collapse of consumer confidence & money supply)
Source: Ray Kurzweil, futurist & inventor
Information Systems IS 425 Class Four
DePaul University
43
Why Data Mining
Credit ratings/targeted marketing:
– Given a database of 100,000 names, which persons are the least likely to default on their credit cards?
– Identify likely responders to sales promotions
Fraud detection
– Which types of transactions are likely to be fraudulent, given the demographics and transactional history of a particular customer?
Customer relationship management:
– Which of my customers are likely to be the most loyal, and which are most likely to leave for a competitor? :
Data Mining helps extract such information
Information Systems IS 425 Class Four
DePaul University
44
Fraud/Non-Compliance Anomaly detection
– Isolate the factors that lead to fraud, waste and abuse
– Target auditing and investigative efforts more effectively
Credit/Risk Scoring Intrusion detection Parts failure prediction
Recruiting/Attracting customers
Maximizing profitability (cross selling, identifying profitable customers)
Service Delivery and Customer Retention
– Build profiles of customers likely to use which services
Web Mining
Examples of What People are Doing with Data Mining:
Information Systems IS 425 Class Four
DePaul University
45
Where does the data come from?
– Credit card transactions, loyalty cards, discount coupons, customer complaint calls, plus (public) lifestyle studies
Target marketing
– Find clusters of “model” customers who share the same characteristics: interest, income level, spending habits, etc.
– Determine customer purchasing patterns over time Cross-market analysis
– Associations/co-relations between product sales, & prediction based on such association
Customer profiling
– What types of customers buy what products (clustering or classification)
Customer requirement analysis
– Identifying the best products for different customers
– Predict what factors will attract new customers
Examples of What People are Doing with Data Mining:
Information Systems IS 425 Class Four
DePaul University
46
Finance planning and asset evaluation
– cash flow analysis and prediction
– contingent claim analysis to evaluate assets
– cross-sectional and time series analysis (financial-ratio, trend analysis, etc.)
Resource planning
– summarize and compare the resources and spending Competition
– monitor competitors and market directions
– group customers into classes and a class-based pricing procedure
– set pricing strategy in a highly competitive market
Examples of What People are Doing with Data Mining:
Information Systems IS 425 Class Four
DePaul University
47
Why Now?
• Data is being produced
• Data is being warehoused
• The computing power is available
• The computing power is affordable
• The competitive pressures are strong
• Commercial products are available
Information Systems IS 425 Class Four
DePaul University
48
Database Processing vs. Data Mining Processing
Query– Well defined– SQL
Query
– Poorly defined
– No precise query language
DataData
– Operational dataOperational data
OutputOutput
– PrecisePrecise
– Subset of databaseSubset of database
DataData
– Not operational dataNot operational data
– Usually summarizedUsually summarized
OutputOutput
– FuzzyFuzzy
– Not a subset of databaseNot a subset of database
Information Systems IS 425 Class Four
DePaul University
49
Query Examples
Database
Data Mining– Find all customers who have purchased beerFind all customers who have purchased beer
– Find all items which are frequently purchased with beer. Find all items which are frequently purchased with beer. (association rules)(association rules)– Describe attributes of customers likely to spend the most Describe attributes of customers likely to spend the most (segmentation)(segmentation)
– Find all credit applicants with last name of Smith.Find all credit applicants with last name of Smith.– Identify customers who have purchased more than Identify customers who have purchased more than $10,000 in the last month$10,000 in the last month..
– Find all credit applicants who are poor credit risks. Find all credit applicants who are poor credit risks. (classification)(classification)
– Identify customers with similar buying habits. (clustering)Identify customers with similar buying habits. (clustering)
Data Mining Models and Tasks
Association Rules• There has been a considerable amount of research in the area of Market
Basket Analysis. Its appeal comes from the clarity and utility of its results, which are expressed in the form association rules.
• Given– A database of transactions– Each transaction contains a set of items
• Find all rules X->Y that correlate the presence of one set of items X with another set of items Y– Example: When a customer buys bread and butter, they buy milk 85% of
the time
+
Example: Association analysis
…. . ….…..
all 100 orders
orders with pretzels orders with beer
Market Basket Example
Is soda typically purchased with bananas?Does the brand of soda make a difference?
Where should detergents be placed in theStore to maximize their sales?
Are window cleaning products purchased when detergents and orange juice are bought together?
How are the demographics of the neighborhood affecting what customers are buying?
?
?
?
?
Example: Segmentation (Target variable: Spending level)
All Customers10.6
Male20.2
Female12.7
Age: 0-255.8
Age: 25-5515.6
Age: > 558.5
North9.6
South6.5
55
Topic Duration
Recap of 4/17 20 minutes
IS Competency Analysis 20 minutes
Data warehouse 40 minutes
*** Break 15 minutes
Data mining 40 minutes
Analytics 40 minutes
Current events 20 minutes
Wrap-up
Today’s Agenda
What is analytics
The extensive use of data, statistical and quantitative analysis, explanatory and predictive models, and fact-based management to drive decisions and actions (Davenport)
Much of the attention focuses on “advanced” analytics, of which predictive analytics is a subset
56
Data Mining Models and Tasks
Examples of analytics applications What products their customers want What prices those customers will pay How many items each will buy in a lifetime What triggers will make people buy more Predict problems with demand and supply
chains, to achieve low rates of inventory and high rates of perfect orders.
58
Example: Marriott - Factor Analysis Identifies What Is Important
Importance of Attributes in Predicting Propensity for Guest Return:
Months Since Deep Clean
Age of Bed
Use of Fitness Center
Spending in Restaurant
Room Price
Speed of Check-In
Speed of Room Service
Premium Movie Channel
Please Rate the Importance of the Following Aspects of Your Stay:
Low High
Room Cleanliness
Comfort of Bed
Fitness Center
Restaurant
Room Prices
Check-In Experience
Room Service
TV Channels
1: Monitoring
2: Framework
3: Predictive
Example: Marriott’s revenue opportunity model
Computes actual revenues as a percentage of the optimal rates that could have been charged
That figure has grown from 83% to 91% as Marriott’s revenue-management analytics have taken root throughout the enterprise
60
7 common targets for analytical activity
61
Long, arduous journey
The UK Consumer Cards and Loans business within Barclays Bank, for example, spent five years executing its plan to apply analytics to the marketing of credit cards and other financial products.
The company had to make process changes in virtually every aspect of its consumer business: underwriting risk, setting credit limits, servicing accounts, controlling fraud, cross selling, and so on.
On the technical side, it had to integrate data on 10 million Barclaycard customers, improve the quality of the data, and build systems to step up data collection and analysis.
And it had to hire new people with top-drawer quantitative skills.
62
The Schumacher Group, April 2008
63
Trends in data mining and advanced analytics projects Need to be driven much more by the business units
The most significant challenges driving changes in data
mining market are scalability and performance Terabyte-class databases have become more common today
The growth of ecommerce has also driven the need for data-mining approaches that work with online Web businesses
More focus on text mining with almost 80% of data nowadays in unstructured textual format.”
64
65
Readings on Systems Development DL: Discussion on data mining/analytics
For May 1
66
Extra slides
Underlying Technology: Hip and Hype
Technology Trigger
Peak ofInflated
Expectations
Trough of Disillusionment Slope of Enlightenment Plateau of
Productivity
time
visibility
Years to mainstream adoption:
less than 2 years 2 to 5 years 5 to 10 years more than 10 yearsobsoletebefore plateau
As of June 2007
XML-Enabled Database Management Systems
Linux as a Mission-Critical DBMS Platform
OSS DBMS for Non-Mission-Critical Applications
Real-Time Data Integration
Data Warehouse Appliances
Data Federation/EII
OSS DBMS for Mission-Critical Applications
Data Profiling
Comprehensive Data Integration Tool Suites
XQuery
Master Data Management
Data Service Architectures
Enterprise Information Management
SaaS Data Integration and Data Quality
Information-Centric Infrastructure
Open-Source Data Integration Tools
Entity Resolution and Analysis
Data Quality DashboardsMetadata Ontology
Management
Content Integration
Data Quality Tools
From "Hype Cycle for Data Management, 2007," 2 July 2007
68
Firmwide IT
Infrastructure Business Value
Business-Unit IT
Applications Business Value
Business-Unit Operational
Business Value
Business-Unit Financial
Business Value
Time
Impact
Sought
Dilutionof Impact
•Revenue growth•Return on assets•Revenue per employee
•Time to bring a new product to market•Sales from new products•Product or service quality
•Time to implement a new application•Cost to implement a new application
•Infrastructure availability•Cost per transaction•Cost per workstation
Business Value Measures
Dilu
tion
of IT
Impa
cts
Dilutionof Impact
Dilutionof Impact
InformationTechnology $
InformationTechnology $ A
C
B
Source: “Leveraging the New Infrastructure”, Peter Weill & Marianne Broadbent, ©1998
Hierarchy of Impact of Information Technology Investments
69
Increased control
Better information
Better integration
Improved quality
•Shorter time to market•Premium pricing•Superior quality
Increased sales
Competitive advantage
Competitive necessity
Market positioning
Innovative services
•50% fail•Some spectacular successes•2-to-3 year lead•Premium pricing•Higher revenue per employee
Cut costs
Increased
throughput
•25-40% return•Higher ROA•Low risk
Business integration
Business flexibility
Reduced marginal
cost of business
unit’s IT
Reduced IT costs
Standardization
More Higher growth Less Higher ROA
Infrastructure
Transactional
Informational Strategic
Source: “Leveraging the New Infrastructure”, Peter Weill & Marianne Broadbent, ©1998
Information Technology Portfolio and Business Value
70
IT Portfolio
Composition Will Be a New Source for Business Application Delivery
Buy Build ComposeEAS User Interface
BPM
Data IntegrationE
SB
/EA
I
User Experience
Business Process
Information Services
Business Application
73
Holistic view
Technology
ProcessPeople