55
Business Intelligence and Data Mining

DATA Mining PPT

Embed Size (px)

DESCRIPTION

DATA Mining

Citation preview

Page 1: DATA Mining PPT

Business Intelligence and Data Mining

Page 2: DATA Mining PPT

Business Intelligence and Data Mining (BI &DM)

• Text Book:– ‘Business Intelligence A Managerial Approach’ by

Efraim Turban, Ramesh Sharda, Dursun Delen and Devid King, 2/e, Pearson, 2012

• Reference Material:– ‘Decision Support and Business Intelligence

Systems’ by Efraim Turban, Ramesh Sharda and Dursun Delen, 9/e, Pearson, 2012

Page 3: DATA Mining PPT

Business Intelligence and Data Mining (BI &DM)

• Reference Material:– ‘Business Intelligence Strategy A Practical Guide

for Achieving BI Excellence’ by John Boyer, Bill Frank, Brian Green and Tracy Harris, MC Press, 2010

– ‘Business Analytics for Manager’ by Gert H. N. Laursen and Jesper Thorlund, Wiley, 2010

Page 4: DATA Mining PPT

• Sessions Plan– Introduction to Business Intelligence– Decision Support Systems Concepts, Methodologies

and Technologies– Data Warehousing– Business Performance Management– Data Mining for Business Intelligence– Text and Web Mining – Business Intelligence: Implementation and Emerging

Trends

Business Intelligence and Data Mining (BI &DM)

Page 5: DATA Mining PPT

BI Implementation: BI Implementation: Integration and Integration and

Emerging TrendsEmerging Trends

BI Implementation: BI Implementation: Integration and Integration and

Emerging TrendsEmerging Trends

Business Intelligence and Data Mining (BI &DM)

Page 6: DATA Mining PPT

Learning Objectives• Describe the major business intelligence (BI)

implementation issues• List some critical success factors of BI implementation• Describe the importance and issues in integrating BI

technologies and applications• Understand the needs for connecting BI systems with

other information systems• Define on-demand BI and its advantages/limitations• List and describe representative privacy, major legal and

ethical issues of BI implementation

Page 7: DATA Mining PPT

Learning Objectives• Understand Web 2.0 and its characteristics as related to

BI and decision support• Understand social networking concepts, selected

applications, and their relationship to BI• Describe how virtual world technologies can change the

use of BI applications• Describe the integration of social software in BI• Know how Radio Frequency Identification (RFID) data

analysis can help improve supply chain management (SCM) and other operations

• Describe how massive data acquisition techniques can enable reality mining

Page 8: DATA Mining PPT

Opening Vignette…

“BI Eastern Mountain Sports IncreasesCollaboration and Productivity”•Company background•Problem description•Proposed solution•Results•Answer & discuss the case questions

Page 9: DATA Mining PPT

Opening VignetteCollaborative Decision Making at Eastern Mountain Sports

Page 10: DATA Mining PPT

Implementing BI – An Overview• Decisional Factors in BI Implementation

– Reporting and analysis tools• Features, functionality, flexibility, scalability

– Database• Scalability, performance, security

– ETL Tools• Accessibility, efficiency, usability

– Costs• Hardware/software, development/training

– Benefits • Tangibles/intangibles - time saving, improved

decisions/operations/customer satisfaction/

Page 11: DATA Mining PPT

Implementing BI – An Overview

• Critical Success Factors for BI Implementationa. Business driven methodology and project managementb. Clear vision and planningc. Committed management support and sponsorshipd. Data management and quality issuese. Mapping the solutions to the user requirementsf. Performance considerations of the BI systemg. Robust and extensible framework

Page 12: DATA Mining PPT

Managerial Issues Related to BI Implementation

1. System development and the need for integration2. Cost–benefit issues and justification3. Legal issues and privacy4. BI and BPM today and tomorrow5. Cost justification; intangible benefits6. Documenting and securing support systems7. Ethical issues8. BI Project failures

Page 13: DATA Mining PPT

BI and Integration Implementation

• Types of Integration– Functional integration

• different [physically separate] applications are provided/used as if it is a single system

– Physical integration• packaging the hardware, software, and communication

features required to accomplish functional integration

– Primary focus in BI (and in this book) is functional-application integration

Page 14: DATA Mining PPT

BI and Integration Implementation

• Why integrate?– To better implement a complete BI system– To increase the capabilities of the BI applications– To enable real-time decision support– To enable more powerful applications– To facilitate faster system development– To enhance support activities such as blogs, wikis,

RSS feeds, etc.

Page 15: DATA Mining PPT

BI and Integration Implementation

• Levels of BI Integration– Functional integration can be within the same BI

or across different BI systems• Integration across different BI systems can be

accomplished in a loosely coupled fashion – input output passing, messaging (SOA)

• Integration within a BI system is more cohesive with several sub-systems constituting the whole

• Embedded Intelligent Systems– Serving as the intelligent agents within BI

Page 16: DATA Mining PPT

Connecting BI Systems to Databases and Other Enterprise Systems

• Virtually every BI application requires database or data warehouse access

Multi-tiered Application ArchitectureMulti-tiered Application Architecture

Page 17: DATA Mining PPT

Connecting BI Systems to Databases and Other Enterprise Systems

• Integrating BI applications and back-end systems– Web scripting languages (e.g., PHP, JSP, ASP)– Application integration servers (e.g., WebLogic)– Enterprise application integration – integration of large

systems (BI to ERP, SCM, CRM, KM, etc.)

• Integrating BI and ERP for DSS– ERP captures and stores data– BI converts data into information/knowledge– Middleware?

Page 18: DATA Mining PPT

On-Demand BI

• The limitations of Traditional BI– Complex, time-consuming, expensive

• The On-Demand Alternative– On-demand computing = Utility computing– SaaS (Software as a service)– Allows SMEs to utilize affordable BI– On-demand function alternatives

• Internally sharing licenses within a firm• Sharing licenses with many firms via an ASP

Page 19: DATA Mining PPT

Benefits of On-Demand BI

• Ability to handle fluctuating demand– Flexible use of the BI technology pool

• Reduced investment/cost– Hardware (servers and peripherals)– Software (more features for less)– Maintenance (centralized timely updates)

• Embodiment of recognized best practices• Better flexibility and connectivity with other systems

via SaaS infrastructure• Better RIO

Page 20: DATA Mining PPT

The Limitations of On-Demand BI

• Integration of vendors’ software with company’s software may be difficult

• The vendor can go out of business, leaving the company without a service

• It is difficult or even impossible to modify hosted software for better fit with the users’ needs

• Upgrading may become a problem• You may relinquish strategic data to strangers (lack of

privacy/security of corporate data)

Page 21: DATA Mining PPT

Issues of Legality, Privacy and Ethics

• Legal issues– Liability for the actions of advice provided by BI– Who is liable, if the software advice fails?

• Privacy– Right to be left alone and the right to be free from

unreasonable personal intrusions– Collecting information about individuals– The Web and information collection– Mobile user privacy – Homeland security and individual privacy

Page 22: DATA Mining PPT

Issues of Legality, Privacy and Ethics

• Ethics in Decision Making and Support– Electronic surveillance– Software piracy– Use of proprietary databases– Use of intellectual property such as knowledge– Computer accessibility for workers with disabilities– Accuracy of data, information, and knowledge– Protection of the rights of users

• Use of corporate computers for non-work-related purposes (personal use of Internet while working)

Page 23: DATA Mining PPT

Issues of Legality, Privacy and Ethics

Typical problem formulation

(T.O.P perspective)

Integration of moral intensity

components

Problem formulation expansion

Conversation

Typical problem formulation

(T.O.P perspective)

Stakeholder expansion

Problemdefinition

“Unfolding” to control expansion

S

SS

S

SS

= Stakeholder

S

A Model of Ethical Problem FormulationA Model of Ethical Problem Formulation

Page 24: DATA Mining PPT

Emerging Topics in BI – An Overview

• Web 2.0 revolution as it relates to BI in (Section 6.7)• Online social networks (Section 6.8)• Virtual worlds as related to BI (Section 6.9)• Integration social networking and BI (Section 6.10)• RFID and BI (Section 6.11)• Reality Mining (Section 6.12)

Page 25: DATA Mining PPT

Emerging Topics in BI – An OverviewThe Future of BI

• Web 2.0 revolution as it related to BI (Section 6.7)• Online social networks (Section 6.8)• Virtual worlds as related to BI (Section 6.9)• Integration social networking and BI (Section 6.10)• RFID and BI (Section 6.11)• Reality Mining (Section 6.12)

Page 26: DATA Mining PPT

Emerging Topics in BI – An Overview

• In 2009, collaborative decision making emerged as a new product category that combines social software with business intelligence platform capabilities.

• In 2010, 20 percent of organizations will have an industry-specific analytic application delivered via software as a service as a standard component of their business intelligence portfolio.

• By 2012, business units will control at least 40 percent of the total budget for BI.

• By 2012, one-third of analytic applications applied to business processes will be delivered through coarse-grained application mashups.

• Because of lack of information, processes, and tools, through 2012, more than 35 percent of the top 5,000 global companies will regularly fail to make insightful decisions about significant changes in their business and markets.

Page 27: DATA Mining PPT

The Web 2.0 Revolution

• Web 2.0: a popular term for describing advanced Web technologies and applications, including blogs, wikis, RSS, mashups, user-generated content, and social networks

• Objective: enhance creativity, information sharing, and collaboration

• Difference between Web 2.0 and Web 1.xUse of Web for collaboration among Internet users and other users, content providers, and enterprises

Page 28: DATA Mining PPT

The Web 2.0 Revolution• Web 2.0: an umbrella term for new technologies

for both content as well as how the Web works• Web 2.0 has led to the evolution of Web-based

virtual communities and their hosting services, such as social networking sites, video-sharing sites

• Companies that understand these new applications and technologies—and apply the capabilities early on—stand to greatly improve internal business processes and marketing

Page 29: DATA Mining PPT

The Web 2.0 RevolutionCharacteristics of the Web 2.0• The ability to tap into the collective intelligence of users.

The more users contribute, the better.• Data is made available in new or never-intended ways.

Web 2.0 data can be remixed or “mashed up”.• Web 2.0 relies on user-generated and user-controlled

content and data (enhanced collaboration).• Lightweight programming techniques and tools let nearly

anyone act as a Web site developer.• The virtual elimination of software-upgrade cycles makes

everything a perpetual beta or work-in-progress and allows rapid prototyping, using the Web as an application development platform.

Page 30: DATA Mining PPT

The Web 2.0 RevolutionCharacteristics of the Web 2.0• Users can access and manage applications entirely through

a browser.• An architecture of participation and digital democracy

encourages users to add value to the application as they use it.

• There is a major emphasis on social networks and computing.

• Information sharing and collaboration is greatly supported.• This allows for rapid and continuous creation of new

business models. “dynamic content, rich user experience, metadata, scalability,

open source, and freedom (net neutrality)”

Page 31: DATA Mining PPT

The Web 2.0 Revolution• Ajax (Asynchronous JavaScript and XML)

– An enabling technology for Web 2.0, resulting in rich, interactive, fast-response, user-friendly GUIs

– Makes Web pages feel more responsive by exchanging small amounts of data with the server behind the scenes (eliminated the need for reloading the complete Web page)

– Leads to improved Web page interactivity, loading speed, and usability

• Many companies and new business models have emerged based on Web 2.0

Page 32: DATA Mining PPT

Online Social Networking –Basics and Examples

• A social network is a place where people create their own space, or homepage, on which they write blogs; post pictures, videos, or music; share ideas; and link to other Web locations they find interesting.– The mass adoption of social networking Web sites points

to an evolution in human social interaction

• The size of social network sites are growing rapidly, with some having over 100 million members – growth for successful ones 40 to 50 % in the first few years and 15 to 25 % thereafter

Page 33: DATA Mining PPT

Online Social Networking –Social Network Analysis Software

• It is used to identify, represent, analyze, visualize, or simulate networks with– Nodes – agents, organizations, or knowledge– Edges – relationships identified from various types of input

data (relational and non-relational)

• Various input and output file formats exist• SNA software tools include

– Business-oriented social network tools such as InFlow and NetMiner

– Social Networks Visualizer, or SocNetV, which is a Linux-based open source package

Page 34: DATA Mining PPT

Mobile Social Networking

• Social networking where members converse and connect with one another using cell phones or other mobile devices

• MySpace and Facebook offer mobile services• Mobile only services: Brightkite, and Fon11• Basic types of mobile social networks

1. Partnership with mobile carriers (use of MySpace over AT&T network)

2. Without a partnership (“off deck”) (e.g., MocoSpace and Mobikade)

• Mobile Enterprise Networks• Mobile Community Activities (e.g., Sonopia)

Page 35: DATA Mining PPT

Major Social Network Services• Facebook: The Network Effect

– Launched in 2004 by Mark Zuckerberg (former Harvard student)

– It is the largest social network service in the world with over 500 million active users worldwide

– Initially intended for college and high school students to connected to other students at the same school

– In 2006 opened its doors to anyone over 13; enabling Facebook to compete directly with MySpace.

Page 36: DATA Mining PPT

Major Social Network Services• Orkut: Exploring the Very Nature of Social

Networking Sites– The brainchild of a Turkish Google programmer – It was to be Google's homegrown answer to MySpace

and Facebook – Format is similar to others: a homepage where users

can display every facet of their personal life they desire using various multimedia applications

– A major highlight of Orkut – ability to create and control communities

– Also supports many languages

Page 37: DATA Mining PPT

Implications of Business and Enterprise Social Networks

• Business oriented social networks can go beyond “advertising and sales”

• Emerging enterprise social networking apps: – Finding and Recruiting Workers

• See Application Case 14.2 for a representative example

– Management Activities and Support– Training– Knowledge Management and Expert Location

• e.g., innocentive.com; awareness.com; Caterpillar

– Enhancing Collaboration– Using Blogs and Wikis Within the Enterprise …>

Page 38: DATA Mining PPT

Implications of Business and Enterprise Social Networks

• Survey shows that best-in-class companies use blogs and wikis for the following applications:– Project collaboration and communication (63%)– Process and procedure document (63%)– FAQs (61%)– E-learning and training (46%)– Forums for new ideas (41%)– Corporate-specific dynamic glossary and terminology

(38%)– Collaboration with customers (24%)

Page 39: DATA Mining PPT

Virtual Worlds

• Virtual worlds have existed for a long time in various forms — stereoscopes, Cinerama, simulators, computer games, …

• They are artificial worlds created by computer systems in which the user has the impression of being immersed

• Examples:– Second Life (secondlife.com)– Google Lively (lively.com)– EverQuest (everquest.com)

Avatars ? Avatars ?

Page 40: DATA Mining PPT

Second Life as a DSS• Advantages:

– Easy access and low cost– Experienced and dedicated designer/builders– Tools and venues for communications-driven decision

support (DecisionSupportWorld.com)– A large, dedicated user base– Impression management / creativity enhancement– Time compression– Easy data integration from real life using RSS feeds– Encourages active participation and experiential learning

Page 41: DATA Mining PPT

Second Life as a DSS• Disadvantages:

– Learning time and training costs– Distractions are numerous– Pranksters and spam are common– Technology problems persist– Chat is a very slow communication tool– Resistance to use– Addiction– Participation in most of these virtual environments

requires downloading of a "plug-in"

Page 42: DATA Mining PPT

Virtual Tradeshows

See iTradeFair.comSee iTradeFair.com

Page 43: DATA Mining PPT

Social Networks and BI:Collaborative Decision Making

• Collaborative decision making (CDM) – combines social software and BI– CDM is a category of decision-support system for non-

routine, complex decisions that require iterative human interactions.

– Ad hoc tagging regarding value, relevance, credibility, and decision context can substantially enrich both the decision process and the content that contributes to the decisions.

– Tying BI to decisions and outcomes that can be measured will enable organizations to better demonstrate the business value of BI.

Page 44: DATA Mining PPT

How CDM Works

Page 45: DATA Mining PPT

RFID and BI

• Wal-Mart's RFID mandate in June 2003• DoD, Target, Albertson's, Best Buy,…• RFID is a generic technology that refers to the

use of radio frequency waves to identify objects.

• RFID is a new member of the automatic identification technologies family, which also includes the ubiquitous barcodes and magnetic strips.

Page 46: DATA Mining PPT

How does RFID work?

• RFID system – a tag (an electronic chip attached to the product

to be identified)– an interrogator (i.e., reader) with one or more

antennae attached – a computer (to manage the reader and store the

data captured by the reader)

• Tags – Active tag versus Passive tags

Page 47: DATA Mining PPT

Data Representation for RFID

• RFID tags contain 96 bits of data in the form of serialized global trade identification numbers (SGTIN) [see epcglobalinc.org]

Page 48: DATA Mining PPT

RFID for Supply Chain BI

• RFID in Retail Systems– Functions in a distribution center

• receiving, put-away, picking, and shipping

– Sequence of operations at a receiving dock1. unloading the contents of the trailer2. verification of the receipt of goods against expected

delivery (purchase order)3. documentation of the discrepancy 4. application of labels to the pallets, cases, items 5. sorting of goods for put-away or cross-dock

Page 49: DATA Mining PPT

RFID for Supply Chain BI

• RFID in Retail Systems

Page 50: DATA Mining PPT

RFID Data Sample

• RFID in Retail Systems

Page 51: DATA Mining PPT

RFID for BI in Supply Chain

• Better SC visibility with RFID systems– Timing/duration of movements between

different locations – especially important for products with limited shelf life

– Better management of out-of-stock items (optimal restocking of store shelves)

– Help streamline the backroom operations: eliminate unnecessary case cycles, reorders

– Better analysis of movement timings for more effective and efficient logistics

Page 52: DATA Mining PPT

RFID + Sensors for Better BI• Knowing the location and health of goods (i.e.,

exception) during transportation

Page 53: DATA Mining PPT

Reality Mining• Identifying aggregate patterns of human activity

trends (see sensenetworks.com by MIT & Columbia University)

• Many devices send location information– Cars, buses, taxis, mobile phones, cameras, and

personal navigation devices– Using technologies such as GPS, WiFi, and cell tower

triangulation

• Enables tracking of assets, finding nearby services, locating friends/family members, …

Page 54: DATA Mining PPT

Reality Mining• Citisense: finding people with similar interests

See www.sensenetworks.com/citysense.php for real-time animation of the content.

A map of an area of San Francisco with density designation at place of interests

Page 55: DATA Mining PPT

Q & A