Final Exam Review Spring 2011. Exam format About 75 questions 45% multiple choice and T/F 30%...

Preview:

Citation preview

Final Exam ReviewSpring 2011

Exam formatAbout 75 questions

45% multiple choice and T/F

30% short fill-ins

25% short-paragraph explanations

What to study50 questions from Exam 1 and 2

12-15 questions about presentation topics

10-13 questions will come from labs Concepts from

Market Basket: Data Mining SCM: RFID (hardward) + XML(concept) Fund Trading: DBs for Optimization Pivot Chart: DBs for Discovery & Prediction Wagemart: DBs for Decision Support

Market Basket Analysis Support: Probability (P) that an item is in someone’s checkout basket

A,B,E A,B,F A,B A,B,F,G A,D,F

C,D C,D,G E,F,G E,F E,G

P(A) = 5/10 = 50%

P(AB) = 4/10 = 40%

P(C) = 2/10 = 20%

P(CD) = 2/10 = 20%

Market Basket Analysis Confidence X Y = P(XY)/P(X) : If item X is purchased, what is

the probability that item Y is also purchased

Confidence B A = P(AB)/P(A) = 40%/50% = 80%

Confidence C D = P(CD)/P(C) = 20%/20% = 100%

Given: P(A) = 5/10 = 50%

P(AB) = 4/10 = 40%

P(C) = 2/10 = 20%

P(CD) = 2/10 = 20%

Market Basket AnalysisQuality X Y = Confidence X Y * P(YX)

High quality association rules

Quality A B = 80% * 40% = 32%

Quality C D = 100% * 20% = 20%

Apriori Algorithm: Calculate high quality association rules given

billions of transactions millions of items

Complex Association Rule ADGMS CLPT (50% quality, 80% confidence), i. 5 items (A,D,G,M, and S) imply with great

confidence that 4 items (C, L, P, and T) are purchased.

Without the Apriori Algorithm, the calculation would take too long (millions of years).

Apriori Algorithm How it works:

By setting minimum support level, the algorithm can prune low confidence pairs (2-itemsets) to compute 3-itemsets.

Then, the pruned 3-itemsets can compute 4-itemsets. The algorithm is guaranteed to return all the itemsets above the minimum support level.

When you get to 5-, 6-, or 7-itemsets, the pruning reduces the number of possible sets from trillions to a few thousand or hundred, which can help humans discover very complex, high quality association rules.

Importance of Apriori AlgorithmA process

An innovation that takes terabytes of data and reduces it to meaningful rules

Raw Data Relevant and Timely Information

A.I. Data Mining

Market Basket Analysis

Pivot Chart LabGreat example of Online Analytical

Processing (OLAP)

Slice & Dice data (Temp., Mood, Day, Weather)

Drill Down (look at only incorrect predictions)

Unlike Data Mining, the process is interactive

a person participates in the process The process is Ad. Hoc.

The process is not pre-determined like Apriori Algo.

Significance of Pivot Chart LabBusiness Intelligence (like A.I.)

Use OLAP to find patterns

Encode patterns as IF statements to predict future cases.

The spreadsheet can automate the human decision making process on a large scale, faster than a human.

Such a system enables timely, accurate predictions without a human decision-maker (Business Intelligence System)

Excel Pivot Charts as a toolFirst: Pattern is noticed

Second: Interactive analysis tools (Pivot Chart) helps to confirm and pin-point the pattern

Example: A marketer thinks that geography plays a role in sales; a Pivot chart shows that Southern stores do have better sales.

Database queries as toolsFirst: The data mining reveals numerous

patterns (association rules)

Second: Human intelligence can derive the theory behind the pattern.

Example: The Apriori algorithm discovers a high quality association rule (Beer Diapers). Later, Marketers try to unravel the reason why. The data analysis must come before the hypothesis

because the data is too big for humans to analyze.

Fund Trading Lab Decision Support Automation: Using a Database to

compute the optimal sequence of trades. Too many combinations for a human to analyze

Another Example of Business intelligence

1. At first we use a graph and human intuition to make the trades

2. We do better if we use a query to calculate and sort all possible transactions

3. We use Database tools to pick the best one’s that don’t overlap

Decision Support Systems: Wagemart vs. Fund TradingWagemart

start with tons of data individual salaries,

availability

reduce it to simple info total cost, average rating

to help make a decision.

Fund Trading

start with less data Fund value for each day

compute every possible transaction Much more data

Queries are used to find the optimal transactions

Decision Support Systems: Wagemart vs. Fund Trading Both system model scenarios to compute the

outcome of decisions

one is structured one scenario to optimize

the other unstructured many different scenarios to consider

Fund Trading was more structured, i.e., you can only buy and sell; you just have to decide the optimal day and funds to buy/sell.

Wagemart was very unstructured, many different ways to cut costs.

Porter’s 5-forcesDo companies complete because its fun?

Maybe some…

They compete because of the threat of going out of business.

Profitability is the penultimate measure of success

Why?

What are the threats?A new competitor

Will take away your sales and profits? Because they are better?

In business what does better really mean?

The five forces/threatsNew entrants

Substitute products

Rivalry

Bargaining power of consumers

Bargaining power of suppliers

ExampleTarget forces their supplier to use XML-

formatted shipment data and boxes tagged with RFID chips.

Apple refuses and wins

Target has to use Apple’s system to sell Apple’s products.

What force is this?

ExampleIndirect: Brooke visits Google Shopping and

Shopzilla to compare prices on a new camera. She’ll buy from the most inexpensive online

retailer

Direct: Bradley uses Lending Tree.com where banks try to underbid each other to get his business.

ExampleDisney World implements a new ride tracking

system, that directs visitors to the rides with shortest wait times.

Forces Universal Studios to invest in a similar system.

ExampleEveryone at the gym is using their iPhone or

Android phone to listen to music

MP3 players are now collecting dust

ExampleNetflix emerges and puts 120 Blockbuster

videos stores out of business

Competitive strategiesTo fight the forces

1. Do something totally new (innovation)

2. Be inexpensive (cost leadership)

3. Be big to increase power (growth) Lock-in your customers Lock-out your competition

4. Make mutually beneficial partnerships (alliance)

5. Be different but in a good way (differentiation)

Put up barriers to the competition

Example Imagine if Blockbuster decided to use

Internet/Mail delivery before Netflix.

But Blockbuster was NOT ______________

By the way, Netflix created a totally new process for renting videos. How does an IS make this possible? How is the IS better than the old-fashioned

process.

E-commerce It was an innovation at one point

Now it necessary to stay in business

ExampleWalmart’s efficient supply chain cuts cost.

RFID and XML play a role

Their size allows them to negotiate low prices with suppliers. Large companies absolutely need information

systems for good management

Walmart’s strategy is 2-fold.

How do Information Systems really help businesses to compete?

The labs provide many examples

RFID, XML More accessible, timely information for improving

supply chain.

Market Basket More relevant information for increasing

sales/profits

How do Information Systems really help businesses to compete?

The labs provide many examples

Wagemart More accurate information for modeling decisions

Pivot Chart & Fund Trading Flexible information; manipulated in real-time to

solve problems (prediction & optimization)

The 11 information attributes are fair gameFlexibility and accessibility are different.

Putting something on the web makes it more accessible

Storing data electronically can make it more flexible

Putting electronic data in a robust, standardized format (XML) improves both.

Attribute Trade-offsSimple vs. Complete

Secure vs. Accessible

PresentationsDon’t forget to review presentations

The websites will be linked on Tuesday

Textbook ReadingLow priority

Top PriorityReview past exams and lookup correct answers

(Text and Google) Will post them on Tuesday

Skim lab materials and instructions on Blackboard

Create cheat sheet

Recommended