22
MIS 6309 BUSINESS DATA WAREHOUSING INSTRUCTOR: KEVIN R. CROOK DATA WAREHOUSE DESIGN PROJECT BookYourTicket.com PRESENTED BY PRADEEP YAMALA pxy160860

Data Warehouse Design Project

Embed Size (px)

Citation preview

Page 1: Data Warehouse Design Project

MIS 6309 BUSINESS DATA WAREHOUSING

INSTRUCTOR: KEVIN R. CROOK

DATA WAREHOUSE DESIGN PROJECT

BookYourTicket.com

PRESENTED BY

PRADEEP YAMALA

pxy160860

Page 2: Data Warehouse Design Project

BookYourShow.com Page 1 of 21

SUMMARY

BookYourShow.com is an online ticket booking website under media and entertainment sector

which offers showtimes, movie tickets, reviews, trailers, concert tickets and events near you.

Also, features promotional offers and coupons. This website targets internet users utilizing

services to perform online transactions like buying tickets. It is also positioned as India’s largest

entertainment ticketing website.

It all started in the year 1999 when 3 long-time friends go holidaying together in South Africa

the seed of a Big Tree is planted. A company is planned, from roots to fruits. Soon after the

Eureka moment, C.E.O. Amar quits his job at Sony, Co-Founder Akbar takes over Technology,

and Co-Founder Anthony takes over Finance.

With the cinema industry on a high and multiplexes and large cinema chains starting to

entertain Indian audiences around the country; Big tree takes over the rights to retail and

service New Zealand based ticketing software, Vista in India. During the dot-com bust that

happened in 2002, the company offered technology solutions from the Customer Relationship

Management point of view to fight the storm. This business flourished under the leadership of

our 3 musketeers.

Network 18 invested in March 2007. In August, the same year an internal contest was held to

coin a name for the new company. A developer intern came up with the name

BookYourShow.com and the rest, as they say, is history.

Big tree Entertainment Pvt. Ltd launches India's first ticketing aggregator - BookYourShow - in

August 2007, now one of the biggest ticketing portals in the country.

Page 3: Data Warehouse Design Project

BookYourShow.com Page 2 of 21

COMPETITIVE ANALYSIS OF THE COMPANY

Within a decade of its inception, BookYourShow poses a 40% compound annual growth rate

(CAGR) in revenues and over 90% market share in the online entertainment ticketing space.

BookYourShow becomes the official ticketing partner for Mumbai Indians, Kings XI Punjab, and

Delhi Daredevils. Today we have Pune Warriors and Rajasthan Royals too on board this high

drama entertainment circus called the IPL. Also, becomes the exclusive ticketing partner for

Formula 1, the Indian Grand Prix.

Records are meant to be broken - As it stands today, the highest number of tickets sold in a

single month was October 2014 - more than 5 Million - 5,696,685. BookYourShow is awarded

'The Hottest Company of the Year-2011-12' and 'The Company to watch out for' at the

prestigious CNBC Young Turks Award. BookYourShow App has around 7.2 Million downloads

that include Windows, Android, iOS, and Blackberry.

Accel Partners invests USD 18 Million Dollars i.e. Rs. 100 crores in BookYourShow. BigTree

Entertainment acquires Chennai based online ticketing company Ticket Green also acquires

Bengaluru-based Social Media Analytics firm Eventifier. BookYourShow awarded ‘Best

Omnichannel Customer Experience Brand’ at the OneDirect Quest Customer Experience

(QuestCX) Awards.

ScaleArc today announced that Bigtree Entertainment Pvt. Ltd. has selected ScaleArc for SQL

Server to ensure the availability and performance for its online entertainment ticketing

business – BookYourShow. BookYourShow is the largest entertainment ticketing portal in India

with more than 400 million average page views a month (website and mobile application). To

Page 4: Data Warehouse Design Project

BookYourShow.com Page 3 of 21

handle the onslaught of traffic and prevent downtime during the release of blockbuster movies,

BookYourShow deployed ScaleArc to ensure seamless availability.

With India’s undying love for films, it is not surprising that film ticketing comprises almost half

of BookYourShow’s business. Ticketing for sporting and other events contribute the next

biggest share to the revenue pie while the rest comes from advertising. Content, along with the

ad platform, is aimed at making advertising a significant revenue source. BookYourShow

expects it to contribute 10 per cent to the overall revenue.

BookYourShow just rolled out the newly designed and improved version of its Android app

which is highly interactive, smart and intuitive and improves the user experience. From about

14 steps, the booking experience has been reduced to 7 steps. Not just that, In addition to the

default English language option, users will also be able to discover entertainment options on

BookYourShow in Tamil, Telugu, Hindi and Kannada.

SWOT Analysis

Strengths • Vast network of event organizers and major cinema chains

• Simple and convenient to use

• Continuous innovation like providing ticket booking

application for Blackberry mobile

• Constantly updated with forthcoming events and movies

Page 5: Data Warehouse Design Project

BookYourShow.com Page 4 of 21

• Around 90% market share in the online entertainment

ticketing space

Weaknesses • Mostly limited to urban areas as people in India are still

apprehensive for online payments

Opportunities • Expand capabilities to cover more events and movies across

various cities

• Acquiring more partnerships with various business entities

• Co-organizing events with various event organizers to

increase physical brand presence

Threats • Possibility of mismanagement due to lack of coordination

with event organizers

• Improved functionalities by competitive online ticketing

portals

• Newly emerging competitive online ticketing portals

The major competitors are Ticketfinder.com, paytm.com, and Tickets.com

Page 6: Data Warehouse Design Project

BookYourShow.com Page 5 of 21

DATA WAREHOUSE ARCHITECTURE

A data warehouse is a repository (collection of resources that can be accessed to retrieve

information) of an organization’s electronically stored data, designed to facilitate reporting and

analysis.

The final step in building a data warehouse is deciding a data warehousing architecture which

includes Inmon, Kimball and Standalone Data Mart. between using a top-down versus bottom-

up design methodology. And Kimball is the best fit for BookYourTicket.com. The final step in

building a data warehouse is deciding between using a top-down versus bottom-up design

methodology.

Kimball is a proponent of an approach to data warehouse design described as bottom-up in

which dimensional data marts are first created to provide reporting and analytical capabilities

for specific business areas such as “Sales” or “Production”. These data marts are eventually

integrated together to create a data warehouse using a bus architecture, which consists of

conformed dimensions between all the data marts. So, the data warehouse ends up being

segmented into several logically self-contained and consistent data marts, rather than a big and

complex centralized model. Business value can be returned as quickly as the first data marts

can be created, and the method lends itself well to an exploratory and iterative approach to

building data warehouses so that no master plan is required upfront. The Kimball’s method

focuses on optimization and quick win which are two key aspects needed for the

BookYourShow.com to rapidly expand their customer base.

Page 7: Data Warehouse Design Project

BookYourShow.com Page 6 of 21

Inmon is one of the leading proponents of the top-down approach to data warehouse design, in

which the data warehouse is designed using a normalized enterprise data model where data

warehouse is defined as a centralized repository for the entire enterprise. Dimensional data

marts containing data needed for specific business processes or specific departments are

created from the enterprise data warehouse only after the complete data warehouse has been

created.

Kimball is best preferred because business users can see some results quickly, with the risk you

may create duplicate data or may have to redo part of a design because there was no master

plan. With Inmon by the time we start generating results, the business source data has

changed or there is changed priorities and you may have to redo some work anyway.

Page 8: Data Warehouse Design Project

BookYourShow.com Page 7 of 21

BUSINESS PROBLEMS SOLVED USING BUSINESS DATA WAREHOUSE

Over the past few years, data warehousing capabilities have tremendously evolved to meeting

enterprise standards, addressing different cases such as velocity, variety, and volume.

Querying – A data warehouse needs to be capable of dealing with repetitive queries. Repetitive

queries support dashboards and reporting requirements when addressing a large number of

visitors.

Scale – This applies to multiple data structures and formats. You need a data warehouse that

can deal with large amounts of data in order to address the management of query workloads

and query optimization.

Real-time loading – In today’s world bulk and batch loading remain the most common method.

More advanced data warehouse technologies are moving to continuous loading methods,

which means that data is being loading from operational sources in real-time. This enables you

to ingest stream data and perform updates for reading optimization.

The data warehouse supports online analytical processing (OLAP), which enables high-level end

users to gain insight into business operations through interactive and iterative access to the

stored data. This enables business executives to improve corporate strategies and operational

decision making by querying the data warehouse to examine business processes, performance,

and trends.

Movies ratings based on the reviews and recommendations to users

Movie reviews provided by the customers are essential to track and decide whether the movie

needs to be suggested for others users or not. Movies with good reviews need to be filtered on

Page 9: Data Warehouse Design Project

BookYourShow.com Page 8 of 21

the other hand bad movie reviews should also to taken into consideration and calculate

average movie ratings.

Analyzing customer profiles and based on the past reviews, various movies are recommended

to the customers. Analyzing customer interests and suggesting various other events and special

screening events. This provides a better experience to the customers and increases the

frequency of visits by the customer.

Analyzing website traffic and transactions

At any point, to time the website traffic needs to be monitored for analyzing performance,

interruptions, and delays. Also, measure number of transactions per day. Doing this the load

can be measured and if needed, necessary steps should be taken to increase the load capacity

by adding additional servers. With this website crash, transactions failures, delay while loading

pages and various other problems can be minimized.

Measures number of customers who visited the website, number of transactions per day,

number of bookings during peak time like new movie release, the launch of the new show.

Thereby improving customer experience by eliminating transaction failures, delay during

payment gateway transactions, and while loading pages.

Analyzing Feedback and reviews

For any website feedbacks and reviews play a prominent role in improving the business,

provides better solutions, improves the process and providing better customer satisfaction.

Because these reviews are real time problems faced by the customers and they must be solved

Page 10: Data Warehouse Design Project

BookYourShow.com Page 9 of 21

to maximize profits and to improve the integrated workflow. There may be positive as well as

negative feedbacks, all these should be stored and analyzed to make the process better and

make it more convenient to the users. Analyzing customer feedback on various issues during

booking time and providing quick solutions to the problems faced. Calculate the average rating

of the movie by analyzing all the customer ratings.

Data Mining

Segmenting various customers and recommending various suggestions

Customers are segmented into various clusters based on their preferences, booking history and

search results. So, based on this various analysis is done to target few groups of customers to

maximize profits. Also, target marketing campaigns to reach more customers ultimately

improving customer base. Fraud detection of customer credit cards by validating card details.

Strategic and official partner

Strategic partnering with various companies by negotiating with them to improve business and

financial support. Thereby becoming official ticketing partner for various events. This can be

achieved by analyzing the company, their track record, events held, successful events

conducted.

Enhancing customer experience

Improving customer online experience is considered one of the primary things for any website.

Page 11: Data Warehouse Design Project

BookYourShow.com Page 10 of 21

Since it’s a ticket booking website user should constantly experience upcoming events, new

movie releases, recommended movies depending on various factors. This can be analyzed by

various factors like a number of visits by a user, customer interests, previous bookings, reviews

or feedback written.

Minimizing booking time

Generally, any user prefers to book a ticket in less than five minutes. Avoiding unnecessary

steps while booking a ticket can reduce overall booking time, thereby improving customer time,

avoiding delayed transactions and customer satisfaction. This can be achieved by analyzing

various fields a customer should fill and optimizing mandatory fields. On measuring average

booking time per ticket during peak time and normal peak hours few measures can be taken.

Choosing a payment gateway

All transactions happen through a gateway. So, choosing an optimal payment gateway is

essential in business and financial perspectives for any company. Due to failed transactions,

user money gets blocked ending up with the unsuccessful transaction.

Calculating average time taken to process a transaction, a number of failed transactions in a

month and few other data can be analyzed while selecting a perfect gateway. Also, the gateway

chosen should perform at its best during peak load when there is more traffic.

Official partners for events

Choosing best events also maximize the commissions, profits and brand value. The company

conducting the event should be trustworthy and there should be last minute issues regarding

Page 12: Data Warehouse Design Project

BookYourShow.com Page 11 of 21

tickets or price changes. Some events might get canceled after the tickets are booked, event

date might get postponed these types of issues are unavoidable. The Proper agreement should

be made before being an official partner, terms and clauses should be defined before signing an

agreement. Generally, analyzing about the company, their previous events, user reviews on the

previous events and choosing the best event.

Average bookings for a particular period

Maintaining good average bookings per day or at given time period always maximizes the

profits and withstand competition from competitors. This can be achieved by

analyzing average bookings on special occasions and holidays, planning way ahead by providing

discounted ticket price, special offers and offers on multiple ticket bookings etc.

Cancellation and refund assessment

For any company or website where there are money transactions, cancellations and refund do

not come under their profits. In fact, it minimizes the profits and losses customer goodwill and

loyalty. So, various reasons for cancellations are analyzed like failed transactions, event

canceled at the last minute, movie release date postponed. These problems can be minimized

by solving the errors or issues relating to the respective department. Analyzing an average

number of cancellations and the average amount refunded during a particular time. Measuring

these can result in maximizing profits and retain customer satisfaction and good will.

Page 13: Data Warehouse Design Project

BookYourShow.com Page 12 of 21

BUSINESS ANALYTICS QUESTIONS

1. The total commission received per ticket for an event.

2. Total processing charges for all the tickets

3. Top rated movies during a particular year

4. Number of failed transactions during payment gateway

5. Average age of users who frequently watch horror movies

Page 14: Data Warehouse Design Project

BookYourShow.com Page 13 of 21

DIMENSIONAL MODEL

Page 15: Data Warehouse Design Project

BookYourShow.com Page 14 of 21

FACT TABLES DESCRIPTION

TABLE NAME HIGH-LEVEL DESCRIPTION GRAIN ADDITIVE

FACTS

NON-ADDITIVE

FACTS

movie_show_facts Child table of various

dimensions

▪ movie

▪ theater

▪ special_screening

▪ show

▪ movie_show_junk

We can quantify facts such

as special_charges,

screen_capacity,

seats_occupied

Coarse-

grained

Special_charges Screen_capacity

Seats_occupied

customer_booking_facts Child table of various

dimensions

▪ admin

▪ discount

▪ ticket

▪ customer

▪ booking

▪ show

▪ theater

▪ movie

▪ events

We can quantify facts such

as total_price,

discount_price,

Fine-

grained

All facts none

Page 16: Data Warehouse Design Project

BookYourShow.com Page 15 of 21

ticket_price,

processing_charge,

tickets_per_booking

reviews_facts Child table of various

dimensions

▪ feedback_reviews

▪ movie

▪ customer

we can measure

movie_ratings

Coarse-

grained

movie_rating none

Customer_refund_facts Child table of various

dimensions

▪ customer

▪ booking

▪ refund

we can measure

refund_amount and

refund_processing_charges

Fine-

grained

All facts none

Page 17: Data Warehouse Design Project

BookYourShow.com Page 16 of 21

DIMENSION TABLES AND DESCRIPTION

TABLE NAME

DESCRIPTION OF DIMENSION

DIMENSION TYPE

TYPE OF CHANGES

RICH ATTRIBUTES NON-SELF-EVIDENT

Movie Dimensional table movie consists of movie_key as the primary key and has attributes movie_id, movie_name, release_date, year, month, day, actors.

Affinity Timestamped None Not applicable

Theater Dimensional table theater consists of theater_key as the primary key and have attributes theater_id, theater_name, location, website, contact_info

Affinity Type 1 None Contact_info is contact information of company

Admin Dimensional table admin consists of admin_key as the primary key and have attributes admin_id, company_name, password, phone_number, email_id, Address, Address1, city, state

Affinity Timestamped Address consists of address1, city, and state

Not applicable

Discount Dimensional table discount consists of discount_key as the primary key and have attributes coupon_code, start_date, end_date

Affinity Timestamped None Not applicable

Customer

Dimensional table customer consists of customer_key as the primary key and have attributes customer_id, password,

Affinity Timestamped Customer_fullName consists of customer_first_name, customer_middle_name,

Not applicable

Page 18: Data Warehouse Design Project

BookYourShow.com Page 17 of 21

customer_firstName, customer_middleName, customer_lastName, last_movie_booked, emai_id, age, profession, phone_number

customer_last_name

Feedback_reviews

Dimensional table feedback_reviews consists of feedback_reviews_key as the primary key and have attributes feedback, comments, queries, date

Affinity Timestamped None Not applicable

Movie_show_junk

Junk table Movie_show_junk consists of movie_show_junk_key as the primary key and have attributes gross_share, distributor, food_menu, isMovieGood, screen_size

Junk Type 2 None Not applicable

Ticket Dimensional table ticket consists of ticket_key as the primary key and have attributes ticket_id, ticket_number, ticket_type, seat_number, show_date

Affinity Type 1 None Not applicable

Show Dimensional table show consists of show_key as the primary key and have attributes show_id, language, start_time, end_time

Junk Type 1 None Not applicable

Special_screening

Dimensional table special_screening consists of

Affinity Type2 None Not applicable

Page 19: Data Warehouse Design Project

BookYourShow.com Page 18 of 21

special_screening_key as the primary key and have attributes event_id, event_name, release_date

Booking Dimensional table booking consists of booking_key as the primary key and have attributes booking_date, booking_id, day, month, year, booking_time, booking_type, transaction_status_code, transaction_status_description, status_code, status_description

Affinity Type2 Booking_date consistsof day, month, year and transaction_status_code, transaction_status_description, status_code, status_description

Not applicable

Refund Dimensional table refund consists of refund_key as the primary key and have attibutes refund_id, refund_date, refund_status_code, refund_status_description, processing_start_date

Affinity Type2 Processing_start_date Refund_status_code Refund_status_description-

Not applicable

Credit_card

Dimensional table credit_card consists of credit_card_key as the primary key and have attributes card_number, expiry_date, cvv, name_on_card

Affinity Type 1 None Not applicable

Payment_info

Dimensioanl table payment_info consists of payment_info_key primary key and have attributes payment_id, payment_date, payment_time_payment_type,

Affinity Timestamped None

Not applicable

Page 20: Data Warehouse Design Project

BookYourShow.com Page 19 of 21

payment_method, payment_gateway

Events Dimensional table events consist of events_key as the primary key and have attributes event_id, event_date, event_time, location, constact_info, email_id, status, website

Affinity Timestamped None Not Applicable

HIGHLY BROWSABLE DIMENSION

A dimension table from which we can get a lot of analytical value from a dimension browse.

Usually, timestamped dimension makes them highly browsable dimension.

Booking table is considered as a highly browsable dimension because we can get more

information by querying this table alone.

We can solve the following business problems from the Business Table alone

• list of customers booking tickets over the phone

• number of failed transactions in a day

• average number of failed transactions in a month

• number of pending booking at that point of time

• Average bookings per day, month and year

• List of customer booking tickets online

Page 21: Data Warehouse Design Project

BookYourShow.com Page 20 of 21

JUNK DIMENSION

Occasionally, there are miscellaneous attributes, that don’t fit into tight star schemas. Rather

than discarding flag fields and yes/no attributes, place them in a junk dimension. In addition,

you can handle comment and open-ended text attributes by creating a text-based junk

dimension.

MOVIE_SHOW_JUNK holds details like gross_share, food_menu, distributor, isMovieGood,

screen_size and movie_show_junk_key which is a surrogate key. It has an identifying

relationship with MOVIW_SHIW_FACTS table. As these attributes did not fit in any of the

dimensions they are moved in a junk dimension.

BRIDGE & RELATIONSHIP CAUSED TO ADD THE BRIDGE

Since a customer can have many credit cards linked to their account, each customer has at least

one credit card or many credit cards (consider customer has multiple credit cards), we need to

construct a bridge. Hence, a CUSTOMER_CARD_BRIDGE is added to CUSTOMER dimension table

in the start schema and to CREDIT_CARD dimension table. This bridge does not have any

attributes other that keys from to dimensions.

Page 22: Data Warehouse Design Project

BookYourShow.com Page 21 of 21

BUSINESS ANALYTICS QUESTIONS AND QUERIES

1. The total commission received per ticket for an event.

Query CUSTOMER_BOOKING_FACTS, EVENTS dimension and ADMIN dimension to get the

total commission received per ticket for an event

2. Total processing charges for all the tickets

Query CUSTOMER_BOOKING_FACTS and TICKET dimension to get the total processing

charges for all the tickets

3. Top rated movies during a particular year

Query REVIEW_FACTS, MOVIE dimension and FEEDBACK_REVIEWS dimension to get the

Top rated movies during a particular year

4. Number of failed transactions during payment gateway

Query CUSTOMER_BOOKING_FACTS, PAYMENT_INFO dimension, BOOKING dimension to

get the number of failed transactions during payment gateway

5. Average age of users who frequently watch horror movies

Query REVIEWS_FACTS, MOVIE dimension, CUSTOMER dimension to get average age of

users who frequently watch horror movies