View
20
Download
1
Category
Tags:
Preview:
DESCRIPTION
IBM Data Model1
Citation preview
Data Modeling
Introduction Name Experience in Database Expectation from the class
Rajesh Kumar V.S rajesh@aroha.co.in
www.ilearnonline.co.in
Agenda Data Modeling Different types of models Operational Systems
3rd NF (ER Modeling) Enterprise Data Modeling Modeling Lifecycle Modeling stakeholders Validating models Questions…..
www.ilearnonline.co.in
Model Many perspectives of business
How it works can be defined by the process model How the organization is organized can be defined by
the org chart Where the organizations operate can be defined /
explained by location or Geo map To understand the data flow of the application you can
create data flow diagram What information needs to run the business is
defined by Data Model
www.ilearnonline.co.in
Types of Models Subject Area / Contextual
Define main terms and definitions for the high level entities. Not every project will have this, only if the scope is big then
we can have this. Conceptual (Overview)
Important entities which holds the application together Business concepts and rules Identifying the relationship between each of the entities.
Logical Complete detail requirements, resolving M:M relationships
identified in Conceptual Model. Detailing all the attributes needed for every entity Providing relationships in terms of PK and FK.
Physical Model Exploiting the RDBMS to take the advantage of the
model.www.ilearnonline.co.in
Business to Physical ModelingThe Business
Contextual
Conceptual
Logical
Physical
Usual Data ModelingReverse Engineering
Usually 1:5 to 1:7 ratio of entities betweendifferent models.
www.ilearnonline.co.in
About Data Modeling
A data model is the place through which you maintain information about things (entities)
The facts about those things are nothing but rows. It’s a real thing, not a technology It’s stable until business does not change, if business
changes then the data model has to be flexible enough to accommodate that change.
It drives the consistency of dealing with information It should always start simple. Data model is relevant to the business As the business rules and requirements gets complex,
the painful detail comes
www.ilearnonline.co.in
About Data Modeling To be a better data modeler you should have both domain
knowledge and good exposure to data modeling concepts It’s a living thing, so expect changes for sure. Its our
visualization of information and proactive vision of business is key to build model which does not change quite often.
Data model is a non technical description of your business in terms of things it need to know about
By following few techniques a business analyst can build great data models
Collect all the facts about business through knowledge transfer sessions, reading about your business and the process at least 10% of your time.
Involve in discussions with various business line mangers to get overview of what they are doing.
www.ilearnonline.co.in
Information Visualization
Business
Line of Business
Business Process
Activity
Information
DATA
www.ilearnonline.co.in
Information Visualization
Business Line of Business Business Process Activity Information Data
HDFC Credit Card Customer Care Customer verification Verifier, address etc Data
www.ilearnonline.co.in
Information Visualization
Business Line of Business Business Process Activity Information Data
IBM ???? ???? ???? ???? ????
www.ilearnonline.co.in
Zachman Framework
Zachman Framework -- stakeholders
Contextual Data Model
It explains the major processes Focuses on specific problem area (SCOPE) Master entities are defined At this level we never talk about the data what you
are going to store (No details) Excellent starting point for the data modeling Vocabulary, Define the terms.
(Classic example of confusing the terms are vendor / supplier).
www.ilearnonline.co.in
Conceptual / Logical Model Conceptual Model
Entities are identified Not Normalized M:M relationships No keys Can list the important attributes
Logical data model M:M resolved Shows PK, FK and all attributes Its fully Normalized All attributes are atomic
www.ilearnonline.co.in
Generalization of the data model
Customers /markets Products
Services
Any business have to have the following entities through which it can work.What is the product you sell in a Software service company?What is the product you sell in Airlines?
www.ilearnonline.co.in
Create simple data model Create a model to hold all the employee information, career path
individual choose. Each career path has certain set of training programs. Individuals can choose training programs based on the current
capabilities and future needs. Assign to be participants of individuals to up coming training
programs based on the request individual mentioned. You should look for manager recommended candidates for the
training program first then fill the requested resources based on the seats in the training program.
Create the set of training programs and associate the proposed candidates based on the business rules.
System should be able to capture the attendance for the training program.
www.ilearnonline.co.in
Visualizing the functionality
www.ilearnonline.co.in
Attributes – conceptual vs logical
Customer--------------------
ContactsAddressesBank info
Customer_IDCustomer_nm
Effective dtIn_eff_dt
Cust_crdit
Cntc_typ_idShort_dscdescription
Addr_typ_idShort_dscdescription
Cust_idCntc_typ_idContact_nm
Emailphone
Cust_idaddr_typ_idAddress1Address2
CityState
Cust_idAccount_id
Bank_idBank_nmAddress1
City
Conceptual
Logical Model
www.ilearnonline.co.in
How to model (Steps)
Requirement gathering and understanding the business process is the foundation to get the right data model.
Gather the information Questions Answers Analyze and confirm Add to model
Make the above 4 steps as an iterative process before freezing the model.
Apply the business scenarios and see whether the data model accommodates the same.
www.ilearnonline.co.in
Depicting and Documenting
Customer
Order
Product
Order Line
places
containsspecifies
Data ModelDiagram
Data ModelDocument
Order Data Model
IntroductionEntity definitionAttribute definitionAdditional Bus Rules
www.ilearnonline.co.in
About entities
An entity is a person, place, thing, event or any of the interest to the enterprise, about which facts may be recorded
You should name it in a real world term Eventually entity becomes a table in relational database Examples
Employee Region Department Customer
Entities are not supposed to be designed by using the input or output formats (screens and reports). This wont give us enough flexibility in the data model.
www.ilearnonline.co.in
Entity Definition
Answers the question Definition Interesting points Exceptions Examples (demonstrate the data)
List of instances Scenarios
www.ilearnonline.co.in
Entity example Building
A ground based structure which supports the company business operations by housing employees, equipment, or supplies
Examples: office buildings, retail store, plants and warehouse
Can be owned or leased May or many not have street address
Key Points To avoid misunderstandings between entities we have to
do this Its crucial, so do them early
www.ilearnonline.co.in
Entity Types
Kernels / Master The central object in the model
Every thing else either further describes, associates or classifies the kernel entities
Can exists independently Ideally, it’s the starting point for modeling
Associative Relates two other entities Evolves from resolving M:M relationships Important associates shown in conceptual, remaining will be
detailed in the logical model Types
Kind of Kernel entity that classifies or categorizes other entities Example (Attribute Customer Type in customer table , or Account
Type and Account table) Transaction tables
www.ilearnonline.co.in
Attributes
A property of a thing that can be expressed as a piece of information – one of the facts about things that must be maintained
Properties of the entities Example for customer entity, following are the attributes
Cust_name Cust_contact Cust_address Cust_city
Questions to ask Is it a fundamental attribute or could it be derived attribute Business Rules association at the attribute level. Exceptions
www.ilearnonline.co.in
Relationships in the RDBMS
An association between two things (entities) is called a relation
We have three different types of relationships in RDBMS 1:1 (One to One) – rare 1:M (One to Many) – common M:M (Many to Many) – more in conceptual model, none in
Logical model and Physical model Examples
1:1 (Person to PAN ID) 1:M (Customer to Phone) M:M (Doctor and Patient)
M:M relationships
Only RDBMS supports the M:M relationship Only in the conceptual model, we can have this
relationship In the logical model, we have to resolve this
relationship. We resolve M:M by using associate table concept.
doctor Patientdoctor Patient
doc_pat
associate table
www.ilearnonline.co.in
M:M relationship Steps in resolving M:M
Split it into two 1:M relationships Example: Customer and Product has M:M
relationship Always M:M is intersected by some other parameter, in
this case it’s the orders. Example: Customer and Survey.
In the case in survey you will have multiple questions, so the set of questions becomes the parameter.
Example: Course and Branches. In this case a specific course can be conducted in more
than one branch. In one branch you will conduct different courses.
www.ilearnonline.co.in
Special cases Recursion
Empno and mgrno are stored in the same entity Mgrno is also an employee number Usually this goes as self referential integrity constraint We define this as a foreign key which refers the PK of the
same table. A recursive relationship is fully optional.
www.ilearnonline.co.in
customer account phone
Call records
Sample TablesCust_id Cust_name Email id Contact name Cust_since
100 Citi citi@citi.com Bill H 10-JAN-08
101 HSBC hsbc@hsbc.com Tim D 15-APR-10
102 SBI sbi@sbi.com Ram K 12-APR-11
Act_id Cust_id No_of_phones
1234 100 2
1235 101 3
1236 100 2
1237 102 1
Phone_no Act_id
123456776 1234
987654321 1235
456780987 1234
098756782 1234
Call_id Duration Phone_no
56789 14 123456776
56790 15 987654321
--------- --------- ---------
Current system is generating bills based on the customer id. Today, Bank A bought the Bank B, then we should generate the bills bank B tobank A only. How do we change this model.
www.ilearnonline.co.in
Normalization It’s a methodology we follow in order to make sure there is
no redundant data available in the data model. It’s a part of your logical model process with in database
design Advantages:
Reduced space in the db Transaction speed increases
Disadvantage Complex queries i.e. a query which has more number of
table joins will tend to impact the query performance.
Cust_idCust_nameCust_dobCust_phoneCust_emailCust_city
customer
•Assume we have 10000 rows in this table. We have customersfrom 5 different cities. Mumbai has 2000 customers.•One of the city name changed again… (Mumbai -- > ……)•In the real world one activity happen. What statement you will issueto record the same change on our data. How many records you changed?
www.ilearnonline.co.in
Normalization process Its about breaking entities into their most granular form
1st Normal Form (1NF) Every attribute must be atomic Repeating attributes moved to a separate entity
2nd Normal form (2NF)All the attributes should be functionally / partially dependent on
the primary key / concatenated 3rd Normal form (3NF)
All the attributes should be primarily dependent on the primary key
No repeating elements or groups of elements – 1st NF No partial dependencies on a concatenated key -- 2nd NF No dependencies on non-key attributes – 3rd NF
www.ilearnonline.co.in
Table to be normalized
www.ilearnonline.co.in
This excel kind of data is tracking all the transactions happens when a customerrents the vehicle. It has some customer information, vehicle information and therented out and returned back data.
Normalize this table based on your assumptions.
Normalized model
www.ilearnonline.co.in
Tables – to analyzeStore name Address Jan_sales Feb_Sales Mar_sales Apr_Sales May_sales June_sales
ABC Stores 123, MGRoad, Chennai
345609 94040 45958 748490 849938 84949
BBC Stores 124, 5th Cross, Anna Nagar, CHN
849409 440400 9840940 89989 456655 23455
cust_nm Address Email 1 Email 2 Mobile1 mobile2
Ram Kumar 123, Housur Road, BLR
ram@gmail.com ramk@hotmail.com 84940390 484040398
Ramesh K 34,ABC St ramK12@gmail.com
Null 89490490 null
www.ilearnonline.co.in
What should we have in requirements
Agent Management -- Requirements Agent is resource through we which we sell the products to our end customers. With out agent you cannot sell any product we have. Products have different life spans, minimum payment, minimum terms etc.
Product can be sold during its offer only. Company provides the training to agents when a new product has been
launched. Only agents who completed the training can sell that product. When a customer buys a product through an agent, then we create a contract
called as policy. All the policy numbers will uniquely identify a product, a customer, agent.
We get revenue after getting the contract and have to pay the commission to agent. This commission depends on premium then pay.
System should be able to track the premium payment data and intimate to respective agents to do a follow up for the payments which are due.
Need to stored the customer multiple address, agents family, addresses and bank /account information for payment processing.
Organization should be able to generate the expected commission for next month based on the policy premiums we expect.
www.ilearnonline.co.in
Conceptual Model for Agent System
AgentBankfamily
ProductTermsPlans
Comm. plan
Training program
customer policy
Policy transactions
agent commission
www.ilearnonline.co.in
Normalization
1NF, 2NF, 3NF are breaking entities into their most granular form
4NF and 5NF also about granularity, but only the granularity of associative entities with 3 or more parents
A B
AB
A B
ABC
C
No 4NF or 5NF concerns Possible 4NF or 5NF concerns
4NF and 5NFA B
ABC
C
Assume this is wrong
A B
AB
C
4NF
BC
A B
AB
C
5NF
BC CA
Example
Program Worker Role
Assignment
This is OK if…….Any combination is valid
BUT…..Workers have defined rolePrograms require certain rolesWorkers are assigned programs independent of roles
Then we will end up writing lots and lots of coding to make sure to implement thesebusiness rules. If we normalize it some of the rules are automatically taken care of
Modeling ApproachTop Down Good way of getting a model We will be thinking out of the box because we talk about entities,
how two entities relates to each other etc Think about the business in broader sense by using the Business
Analysts / subjects. Most of the companies follow the top down approach and get
validate based on the scenarios.
Bottom Up Based on the output we will create the model Kind of reverse engineering (some times) Easy to normalize, but we may miss out on bigger picture.
www.ilearnonline.co.in
Mutually exclusive relationship
order
customer
Internaldepartment
In this scenario, an order can be placed only by customer or by internal department
So its mutually exclusive (Solid lines exactly one)
The other example is, payment table in telecom billing system can have credit_card,Cash or check. In one payment you will accept either one of these three, so itsmutually exculsive
)
Mutually inclusive relationship
Advertisement
Radio
Televison
In this scenario, an advertisement can be placed in different media
So its mutually inclusive (dotted lines one or more)
)
Internet
Claim Processing system
claims
contract
Contracttype
claimstatus
category
product
city
state
region
Agent
customer
Cust_type
www.aroha.co.in www.ilearnonline.co.in
branches
Claimcategory
Claim transaction
Modeling time and historyPrinciples
Don’t change the stored data, if you do you loose the history
It becomes as is reporting. Add new records when there is a change This way you track the history of changes to that record
and when it got changed. If they can correct the changes, then capture the
correction date also, so that you have all the information to track back what did happen.
Have audit columns in place to capture the time and history.
Logical / Physical Modeling
Logical Data Model Physical Data Model
Business info and rules Model in the database
Entity, Attribute Tables, Columns
Primary Key, Alternate key Primary Key constraint, unique index
Inversion Key Non unique index (for performance)
Rule Check constraint / triggers
Relationship PK /FK
Definition of the entity Comments
www.aroha.co.in www.ilearnonline.co.in
Physical Model Implementation All most same as Logical Model, with little variation.
Taking the advantage of the database what you are dealing with.
Make sure all the PK and FK are numbers. (Try a POC before we say this)
Example: Should I use partitions or materialized views or external tables etc.
Should we use columnar Database (Vertica, Opensource db) for certain data marts.
Designing rolling window based on partitions to improve the performance.
Creating views to simulate the scenario rather than increasing the database size (especially small tables)
Checklist for Physical Model Logical model as the input. Factors of the DB on which we deploy the tables. (Oracle,
SQL Server, DB2, MySQL etc)
Data Velocity factors (Frequency of activities in the business) Initial Load Data & Incremental load planning Data Volume calculation (Capacity Planning) Identifying importance of tables in terms of joins and
table data volume Deciding partitions, indexes, views, materialized views Implementing back up recovery mechanism for the
system.
Channel Partner Payment system hp sells the products through channel partners. The scope of the project is related to only sales. Through presales system hp generates the quotes and provide the
same to channel partners, channel partners sells the products to end customers based on the quote (This is outside of this system)
Certain channel partners can sell only certain products of hp. End customer who buy hp products through channel partner can be
an individual or a company (don’t want to consider end customers as part of the scope).
Based on the sales made by the channel partner, hp have to raise the invoices to channel partner.
System should have the ability to store the grade of the channel partner.
hp provides the commission to channel partners based on the sales they made automatically once in three months.
System should have the ability to store the addresses, different kinds of contacts and bank information of channel partners.
Channel partner payment to our invoices and the payment to channel partner commission are tracked in this system only
De-normalization Is a process where we increase the query
performance Especially used for reporting, not used to
increase the transaction processing To get best of both worlds, is to create the
normalized model for faster transaction processing and take the advantage of oracle’s materialized views concept to get your reports run faster
We are making an another copy of the data, but system takes care of it. This way we don’t implement new bugs.
Insurance Business Company sells insurance to various customers through agents We sell different kind of insurance policies like Risk, kids education,
endowment, pension etc Nominations must exists in each one of the policy. Max of 2
nominations can exists of every policy. We can surrender the policy if we paid the minimum number of
premiums. This number varies between policies Customers can get loan against the policy they have. Agent commission should be processed by the system. Based on the policy as well as the premium paid, the commission to
agent differs. When we surrender the policy, the charges are applicable. When we pay the premium late the interest should be added to the
payment.
Build a conceptual model and logical model for the above mentioned business.
Telephone Billing Customer comes and buys the telephone Customer can have multiple phones. Customers can be either corporate or individuals Billing flexibility should be available based on certain phone
numbers so that I can send the consolidated bill to different groups with in the corporation
Billing cycle can be decided by customer Must maintain different addresses of the customer Always we have to mail the bill to Billing Address, should have
flexibility to send it to an electronic address also. Customer can subscribe for multiple service like wireless,
internet…. Customer can select a specific plan (rate)
Retail Case Study
Assume, we want to create a data model which takes care of a retail chain.
We want to create the model which stores the supplier information, order management through which we place orders to all the suppliers.
We have one warehouse from which point, we distribute to various stores in the city and store the point of sale.
Customer can return the goods with in 7 days. Some of the products cannot be taken back.
One employee can work multiple stores.
Star Schemas for General Insurance
Training program
AGENT
product time
Trgfact
Enrollfact
customer
Premium fact
Exp revfact
Comm. fact
Paymenttype
policy
channel
claims fact
Coveragetype
location
workflowfact
policystatus
Org_cat_idDesc
Org_idOrg_cat_id
DescManager_id
Emp_idEname
JoindateSal
CommOrg_id
Cust_idName
Emp_idAddressPhoneEmailState
Prod_idName
Start_dateEnd_date
Cust_idPolicy_idProd_idStart_dtEnd_dtnotes
Cust_idPolicy_idTrans_id
DateTran_type_id
amt
Tran_type_iddesc
organization
Org_cat
product
Employees
CustomersTransaction_type
Transactions
Cust Policies
Insurance Model - Normalized model
Dpet_idDept_name
Org_id
department
Snow flake - Insurance model
Org_idNameCat_id
Manager_nm
Emp_idEname
Join dateSal
Dept_id
Cust_idName
AddressPhone
State_id
Product_idName
Start_dateEnd_date
Cal_date_idDate
Week_id
Org_idEmp_idCust_id
Product_idPolicy_id
Cal_date_idPayment_amtClaims_amt
org_dim
emp_dim
txn_fact
Cal_dim
cust_dim
policy_dim
Cat_idCat_desc
Cat_lookup
Dept_idDept_name
dept_lookup
Week_idCal_weekMonth_id
Week_lookup
State_idState_desc
State_lookup
Month_idCal_monthQuarter_id
month_lookupQuarter_idQuarterYear_id
Quarter_lookup
Year_idyear
Year_lookup
Normalized Model (Electronic AtoZ Store)
Call Center
In store POS
Online Sales
Thank you
Questions
Recommended