Database Modeling presentation

Preview:

Citation preview

Database Modeling for radio cab services.

Outline Database Modeling Introduction Steps involved in Data Modeling Logical Model of cab service. E-R diagrams Example queries.

About the Presentation Focus is on Data modeling and

various techniques like normalization. The whole process of data modeling

has been demonstrated taking an example of radio cabs.

We would be discussing how we came up with cab database in 3 steps.

Step 1:Conceptual Modeling The conceptual data model  is a (relatively) 

technology independent specification of the data to be held in the database.

It is the focus of communication between the data modeler and business stakeholders,and it is usually presented as a diagram with supporting documentation

Conceptual Modeling■ Modeler: What is your business model?■ Business Specialist: We are aggregators i.e we aren’t direct service providers■ Modeler: Ok, so we need not be concerned about the maintenance of cabs!. So what all information about the drivers would you like to keep?■ Business Specialist: Security issues have been popping up, we would need to verify each driver based on his past history, validity of his registration, Car condition etc.

Conceptual Modeling■Modeler: So maybe we could have a separate table containing the details of the drivers and would call it “Driver Table”. By the way, would you like to store the information about the clientele?■ Business Specialist: Definitely. We would like to have a detailed record of them so that we can utilize that data for advertisement and business analysis. ■Modeler: Sure deed! Let as develop a basic design for you and we shall discuss further.

E-R Diagrams as part of Conceptual Modeling Based on the communication with

the business specialist various entities and relationships are developed.

Data Representation: Relational Notation. Chen notation. Oracle designer’s notation.

E-R Diagrams Conventions that we have tried to follow while drawing

the E-R diagrams like how to give entity names, attribute names and relationship names.

These are very fundamental to data modeling. Convention followed for:

Entity names Attribute names Relationships

Conventions Why Correct Naming: If the data modeler doesn’t name

correctly then it might create confusion in the minds of people using the database and could result in wrong entries.

For example: One  should not abbreviate entity names unnecessarily.

This can create a lot of confusion to those who use the database

For example, In student administration system if two entities are named M-Parent and F-Parent then database user can understand M-Parent as Male parent or he can understand it as Mother Parent. Things like this create a lot of wrong inputs to the database.

Conventions similarly other conventions that need to be followed are:

The name of an entity class must be in the singular and refer to a single instance i.e a row not to the whole table.

One should never name the entity class on the name of  the most “important” attribute. Since later on we we add other attributes to it then this entity name would not make much sense.

One should never name one entity class by adding a prefix to the name of another,

Example: External Employee  when there is already an Employee  entity class. This creates unwanted assumptions in the mind that all employee in the Employee class are internal employee.

E-R Diagrams

Step 2:Logical Modeling Stage The logical data model  is a translation of the

conceptual model into structures that can be implemented using a database management system (DBMS).

 This model specifies tables and columns. These are the basic building blocks of relational databases.

Logical Modeling Stage List of tables and attribute at the beginning.

Driver (Driver_ID, Name, Insurance_No, Verification_Date, Licence_No, Phone_No, Address)

Booking (Booking_ID, Booking_Date, From_Location, To_Location, Miles, Cab_Type, Driver_ID)

Client (First_Name, Last_Name, Booking_ID, Customer_ID, Phone_No, Email_ID)

Cab (Cab_Type, Cab_Model, DriverID, Cab_Licence_No) Billing (Booking_ID, Billing_Date, Amount, Customer_ID ) Compliance (Driver_ID, Insurance_No, Licence_No Year,

Tickets_Issued)

Normalization of Tables Why Normalization

completeness, nonredundancy flexibility of extending repeating

groups ease of data reuse programming simplicity

Note: Need for Unnormalization Performance

Normalization of Driver Table

Separate table for Address

Steps in Normalization

Steps in Normalization

The table still has to face update anomaly as there are redundant records.

Steps in NormalizationDriver Table in 3nf

Efficient Querying Once the database is in place it is

very important that we query it in the most efficient way.

This can be done by improving the performance of SQL statements

This is called as SQL tuning

SQL Tuning On Cab DataBase Use a WHERE Clause to Filter Rows

Many novices retrieve all the rows from a table when they only want one row (or a few rows). This is very wasteful. A better approach is to add a WHERE clause to a query. That way, you restrict the rows retrieved to just those actually needed.

SELECT * FROM driver; BAD (retrieves all rows from the customers table)

SELECT * FROM customers WHERE Driver_Id IN (111012123, 092013456);

GOOD (uses a WHERE clause to limit the rows retrieved) Use Table Joins Rather than Multiple Queries

SELECT name, driver_id FROM products WHERE phone_no = 4026610;

SELECT First_Line FROM Address WHERE driver_id = 111012123; BAD (two separate queries when one would work)

SELECT d.name, a.first_line FROM driver d, address a WHERE d.driver_id = a.driver_id AND d.driver_id = 111012123;

Good( Join used instead of two separate queries)

SQL Tuning Fully Qualified Column References

When Performing Joins Always include table aliases in your queries and use the alias for each column in your query.

This is known as “fully qualifying” your column references. That way, the database doesn’t have to search for each column in

the tables used in your query. SELECT d.name, a.first_line, phone_no, Licence_no FROM

Driver d, Address a WHERE d.driver_id = a.driver_id; BAD (Phone_no and Licence_no columns are not fully qualified)

SELECT d.name, a.first_line, a.phone_no, d.Licence_no FROM Driver d, Address a WHERE d.driver_id = a.driver_id;

Good(Column references are Fully Qualified)

SQL Tuning Use CASE Expressions Rather than Multiple Queries

SELECT Driver_ID, CASE WHEN Tickets_Issued Between 8 AND 12 THEN ‘High_Risk’ WHEN Tickets_Issued BETWEEN 5 AND 8 THEN ‘Med_Risk’ ELSE ‘Fine’ END FROM Compliance;

Add Indexes to Tables Generally, you should create an index on a column when you are

retrieving a small number of rows from a table containing many rows. A good rule of thumb is Create an index when a query retrieves <=

10 percent of the total rows in a table. Why shouldn’t we create Index without the above Rule?

The downside of indexes is that when a row is added to the table, additional time is required to update the index for the new row.

Step 3Physical Model It is a transition from logical to physical model. Goal

is to achieve adequate performance. We may need to work creatively with the database

designer to propose and evaluate changes to the logical model to be incorporated in the physical model, if these are needed to achieve adequate performance.

similarly, we may need to work with the business stakeholders and process modelers or programmers to assess the impact of such changeson them.

The physical model describes the actual implemented database including the tables, constraints etc.

Thank You!

Recommended