Upload
anindya-mitra
View
226
Download
0
Embed Size (px)
Citation preview
8/3/2019 Database Design Optimization
http://slidepdf.com/reader/full/database-design-optimization 1/25
Index
DATABASE OPTIMIZATION
Techniques practiced in Data
Modeling to achieve practical gain
Version 1.0
Anindya Mitra
8/3/2019 Database Design Optimization
http://slidepdf.com/reader/full/database-design-optimization 2/25
8/3/2019 Database Design Optimization
http://slidepdf.com/reader/full/database-design-optimization 3/25
Index
Slide 3 of 25
What?
•
Design change in Physical Data Model.• No longer logical and physical model look similar.
• Design may look drifted apart from Normalization Rules.
Why?• Technology dependent implementation of database design.
• Logical design has some proven performance problem.
• Business rules driven design approach.
• Design by Data access pattern and frequency.
• Design to minimize I/O cost and runtime CPU requirements.
Database Optimization is Design Adjustments to meet high
volume queries or transactions, with or without sacrificing the
the normal forms.
DATABASE OPTIMIZATION
8/3/2019 Database Design Optimization
http://slidepdf.com/reader/full/database-design-optimization 4/25
Index
Slide 4 of 25
Optimization Techniques: -
Various techniques are practiced in the world of data modeling.
All are not useful at the same time.
Understand Merits and Demerits of each approach.
Choose the right tool for the right job.
DATABASE OPTIMIZATION
8/3/2019 Database Design Optimization
http://slidepdf.com/reader/full/database-design-optimization 5/25
Index
Slide 5 of 25
DATABASE OPTIMIZATION
Denormalization
There are many techniques for denormalizing a relational database design.
These include –
Stored Joins :-• Technique of performing join on two or more tables
together and storing the answer set as an additional table.
Adv :-• Reduces significant runtime processing for big queries.
Disadv : -• Requires synchronization when original data modified.• Takes more disk space.
• Applications needs information in order to use theseadditional tables.
8/3/2019 Database Design Optimization
http://slidepdf.com/reader/full/database-design-optimization 6/25
Index
Slide 6 of 25
Denormalization(Contd.) DATABASE OPTIMIZATION
Stored Joins - Database Specific Features:-
•OracleOracle 8i Materialized Views can replace Stored Joins.
Advantages are :-1) Transparency to application program
2) Automatic data maintenance3) Protection against data corruptions.
8/3/2019 Database Design Optimization
http://slidepdf.com/reader/full/database-design-optimization 7/25
Index
Slide 7 of 25
Denormalization(Contd.) DATABASE OPTIMIZATION
Duplicated data :- • Technique of making copies of some columns from
one table and storing & utilizing in another.
Adv :-• Eliminates runtime joins if only those columns are
accessed from the parent table. Improves queryperformance.
Disadv :-• Duplication may lead to data inconsistency.• More storage space required.
• Application programs need information about them.
8/3/2019 Database Design Optimization
http://slidepdf.com/reader/full/database-design-optimization 8/25
Index
Slide 8 of 25
Denormalization(Contd.) DATABASE OPTIMIZATION
Duplicated data - Database Specific Features:-
•OracleOracle 9i Bitmap Join Index stores duplicated columns from another table as the
index entry, not as table columns. Can be used only when data is of low
cardinality and static in nature and achieve :-
1) Transparency to application programs
8/3/2019 Database Design Optimization
http://slidepdf.com/reader/full/database-design-optimization 9/25
Index
Slide 9 of 25
Denormalization(Contd.) DATABASE OPTIMIZATION
Recurring data groups :-
• Collapsing the subordinate table into the parent table
when fixed and small number of subordinate tablesassociated with a table.
Adv :-• Eliminates runtime joins between those tables.
Disadv :-• Degradation of design quality as leads to lower
Normal Forms.
Note: Care should be taken to preserve the Keys to ensure “Rule of
Reconstruction”. i.e. Joins should not lead to loss of information.
8/3/2019 Database Design Optimization
http://slidepdf.com/reader/full/database-design-optimization 10/25
Index
Slide 10 of 25
Denormalization(Contd.) DATABASE OPTIMIZATION
Derived data :-
• Composite attributes derivable by computational rulesare materialized as columns.
• Aggregate data stored as precalculated columns.
Adv :-
• Computational attributes save runtime computation.• Aggregate columns minimizes disk I/O and processing
time.
Disadv :-
• As precomputed, Accuracy and Timeliness are issues.• Derived columns introduce redundancy and may evenviolate 3NF for some cases.
Note:- Avoid storage of aggregate data within the same table
being aggregated.
8/3/2019 Database Design Optimization
http://slidepdf.com/reader/full/database-design-optimization 11/25
Index
Slide 11 of 25
Denormalization(Contd.) DATABASE OPTIMIZATION
Derived data - Database Specific Features:-
•OracleOracle 8i Materialized Views can be used to store aggregate results.
8/3/2019 Database Design Optimization
http://slidepdf.com/reader/full/database-design-optimization 12/25
Index
Slide 12 of 25
DATABASE OPTIMIZATION
Vertical Partitioning: -
• Technique of splitting the original logical table into two or
more physical tables by assigning some of the columns toone physical table and some to another.
• Both tables end up with the same number of rows and have
the same keys.
Adv :-• Incurs less I/O for queries that never joins these
splitted tables.• May improve situation under those DBMSs, which
perform poorly for wide tables.Disadv :-• May prove costly if most queries require columns
from more than one table. Introduces joins.• Introduce multi-table integrity check and Primary key
maintenance.
OverNormalization
8/3/2019 Database Design Optimization
http://slidepdf.com/reader/full/database-design-optimization 13/25
Index
Slide 13 of 25
DATABASE OPTIMIZATION
Single Table Subtype Design : -
• Map the subtypes onto a single table for the supertype.
• The single table will contain instances of all subtypes.• Use when the subtypes have few subtype specific
attributes and relationships.
Adv:-• Access to the supertype is straightforward.• The subtypes can be accessed and modified
using views.
Disadv:-• Subtype NOT NULL requirements cannot beenforced at the database level.
• Application logic will have to cater to different setsof attributes, depending on TYPE.
SUBTYPE options
8/3/2019 Database Design Optimization
http://slidepdf.com/reader/full/database-design-optimization 14/25
Index
Slide 14 of 25
DATABASE OPTIMIZATION
Separate Tables Subtype Design : -
• Map the subtypes onto separate tables – one for eachsubtype.
• Each table will contain only instances of that subtype.
• Use when there are many subtype-specific attributes orrelationships.
Adv :-• The subtype's attribute optionality is enforced at
database level. • Application logic does not require checks for subtypes.
Disadv :-• Access to the supertype requires the UNION or Join.
• Joins produce multi-table view; complex updation logic.
• Maintenance of UID's across subtypes is difficult to
implement.
SUBTYPE options (Contd.)
8/3/2019 Database Design Optimization
http://slidepdf.com/reader/full/database-design-optimization 15/25
Index
Slide 15 of 25
DATABASE OPTIMIZATIONSUBTYPE options (Contd.)
Example :-
Separate Table Design Single Table Design
8/3/2019 Database Design Optimization
http://slidepdf.com/reader/full/database-design-optimization 16/25
Index
Slide 16 of 25
DATABASE OPTIMIZATION
Horizontal segmentation : -
Technique of taking horizontal slices of the originaltable and putting them in different tables. Somerows go to one table and some rows to another
depending on partition logic.
Other Techniques
Order_98 OrderOrder_99
Order
8/3/2019 Database Design Optimization
http://slidepdf.com/reader/full/database-design-optimization 17/25
Index
Slide 17 of 25
Adv :-
• Incurs less I/O for queries that access data from onesingle partition table as tables become smaller.
• Putting tables in different physical files may improveperformance of Parallel Processing.
• Very useful technique for data archival. Old, rarely
accessed data goes to a separate table than current data.Disadv :-
• Slows retrieval across partition tables.
• Applications should have information about which tableto look at for which data. May result in code modification.
• Complexity.
Other (Contd.)
Horizontal Segmentation (Contd.)
8/3/2019 Database Design Optimization
http://slidepdf.com/reader/full/database-design-optimization 18/25
Index
Slide 18 of 25
Other (Contd.) DATABASE OPTIMIZATION
Horizontal Segmentation - Database Specific
Features:-•Oracle
Oracle 8 Partitioning Option provides a very useful alternative solution.
The advantages are :-
1) Ease of partition maintenance and augmentation
2) Transparency to application program
8/3/2019 Database Design Optimization
http://slidepdf.com/reader/full/database-design-optimization 19/25
Index
Slide 19 of 25
DATABASE OPTIMIZATION
Repeating groups across columns
rather down rows: -
Put repeating groups in additional columns in same tablerather than putting in a different table as down rows.
Conditions :
• Meaning of each column is clear and different.
• Fixed number of occurrences.
• Entire group normally accessed or updated as
one unit.
Other (Contd.)
8/3/2019 Database Design Optimization
http://slidepdf.com/reader/full/database-design-optimization 20/25
Index
Slide 20 of 25
Other (Contd.)
Repeating groups across columns (Contd.)
Adv:-• Results in simpler query by eliminating joins.• Improve query or DML performance.
Disadv:-• Design looses flexibility. Any change in numberof occurrences of repeating group leads todesign change.
8/3/2019 Database Design Optimization
http://slidepdf.com/reader/full/database-design-optimization 21/25
Index
Slide 21 of 25
DATABASE OPTIMIZATION
Surrogate key in place of Natural Key: -
A short, generated key may be selected as Primary Keyin some situations even if natural key is existent.
Conditions:
• Natural key is a wide one. Search takes
significant time.• Natural key is a composite key.
Other (Contd.)
8/3/2019 Database Design Optimization
http://slidepdf.com/reader/full/database-design-optimization 22/25
8/3/2019 Database Design Optimization
http://slidepdf.com/reader/full/database-design-optimization 23/25
Index
Slide 23 of 25
Fixed Length columns Left of VariableLength columns :-
• Variable length columns require CPU time tocompute where they end and next columnsfollowing them begin within the table row.
• May help to improve CPU bottleneck.
Nullable columns at End of table :-Note: Oracle doesn’t allocate any space for columns, those
have all null data, when positioned at the end of the row.
DATABASE OPTIMIZATIONOther (Contd.)
8/3/2019 Database Design Optimization
http://slidepdf.com/reader/full/database-design-optimization 24/25
8/3/2019 Database Design Optimization
http://slidepdf.com/reader/full/database-design-optimization 25/25
Index
Caveats
• Remain as faithful as you can to the original logical datamodel.
• Apply tuning techniques, that result database structuresinconsistent with the logical data model, only when it isproven that to do otherwise would result in functionality orperformance failure.
• separate the logical model from the physical modelwithin modeling tool; document changes to the physicalmodel.
• Decide on Optimization techniques at early phase ofdesign. Saves lot of Re-coding time.
• OLTP systems should avoid Denormalization as it makesdata updation extremely difficult.
DATABASE OPTIMIZATION