Upload
dmcfarlane
View
2.193
Download
0
Embed Size (px)
DESCRIPTION
As operational database schemas become complex, users resort to denormalization to handle performance issues. This includes a range of techniques from materialized views to using MySQL as a key-value store for blobs containing full objects. While denormalization solves immediate bottlenecks, it comes at a hefty price. In this presentation Ari will explore common denormalization approaches and tradeoffs using real world examples. He will then present a solution under development at Akiban Technologies to alleviate these same problems much more efficiently, and allow users to get the best of both worlds.
Citation preview
RENORMALIZE
Akiban Technologies, Inc. Confidential & Proprietary
Solving Performance Problems in MySQL Without Denormalization
Problem Statement
Schemas scale out
Data volume grows
Joins become a real bottleneck
2 Akiban Technologies, Inc. Confidential & Proprietary
Two Common Manifestations
SQL Joins Queries become slower as more tables are joined.
Application Object Creations Constructing an object is as expensive as SELECTing the sum of its parts
Denormalize. Problem solved.
3 Akiban Technologies, Inc. Confidential & Proprietary
V1 Release Get Customers!
4
Application Growing Pains
V2 Release De-normalize DB
Time
Com
plex
ity &
Cos
t
Cus
tom
ers
V3 Release Replicate DB
V4 Release Add Caching
V5 Release Shard Database
V6 Release Rip & Replace
MySQL
Cache Server
MySQL Slaves
MySQL MySQL
MySQL
Sharding
Web Server
Rip & Replace Database Architecture
?
De·nor·mal·ize [de-nawr-muh-lahyze]
verb, -ized, -iz·ing.
–verb (used with object)
1. the process of attempting to optimize the read performance of a database by adding redundant data or by grouping data wikipedia
2. Denormalize means to allow redundancy in a table so that the table can remain flat UCSD Blink
3. The process of restructuring a normalized data model to accommodate operational constraints or system limitations celiang.tongji.edu.cn
5 Akiban Technologies, Inc. Confidential & Proprietary
Materialized Views
Persistent database object Contains the results of a query Store summary and pre-joined tables Require maintenance/refresh for dynamic data
SELECT DISTINCT(n.nid),n.sticky,n.title,n.created FROM node n INNER JOIN term_node tn0
ON n.vid = tn0.vid WHERE n.status = 1
AND tn0.tid IN (77) ORDER BY n.sticky DESC, n.created DESC LIMIT 0, 25;
Result: using where, using filesort 6 Akiban Technologies, Inc. Confidential & Proprietary
Drupal Materialized View Project CREATE TABLE `mv_drupalorg_node_by_term` ( `entity_type` varchar(64) NOT NULL, `entity_id` int(10) unsigned NOT NULL DEFAULT '0’, `term_tid` int(10) unsigned NOT NULL DEFAULT '0', `node_sticky` int(11) NOT NULL DEFAULT '0', `last_node_activity` int(11) NOT NULL DEFAULT '0', `node_created` int(11) NOT NULL DEFAULT '0', `node_title` varchar(255) NOT NULL DEFAULT '’, PRIMARY KEY (`entity_type`,`entity_id`,`term_tid`), KEY `activity` (`term_tid`,`node_sticky`,`last_node_activity`,`node_created`), KEY `creation` (`term_tid`,`node_sticky`,`node_created`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 SELECT DISTINCT entity_id AS nid, node_sticky AS sticky, node_title AS title,
node_created AS created FROM mv_drupalorg_node_by_term WHERE term_tid IN (77) ORDER BY node_sticky DESC, node_created DESC LIMIT 0, 25;
Result: using where, using temporary table
7 Akiban Technologies, Inc. Confidential & Proprietary
Denormalization Technique Listing
8 Akiban Technologies, Inc. Confidential & Proprietary
Technique Pros Cons
Materialized views Faster queries (no joins) Data explosion Manually keep synched
Store object as Blob Fast object get No modeling, or querying
Denormalize 1NF: Folding parent-child into parent table
Data in one row limited # of child rows Hard to query (UNION hell)
Denormalize 2NF to 1NF: repeat columns from 1 table in M table (Double writing)
Avoid join Data explosion Manually keep synched
Adding derived columns Avoid joins, aggregation Manually keep synched
Property bag (RDF) Schema flexibility Manage schema in app Hard to index or perform
Renormalization
Join for free - Improved performance. 10-100x! - Retrieve an object in one request
9 Akiban Technologies, Inc. Confidential & Proprietary
Introduction to Table-Groups
Traditional SQL Schema à Table à Column
Akiban newSQL Schema à GROUP à Table à Column
Table-Groups are first class citizens
10 Akiban Technologies, Inc. Confidential & Proprietary
Typical Relational DB Schema
11 Akiban Technologies, Inc. Confidential & Proprietary
Typical Schema: Grouped
12
Block
Group
User Group
Node Group
Physical
Artist Table-group
Logical
Table-Groups Eliminate Joins
13
Users Users_Roles Sessions
Table bTree
uid name pass
1 rriegel ***
2 twegner ***
Table bTree
Table bTree
Akiban Technologies, Inc. Confidential & Proprietary
id rid
1 1
1 2
2 1
id timestamp sid
1 2011-10-01-06:02.00 19390
2 2011-10-04-22:32.10 22828
1 2011-10-04-16:07.30 49377
Group bTree
Benefits of Table-grouping
SQL join operations are fast - Table Group access is equivalent to a
single table access. Joins are free! - Performance increases 10-100x
Applications do not change - Maintain the same tables and SQL - Objects (e.g. ORM) fetched in one request - Akiban uses standard MySQL replication
14 Akiban Technologies, Inc. Confidential & Proprietary
Design Partner Sample Query
SELECT t1.id , t3.c1, t3.c2, t3.c3, t3.c4 FROM t1 INNER JOIN t2 on t2.id = t1.id LEFT JOIN t3 ON t1.id = t3.id WHERE t2.region in (1297789)
AND t1.c1 = '0' ORDER BY t1.latestLogin DESC LIMIT 500
15 Akiban Technologies, Inc. Confidential & Proprietary
Typical MySQL EXPLAIN Plan
1
2 3
4
7
6
5
8
9
Akiban Technologies, Inc. Confidential & Proprietary
3 Index Accesses
Sort
Temp Table
2 Joins
2 Table Accesses
Project Results 10
3 Index Accesses
Sort
Temp Table
2 Joins
2 Table Accesses
Project Results
Efficiency for Speed and Scale
Akiban Technologies, Inc. Confidential & Proprietary
2
1
3
Typical MySQL EXPLAIN 1 Group Index Access
No Joins, Temp Tables or
Sorts!
1 Group Access
Project Results
Design Partner Acceleration: 27x
18 Akiban Technologies, Inc. Confidential & Proprietary
Concurrent Connections
Object Creation Query Stream
SELECT * FROM t1 Where u.uid=1387 SELECT * FROM t2 Where as.uid=1387 SELECT * FROM t3 Where os.uid=1387 SELECT * FROM t4 Where pm.uid=1387 SELECT * FROM t5 Where pl.uid=1387 SELECT * FROM t6 Where pa.uid=1387 ... ...
19 Akiban Technologies, Inc. Confidential & Proprietary
Becomes Single ORM Request SELECT * , (SELECT * FROM t2 where as.uid=u.uid), (SELECT * FROM t3 where as.uid=u.uid),
... FROM t1 Where u.uid=1387;
Or simply: get my_schema:t1:uid=1387
20 Akiban Technologies, Inc. Confidential & Proprietary
Object Access in One Request
21 Akiban Technologies, Inc. Confidential & Proprietary
Application Integration
22
Akiban Server
MyISAM / InnoDB Storage
MySQL Master
MyS
QL
adap
ter
Replication
Problem Queries Write Operations
HA Redirect Enabled
Akiban Technologies, Inc. Confidential & Proprietary
Fully independent server Data replicated to Akiban
Akiban is looking for Design Partners! Do you have • Slow multi-join read queries? • User concurrency or data volume challenges? http://www.akiban.com/design-partner-program
23 Akiban Technologies, Inc. Confidential & Proprietary
Ah, so you’re…
Denormalizing…no. - Schema doesn’t change - Data is stored once, more efficiently
Materializing Views…no. - No triggers or post-processing - No 2ndary logical objects
Introducing Write Latency…no. - Previous design partner showed 2x write
improvement
24 Akiban Technologies, Inc. Confidential & Proprietary
Artist
Table bTree
id name gender
1 Lennon M
2 Joplin F
Table-Grouping: A Closer Look
25 Akiban Technologies, Inc. Confidential & Proprietary
• Covering index • Index on frequently joined columns • Index on common sort order
Each table maintains its own bTree
Indexes add their own bTrees
Covering Index
Join Cols Index Sort
Order Index
How many indexes do you maintain? • Slow updates == reduced concurrency • More resources == more overhead • Ongoing maintenance == high TCO