Upload
madlyn-townsend
View
213
Download
0
Embed Size (px)
Citation preview
1
Recovery Tuning
Main techniques Put the log on a dedicated disk Delay writing updates to the database
disks as long as possible Setting proper intervals for DB
dumping and checkpointing Reduce the size of large update
transactions
2
Separate Disk for the Log DB2 UDB v7.1 on
Windows 2000 5 % performance
improvement if log is located on a different disk
Controller cache hides negative impact
mid-range server, with Adaptec RAID controller (80Mb RAM) and 2x18Gb disk drives.
0
50
100
150
200
250
300
350
controller cache no controller cache
Th
rou
gh
put
(tu
ples
/sec
)
Log on same disk
Log on separate disk
Figure 2.18 in the textbook shows a 30% improvement
3
Tuning Database Writes Database writes caused by transactions tend
to be random Better to be delayed as much as possible
Sufficient info in the log for recovery But eventually they need to be written to the disk
To reduce recovery time When to write?
Forced: when the buffer is full (or nearly full) Opportunistic: when no extra overhead for disk seeking Checkpoint: force all committed writes to disk
4
Writing Dirty Pages to the Disk When the number of dirty pages is greater
than a given parameter (Oracle 8) When the number of dirty pages crosses a
given threshold (less than 3% of free pages in the database buffer for SQL Server 7)
When the log is full, a checkpoint is forced. This can have a significant impact on performance.
5
Tune Checkpoint Intervals Oracle 8i on Windows
2000 A checkpoint (partial
flush of dirty pages to disk) occurs at regular intervals or when the log is full:
Impacts the performance of on-line processing
+ Reduces the size of log+ Reduces time to
recover from a crash
0
0.2
0.4
0.6
0.8
1
1.2
0 checkpoint 4 checkpoints
Thro
ughp
ut R
atio
6
Group Commit Log-writing a bottleneck if every committing
transaction needs a write to the log Group commit
Write the logs of multiple transactions in batch Need to use a “log buffer” (another thing to
tune!) Better throughput if many concurrent short
update transactions Longer response time for individual transactions
This is a problem if they hold lock Early release of locks can cause problems, but the risk
is remote
7
Log IO - Data
Settings:lineitem ( L_ORDERKEY, L_PARTKEY , L_SUPPKEY,
L_LINENUMBER , L_QUANTITY, L_EXTENDEDPRICE , L_DISCOUNT, L_TAX , L_RETURNFLAG, L_LINESTATUS , L_SHIPDATE, L_COMMITDATE, L_RECEIPTDATE, L_SHIPINSTRUCT , L_SHIPMODE , L_COMMENT );
READ COMMITTED isolation level Empty table Dual Xeon (550MHz,512Kb), 1Gb RAM, Internal RAID
controller from Adaptec (80Mb), 4x18Gb drives (10000RPM), Windows 2000.
8
Log IO - Transactions
No Concurrent Transactions:Insertions [300 000 inserts, 10 threads],
e.g., insert into lineitem values (1,7760,401,1,17,28351.92,0.04,0.02,'N','O','1996-03-13','1996-02-12','1996-03-22','DELIVER IN PERSON','TRUCK','blithely regular ideas caj');
9
Group Commits DB2 UDB v7.1 on
Windows 2000 Log records of many
transactions are written together
Increases throughput by reducing the number of writes
At cost of increased minimum response time.
0
50
100
150
200
250
300
350
1 25
Size of Group Commit
Th
rou
gh
pu
t (t
up
les/
sec)
10
Transaction Chopping Some transactions, in particular
batch transactions, can be very long A lot of log information Very costly for recovery
Solution Transaction chopping An easy to understand concept
Formal work in appendix B of the textbook
11
Summary
In this module, we have covered: The principles of recovery How to optimise recovery-related
options Put the log on a dedicated disk Delay writing updates Using checkpoint and dump properly Reduce the size of update transactions
CS5226 Week 6
Operating System & Database Performance Tuning
13
Outline
Part 1: Operating systems and DBMS
Part 2: OS-related tuning
14
Operating System• Operating system is an
interface between hardware and other software, supporting:
• Processes and threads;
• Paging, buffering and IO scheduling
• Multi-tasking
• File system
• Other utilities such as timing, networking and performing monitoring
hardware
15
Scheduling Process versus thread
Scheduling based on time-slicing, IO, priority etc
Different from transaction scheduling The cost of content switching
When switch is desirable? And when is not? The administrator can set priorities to
processes/threads Case 1: The DBMS runs at a lower priority Case 2: Different transactions run at different priority Case 3: Online transactions with higher priority than
offline transactions
16
Priority Inversion
Let priorities T1 > T2s > T3
T1
T2s
T3
Lock x
Request X
… a solution: priority inheritance
17
Database Buffers
Application buffers
DBMS buffers
OS buffers
•An application can have its own in-memory buffers (e.g., variables in the program; cursors);
•A logical read/write will be issued to the DBMS if the data needs to be read/written to the DBMS;
•A physical read/write is issued by the DBMS using its systematic page replacement algorithm. And such a request is passed to the OS.
•OS may initiate IO operations to support the virtual memory the DBMS buffer is built on.
18
Database Buffer Size Buffer too small, then hit
ratio too smallhit ratio =
(logical acc. - physical acc.) / (logical acc.)
Buffer too large, paging Recommended strategy:
monitor hit ratio and increase buffer size until hit ratio flattens out. If there is still paging, then buy memory.
LOG DATADATA
RAM
Paging Disk
DATABASE PROCESSES
DATABASEBUFFER
19
Buffer Size - Data
Settings:employees(ssnum, name, lat, long, hundreds1,
hundreds2);
clustered index c on employees(lat); (unused) 10 distinct values of lat and long, 100 distinct values
of hundreds1 and hundreds2 20000000 rows (630 Mb); Warm Buffer Dual Xeon (550MHz,512Kb), 1Gb RAM, Internal RAID
controller from Adaptec (80Mb), 4x18Gb drives (10000 RPM), Windows 2000.
20
Buffer Size - Queries
Queries: Scan Queryselect sum(long) from employees;
Multipoint queryselect * from employees where lat = ?;
21
Database Buffer Size SQL Server 7 on
Windows 2000 Scan query:
LRU (least recently used) does badly when table spills to disk as Stonebraker observed 20 years ago.
Multipoint query: Throughput increases
with buffer size until all data is accessed from RAM.
Multipoint Query
0
40
80
120
160
0 200 400 600 800 1000
Buffer Size (Mb)
Th
rou
gh
pu
t (Q
uer
ies/
sec)
Scan Query
0
0.02
0.04
0.06
0.08
0.1
0 200 400 600 800 1000
Buffer Size (Mb)
Th
rou
gh
pu
t (Q
ue
rie
s/s
ec
)
22
Multiprogramming Levels More concurrent users
Better utilization of CPU cycles (and other system resources)
Risk of excessive page swapping More lock conflicts
So how many exactly Depends on transaction profiles
Experiments to find the best value And this parameter may change when application
patterns change Feedback control mechanism
23
Disk Layout and Access Larger disk allocation chunks
improves write performance At the cost of disk utilisation
Setting disk usage factor Low when expecting updates/inserts Higher for scan-type of queries
Prefetching within DBMS; not OS For non-random accesses
24
Scan Performance - Data
Settings:lineitem ( L_ORDERKEY, L_PARTKEY , L_SUPPKEY,
L_LINENUMBER , L_QUANTITY, L_EXTENDEDPRICE , L_DISCOUNT, L_TAX , L_RETURNFLAG, L_LINESTATUS , L_SHIPDATE, L_COMMITDATE, L_RECEIPTDATE, L_SHIPINSTRUCT , L_SHIPMODE , L_COMMENT );
600 000 rows Lineitem tuples are ~ 160 bytes long Cold Buffer Dual Xeon (550MHz,512Kb), 1Gb RAM, Internal RAID
controller from Adaptec (80Mb), 4x18Gb drives (10000RPM), Windows 2000.
25
Scan Performance - Queries
Queries:select avg(l_discount) from lineitem;
26
Usage Factor DB2 UDB v7.1 on
Windows 2000 Usage factor is the
percentage of the page used by tuples and auxiliary data structures (the rest is reserved for future)
Scan throughput increases with usage factor.
0
0.05
0.1
0.15
0.2
70 80 90 100
Usage Factor (%)
Thro
ughp
ut (T
rans
/sec
)
scan
27
Prefetching DB2 UDB v7.1 on
Windows 2000
Throughput increases up to a certain point when prefetching size increases.
0
0.05
0.1
0.15
0.2
32Kb 64Kb 128Kb 256Kb
Prefetching
Thro
ughp
ut (T
rans
/sec
)
scan
28
Summary
In this module, we have covered: A review of OS from the DBMS
perspective How to optimise OS-related
parameters and options Thread Buffer, and File system