Upload
msr61
View
19
Download
0
Embed Size (px)
Citation preview
Oracle column level compression
IT Tips by Burleson ConsultingSeptember 5, 2009
Column oriented data storage for Oracle
In traditional relational theory, the internal representation of the data on the physical blocks is not supposed to matter, but in the real world, the placement of the data on blocks is critical. Oracle provides tools like sorted hash clusters to group related rows together and row-sequencing can dramatically improve the performance of SQL queries by placing all information on a single data block.
Using Oracle cluster table, you can even group related tables together on the same data block. For example, if you have a busy OLTP database where millions of people query customer and related order rows all day long:
An un-clustered table has high I/O overhead
If you use Oracle cluster tables to put the customer and order rows together on a single data block, greatly reducing the number of trips to the database to fetch the desired result set.
The placement of rows on physical data blocks makes a huge performance difference
See my note here on Oracle 11g table compression and note the tradeoff between run-time performance of the SQL, vs. the processing overhead of compressing and de-compressing the rows.
Source: Oracle Corporation
The point is that this type of compression requires overhead, and the less volatile the table, the better the overall performance.
11gR2 column level data storage for Exadata servers
These column oriented database have another significant advantage, because they store adjacent column data, they can use compression algorithms to detect patterns in the columns and achieve very high rates of data compression. This packs more data onto each data block, making data warehouse queries run even faster.
But what about applications like DSS that want data stored with related columns on the data blocks? While traditional database system wants to group related data items together, data warehouse applications prefer to see related columns of data grouped together on the data blocks.
In a OLAP or DSS system, we are analyzing "facts", individual column values, and we rarely needs row-oriented data display. Hence, it does not make sense to physically store data in rows format. Instead, we break-apart the rows and store the column values adjacent to one another on the data blocks.
Oracle guru Guy Harrison of Quest Software has this great illustration of rows storage vs. column storage on Oracle data blocks:
Source: Guy Harrison
This is very foreign to most relational databases, and column oriented database are similar to indexes, in that they store ROWID and column value pairs together on the data blocks. These column oriented databases were first implemented of guru’s like professor Michael Stonebreaker at MIT with their C-store column-oriented database.
While the 11g compression is row-level compression we now see a Oracle 11g Release 2 new feature called 11g column-level compression.
This option is only available on the Oracle/HP Exadata storage devices, million dollar firmware disks that are optimized for data warehouse data.
The column level compression for a table has syntax like this:
create table tot_sales_compressedcompress for archive level=3 asselect sales_factsfrom sales s;
Note the archive level syntax. According to the Oracle documentation, the archive level arguments s[specifies the amount of compression, and this is directly related to the processing overhead at compress-decompress time.
Oracle guru Guy Harrison of Quest Software performed a benchmark on the 11g Release 2 compression with these results:
Today, it appears that this new column-level data storage is only available on the million dollar Exadata storage boxes being sold by Oracle and HP. If you attempt to define column-level compression on a non Exadata server to get this error message:
ORA-64307: hybrid columnar compression is only supported in tablespaces residing on Exadata storage
Cause: An attempt was made to use hybrid columnar compression on unsupported
storage.
Action: Create this table in a tablespace residing on Exadata storage or use a different compression type.