7
Oracle column level compression IT Tips by Burleson Consulting September 5, 2009 Column oriented data storage for Oracle In traditional relational theory, the internal representation of the data on the physical blocks is not supposed to matter, but in the real world, the placement of the data on blocks is critical. Oracle provides tools like sorted hash clusters to group related rows together and row-sequencing can dramatically improve the performance of SQL queries by placing all information on a single data block. Using Oracle cluster table, you can even group related tables together on the same data block. For example, if you have a busy OLTP database where millions of people query customer and related order rows all day long: An un-clustered table has high I/O overhead

Oracle Column Level Compression

  • Upload
    msr61

  • View
    19

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Oracle Column Level Compression

Oracle column level compression

IT Tips by Burleson ConsultingSeptember 5,  2009

Column oriented data storage for Oracle

In traditional relational theory, the internal representation of the data on the physical blocks is not supposed to matter, but in the real world, the placement of the data on blocks is critical. Oracle provides tools like sorted hash clusters to group related rows together and row-sequencing can dramatically improve the performance of SQL queries by placing all information on a single data block.

Using Oracle cluster table, you can even group related tables together on the same data block. For example, if you have a busy OLTP database where millions of people query customer and related order rows all day long:

An un-clustered table has high I/O overhead

If you use Oracle cluster tables to put the customer and order rows together on a single data block, greatly reducing the number of trips to the database to fetch the desired result set.

Page 2: Oracle Column Level Compression

The placement of rows on physical data blocks makes a huge performance difference

See my note here on Oracle 11g table compression and note the tradeoff between run-time performance of the SQL, vs. the processing overhead of compressing and de-compressing the rows.

Page 3: Oracle Column Level Compression

Source: Oracle Corporation

The point is that this type of compression requires overhead, and the less volatile the table, the better the overall performance.

11gR2 column level data storage for Exadata servers

These column oriented database have another significant advantage, because they store adjacent column data, they can use compression algorithms to detect patterns in the columns and achieve very high rates of data compression. This packs more data onto each data block, making data warehouse queries run even faster.

But what about applications like DSS that want data stored with related columns on the data blocks? While traditional database system wants to group related data items together, data warehouse applications prefer to see related columns of data grouped together on the data blocks.

In a OLAP or DSS system, we are analyzing "facts", individual column values, and we rarely needs row-oriented data display.  Hence, it does not make sense to physically store data in rows format.  Instead, we break-apart the rows and store the column values adjacent to one another on the data blocks.

Oracle guru Guy Harrison of Quest Software has this great illustration of rows storage vs. column storage on Oracle data blocks:

Page 4: Oracle Column Level Compression

 

Source:  Guy Harrison

This is very foreign to most relational databases, and column oriented database are similar to indexes, in that they store ROWID and column value pairs together on the data blocks. These column oriented databases were first implemented of guru’s like professor Michael Stonebreaker at MIT with their C-store column-oriented database.

While the 11g compression is row-level compression we now see a Oracle 11g Release 2 new feature called 11g column-level compression. 

This option is only available on the Oracle/HP Exadata storage devices, million dollar firmware disks that are optimized for data warehouse data. 

Page 5: Oracle Column Level Compression

The column level compression for a table has syntax like this:

create table tot_sales_compressedcompress for archive level=3 asselect sales_factsfrom sales s;

Note the archive level syntax. According to the Oracle documentation, the archive level arguments s[specifies the amount of compression, and this is directly related to the processing overhead at compress-decompress time.

Oracle guru Guy Harrison of Quest Software performed a benchmark on the 11g Release 2 compression with these results:

Today, it appears that this new column-level data storage is only available on the million dollar Exadata storage boxes being sold by Oracle and HP.  If you attempt to define column-level compression on a non Exadata server to get this error message:

ORA-64307: hybrid columnar compression is only supported in tablespaces residing on Exadata storage

Cause: An attempt was made to use hybrid columnar compression on unsupported

Page 6: Oracle Column Level Compression

storage.

Action: Create this table in a tablespace residing on Exadata storage or use a different compression type.