May 6, 2011

Recover HCC Compressed Tables to non Exadata storage

Follow-up post from the 13th of May 2011


If you read my previous post, you might have noticed that I was using a non-partitioned table, of a somewhat small size. The basic stuff is there, but if you're managing bigger tables you might want to do it using partitions and do it using parallel DDL, to speed things up.

This raises a new challenge: you might not want to uncompress all your partitions, and those that you decide to uncompress should be able to do it in parallel. The sole propose of this post is to show you how can you do it, assuming you want to uncompress all your HCC compressed partitions in the non-Exadata storage... and in parallel :-D

Now the steps

1. Create 4 different tablespaces to accommodate the four partitions of the table we're about to create

2. Create a table called BRAGA (3rd biggest city in Portugal and the finalists in the UEFA Europa Cup next 18th May in Dublin's Arena). This table is hash partitioned, with four partitions pointing to the four tablespaces created previously. The table will use HCC compression at the default level (QUERY HIGH), which uses the exact same algorithms and transformations as ARCHIVE LOW, but with a smaller Compression Unit (CU).

3. Direct Load the table with info from the Data Dictionary, creating 415805 rows (about 70MB of uncompressed info).

4. Check the space taken by the CUs: 2.75MB. A compression ratio around 25!

5. Gather stats for the table and its partitions

6. Check how the hashing algorithm spread the data across the four partitions. It should be balanced. I've written another post this year about this. You should choose the column with the biggest cardinality to spread the data more evenly. It should be a number around 415805/4=103952 rows for each partition.

7. Backup the tablespaces

8. Recover them back to non-Exadata storage, this time to a local disk in node 1 (/home/oracle/lmc). I was using only one instance to avoid having ORA-01157 errors blowing in my face.

9. Check if the datafiles are on the expected place

10. Query table BRAGA and get error ORA-64307 thrown at you

11. Now it's the tricky bit! First you modify the compression status of each partition to "nocompress" and then ...

12. ... you uncompressed them at will and in parallel (after enabling parallel DDL at the session level). Each ALTER TABLE braga MOVE PARTITION  will be a parallel operation in itself, and can be done in parallel amongst themselves to raise the bar!

13. After the uncompression gets done you'll see that the table is readable and has 69MB!

14. To play around a bit I've decided to compress them again but using OLTP compression and in the end I measure the ratio. Nice one too.

Now the code
create tablespace knod1
 datafile '+DGDATA' size 100m
 autoextend on next 250m maxsize 5g;
 create tablespace knod2
 datafile '+DGDATA' size 100m
 autoextend on next 250m maxsize 5g;
 create tablespace knod3
 datafile '+DGDATA' size 100m
 autoextend on next 250m maxsize 5g;
 create tablespace knod4
 datafile '+DGDATA' size 100m
 autoextend on next 250m maxsize 5g;
create table lmc.braga ( --column mimic dba_segments
 OWNER                VARCHAR2(30),
 SEGMENT_NAME            VARCHAR2(81),
 PARTITION_NAME         VARCHAR2(30),
 SEGMENT_TYPE            VARCHAR2(18),
 SEGMENT_SUBTYPE        VARCHAR2(10),
 TABLESPACE_NAME        VARCHAR2(30),
 HEADER_FILE            NUMBER,
 HEADER_BLOCK            NUMBER,
 BYTES                NUMBER,
 BLOCKS             NUMBER,
 EXTENTS            NUMBER,
 INITIAL_EXTENT         NUMBER,
 NEXT_EXTENT            NUMBER,
 MIN_EXTENTS            NUMBER,
 MAX_EXTENTS            NUMBER,
 MAX_SIZE            NUMBER,
 RETENTION            VARCHAR2(7),
 MINRETENTION            NUMBER,
 PCT_INCREASE            NUMBER,
 FREELISTS            NUMBER,
 FREELIST_GROUPS        NUMBER,
 RELATIVE_FNO            NUMBER,
 BUFFER_POOL            VARCHAR2(7),
 FLASH_CACHE            VARCHAR2(7),
 CELL_FLASH_CACHE        VARCHAR2(7)
 )
 partition by hash(header_block)
 partitions 4
 store in (knod1, knod2, knod3, knod4)
 compress for query high
 ;

Direct-Load it so HCC can kick-in:
SQL> insert /*+ APPEND */ into lmc.braga
 2   select * from dba_segments;
415805 rows created.
select sum(bytes)/1024/1024 Megs
 from dba_extents
 where segment_name = 'BRAGA';
MEGS
 ----------
 2.75
-- Collect Stats
 exec dbms_stats.gather_table_stats('LMC','BRAGA');
select partition_name, num_rows
 from user_tab_partitions
 where table_name='BRAGA';
PARTITION_NAME             NUM_ROWS
 ------------------------------ ----------
 SYS_P61985               101624
 SYS_P61986               104043
 SYS_P61987               103247
 SYS_P61988               106891

Nice Distribution! This means we've picked the right column to hash partition it.

Now let's move it to non-Exadata storage:

1st) Backup.
You can do it one by one or backup them all with the same rman command, see my example:
RMAN> backup tablespace knod1;
Starting backup at 13-MAY-11
 using target database control file instead of recovery catalog
 allocated channel: ORA_DISK_1
 channel ORA_DISK_1: SID=693 instance=orcl1 device type=DISK
 channel ORA_DISK_1: starting full datafile backup set
 channel ORA_DISK_1: specifying datafile(s) in backup set
 input datafile file number=01006 name=+DGDATA/orcl/datafile/knod1.1153.750943521
 channel ORA_DISK_1: starting piece 1 at 13-MAY-11
 channel ORA_DISK_1: finished piece 1 at 13-MAY-11
 piece handle=+DGARCH/orcl/backupset/2011_05_13/nnndf0_tag20110512t120530_0.279.750945931 tag=TAG20110512T120530 comment=NONE
 channel ORA_DISK_1: backup set complete, elapsed time: 00:00:07
 Finished backup at 13-MAY-11
RMAN> backup tablespace knod2,knod3,knod4;
Starting backup at 13-MAY-11
 using channel ORA_DISK_1
 channel ORA_DISK_1: starting full datafile backup set
 channel ORA_DISK_1: specifying datafile(s) in backup set
 input datafile file number=01007 name=+DGDATA/orcl/datafile/knod2.1154.750943521
 input datafile file number=01008 name=+DGDATA/orcl/datafile/knod3.1155.750943523
 input datafile file number=01009 name=+DGDATA/orcl/datafile/knod4.1156.750943525
 channel ORA_DISK_1: starting piece 1 at 13-MAY-11
 channel ORA_DISK_1: finished piece 1 at 13-MAY-11
 piece handle=+DGARCH/orcl/backupset/2011_05_13/nnndf0_tag20110512t120633_0.278.750945995 tag=TAG20110512T120633 comment=NONE
 channel ORA_DISK_1: backup set complete, elapsed time: 00:00:07
 Finished backup at 13-MAY-11

Now restore to "/home/oracle/lmc":
run {
 sql 'alter tablespace "KNOD1" offline immediate';
 sql 'alter tablespace "KNOD2" offline immediate';
 sql 'alter tablespace "KNOD3" offline immediate';
 sql 'alter tablespace "KNOD4" offline immediate';
 set newname for datafile '+DGDATA/orcl/datafile/knod1.1153.750943521' to '/home/oracle/lmc/knod1.dbf';
 restore tablespace 'KNOD1';
 set newname for datafile '+DGDATA/orcl/datafile/knod2.1154.750943521' to '/home/oracle/lmc/knod2.dbf';
 restore tablespace 'KNOD2';
 set newname for datafile '+DGDATA/orcl/datafile/knod3.1155.750943523' to '/home/oracle/lmc/knod3.dbf';
 restore tablespace 'KNOD3';
 set newname for datafile '+DGDATA/orcl/datafile/knod4.1156.750943525' to '/home/oracle/lmc/knod4.dbf';
 restore tablespace 'KNOD4';
 switch datafile all;
 recover tablespace 'KNOD1';
 recover tablespace 'KNOD2';
 recover tablespace 'KNOD3';
 recover tablespace 'KNOD4';
 sql 'alter tablespace "KNOD1" online';
 sql 'alter tablespace "KNOD2" online';
 sql 'alter tablespace "KNOD3" online';
 sql 'alter tablespace "KNOD4" online';
 }

After a while everything is back to non-Exadata storage:
[oracle@db01 lmc]$ ll
 total 410144
 -rw-r----- 1 oracle oinstall 104890368 May 13 18:12 knod1.dbf
 -rw-r----- 1 oracle oinstall 104890368 May 13 18:13 knod2.dbf
 -rw-r----- 1 oracle oinstall 104890368 May 13 18:13 knod3.dbf
 -rw-r----- 1 oracle oinstall 104890368 May 13 18:13 knod4.dbf
 [oracle@db01 lmc]$ pwd
 /home/oracle/lmc

Let's try to query the table:
SQL> select count(*)
 2  from lmc.braga;
 from lmc.braga
 *
 ERROR at line 2:
 ORA-64307: hybrid columnar compression is only supported in tablespaces
 residing on Exadata storage

This is fine! Now let's uncompress it in parallel:
SQL> ALTER SESSION ENABLE PARALLEL DDL;
Session altered.
SQL> alter table braga modify partition SYS_P61985 nocompress;
Table altered.
SQL> alter table braga modify partition SYS_P61986 nocompress;
Table altered.
SQL> alter table braga modify partition SYS_P61987 nocompress;
Table altered.
SQL> alter table braga modify partition SYS_P61988 nocompress;
Table altered.
SQL> select sum(bytes)/1024/1024 Megs
 from dba_extents
 where segment_name = 'BRAGA';  2    3
MEGS
 ----------
 2.75
SQL> select count(*)
 2  from braga;
 from braga
 *
 ERROR at line 2:
 ORA-64307: hybrid columnar compression is only supported in tablespaces
 residing on Exadata storage
Still can't!!!!!!! because we haven't released it yet.
SQL> alter table braga move partition SYS_P61985 parallel;
Table altered.
SQL> alter table braga move partition SYS_P61986 parallel;
Table altered.
SQL> alter table braga move partition SYS_P61987 parallel;
Table altered.
SQL> alter table braga move partition SYS_P61988 parallel;
Table altered.
SQL> select count(*)
 2  from braga;
COUNT(*)
 ----------
 415805
SQL> select sum(bytes)/1024/1024 Megs
 from dba_extents
 where segment_name = 'BRAGA';  2    3
MEGS
 ----------
 69

Wow! Looks like the ratio was 69/2.75=25 or in orageek language:
SQL> select round(69/2.75) Ratio from dual;
RATIO
 ----------
 25

Now that we got kicked off Exaland, we can only use OLTP compression. Let's do it to all four partitions:
alter table braga modify partition SYS_P61985
 compress for oltp;
 alter table braga modify partition SYS_P61986
 compress for oltp;
 alter table braga modify partition SYS_P61987
 compress for oltp;
 alter table braga modify partition SYS_P61988
 compress for oltp;
 -- Gosh I was dumm not naming the partitions :-(

Now rebuild them:
alter table braga move partition SYS_P61985;
 alter table braga move partition SYS_P61986;
 alter table braga move partition SYS_P61987;
 alter table braga move partition SYS_P61988;
SQL> select sum(bytes)/1024/1024 Megs
 from dba_extents
 where segment_name = 'BRAGA';  2    3
MEGS
 ----------
 16

Still a good ratio:
SQL> select round(69/16) Ratio from dual;
RATIO
 ----------
 4

Hope this was of some utility.

LMC


Older post from the 6th of May 2011



Is this for me? (always helpful for busy people)

As many of you know by now, Exadata comes with an exclusive way to compress information, based on columns, and with a different storage unit called "Compression Unit", that responds by the name of Hybrid Columnar Compression (HCC). The only storage in the world that can read this compressed information is Exadata Storage Server. So if you have your main production database inside an Exadata box, and you want to duplicate that database to non-Exadata storage (for instance SAN), then this article is for you.

Stuff you need to know first:

- As of this day Golden Gate development teams haven't been able to capture information from compressed tables (HCC or other Oracle compression algorithm) but they are working on it

- If you use standby database (Data Guard) either in active or on mount mode you need to do these steps first, before you start syncronizing

- HCC is a very good compression technique. I've seen hundreds of examples and very few of them don't have a 10 to 20 average compression ratio. So you need to have 10 to 20 times the space in your non-Exadata storage before you consider using the following steps. The only way to compensate the need for that amount of space is to "explode" the HCC tables to OLTP compressed tables. I'll show you that too.

The Steps

These steps were done inside an Exadata X2-2 Full Machine on a server pool with 4 nodes, but can be reproduced on any other Exadata V2/X2 Machine.

The following example will:

1. Create a tablespace called "knod"

2. Create a table "lisboa" inside the tablespace, with an HCC compression level of 2 (Query High). The table will be loaded during creation with the view "DBA_EXTENTS" to create some volume (28674 rows).

3. Backup the tablespace "knod" with RMAN to the FRA that resides on the ASM diskgroup called "RECO" (inside Exadata).

4. Restore the tablespace from Exadata to a non-Exadata storage, using RMAN.

5. Query the table and get the ORA-64307 Error.

6. Try to move the table back to a tablespace inside Exadata (called "tbs_with_hcc") and see that it can't read it either, throwing the 64307 error again.

7. Try to DML on the table and watch that only the INSERT statement would work, because you can add data, you just can't read the one that's there.

8. To put you out of your misery the table will be decompressed (it could be recompressed with OLTP compression too in this stage)

9. And now you can query or DML on top of the table that everything is working normally (the table was decompressed from it's HCC format).

10. Finally we count the bytes it takes up (3 MB), OLTP compress it and count them again (0.625) to see that the table was 3 times smaller.

Main Conclusion: You don't have to "explode" your HCC tables and then OLTP compressed them, you can go *directly* from HCC on Exadata to OLTP compression on non-Exadata storage.

The Script
SQL> create tablespace knod datafile '+DATA_DM01' size 1g;

Tablespace created.

SQL> create table lisboa
  2  tablespace knod
  3  compress for query high
  4  as
  5  select * from dba_extents
  6  ;

Table created.

Elapsed: 00:00:12.00
SQL> select count(*) from lisboa;

  COUNT(*)
----------
     28674

RMAN> backup tablespace knod;

Starting backup at 23-MAR-11
using channel ORA_DISK_1
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00009 name=+DATA/orcl/datafile/knod.3800.746570399
channel ORA_DISK_1: starting piece 1 at 23-MAR-11
channel ORA_DISK_1: finished piece 1 at 23-MAR-11
piece handle=+RECO/orcl/backupset/2011_03_23/nnndf0_tag20110323t204617_0.366.746570777 tag=TAG20110323T204617 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:03
Finished backup at 23-MAR-11

---

run {
sql 'alter tablespace "KNOD" offline immediate';
set newname for datafile '+DATA/orcl/datafile/knod.3799.746573393' to '/home/lmc/rec/knod.dbf';
restore tablespace 'KNOD';
switch datafile all;
recover tablespace 'KNOD';
sql 'alter tablespace "KNOD" online';
}

---

SQL> desc lisboa
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 OWNER                                              VARCHAR2(30)
 SEGMENT_NAME                                       VARCHAR2(81)
 PARTITION_NAME                                     VARCHAR2(30)
 SEGMENT_TYPE                                       VARCHAR2(18)
 TABLESPACE_NAME                                    VARCHAR2(30)
 EXTENT_ID                                          NUMBER
 FILE_ID                                            NUMBER
 BLOCK_ID                                           NUMBER
 BYTES                                              NUMBER
 BLOCKS                                             NUMBER
 RELATIVE_FNO                                       NUMBER

SQL> select count(*) from lisboa;
select count(*) from lisboa
*
ERROR at line 1:
ORA-64307: hybrid columnar compression is only supported in tablespaces
residing on Exadata storage

SQL> alter table lisboa move tablespace tbs_with_hcc;
alter table lisboa move tablespace tbs_with_hcc
            *
ERROR at line 1:
ORA-64307: hybrid columnar compression is only supported in tablespaces
residing on Exadata storage

SQL> insert into lisboa select * from dba_extents where rownum < 2;

1 row created.

SQL> commit;

SQL> update lisboa
  2  set owner = owner;
update lisboa
       *
ERROR at line 1:
ORA-64307: hybrid columnar compression is only supported in tablespaces
residing on Exadata storage

SQL> delete lisboa where owner = 'KUS';
delete lisboa where owner = 'KUS'
       *
ERROR at line 1:
ORA-64307: hybrid columnar compression is only supported in tablespaces
residing on Exadata storage

------------------- BUT !!!!!!!!!!!!! ----------------------
If you alter table move 'it with NOCOMPRESS things will be different:

SQL> alter table lisboa move nocompress;

Table altered.

SQL> select count(*) from lisboa;

  COUNT(*)
----------
     28675

-- Now I can even do all the DML I can:

SQL> r
  1* update lisboa set owner = 'USK' where owner = 'SYSMAN'

1951 rows updated.

SQL> commit;

Commit complete.

SQL> delete lisboa where owner = 'USK';

1951 rows deleted.

SQL> commit;

Commit complete.

-- ****************************************************************
SQL> select sum(bytes)/1024/1024 from dba_extents where segment_name = 'LISBOA';
SUM(BYTES)/1024/1024
--------------------
                   3

SQL> alter table lisboa move compress for oltp;

Table altered.

SQL>  select sum(bytes)/1024/1024 from dba_extents where segment_name = 'LISBOA';

SUM(BYTES)/1024/1024
--------------------
                .625

-- **************************************************************

---

Feel free to bash or inquire.

LMC

11 comments:

  1. >The only storage in the world that can read this compressed information is Exadata Storage Server.

    ..Hi Luis,

    Actually this is wrong. Any Oracle 11gR2 database instance on the planet can process HCC data. Oracle management decided to but a little tiny piece of code in the database server to force Exadata storage. That is, HCC is a totally intrinsic Oracle database feature but is disabled unless the storage is Exadata. Anything the functions with cell offload processing disabled is intrinsic. If you disable cell offload processing you'd see no failures dealing with HCC on your exadata configuration. This fact had all the beta customers for HCC pretty upset. None of them had Exadata but they performed their beta roles just fine.

    ReplyDelete
  2. Thanks for you comment Kevin. Always a pleasure to "have" you around here.
    I must say a few words begging to differ, and using a metaphor from the auto industry.
    My car is a Mercedes C200 and it has exactly the same engine has a C220, but somehow Damler-Benz by some electronic means decided to cap the horse power of this model to make it cheaper (for the likes of me!). My car has 134 horse power (hp) and I just can't go out there on the street and say it has 168 hp which is the power of the C220 model which (I repeat) has *exactly* the same engine.
    So this capping has commercial reasons and I have to accept that.

    What you're saying it's something Oracle customers will have to accept: buying Oracle software and hardware together has to bring much bigger advantages, although most of them will obviously know that some of the features are capped to be this way.

    Please feel free to drop by my humble blog anytime.

    LMC.

    ReplyDelete
  3. Hi Luis,

    You've pretty much hit the nail on the head regarding the fact that it is just a synthetic limit put in place. I just wanted to point out that it is a totally synthetic limit on the non-Exadata systems.

    Your analogy could be modified a bit to be more relative though. Allow me to explain. See, a licensee of Oracle Database 10g gets 11g as an upgrade for free. If you were one of those beta customers running 10g and 11g (beta) side by side with the 11g side using HCC (and no Exadata in sight mind you) you'd be all excited to upgrade to 11g. However, Oracle came in after the fact (after beta was over) and applied the horse power-cap to use your analogy. That is, I assert that the auto analogy would be more correct if enforcers from Mercedes came in and capped the horsepower *after* you already *owned* the car, had been driving it, providing feedback to Mercedes engineering on field capability and so forth.

    But, that's just a pet-peeve of mine. In fact, I recall sharing that with some of you volcano mates last year.

    ReplyDelete
  4. Kevin,

    I see you're waving the flag of a few possibly angry beta customers, but even those have to cope with the fact the it makes a lot of sense to cap it. From where I stand, I saw in my career loads of customers that would rather not use some great features, even when those were free. Take ASM for instance. By putting HCC in a position of value I think makes a lot more sense. You even have paid features, or options, that people would simply ignore to use because there's not a "value culture" around those (OLAP Option is the example that pops my mind).
    So by putting HCC free, but only inside an engineered solution, and getting back to my analogy, that would be the same as unlocking horse power in my car, if I would buy more than one. Take Ferrari FXX: you pay 1.8 million dollars and Ferrari will only let you drive it inside a racing track!
    Not everybody is ready for extreme speed ;-)

    LMC

    ReplyDelete
  5. Hi Luis,

    I think we did the thread a justice here. I value your opinion, my friend. Hope we can beer (it is a verb) again some time. Keep up the good work.

    ReplyDelete
  6. [...] ocpdba oracle weblog oracle knowledge web logging Skip to content HomeocpdbaRobotMy WebMy Presos ← Recover HCC Compressed Tables to non Exadata storage [...]

    ReplyDelete
  7. Great article; still not obsolete at all after 3.5 years!

    I have a few questions though:
    1. When you say "restore tablespace from Exadata to a non-Exadata storage" - do you suggest you have some non-Exadata storage attached to your Exadata? I thought that is not supported?
    2. Is there any way to restore HCC-compressed tables from RMAN backups to an environment without any Exadata servers?

    ReplyDelete
  8. 1. No. Just same network. Not physically attached. Recovery purposes
    2. Yes (in 2015) with ZFS backup Appliance

    ReplyDelete
  9. Hello admin, i see your site needs fresh content. If you are
    too lazy to write unique articles everyday
    you should search in google for:
    Ightsero's Essential Tool

    ReplyDelete
  10. Hmm is anyone else encountering problems with the images on this blog loading? I'm trying to determine if its a problem on my end or if it's the blog. Any feed-back would be greatly appreciated.

    ReplyDelete