Sunday 28 December 2014

The relationship between groom and nzbackup

There are three basic functions that every Netezza DBA must perform regularly:
  1. Ensure statistics are up to date
  2. Groom your tables
  3. Backup your production databases
Let’s focus on groom and its dependency on nzbackup.  I recently ran into an issue where groom was running every day — but none of the deletes/updates were getting removed.
When any backup operation is run, a new entry is created in _t_backup_history recording the type of backup (0 – full, 1 – differential, 2 – cumulative, 4 – schema only, etc), when and for what database. Another key piece of information captured is the backup operation’s transaction id.
Quick piece of background: every record in a Netezza system has four hidden columns:
  1. createxid is the transaction id of the operation that created the record
  2. deletexid is the transaction id of the operation that deleted the record.
  3. datasliceid is the data slice (disk) that the record resides on
  4. rowid  is the unique row identifier (unique to the entire system)
Netezza logically deletes records by populating the deletexid column with the delete operation’s transaction id. This is an instruction to the FPGA to not allow these records past, eliminating visibility to them completely.
So why is the last backup operation ID important?  This information is used by subsequent differential and cumulative backups to identify newly inserted records, deletes and updates. Records logically deleted since the last backup operation are recorded, ensuring a restore of that increment will result in those records properly being deleted.
Groom also uses this information to ensure that all logical deletes were recognized by a backup before it actually physically moves the row. Consider this example:
1. A table has 5 million rows in it when it gets backed up on Sunday night
2. A user deletes all of the rows (not truncate) on Monday morning
3. A groom operation is run

In this example none of the deletes will be physically removed and the table will not shrink.   To put it simply: a logically deleted record is eligible to be physically removed only if the deletexid is less than the last backup’s operation ID.  If it isn’t then groom will leave the record in place (by default).  You can override this behavior by adding ‘reclaim backupset none’  to your GROOM command.  This instructs groom to ignore any existing backup sets and physically remove any logical deletes.

No comments:

Post a Comment