Girish Srivastava: 2014

Sunday 28 December 2014

NZ_MIGRATE utility

NZ_MIGRATE utility is used to migrate data from source database to the target database. All the tables mentioned in the tablelist parameter are migrated to the target database. There are options to create target tables or truncate target tables before migrating the data. NZ_Migrate utility does the migration by running many parallel threads and hence is faster than nzload.

nz_migrate command usage is below:

./nz_migrate -shost <source_host> -thost <target_host> -sdb <source_database> -tdb <targetdatabase> -suser <source_user> -spassword <source_password> -tuser <target_user> -tpassword <target_password> -t <table1, table2, ...> -cksum fast -genStats Full -TruncateTargetTable YES >> $log

nzbackup vs nz_backup

There are 2 commands to perform backup of Netezza. One is nzbackup. The other one is nz_backup. For nzbackup, not much you can play with it except you need to specify the database to backup. However, nz_backup provides more command options eg, specific table to backup, include where clause etc. Here's the command options for nz_backup.

Usage: nz_backup -dir <dirname> -format <ascii|binary|gzip> [ optional args ]
-or-
nz_restore -dir <dirname> -format <ascii|binary|gzip> [ optional args ]

Purpose: To backup (or restore) one or more tables.

An nz_backup must be run locally (on the NPS host being backed up).

An nz_restore can be used to restore data into a remote NPS host. Just
include the "-host" switch, or set the NZ_HOST environment variable.

Note: When doing an "nz_restore -format binary -host <xxx>", the script
issues an "nzload -compress true". This nzload feature only exists
in nzload as of NPS 4.6. If you want to do this on an older version
of NPS (4.0 or 4.5) then:
o Install a copy of the 4.6 client toolkit somewhere on your box
(it can be used against the older server releases)
o Add it's bin directory to the start of your search PATH
o Then invoke nz_restore

These scripts can process a single table, multiple tables, or an entire database.

The data format that is used can be either
ascii -- which is very portable.
binary -- which is the database's compressed external table format. This
is much faster, and results in significantly smaller backup sets.
gzip -- ascii, which is gzip'ed on the NPS host.

The data can be written to (or read from) disk files or named pipes.
If you use named pipes, another application is used to consume
(or produce) the data.

These scripts just concern themselves with the DATA itself. When backing up
a table, the DDL is not included. When restoring a table, the script expects
the table to already exist. It will not create it. It will not truncate it
(so if the table currently has any data in it, that data will be left untouched
by this script).

To backup tables requires the following permissions:

GRANT SELECT ON _VT_HOSTTXMGR TO <user/group>;
GRANT SELECT ON _VT_HOSTTX_INVISIBLE TO <user/group>;
GRANT SELECT ON _VT_DBOS_CONNECTION TO <user/group>;
--To obtain information about transactions

GRANT LIST ON <DATABASE|dbname> TO <user/group>;
--The user must have access to the database that contains the tables

GRANT SELECT ON <TABLE|tablename> TO <user/group>;
--The user must have access to the tables themselves, and their data

GRANT CREATE EXTERNAL TABLE TO <user/group>;
--The user must be able to create external tables, into which the
--data will be unloaded

To restore/reload a table requires an additional permission:

GRANT INSERT ON <TABLE|tablename> TO <user/group>;
--The user must be able to insert (i.e., reload) data back into the tables

Options: REQUIRED ARGUMENTS
==================

-dir <dirname> [...]

The full path to the directory in which the data files will be written
to (or read from). This directory must already exist and permit write
access to it. The directory name can be as meaningful as you wish to
make it.

If you are running this script as some linux user other than 'nz', please
note that it will actually be one of the 'nz' processes that writes the
data into this directory. So linux user 'nz' must also have write access
to it. If you are using named pipes (rather than files) then this is the
directory where the named pipes will be created.

Examples:
-dir /backups/backup_set_17
-dir /snap_storage/bu/customer_db/2006_11_18
-dir /tmp

If desired, you may split the backup files up across multiple directories/
file systems. Each thread can be associated with a separate "-dir <dirname>"
by specifying them on the command line. If you use this feature, then the
number of directories specified must match the number of threads.

-format <ascii|binary|gzip>

Identifies the format to be used for the output files.

ascii Universal in nature, but typically results in larger files and
slower performance.

binary The database's compressed external table format.

gzip ascii, which is then compressed (using gzip). By definition,
compressing and decompressing data uses up a lot of CPU cycles (i.e.,
it takes a long time). When using the binary format (compressed/
external), the work is done in parallel across all of ths SPUs ... so
it is very quick. But this option uses the NPS host to gzip/gunzip the
data. You will (almost always) want to use multiple threads in order
to get more of the host's SMP processors involved in order to speed
things along. The sweet spot seems to be about 8 threads, though you
can certainly use a larger/smaller number if you want to break the
backup files up into more/fewer pieces.

best --to-- least
====== ====== ======
Speed: binary ascii gzip
Size: gzip binary ascii
Universality: ascii gzip binary

OPTIONAL ARGUMENTS
==================

-h or -? or --help display this help

-[rR]ev or -V print the software revision of this program

-host <name/IP> host name or IP address [NZ_HOST]
-u <user> user name [NZ_USER]
-pw <string> password [NZ_PASSWORD]
-db <database> database name [NZ_DATABASE]

-t <tablename> [...]

# Table(s) within the database to be processed. If none are specified,
# then all tables in the database will be processed.
#
# If you have a file that contains a list of tablenames to be backed up,
# (separated by spaces and/or newlines) it can be used via the following
# syntax:
#
# -t `cat /tmp/the_list_of_tables`

-threads <nnn>

# Breaking the backup into multiple threads (per table) can increase the
# overall thruput, especially for large tables. This will also have the
# effect of creating smaller backup files, since each one will now be
# broken up into multiple pieces.
#
# By default, only a single thread will be used. You can specify a number
# from 1..31. Whatever value you specify for the backup must also be used
# for the restore. In general, the sweet spot seems to be about 6 threads.

-script <scriptname>

# Instead of backing up (or restoring) the data from disk files, you can use
# named pipes -- allowing another application to consume (or produce) the data
# on the fly. To use named pipes, specify that application/script here. The
# script will be automatically invoked and passed (as arg 1) the fully rooted
# pathname of the named pipe that is it supposed to use.
#
# For example scripts, see the file(s)
# nz_backup.script_example
# nz_restore.script_example

-whereclause <clause>

# Normally, nz_backup is used to backup the entire contents of a table --
# all visible rows. This option allows "you" to tack on a WHERE clause
# to the data that gets selected ... allowing "you" to backup a subset
# of the table. All the power (and responsibility) is put into your
# hands. Do wrap the clause in double quotes so it will be passed into the
# script correctly. Examples:
#
# -whereclause "customer_key = 2"
# -whereclause "customer_key in (1,3,5)"
# -whereclause "region_name = 'AMERICA' or region_key = 0"
# -whereClause "order_date between '1998-01-01' and '1998-12-31'"
#
# Because this clause gets applied to all tables being backed up, you would
# probably only want to backup a single table at a time (when using this
# clause) ... since the clause will typically contain column names that are
# specific to that table.
#
# This clause only applies to backups (not restores). Its use will be logged
# in the output of this script (as well as in the pg.log file).

-ignoreTxID

# Tables are individually backed up, one at a time. Since a backup may span
# many hours, this script insures that the backup represents a consistent
# point-in-time by using the transaction IDs attached to each row.
#
# This switch will override that feature ... and backs up each table with
# whatever data it contains when the backup (of that particular table) is
# kicked off. This insures that your backup will include ALL of the data
# in a table that has been committed (so you're not susceptible to long
# running or stale transactions).
#
# This switch is primarily of use with NPS 4.0. In later releases, this
# script is able to do things differently.

-dropdata

# This is for testing purposes only. As the name implies, the backup will
# be written to /dev/null, resulting in no backup at all. This is useful
# for testing the performance of the NPS components that are involved
# (SPUs/S-Blades ==> Host), while excluding the speed/overhead of your host
# storage.

-sizedata

# This is for testing purposes only. Like "-dropdata", but rather than
# sending the backup data directly to /dev/null it will first be piped
# thru "wc -c" in order to count the number of bytes in the backup
# stream (e.g., to provide you with actual sizing information). So it
# has the performance characteristics of "-dropdata" ... but provides
# you additional information.
#
# You can use "-format <ascii|binary>" and 1 or multiple "-threads <nn>"
# when using this switch.
#
# Each table will include the following pieces of information
#
# Info: source table size 80,740,352
# Info: backup file size 76,574,691
#
# And summary lines for the entire backup set will display the
#
# TOTAL source table size : 48,267,526,144
# TOTAL backup file size : 46,088,648,210
#
# The table size is whatever the table size is -- the amount of storage
# space it using on disk (as reported by nz_db_size or nz_tables). This
# script doesn't know/care if the data is compressed on disk (e.g, CTA0/1/2).
# Nor does the script know if there are any logically deleted rows in the
# table (taking up space in the table, but which would not be part of a
# backup data set).
#
# The backup size is the amount of storage that would be required if the
# backup data set was actually written to disk. This would represent either
# the ascii or the binary (compressed external table format) version of the
# data ... whatever you chose.
#
# To get sizing information for a full nzbackup, e.g.
# nzbackup -db DBNAME -dir /tmp
# you would use a command line such as this
# nz_backup -db DBNAME -dir /tmp -format binary -sizedata

Outputs: Status/log/timing information will be sent to standard out ... and will
include information about any ERROR's that might be encountered.

Exit status: 0 = success, non-0 = ERROR's were encountered

Examples: $ nz_backup -format binary -dir /backupdir

$ nz_restore -format ascii -dir /tmp -db my_db -t table1 -script /tmp/my_script

Comparison: nzbackup/nzrestore nz_backup/nz_restore
================== ====================
NPS CLI Utility Add-on Script
Backup Type
Full X X
-differential X
-cumulative X

Granularity
Entire Database X X
Individual Table(s) nzrestore X

Formats Supported
Compressed Binary X X
Ascii X
Ascii gzip'ed X

Output Destination
Veritas (NPS 4.0) X
Tivoli (NPS 4.6) X
Disk X X
Named Pipes X

Multi-Stream Support nzbackup (6.0) X

DDL included as part X Use the nz_ddl* scripts
of the backup set

The nzreclaim command

Use the nzreclaim command to recover disk space that is used by updated or deleted data by using the GROOM TABLE command.

Note: Starting in release 6.0, the SQL GROOM TABLE command replaces the nzreclaim command. The nzreclaim command is now a “wrapper” that calls the GROOM TABLE command to reclaim space. if you have existing scripts that use the nzreclaim command, those scripts continue to run, although some of the options might be deprecated since they are not used by GROOM TABLE. You should use the GROOM TABLE command in your scripts.

The relationship between groom and nzbackup

There are three basic functions that every Netezza DBA must perform regularly:

Ensure statistics are up to date
Groom your tables
Backup your production databases

Let’s focus on groom and its dependency on nzbackup. I recently ran into an issue where groom was running every day — but none of the deletes/updates were getting removed.
When any backup operation is run, a new entry is created in _t_backup_history recording the type of backup (0 – full, 1 – differential, 2 – cumulative, 4 – schema only, etc), when and for what database. Another key piece of information captured is the backup operation’s transaction id.
Quick piece of background: every record in a Netezza system has four hidden columns:

createxid is the transaction id of the operation that created the record
deletexid is the transaction id of the operation that deleted the record.
datasliceid is the data slice (disk) that the record resides on
rowid is the unique row identifier (unique to the entire system)

Netezza logically deletes records by populating the deletexid column with the delete operation’s transaction id. This is an instruction to the FPGA to not allow these records past, eliminating visibility to them completely.
So why is the last backup operation ID important? This information is used by subsequent differential and cumulative backups to identify newly inserted records, deletes and updates. Records logically deleted since the last backup operation are recorded, ensuring a restore of that increment will result in those records properly being deleted.
Groom also uses this information to ensure that all logical deletes were recognized by a backup before it actually physically moves the row. Consider this example:
1. A table has 5 million rows in it when it gets backed up on Sunday night
2. A user deletes all of the rows (not truncate) on Monday morning
3. A groom operation is run

In this example none of the deletes will be physically removed and the table will not shrink. To put it simply: a logically deleted record is eligible to be physically removed only if the deletexid is less than the last backup’s operation ID. If it isn’t then groom will leave the record in place (by default). You can override this behavior by adding ‘reclaim backupset none’ to your GROOM command. This instructs groom to ignore any existing backup sets and physically remove any logical deletes.

Saturday 1 November 2014

Sakila DB Installation in MySql

The Sakila sample database is available from http://dev.mysql.com/doc/index-other.html. A downloadable archive is available in compressed tar file or Zip format. The archive contains three files: sakila-schema.sql, sakila-data.sql, and sakila.mwb.
The sakila-schema.sql file contains all the CREATE statements required to create the structure of the Sakila database including tables, views, stored procedures, and triggers.
The sakila-data.sql file contains the INSERT statements required to populate the structure created by the sakila-schema.sql file, along with definitions for triggers that must be created after the initial data load.
The sakila.mwb file is a MySQL Workbench data model that you can open within MySQL Workbench to examine the database structure.

To install the Sakila sample database, follow these steps:

Extract the installation archive to a temporary location such as C:\temp\ or /tmp/. When you unpack the archive, it creates a directory named sakila-db that contains the sakila-schema.sql and sakila-data.sql files.
Connect to the MySQL server using the mysql command-line client with the following command:
shell> mysql -u root -p

Enter your password when prompted. A non-root account can be used as long as the account has privileges to create new databases.

Execute the sakila-schema.sql script to create the database structure by using the following command:
mysql> SOURCE C:/temp/sakila-db/sakila-schema.sql;

Replace C:/temp/sakila-db with the path to the sakila-schema.sql file on your system.

· Execute the sakila-data.sql script to populate the database structure with the following command:

mysql> SOURCE C:/temp/sakila-db/sakila-data.sql;

Replace C:/temp/sakila-db with the path to the sakila-data.sql file on your system.

· Confirm that the sample database is installed correctly. Execute the following statements. You should see output similar to that shown here.

mysql> USE sakila;

Database changed

mysql> SHOW TABLES;

+----------------------------+

| Tables_in_sakila |

+----------------------------+

| actor |

| address |

| category |

| city |

| country |

| customer |

| customer_list |

| film |

| film_actor |

| film_category |

| film_list |

| film_text |

| inventory |

| language |

| nicer_but_slower_film_list |

| payment |

| rental |

| sales_by_film_category |

| sales_by_store |

| staff |

| staff_list |

| store |

+----------------------------+

22 rows in set (0.00 sec)

mysql> SELECT COUNT(*) FROM film;

+----------+

| COUNT(*) |

+----------+

| 1000 |

+----------+

1 row in set (0.02 sec)

mysql> SELECT COUNT(*) FROM film_text;

+----------+

| COUNT(*) |

+----------+

| 1000 |

+----------+

1 row in set (0.00 sec)

Sunday 12 October 2014

IBM PureDataSystem for Analytics

IBM® PureData™ for Analytics, powered by Netezza technology, is changing the game for data warehouse appliances (again). Designed to help you unlock data’s true potential, the new IBM PureData System for Analytics is:

Big Data and Business Intelligence ready: In addition to the features and benefits of the data warehouse appliance itself, you now get the added benefit of additional software to exploit big data opportunities and leading business intelligence capabilities. Included with the data warehouse appliance are software licenses to get you started with IBM Cognos for business intelligence, IBM InfoSphere DataStage for data integration, IBM InfoSphere BigInsights to augment your data warehouse with Hadoop data services and IBM InfoSphere Streams for an introduction to real-time streaming analytics.
Advanced security at no extra cost: Enhanced data security is provided through self-encrypting drives on the IBM PureData System for Analytics N3001 models as well as support for the Kerberos authentication protocol.
An even broader family of appliance models: The IBM PureData System for Analytics N3001 is available in a range of models to fit a broad range of data capacity needs. New to the N3001 family is a rack mountable appliance to enable midsize organizations to take advantage of the same speed and simplicity provided by the larger models in the family. Also, an 8-rack configuration is available to provide petabyte+ capacity in a single system.

Get some overview idea about IBM® PureData™ for Analytics: IBM PureData System for Analytics

Learn EXT JS from Scratch...

This tutorial provides a tour of ExtJS. Here, you will learn the basics of this javascript framework and how to deal with core concepts as well as with UI components.
Go through the link bellow:
EXTJS