Monday, November 21, 2011

PostgreSQL 9.2 Volume 2 - Summary

4. SQL Syntax
Value Expression

Calling Function
Named notation is especially useful for functions that have a large number of parameters (makes the associations between parameters and actual arguments more explicit and reliable)
Positional notation, a function call is written with its argument values in the same order as they are defined in the function declaration (parameters can only be omitted from right to left)
CREATE FUNCTION concat_lower_or_upper(a text, b text, uppercase boolean DEFAULT false)
       RETURNS text AS $$
       SELECT CASE WHEN $3 THEN UPPER($1 || ' ' || $2) ELSE LOWER($1 || ' ' || $2) END;
Positional Notation
SELECT concat_lower_or_upper('Hello', 'World', true);
SELECT concat_lower_or_upper('Hello', 'World');
 hello world
Named notation
SELECT concat_lower_or_upper(a := 'Hello', b := 'World', uppercase := true);
SELECT concat_lower_or_upper(a := 'Hello', uppercase := true, b := 'World');
Mixed Notation
SELECT concat_lower_or_upper('Hello', 'World', uppercase := true);
5. Data Definition
Constraints - following example demonstrates different methods of constraints usage.
CREATE TABLE products (
CHECK (price > 0),
price numeric,
price1 numeric,
CONSTRAINT positive_price CHECK (price1 > 0),
CHECK (price > price1),
CHECK (discounted_price > 0 AND price > discounted_price),
product_no integer UNIQUE,
product_no integer CONSTRAINT must_be_different UNIQUE,
product_no integer REFERENCES products (product_no), -- foreign key
FOREIGN KEY (b, c) REFERENCES other_table (c1, c2),
product_no integer REFERENCES products ON DELETE RESTRICT,
order_id integer REFERENCES orders ON DELETE CASCADE);
RESTRICT prevents deletion of a referenced row. NO ACTION means that if any referencing rows still exist when the constraint is checked. CASCADE specifies that when a referenced row is deleted, row(s) referencing it should be automatically deleted as well.
Modifying Table
ALTER TABLE products ADD COLUMN description text; --adding a column
ALTER TABLE products DROP COLUMN description CASCADE; -- drop column
-- add constrain
ALTER TABLE products ADD CHECK (name <> ''); 
ALTER TABLE products ADD CONSTRAINT some_name UNIQUE (product_no); 
ALTER TABLE products ADD FOREIGN KEY (product_id) REFERENCES product_groups;
ALTER TABLE products ALTER COLUMN price SET DEFAULT 7.77; -- add default
ALTER TABLE products RENAME COLUMN prod_no TO product_no; -- rename column
Partitioning refers to splitting what is logically one large table into smaller physical pieces. Benefits are:
  • Query performance can be improved, when heavily accessed rows of table are in a single partition. The partitioning substitutes for leading columns of indexes, reducing index size and making it more likely that the heavily-used parts of the indexes fit in memory.
  • When queries or updates access a large percentage of a single partition, performance can be improved by taking advantage of sequential scan of that partition instead of using an index..
  • Bulk loads and deletes can be accomplished by adding or removing partitions, if that requirement is planned into the partitioning design. 
  • Seldom-used data can be migrated to cheaper and slower storage media.
Range Partitioning       
The table is partitioned into "ranges" defined by a key column or set of columns, with no overlap
List Partitioning        
The table is partitioned by explicitly listing which key values appear in each partition.

Combining Queries

The results of two queries can be combined using the set operations union, intersection, and difference.   
  • query1 UNION [ALL] query2 - eliminates duplicate rows from its result, in the same way as DISTINCT, unless UNION ALL is used  
  • query1 INTERSECT [ALL] query2 - returns all rows that are both in the result. Duplicate rows are eliminated unless INTERSECT ALL is used.  
  • query1 EXCEPT [ALL] query2 - returns all rows that are in the result of query1 but not in the result of query2. (This is sometimes called the difference between two queries.)

7.With Queries (Common Table Expressions)

WITH provides a way to write auxiliary statements for use in a larger query. These statements, which are often referred to as Common Table Expressions or CTEs, can be thought of as defining temporary tables that exist just for one query.

Data-Modifying Statements in WITH

You can use data-modifying statements (INSERTUPDATE, or DELETE) in WITH. This allows you to perform several different operations in the same query. An example is:
WITH moved_rows AS (
    DELETE FROM products
        "date" >= '2010-10-01' AND
        "date" < '2010-11-01'
INSERT INTO products_log
SELECT * FROM moved_rows;
This query effectively moves rows from products to products_log. The DELETE in WITH deletes the specified rows from products, returning their contents by means of itsRETURNING clause; and then the primary query reads that output and inserts it into products_log.
8. Data type

Chapter 9. Functions and Operators

9.3. Mathematical Functions and Operators
9.4. String Functions and Operators
9.5. Binary String Functions and Operators
9.7. Pattern Matching
9.8. Data Type Formatting Functions
9.9. Date/Time Functions and Operators
9.10. Enum Support Functions 
9.11. Geometric Functions and Operators
9.12. Network Address Functions and Operators
9.13. Text Search Functions and Operators
9.14. XML Functions
9.15. Sequence Manipulation Functions
9.16. Conditional Expressions

SELECT ... WHERE CASE WHEN x <> 0 THEN y/x > 1.5 ELSE false END;
The NULLIF function returns a null value if value1 equals value2; otherwise it returns value1. This can be used to perform the inverse operation of the COALESCE example given above:
SELECT NULLIF(value, '(none)')
9.17. Array Functions and Operators
9.18. Aggregate Functions
9.19. Window Functions 

9.22. Set Returning Functions
This section describes functions that possibly return more than one row.

SELECT * FROM generate_series(2,4);
SELECT * FROM generate_series(5,1,-2);

-- this example relies on the date-plus-integer operator
SELECT current_date + s.a AS dates FROM generate_series(0,14,7) AS s(a);

SELECT * FROM generate_series('2008-03-01 00:00'::timestamp, '2008-03-04 12:00', '10 hours');
 2008-03-01 00:00:00
 2008-03-01 10:00:00
 2008-03-01 20:00:00
 2008-03-02 06:00:00
 2008-03-02 16:00:00

9.23 System Information Functions
Session Information Functions

NameReturn TypeDescription
current_catalognamename of current database (called "catalog" in the SQL standard)
current_database()namename of current database
current_query()texttext of the currently executing query, as submitted by the client (might contain more than one statement)
current_schema[()]namename of current schema
current_schemas(boolean)name[]names of schemas in search path, optionally including implicit schemas
current_usernameuser name of current execution context
inet_client_addr()inetaddress of the remote connection
inet_client_port()intport of the remote connection
inet_server_addr()inetaddress of the local connection
inet_server_port()intport of the local connection
pg_backend_pid()intProcess ID of the server process attached to the current session
pg_conf_load_time()timestamp with time zoneconfiguration load time
pg_is_other_temp_schema(oid)booleanis schema another session's temporary schema?
pg_listening_channels()setof textchannel names that the session is currently listening on
pg_my_temp_schema()oidOID of session's temporary schema, or 0 if none
pg_postmaster_start_time()timestamp with time zoneserver start time
session_usernamesession user name
usernameequivalent to current_user
version()textPostgreSQL version information
9.24. System Administration Functions 

Thursday, November 17, 2011

Useful PDI steps' commands

Wild card expression
File Names Starts with 100047 and end with  DIS.DAT
${Internal.Job.Filename.Directory}  100047*DIS.DAT <

File names end up with xls extension
${Internal.Transformation.Filename.Directory}\matchdata    .*\.xls

Java scripts
Replace white spaces, concatenate strings and print fields with line breaks 
var Data_file;
var Date_time;
var header_line;
date_time = Date_time1.replace(" ", "-");
Data_file = '82246_data_extraction_' + Date_time1.replace(" ", "-") + '.xls';
header_line = '========================================== \r\n' + 
               'Date & Time  :' + date_time + '\r\n' + 'Output File   :' + Data_file + '\r\n'

If and replace function usage
var max_imported_last_call_new;
if (max_imported_last_call != null) 
       { max_imported_last_call_new=replace(max_imported_last_call, "/", "-"); }
else { max_imported_last_call_new='2000-01-01 00:00:00.0'; }

If and Data Type Conversion
var new_dispo_file_name;
var wave;
if (str2num(filename_date) == str2num(SERVER_DATE)){
new_dispo_file_name = XXXXXX + filename_date + 'CVAR' + num2str(str2num(wave) + 1) + 'DIS';}
else {
new_dispo_file_name = XXXXXX + SERVER_DATE + 'CVAR' + 0 + 'DIS';}

AND and OR conditions 

var fail_number = 0;
if (ADDR == null || CITY == null || ZIPCODE == null) {

if (x_salutation == null && (
(country=='BE' && language=='Dutch') ||
(country=='BE' && language=='French') ||
(country=='FR' && language=='French') ||
(country=='CH' && language=='Italian')  )) {

Date type (string to date conversion)
eoweek = str2date(date_val,"dd-MMM-yy");
quarter_no = quarter(eoweek);

var SurveyResults_filename = system_date + "_82430SurveyResults";
var SampleDisposition_filename = system_date + "_CumminsDispositionStatus";

var loaddatetime = system_date.substr(0,12) + "00";
var yesterday = dateAdd(today, "d", -1);
var dayb4yesterday = dateAdd(today, "d", -2);

var c_date = date2str(LAST_CALL,"MM/dd/yyyy") + " " + num2str(LAST_TIME/3600, "00") + ":" + (num2str(LAST_TIME%3600/60, "00")) + ":" + (num2str(LAST_TIME%60, "00")) ;

Regular exression
 if ( substr(timetype,0,4) == 'GMT+' || substr(timetype,0,4) == 'GMT-')
{   var re = /([-+])(\d+)(.?)(3?)/.exec(timetype);
var x = re[2].length;
if ( x == 1) {hr = '0' + re[2]; }
else {hr = re[2]; }
temptime = re[1] + hr + ':' + mins;

.*\.add$ - everything ends with ".add"

Monday, November 14, 2011

PostgreSQL 9.2 Volume 1

PostgreSQL is relational database management system (RDBMS).
Use client / server architecture. Server handles multiple concurrent connections from clients. It starts new process called "forks" for each new connection.

Create DB: $ createdb my_database           $ dropdb my_database
                  $ /usr/local/pgsql/bin/ createdb my_database

Access DB $ psql my_database
my_database => SELECT version();

Internal commands
my_database --> \h    - help
my_database --> \q    - quit from plsql to shell
my_database --> \i    - read command from specific file
my_database --> \b    - backspace
my_database --> \f    - form feed
my_database --> \n    - newline
my_database --> \r    - carriage return
my_database --> \t    - tab
my_database --> \o, \oo, \ooo (o = 0 - 7) octal byte value
my_database --> \xh, \xhh (h = 0 - 9, A - F)   hexadecimal byte value
my_database --> \uxxxx, \Uxxxxxxxx (x = 0 - 9, A - F)  16 or 32-bit hexadecimal Unicode character value

Create table
    code        char(5) PRIMARY KEY DEFAULT nextval('school_id_seq')
    vector  int[][], -- 2 dimension array,
    name   varchar(40) NOT NULL CHECK (name <> '')
    ddid     integer CHECK (did > 100),
    price numeric DEFAULT 9.99,
    date_prod   date,
    kind        varchar(10),
    len         interval hour to minute,
    modtime   timestamp DEFAULT current_timestamp, -- default value is current time
    CONSTRAINT production UNIQUE(date_prod),
    --CONSTRAINT code_title PRIMARY KEY(code,title)
Joining tables 
SELECT * FROM table1 a INNER JOIN table2 b ON ( =; 
SELECT * FROM table1 a LEFT OUTER JOIN table2 b ON ( =; 
SELECT * FROM table1 a RIGHT OUTER JOIN table2 b ON ( =;

3. Advanced Features
Transaction concept (all-or-nothing)
Transactions are a fundamental concept of all database systems. The essential point of a transaction is that it bundles multiple steps into a single, all-or-nothing operation. The intermediate states between the steps are not visible to other concurrent transactions, and if some failure occurs that prevents the transaction from completing, then none of the steps affect the database at all. A transaction is said to be atomic: from the point of view of other transactions, it either happens completely or not at all.

when multiple transactions are running concurrently, each one should not be able to see the incomplete changes made by others. To facilitate these iPostgreSQL, a transaction is set up by surrounding the SQL commands of the transaction with BEGIN and COMMIT commands

If, partway through the transaction, we decide we do not want to commit (perhaps bank balance went negative), we can issue the command ROLLBACK instead of COMMIT, and all our updates so far will be cancelled. if you are sure you won't need to roll back to a particular savepoint again, it can be released, so the system can free some resources. Keep in mind that either releasing or rolling back to a savepoint will automatically release all save-points that were defined after it.
UPDATE accounts SET balance = balance - 100.00 WHERE name = 'Alice';
SAVEPOINT my_savepoint;
UPDATE accounts SET balance = balance + 100.00 WHERE name = 'Bob';
-- oops ... forget that and use Wally's account
ROLLBACK TO my_savepoint;
UPDATE accounts SET balance = balance + 100.00 
    WHERE name = 'Wally';
Window function
window function performs a calculation across a set of table rows that are somehow related to the current row. Unlike regular aggregate functions, use of a window function does not cause rows to become grouped into a single output row. Behind the scenes, the window function is able to access more than just the current row of the query result. Not allowed within where clause. 
This shows how to compare each employee's salary with the average salary in his or her department.
SELECT depname, empno, salary, avg(salary) OVER (PARTITION BY depname) FROM empsalary;
SELECT depname, empno, salary, rank() OVER (PARTITION BY depname ORDER BY salary DESC) FROM empsalary;

depname  | empno | salary | rank 
 develop   |     8 |   6000 |    1
 develop   |    10 |   5200 |    2
 develop   |     9 |   4500 |    4
 develop   |     7 |   4200 |    5
 personnel |     2 |   3900 |    1
 personnel |     5 |   3500 |    2
 sales     |     1 |   5000 |    1
 sales     |     4 |   4800 |    2
OVER clause directly following the window function's name and argument(s). The OVERclause determines exactly how the rows of the query are split up for processing by the window function. The PARTITION BY list within OVERspecifies dividing the rows into groups, or partitions, that share the same values of the PARTITION BY expression(s).
When a query involves multiple window functions, it is possible to write out each one with a separate OVER clause, but this is duplicative and error-prone if the same windowing behavior is wanted for several functions. Instead, each windowing behavior can be named in a WINDOW clause and then referenced in OVER. For example:
SELECT sum(salary) OVER w, avg(salary) OVER w
  FROM empsalary


Inheritance is a concept from object-oriented databases. 
CREATE TABLE cities ( name text, population real, altitude int );
CREATE TABLE capitals ( state char(2))  INHERITS (cities);
On the other hand, the following query finds all the cities that are not state capitals and are situated at an altitude of 500 feet or higher:
SELECT name, altitude FROM ONLY cities WHERE altitude > 500;

Monday, November 7, 2011

Temporary table inside a PostgreSQL function

1. First, create a TYPE Object defining returning fields.
CREATE TYPE accounts.weekly_payroll_report_type AS(
batch integer,
project varchar,
language varchar,
intcode VARCHAR,
bcode integer,
description VARCHAR,
p_tot_hrs NUMERIC,
pay_amount NUMERIC );
2. Create the function
CREATE OR REPLACE FUNCTION accounts.weekly_payroll_report_indirect(integer)
RETURNS setof accounts.weekly_payroll_report_type
AS $$
int_payrate NUMERIC;
 cur_timecard CURSOR FOR select t.batch , t.language, t.project, t.intcode AS t_intcode,
FROM accounts.timecard_duration_view t
INNER JOIN task k ON (k.bcode = t.bcode  AND  UPPER(k.billable) = UPPER(t.billable))
WHERE t.batch = $1 AND UPPER(k.billable) = 'Y'
GROUP BY 1, 2, 3, 4, 5 ,6
cur_pay CURSOR FOR SELECT batch, project, intcode as p_intcode , sum_hours AS p_tot_hrs FROM accounts.payroll_hours_view p WHERE batch = $1; 
CREATE TEMPORARY TABLE temp_payroll_val(
tmp_batch INTEGER, tmp_project VARCHAR,
tmp_p_tot_hrs NUMERIC, tmp_pay_amount NUMERIC) on commit drop;
FOR rec in cur_pay
LOOP FOR rec1 in cur_timecard LOOP
IF (rec.batch::INT = rec1.batch::INT AND rec.project::TEXT = rec1.project::TEXT AND rec.p_intcode = rec1.t_intcode) THEN
SELECT SUM(sum_duration) INTO sum_duration_per_intcode FROM accounts.timecard_duration_view
WHERE batch = rec1.batch AND project = rec1.project AND intcode =  rec1.t_intcode ; SELECT r.amount INTO int_payrate FROM accounts.payroll pp  INNER JOIN accounts.ratecode r ON (UPPER(pp.pay_rate) = UPPER(r.pay_rate))
INSERT INTO temp_payroll_val VALUES (rec.batch, rec.project, rec1.language, rec1.t_intcode, rec1.t_bcode, rec1.description, ROUND(prorated_hrs, 2), (int_payrate * ROUND(prorated_hrs, 2)));
prorated_hrs = 0;
FOR tem_val IN select * FROM temp_payroll_val
RETURN NEXT tem_val;
2. Read the data from function         select * from accounts.weekly_payroll_report_indirect(201129);