Thursday, December 22, 2011

Useful PostgreSQL Commands

Update statement joining tables
update deduction set sum=sum+1 from employee e
where e.id=deduction.employeeid and e.groupid=2;

update hall_sample h set panel_id = s.panelid
from (select lower(trim(address)) as email, e.panelid from email e
inner join (select panelid from contact where statusid = 1 and typeid = 2) AS c
on e.panelid = c.panelid) AS s where lower(trim(h.email)) = lower(trim(s.email))

Find duplicates when more than one unique values
select s.studyid, s.sam_no, count(*) from table_name i
inner join table_name s on (i.studyid = s.studyid AND i.sam_no = s.sam_no)
group by 1,2 having count(*) > 1

SELECT key, count(*) FROM ibmcvar.disposition GROUP BY 1 HAVING count(*) > 1;

SELECT * FROM ibmcvar.disposition where key in (select key from (SELECT key, count(*)
FROM ibmcvar.disposition
GROUP BY 1 HAVING count(*) > 1) as t) order by key

Delete one from duplicate rows (minimum fatiguedate)
delete from ibmcvar_2012.disposition where (key, fatiguedate) in (

select key, min(fatiguedate) from ibmcvar_2012.disposition where key in
(select key from (select key, count(*) from ibmcvar_2012.disposition
group by 1 having count(*) > 1 ) as x )
group by 1 order by 2 )

Find records with latest update value
SELECT * FROM table a JOIN (SELECT ID, max(date) maxDate
FROM table GROUP BY ID) b ON a.ID = b.ID AND a.date = b.maxDate

Add sequence to existing column in existing table
First set the maximum number of the field as current value in sequence'

ALTER TABLE ccadmin.tblemployees ALTER COLUMN recordid SET DEFAULT nextval('ccadmin.tblemployees_sec'::regclass)

Select last 5 values for 2 columns as fields for selected column value(person, project)

select puin,
max(case when (row_number = 1) then job_number else null end ) as jobnumber1,
max(case when (row_number = 1) then web_stat_type else null end ) as stat_type1,
max(case when (row_number = 2) then job_number else null end ) as jobnumber2,
max(case when (row_number = 2) then web_stat_type else null end ) as stat_type2,
max(case when (row_number = 3) then job_number else null end ) as jobnumber3,
max(case when (row_number = 3) then web_stat_type else null end ) as stat_type3,
max(case when (row_number = 4) then job_number else null end ) as jobnumber4,
max(case when (row_number = 4) then web_stat_type else null end ) as stat_type4,
max(case when (row_number = 5) then job_number else null end ) as jobnumber5,
max(case when (row_number = 5) then web_stat_type else null end ) as stat_type5
from ( WITH summary AS (
select puin, called_or_mailed_date, job_number, stat_type as web_stat_type,
row_number() over (PARTITION BY puin order by called_or_mailed_date desc)
from reports.add_fields2_samples
where puin in ('C3CE780B-0E3F-4A0E-BA3A', 'EF9ADA63-2D5F-427C-989E')
group by 1,2,3,4 order by puin asc, called_or_mailed_date desc )
SELECT * FROM summary where row_number < 6 ) as a
group by 1

Monday, November 21, 2011

PostgreSQL 9.2 Volume 2 - Summary

4. SQL Syntax
Value Expression
http://developer.postgresql.org/pgdocs/postgres/sql-expressions.html

Calling Function

Named notation is especially useful for functions that have a large number of parameters (makes the associations between parameters and actual arguments more explicit and reliable)

Positional notation, a function call is written with its argument values in the same order as they are defined in the function declaration (parameters can only be omitted from right to left)

CREATE FUNCTION concat_lower_or_upper(a text, b text, uppercase boolean DEFAULT false)
       RETURNS text AS $$
       SELECT CASE WHEN $3 THEN UPPER($1 || ' ' || $2) ELSE LOWER($1 || ' ' || $2) END;
$$ LANGUAGE SQL IMMUTABLE STRICT;
Positional Notation
SELECT concat_lower_or_upper('Hello', 'World', true);
 HELLO WORLD
SELECT concat_lower_or_upper('Hello', 'World');
 hello world
Named notation
SELECT concat_lower_or_upper(a := 'Hello', b := 'World', uppercase := true);
 HELLO WORLD
SELECT concat_lower_or_upper(a := 'Hello', uppercase := true, b := 'World');
 HELLO WORLD
Mixed Notation
SELECT concat_lower_or_upper('Hello', 'World', uppercase := true);
 HELLO WORLD

5. Data Definition
Constraints - following example demonstrates different methods of constraints usage.

CREATE TABLE products (
CHECK (price > 0),
price numeric,
price1 numeric,
CONSTRAINT positive_price CHECK (price1 > 0),
CHECK (price > price1),
CHECK (discounted_price > 0 AND price > discounted_price),
product_no integer UNIQUE,
product_no integer CONSTRAINT must_be_different UNIQUE,
product_no integer REFERENCES products (product_no), -- foreign key
FOREIGN KEY (b, c) REFERENCES other_table (c1, c2),
product_no integer REFERENCES products ON DELETE RESTRICT,
order_id integer REFERENCES orders ON DELETE CASCADE);

RESTRICT prevents deletion of a referenced row. NO ACTION means that if any referencing rows still exist when the constraint is checked. CASCADE specifies that when a referenced row is deleted, row(s) referencing it should be automatically deleted as well.

Modifying Table

ALTER TABLE products ADD COLUMN description text; --adding a column
ALTER TABLE products DROP COLUMN description CASCADE; -- drop column
-- add constrain
ALTER TABLE products ADD CHECK (name <> ''); 
ALTER TABLE products ADD CONSTRAINT some_name UNIQUE (product_no); 
ALTER TABLE products ADD FOREIGN KEY (product_id) REFERENCES product_groups;
ALTER TABLE products ALTER COLUMN price SET DEFAULT 7.77; -- add default
ALTER TABLE products RENAME COLUMN prod_no TO product_no; -- rename column

6.Partitioning

Partitioning refers to splitting what is logically one large table into smaller physical pieces. Benefits are:

Query performance can be improved, when heavily accessed rows of table are in a single partition. The partitioning substitutes for leading columns of indexes, reducing index size and making it more likely that the heavily-used parts of the indexes fit in memory.
When queries or updates access a large percentage of a single partition, performance can be improved by taking advantage of sequential scan of that partition instead of using an index..
Bulk loads and deletes can be accomplished by adding or removing partitions, if that requirement is planned into the partitioning design.
Seldom-used data can be migrated to cheaper and slower storage media.

Range Partitioning

The table is partitioned into "ranges" defined by a key column or set of columns, with no overlap

List Partitioning

The table is partitioned by explicitly listing which key values appear in each partition.

http://developer.postgresql.org/pgdocs/postgres/ddl-partitioning.html

Combining Queries

The results of two queries can be combined using the set operations union, intersection, and difference.

query1 UNION [ALL] query2 - eliminates duplicate rows from its result, in the same way as DISTINCT, unless UNION ALL is used
query1 INTERSECT [ALL] query2 - returns all rows that are both in the result. Duplicate rows are eliminated unless INTERSECT ALL is used.
query1 EXCEPT [ALL] query2 - returns all rows that are in the result of query1 but not in the result of query2. (This is sometimes called the difference between two queries.)

7.With Queries (Common Table Expressions)

WITH provides a way to write auxiliary statements for use in a larger query. These statements, which are often referred to as Common Table Expressions or CTEs, can be thought of as defining temporary tables that exist just for one query.

Data-Modifying Statements in WITH

You can use data-modifying statements (INSERT, UPDATE, or DELETE) in WITH. This allows you to perform several different operations in the same query. An example is:

WITH moved_rows AS (
    DELETE FROM products
    WHERE
        "date" >= '2010-10-01' AND
        "date" < '2010-11-01'
    RETURNING *
)
INSERT INTO products_log
SELECT * FROM moved_rows;

This query effectively moves rows from products to products_log. The DELETE in WITH deletes the specified rows from products, returning their contents by means of itsRETURNING clause; and then the primary query reads that output and inserts it into products_log.

8. Data type

Chapter 9. Functions and Operators

9.3. Mathematical Functions and Operators
http://www.postgresql.org/docs/9.1/interactive/functions-math.html
9.4. String Functions and Operators
http://www.postgresql.org/docs/9.1/interactive/functions-string.html
9.5. Binary String Functions and Operators
http://www.postgresql.org/docs/9.1/interactive/functions-binarystring.html
9.7. Pattern Matching
http://www.postgresql.org/docs/9.1/interactive/functions-matching.html
9.8. Data Type Formatting Functions
http://www.postgresql.org/docs/9.1/interactive/functions-formatting.html
9.9. Date/Time Functions and Operators
http://www.postgresql.org/docs/9.1/interactive/functions-datetime.html
9.10. Enum Support Functions
http://www.postgresql.org/docs/9.1/interactive/functions-enum.html

9.11. Geometric Functions and Operators
http://www.postgresql.org/docs/9.1/interactive/functions-geometry.html
9.12. Network Address Functions and Operators
http://www.postgresql.org/docs/9.1/interactive/functions-net.html
9.13. Text Search Functions and Operators
http://www.postgresql.org/docs/9.1/interactive/functions-textsearch.html
9.14. XML Functions
http://www.postgresql.org/docs/9.1/interactive/functions-xml.html
9.15. Sequence Manipulation Functions
http://www.postgresql.org/docs/9.1/interactive/functions-sequence.html
9.16. Conditional Expressions
Case

SELECT ... WHERE CASE WHEN x <> 0 THEN y/x > 1.5 ELSE false END;

NULLIF

The NULLIF function returns a null value if value1 equals value2; otherwise it returns value1. This can be used to perform the inverse operation of the COALESCE example given above:

SELECT NULLIF(value, '(none)')

9.17. Array Functions and Operators
http://www.postgresql.org/docs/9.1/interactive/functions-array.html
9.18. Aggregate Functions
http://www.postgresql.org/docs/9.1/interactive/functions-aggregate.html
9.19. Window Functions
http://www.postgresql.org/docs/9.1/interactive/functions-window.html

9.22. Set Returning Functions
This section describes functions that possibly return more than one row. http://www.postgresql.org/docs/9.1/interactive/functions-info.html

SELECT * FROM generate_series(2,4);
 generate_series
-----------------
               2
               3
               4
SELECT * FROM generate_series(5,1,-2);
 generate_series
-----------------
               5
               3
               1

-- this example relies on the date-plus-integer operator
SELECT current_date + s.a AS dates FROM generate_series(0,14,7) AS s(a);
   dates
------------
 2004-02-05
 2004-02-12
 2004-02-19

SELECT * FROM generate_series('2008-03-01 00:00'::timestamp, '2008-03-04 12:00', '10 hours');
   generate_series   
---------------------
 2008-03-01 00:00:00
 2008-03-01 10:00:00
 2008-03-01 20:00:00
 2008-03-02 06:00:00
 2008-03-02 16:00:00

9.23 System Information Functions
Session Information Functions

Name	Return Type	Description
`current_catalog`	`name`	name of current database (called "catalog" in the SQL standard)
`current_database()`	`name`	name of current database
`current_query()`	`text`	text of the currently executing query, as submitted by the client (might contain more than one statement)
`current_schema[()]`	`name`	name of current schema
`current_schemas(boolean)`	`name[]`	names of schemas in search path, optionally including implicit schemas
`current_user`	`name`	user name of current execution context
`inet_client_addr()`	`inet`	address of the remote connection
`inet_client_port()`	`int`	port of the remote connection
`inet_server_addr()`	`inet`	address of the local connection
`inet_server_port()`	`int`	port of the local connection
`pg_backend_pid()`	`int`	Process ID of the server process attached to the current session
`pg_conf_load_time()`	`timestamp with time zone`	configuration load time
`pg_is_other_temp_schema(oid)`	`boolean`	is schema another session's temporary schema?
`pg_listening_channels()`	`setof text`	channel names that the session is currently listening on
`pg_my_temp_schema()`	`oid`	OID of session's temporary schema, or 0 if none
`pg_postmaster_start_time()`	`timestamp with time zone`	server start time
`session_user`	`name`	session user name
`user`	`name`	equivalent to `current_user`
`version()`	`text`	PostgreSQL version information

9.24. System Administration Functions
http://www.postgresql.org/docs/9.1/interactive/functions-admin.html

Thursday, November 17, 2011

Useful PDI steps' commands

Wild card expression

File Names Starts with 100047 and end with DIS.DAT

${Internal.Job.Filename.Directory} 100047*DIS.DAT <

File names end up with xls extension
${Internal.Transformation.Filename.Directory}\matchdata .*\.xls

Java scripts
Replace white spaces, concatenate strings and print fields with line breaks
var Data_file;
var Date_time;
var header_line;
date_time = Date_time1.replace(" ", "-");
Data_file = '82246_data_extraction_' + Date_time1.replace(" ", "-") + '.xls';
header_line = '========================================== \r\n' +
'Date & Time :' + date_time + '\r\n' + 'Output File :' + Data_file + '\r\n'

If and replace function usage
var max_imported_last_call_new;
if (max_imported_last_call != null)
{ max_imported_last_call_new=replace(max_imported_last_call, "/", "-"); }
else { max_imported_last_call_new='2000-01-01 00:00:00.0'; }

If and Data Type Conversion
var new_dispo_file_name;
var wave;
if (str2num(filename_date) == str2num(SERVER_DATE)){
new_dispo_file_name = XXXXXX + filename_date + 'CVAR' + num2str(str2num(wave) + 1) + 'DIS';}
else {
new_dispo_file_name = XXXXXX + SERVER_DATE + 'CVAR' + 0 + 'DIS';}

AND and OR conditions

var IBM_DISPOSITION_CODE;
var fail_number = 0;
IBM_DISPOSITION_CODE = '000';
if (ADDR == null || CITY == null || ZIPCODE == null) {
IBM_DISPOSITION_CODE = '009';
fail_number++;}

if (x_salutation == null && (
(country=='BE' && language=='Dutch') ||
(country=='BE' && language=='French') ||
(country=='FR' && language=='French') ||
(country=='CH' && language=='Italian') )) {
IBM_DISPOSITION_CODE = '011';
fail_number++;}

Date type (string to date conversion)
eoweek = str2date(date_val,"dd-MMM-yy");
quarter_no = quarter(eoweek);

var SurveyResults_filename = system_date + "_82430SurveyResults";
var SampleDisposition_filename = system_date + "_CumminsDispositionStatus";

var loaddatetime = system_date.substr(0,12) + "00";
var yesterday = dateAdd(today, "d", -1);
var dayb4yesterday = dateAdd(today, "d", -2);

var c_date = date2str(LAST_CALL,"MM/dd/yyyy") + " " + num2str(LAST_TIME/3600, "00") + ":" + (num2str(LAST_TIME%3600/60, "00")) + ":" + (num2str(LAST_TIME%60, "00")) ;

Regular exression
if ( substr(timetype,0,4) == 'GMT+' || substr(timetype,0,4) == 'GMT-')
{ var re = /([-+])(\d+)(.?)(3?)/.exec(timetype);

xmins=re[4]
else{mins='00';}
var x = re[2].length;
if ( x == 1) {hr = '0' + re[2]; }
else {hr = re[2]; }
temptime = re[1] + hr + ':' + mins;
}

.*\.add$ - everything ends with ".add"

Monday, November 14, 2011

PostgreSQL 9.2 Volume 1

1. Getting Started

PostgreSQL is relational database management system (RDBMS).

Architecture

Use client / server architecture. Server handles multiple concurrent connections from clients. It starts new process called "forks" for each new connection.

2. The SQL Language

Create DB: $ createdb my_database $ dropdb my_database

$ /usr/local/pgsql/bin/ createdb my_database

Access DB $ psql my_database

my_database => SELECT version();

Internal commands

my_database --> \h - help

my_database --> \q - quit from plsql to shell

my_database --> \i - read command from specific file

my_database --> \b - backspace

my_database --> \f - form feed

my_database --> \n - newline

my_database --> \r - carriage return

my_database --> \t - tab

my_database --> \o, \oo, \ooo (o = 0 - 7) octal byte value

my_database --> \xh, \xhh (h = 0 - 9, A - F) hexadecimal byte value

my_database --> \uxxxx, \Uxxxxxxxx (x = 0 - 9, A - F) 16 or 32-bit hexadecimal Unicode character value

Create table

CREATE TABLE films (
    code        char(5) PRIMARY KEY DEFAULT nextval('school_id_seq')
    vector  int[][], -- 2 dimension array,
    name   varchar(40) NOT NULL CHECK (name <> '')
    ddid     integer CHECK (did > 100),
    price numeric DEFAULT 9.99,
    date_prod   date,
    kind        varchar(10),
    len         interval hour to minute,
    modtime   timestamp DEFAULT current_timestamp, -- default value is current time
    CONSTRAINT production UNIQUE(date_prod),
    --CONSTRAINT code_title PRIMARY KEY(code,title)
);

Joining tables
SELECT * FROM table1 a INNER JOIN table2 b ON (a.name = b.name);
SELECT * FROM table1 a LEFT OUTER JOIN table2 b ON (a.name = b.name);
SELECT * FROM table1 a RIGHT OUTER JOIN table2 b ON (a.name = b.name);

3. Advanced Features
Transaction concept (all-or-nothing)

Transactions are a fundamental concept of all database systems. The essential point of a transaction is that it bundles multiple steps into a single, all-or-nothing operation. The intermediate states between the steps are not visible to other concurrent transactions, and if some failure occurs that prevents the transaction from completing, then none of the steps affect the database at all. A transaction is said to be atomic: from the point of view of other transactions, it either happens completely or not at all.

when multiple transactions are running concurrently, each one should not be able to see the incomplete changes made by others. To facilitate these in PostgreSQL, a transaction is set up by surrounding the SQL commands of the transaction with BEGIN and COMMIT commands

If, partway through the transaction, we decide we do not want to commit (perhaps bank balance went negative), we can issue the command ROLLBACK instead of COMMIT, and all our updates so far will be cancelled. if you are sure you won't need to roll back to a particular savepoint again, it can be released, so the system can free some resources. Keep in mind that either releasing or rolling back to a savepoint will automatically release all save-points that were defined after it.


BEGIN;
UPDATE accounts SET balance = balance - 100.00 WHERE name = 'Alice';
SAVEPOINT my_savepoint;
UPDATE accounts SET balance = balance + 100.00 WHERE name = 'Bob';
-- oops ... forget that and use Wally's account
ROLLBACK TO my_savepoint;
UPDATE accounts SET balance = balance + 100.00 
    WHERE name = 'Wally';
COMMIT;
END;

Window function

A window function performs a calculation across a set of table rows that are somehow related to the current row. Unlike regular aggregate functions, use of a window function does not cause rows to become grouped into a single output row. Behind the scenes, the window function is able to access more than just the current row of the query result. Not allowed within where clause.

This shows how to compare each employee's salary with the average salary in his or her department.

SELECT depname, empno, salary, avg(salary) OVER (PARTITION BY depname) FROM empsalary;
SELECT depname, empno, salary, rank() OVER (PARTITION BY depname ORDER BY salary DESC) FROM empsalary;

depname  | empno | salary | rank 
-----------+-------+--------+------
 develop   |     8 |   6000 |    1
 develop   |    10 |   5200 |    2
 develop   |     9 |   4500 |    4
 develop   |     7 |   4200 |    5
 personnel |     2 |   3900 |    1
 personnel |     5 |   3500 |    2
 sales     |     1 |   5000 |    1
 sales     |     4 |   4800 |    2

OVER clause directly following the window function's name and argument(s). The OVERclause determines exactly how the rows of the query are split up for processing by the window function. The PARTITION BY list within OVERspecifies dividing the rows into groups, or partitions, that share the same values of the PARTITION BY expression(s).

When a query involves multiple window functions, it is possible to write out each one with a separate OVER clause, but this is duplicative and error-prone if the same windowing behavior is wanted for several functions. Instead, each windowing behavior can be named in a WINDOW clause and then referenced in OVER. For example:

SELECT sum(salary) OVER w, avg(salary) OVER w
  FROM empsalary
  WINDOW w AS (PARTITION BY depname ORDER BY salary DESC);

Inheritance

Inheritance is a concept from object-oriented databases.

Ref: http://developer.postgresql.org/pgdocs/postgres/tutorial-inheritance.html

CREATE TABLE cities ( name text, population real, altitude int );
CREATE TABLE capitals ( state char(2))  INHERITS (cities);

On the other hand, the following query finds all the cities that are not state capitals and are situated at an altitude of 500 feet or higher:
SELECT name, altitude FROM ONLY cities WHERE altitude > 500;

Isura's Blog

Pages