PostgreSQL – Database Consultant

Stored-Proc to insert into PostgreSQL

It turned out Not to be so straight forward, executing a SQL Server stored-procedure from Entity Framework to insert data into Postgres (yes, I know, I said that in the meeting).

(A stored-procedure with an ordinary INSERT executed perfectly from within SSMS, using a linked server with an ODBC driver.)

After a fruitless day tweeking RPC and Distributed Transactions, my pragmatic (dreary) solution was to manually create a Postgres table, then a Postgres function to do the insert, then a stored-procedure to pass parameters to that function.

Here’s the code …

-- in postgres

-- DROP TABLE public.errorlogs;

CREATE TABLE public."ErrorLogs"
(
     "id" serial primary key,
     "edesc" char(500),
     "etype" int,
     "appid" int
)
WITH (
    OIDS = FALSE
)
TABLESPACE pg_default;

ALTER TABLE public.ErrorLogs
    OWNER to postgres;

Then to create the insert function …

--in postgres
CREATE OR REPLACE FUNCTION pg_insert_new_error(edesc varchar(500), etype int, appid int) RETURNS VOID AS
$$
BEGIN
    INSERT INTO "ErrorLogs" (edesc, etype, appid) VALUES (edesc, etype, appid);
END
$$
  LANGUAGE 'plpgsql';

And finally the stored-proc …

-- in sql server
alter procedure [PostgreSQL].[SPU_InsertNewErrorLog]
	@description varchar(500),
	@messagetype int,
	@appid int
as

BEGIN

	DECLARE @cmd VARCHAR(500) = 'SELECT a.* FROM OPENQUERY(pgserver,''select pg_insert_new_error(''''' + @description + ''''', ' 
								+ CONVERT(VARCHAR(5), @messagetype) + ', ' + CONVERT(VARCHAR(5), @appid) + ')'') a'

	EXEC (@cmd)

END

GO

(Notice, I had to encapsulate the whole query and its parameters to work with OPENQUERY).

Migrating SQL Server to PostgreSQL

download and install latest postgresql (eg: 10.4)
download and install latest admin tool (eg: pgadmin 4.3)
create a postgres “Server” that matches the actual server name, and a database

ServerDatabase

5. download the newest 32 bit postgresql driver (eg: psqlodbc_x64.msi)
6. install the driver (image) …

postgresDriver

7. run SSIS export wizard (image) …

ssisWizard

Where the wizard stops with an error use the following script to change the offending column to varchar(max). The idea at this stage is just to get the data in, in whatever form.

-- convertTablesForPg.sql

-- 1. to find tables with a named column, or

select
so.[name] table_name, sc.[name] column_name, st.[name] data_type
from sysobjects so
inner join syscolumns sc on (so.id = sc.id)
inner join systypes st on (st.[type] = sc.[type])
where so.[type] = 'U'
and sc.[name] = 'Email'

-- 1b. to find named tables

select
so.[name] table_name, sc.[name] column_name, st.[name] data_type,
'ALTER TABLE [dbo].[' + so.[name] + '] ALTER COLUMN [' + sc.[name] + '] VARCHAR(MAX);'
from sysobjects so
inner join syscolumns sc on (so.id = sc.id)
inner join systypes st on (st.[type] = sc.[type])
where so.[type] = 'U'
and so.[name] = 'translations'
ORDER BY 2

-- 2. to create command to change date/time/bit columns to varchar(max)

select
so.[name] table_name, sc.[name] column_name, st.[name] data_type,
'ALTER TABLE [dbo].[' + so.[name] + '] ALTER COLUMN [' + sc.[name] + '] VARCHAR(MAX);'
from sysobjects so
inner join syscolumns sc on (so.id = sc.id)
inner join systypes st on (st.[type] = sc.[type])
where so.[type] = 'U'
and st.[name] IN ('DATE', 'TIME', 'BIT')
ORDER BY 1,2,3

/* missed tables
[dbo].[BuyersBySupplierData]
*/

See me for part 2 😉

A PostgreSQL instance

A Cluster is a single, complete, running, PostgreSQL server (IE: cluster of databases)

One PostgreSQL Server
Listening on one port (may be multiple addresses)
One set of data files (including tablespaces)
One set of Write Ahead Log

Operations done on a Cluster:

Initialization (initdb)
Start / Stop the cluster
File-level Backup / Restores
Streaming Replication

Objects defined at Cluster level

Users / Roles
Tablespaces
Databases

PostgreSQL – Administration

Maintenance Tasks

Keep autovacuum enabled most of the time
VACUUM regularly as well
Check for unused indexes

Warnings: (unless you know what you are doing …)

Avoid using VACUUM FULL
REINDEX CONCURRENT does not exist (yet)
Do not use HASH INDEXES
Do not use fsync = off

Read the manual

PostgreSQL docs are about 2000 pages
Technically accurate
Written and maintained by the developers

Security

Superuser is too powerful for most use cases (SECURITY DEFINER functions)
Use a distinct userid for replication
GRANT minimal access rights

Upgrades

Maintenance releases happen about every 3 months
For best security – upgrade to latest maintenance release
Major release upgrades are harder
UDR technology (thats UniDirectional Replication) will make Major release upgrades much easier from 9.4+

Extensions

PostgreSQL is designed to be extensible
Many new features enabled via extensions (EG: pgaudit, postgis)
Use them!

Scripts

GUIs do not allow you to apply changes in a transaction or easily record your actions
Use scripts for any administrative changes
Test them, before applying

Schema Change: adding a foreign key now split into two parts :-

ALTER TABLE foo ADD FOREIGN KEY … REFERENCES bar NOT VALID;
ALTER TABLE foo VALIDATE CONSTRAINT fook;

Apply constraint going forward (with quick write-lock)
Check data already in the table (a background task)

PostgreSQL study notes – psql

Features

non-interactive usage
command history (up/down arrow)
tab completion (sans Windows)
commands terminate with semi-colon and can wrap lines
defaults to supplying currently logged-in username as pg use

Tasks

explore ‘psql’
a. ‘psql –version’ returns version of postgresql client
b. ‘psql -l -U postgres’ lists installed db’s then exits.
postgreSQL installs 3 default db’s
1. ‘postgres’ – management db. contains user accounts, global settings, etc
2. ‘template0’ – vanilla read-only db
3. ‘template1’ – changable copy of template0, used as template for new db’s
c. ‘psql’ with no options enters interactive mode (duh)
a hash-mark ending the interactive-prompt denotes a ‘superuser’ eg: ‘postgres=#’
d. ‘\h’ – returns sql-specific help eg: ‘ALTER TABLE …’
‘\h create’ filters above to just ‘CREATE …’ commands
e. ‘\?’ – returns ‘psql’ specific help eg: ‘\?’ ie: usable metasequences ie shortcuts
f. ‘\l’ – returns list of DBs. ‘\l+’ additionally returns DB sizes
g. ‘\du[+]’ – returns list of users with access to postgresql.
h. ‘\!’ open shell from session (‘exit’ from shell = back to psql)
i. ‘\! [command] – runs command in shell non-interactively and returns to psql.
j. ‘\i filename’ – execute command(s) in the file ie psql or sql commands
k, multiple commands can be run on one line. Separate with space, terminate with semi-colon (eg: ‘\l \du;’)
L: ‘\c’ – connect to another database or host eg: ‘\c template1’
m. ‘\d’ list tables\views etc in current DB, ‘\dS’ list system tables, ‘\dS+’ list system tables with sizes.
n, \q quit
c. ‘psql –help’ (or ‘psql -?’) psql option switches & defaults (short version then long version) eg: -U (username – short version), -l (list – short version), –version (long version) …

PostgreSQL study notes – Installation

Download from Enterprise DB, gui is the one to go for, remember xhost on linux

1. Install. For prod you should indicate that data files are stored independently of the source tree. A ‘Cluster’ is not a classic Cluster IE: a Server Cluster, just means all the databases on this particular box.

2. Explore the footprint of the install. ‘\\bin\ contains utilities like psql.exe (terminal monitor). \\data\ contains all databases and 3x config files & pg_log/ (log files), pg_xlog/ (write ahead log folder). postmaster.opts (startup options)

3. Provide access to internal docs via web-browser bookmark eg: file:///C:/Program%20Files/PostgreSQL/9.5/doc/postgresql/html/index.html

4. Add \\bin file to path – for psql.exe etc (eg: C:\Program Files\PostgreSQL\9.5\bin). posqlgresql clients default to submitting the current logged-in users name as the DB user name (eg: “psql” without -U will assum user is windows-user).

5. Add system variable ‘PGUSER=postgres’ workaround so psql etc wont try to login to utilities as o/s-user

PostgreSQL study notes – Features

Object Relational Database Management System (ORDBMS) objects (eg: tables) can be related in a Hierarchy: Parent -> Child
Transactional RDBMS: SQL statements have implicit: BEGIN; COMMIT: statements. SQL statements may also have explicit BEGIN COMMIT statements
developed at UC Berkeley like along with bsd-unix
One process per connection: master process = “postmaster” auto-spawns per new connection
Processed (pid’s) use one cpu-core per connection: o/s may spawn new connections on a different cpu-core. no cross-core queries
Multiple helper processes, which appear as ‘postgres’ instances, always running eg: stats collector, background writer (protect agains sudden failure ), auto-vacuum (cleanup/ space reclaimer), wal sender (that’s write ahead log)
max db size: unlimited – limited by available storage.(terabytes, perabytes, exabytes)
max table size: 32tb – stored as multiple 1gb files – changable (could be prob for some o/s’s)
max row size: 400gb
max column size: 1gb (per row ie: per field)
max indexes per table: unlimited
max identifier length (db objects – tables, columns, triggers, functions@ 63 bytes. this is extensible via source code
default listener tcp port 5432. so may install postgresql as non-privilaged user
users are distinct from o/s users
users are authenticated globally (per server), then assigned permissions per database.
inheratance. tables lower in hierachy may inherit columns from heigher tables (ie parents) so long as no contraints eg foreign keys.
case insensative commands – without double quotes (eg: select * from syslog;)
case sensative commands – with double quotes – (eg: select * from “syslog”;)
three primary config files, located in postgres-root A.pg_hba.conf (host based access) B.postgresql.conf general settings C. pg_ident.conf – user mappings
integrated log rotation (config by age or size)

PostgreSQL command-line

Having logged in locally on a linux box, I used these steps to access the database via a terminal session …

 $ sudo su - postgres
 [sudo] password for richard: *****
 $ psql
 Password: ******
 postgres=#
 postgres=# select version()
 PostgreSQL 9.4.4 on x86_64 (Red Hat 4.1.2-55), 64-bit

Line 1) As root, I switch to linux user “postgres” (including environmental variables)
Line 2) I typed in my password
Line 3) And ran the Executable (psql.exe)
Line 4) I typed in the password of the postgres user
Line 5) Success! and to prove it …
Line 6) My fist SELECT statement, lol

BTW: to leave I typed “\q” to quit the PostgreSQL environment, “exit” to leave the postgres account, then “exit” again to close the terminal session.