PostgreSQL Tips

 

If you run your database on a different server, you might want to set the PGHOST variable in the /etc/profile directory so all your users don't have to type the "-h " part over and over again.

 

Largest ID

To get the largest or smallest value from a postgres database it is highly recommended to not use the max() or min() functions, but resort to a Trick (as discussed in the OpenACS Forums

So instead of "select max(id) from mytable" do a "select id from mytable order by id desc limit 1". 

And instead of "select min(id) from mytable" do a "select id from mytable order by id asc limit 1".

Exclude rows

If you have two tables and you want to select all the entries from table "one" that are not part of table "two" do a left join and make sure that the value from table "two" is null 

select person_id

from one left join (select item_id

from two

where content_type = 'contact_party_revision') twos on item_id = person_id

where item_id is null;

Under no circumstances should you be forced to use the "is not in" statement. 

Performance tuning

This assumes that you have a dedicated PostgreSQL Server with at least 4GB of RAM. 

  1. Change the kernel parameter for maximum shared memory segment size:

    [root root]# echo 288435456 >/proc/sys/kernel/shmmax
    [root root]#

    Make that change permanent by editing /etc/sysctl.conf to add these lines at the end:

    # increase shared memory limit for postgres
    kernel.shmmax = 288435456
  2. Edit the PostgreSQL config file, /usr/local/pgsql/data/postgresql.conf, to use more memory. These values should improve performance in most cases. (more information)

    #       Shared Memory Size
    #
    shared_buffers = 16384     # 2*max_connections, min 16
    
    #       Non-shared Memory Sizes
    #
    sort_mem = 4096            # min 32
    
    vacuum_mem = 32168
    
    work_mem = 8192
    
    max_fsm_pages = 1000000
    
    effective_cache_size = 262144   # typically 8KB each
    random_page_cost = 2            # units are one sequential page fetch cost
    
    wal_buffers = 64 # Useful for long transactions
    #       Write-ahead log (WAL)
    #
    checkpoint_timeout = 900                # range 30-3600, in seconds
    checkpoint_segments = 8     # in logfile segments (16MB each), min 1
    
  3. In Postgres 8.1 add the following 
    autovacuum = on
    stats_start_collector = on
    stats_command_string = on
    stats_row_level = on
    
  4. Restart postgres (service postgresql restart) or (/etc/init.d/postgres restart) so that the changes take effect.

Upgrade

 

If you want to upgrade your PostgreSQL database a couple of steps should be noted:

  • An upgrade between major version numbers (e.g. 8.1 to 8.2) requires a dump and restore of all databases
  • UTF-8 is an issue at least prior to 8.2
  • Always use the pg_dump command of the newer version 

Therefore our suggestion is to follow these steps to get your installation ready:

  1. Install the latest version of postgresql in e.g. /usr/local/pg82
  2. Change the port to run on 5434 (at least something different from your current postgres version)
  3. Shutdown all client applications using the database. 
  4. Dump the database 
    /usr/local/pg82/bin/pg_dump -E UNICODE -f service0.dmp -U service0 service0
  5. Clean up the dumpfile for incorrect UTF8 characters 
    /usr/bin/iconv -c -f UTF-8 -t UTF-8 -o cleanfile.dmp service0.dmp
  6. Import the dumpfile in your database 
    /usr/local/pg82/bin/pgsql -f cleanfile.dmp -U service0 service0
  7. Run your application in testmode against port 5434
  8. If this works reliably, shutdown the old postgres version 
    /etc/init.d/postgresql stop
  9. Edit the port of the new version to run on 5432 again and restart your application on that port.
  10. On the next upgrade, remove /usr/local/pgsql and install e.g. PG8.3 right into /usr/local/pgsql instead of /usr/local/pg83. This helps you keep a clean installation. Alternatively you can always use symbolic links. 

String manipulation

You can do string manipulation in PostgreSQL as well. Assume you have a column which contains strings that are composed of a fixed part and an ID. To only retrieve the ID, use the string functions, e.g. "ltrim"

select ltrim(name, 'csr_contact_') as id from cr_items where content_type = 'contact';

Take into account that ltrim will trim all the characters from the name that are contained removal string. To give an example, ltrim("csr_contact_supervisor","csr_contact_") would return (upervisor), as the "s" is contained in the "csr_contact_" characters. 

To circumwent this you can do a replace(name,'csr_contact_','') if you know there is only one 'csr_contact_' in the name. Otherwise you will have to do a select where you first check if the name starts with 'csr_contact_' and then return a substring starting with the first character after 'csr_contact_' (thanks to Russel Muetzelfeldt):

select CASE WHEN name like 'csr_contact_%' THEN substring(name from 13) ELSE name END

Sadly sorting on the ID does not work, as the id is of type string, which results in a sorting order of {1 10 100 2 3 4 5}. To change this you have to convert the string to a number. Then ordering works:

select to_number(ltrim(name, 'csr_contact_'),'999999999') as id from cr_items where content_type = 'contact' order by id desc;