04 Mar 2015

PostgreSQL, playing with triggers, functions and sequences

I have a very special use case where I want Postgres to automatically generate an identifier for one column when a new row is inserted. This identifier has the following properties:

  • It must be unique.
  • It must have two parts.
    • The first one is the value of another field in the sampe table.
    • The second one must be sequence number padded with zeros.

Basically I want the identifier to have this format:

0-000001
1-000002
2-000001

These are the tables I’ll be using:

    CREATE TABLE warehouses (
        id      serial PRIMARY KEY,
        name    varchar(255)
    );

    CREATE TABLE incomings (
        id              serial PRIMARY KEY,
        warehouse_id    integer,
        identifier      varchar(50),
        data            text
    );

Each time a new warehouse is created I want PostgreSQL to create a customized SEQUENCE to keep track of the next identifier for that warehouse.

    CREATE OR REPLACE FUNCTION make_seq_for_warehouse() RETURNS TRIGGER AS $$
    DECLARE
        sql varchar := 'CREATE SEQUENCE seq_for_warehouse_' || NEW.id;
    BEGIN
        EXECUTE sql;
        return NEW;
    END;
    $$ LANGUAGE plpgsql;

And here is the trigger for the warehouses table that will call the previous function:

    CREATE TRIGGER generate_next_identifier_function
    AFTER INSERT
    ON warehouses
    FOR EACH ROW
    EXECUTE PROCEDURE make_seq_for_warehouse();

If we now insert a new warehouse, it should create a new sequence for us:

test=# INSERT INTO warehouses (name) VALUES ('Warehouse1');
INSERT 0 1
test=# \d
List of relations
Schema |        Name          |   Type   | Owner
-------+--------------.-------+----------+-------
public | incomings            | table    | jose
public | incomings_id_seq     | sequence | jose
public | seq_for_warehouse_1  | sequence | jose
public | warehouses           | table    | jose
public | warehouses_id_seq    | sequence | jose

Now I want the column identifier to be filled in every time a new incoming is created. First we create the function to do the work:

    CREATE OR REPLACE FUNCTION update_incoming_identifier()
        RETURNS TRIGGER AS $$
    BEGIN
        NEW.identifier := NEW.warehouse_id || '-' || lpad(nextval('seq_for_warehouse_' || NEW.warehouse_id)::varchar, 8, '0');
        RETURN NEW;
    END;
    $$ LANGUAGE plpgsql;

Finally we link this function with a trigger on the incomings table:

    CREATE TRIGGER generate_next_identifier
    BEFORE INSERT
    ON incomings
    FOR EACH ROW
    EXECUTE PROCEDURE update_incoming_identifier();

Note that now we use BEFORE TRIGGER so we get the new identifier before the row is saved.

Now we can create a new warehouse, so we have two and create some incomings:

    test=# INSERT INTO warehouses (name) VALUES ('Warehouse 2');
    INSERT 0 1
    test=# INSERT INTO incomings  (warehouse_id, data) VALUES (1, 'data on warehouse 1');
    INSERT 0 1
    test=# INSERT INTO incomings  (warehouse_id, data) VALUES (1, 'more data on warehouse 1');
    INSERT 0 1
    test=# INSERT INTO incomings  (warehouse_id, data) VALUES (2, 'data on warehouse 2');
    INSERT 0 1
    test=# INSERT INTO incomings  (warehouse_id, data) VALUES (2, 'more data on warehouse 2');
    INSERT 0 1
    test=# INSERT INTO incomings  (warehouse_id, data) VALUES (2, 'even more data on warehouse 2');
    INSERT 0 1

And this is the result:

test=# SELECT * FROM incomings;
id | warehouse_id | identifier  |              more_data
---+--------------+-------------+--------------------------------------
1  |            1 | 1-00000001  | data on warehouse 1
2  |            1 | 1-00000002  | more data on warehouse 1
3  |            2 | 2-00000001  | data on warehouse 2
4  |            2 | 2-00000002  | more data on warehouse 2
5  |            2 | 2-00000003  | even more data on warehouse 2

Another way to achive the same result would have been to use a SELECT MAX() + 1 query on the trigger and use an additional column on the incomings table.

DISCLAIMER: Bear in mind that we are creating a new sequence for each warehouse. This is not a problem in my scenario because I expect to have between 2 and 8 warehouses and most of them are created when I setup the application. Do not use this as a pattern. This is just me playing with PostgresSQL.