Globally unique numeric keys in PostgreSQL

Overview

Preface

Usage of numeric unique identifiers is a very popular approach to identify individual records in database systems. While other approaches exist, often the primary key of a table will be an integer. Integer keys offer a good balance between internal storage size requirements, feasible indexing approaches and performance, as column values often fit within a register of a modern cpu.

Usage of these type of primary keys is so popular, that most databases offer mechanisms to provide an autoincrement mechanism. The SQL Standard even has a specific modifier - IDENTITY, to provide this autoincrement functionality. However, not all database systems implement IDENTITY, and in some cases offer a sequence-based mechanism; PostgreSQL historically provides the SERIAL pseudo-type as a shortcut to the sequence-based mechanism, but version 10 introduced support for IDENTITY clauses.

One of the downsides of using numeric identifiers is that its uniqueness is tied to the database where the identifier is created, do to its sequential nature (and, by default, values start from 1) - the same table definition, implemented in two distinct databases, with different data, will have colliding entries. While some approaches exist to minimize this problem, this is one of the reasons why some schemas rely on UUID instead.

However, there are plenty of scenarios where using numeric identifiers are preferred due to its compactness, and it is often desirable to identify the database where that identifier was generated. In this tutorial, we propose a somewhat simple mechanism that allows the usage of some of the upper bits of a 64-bit integer to store another identifier that can be used in different scenarios to identify the cluster, server or database where the record is generated.

Encoding a cluster identifier in a numeric key

This mechanism allows the usage of numeric identifiers in different instances of a given database, in a way they are still unique. As a side-effect, the database identifier is easily extracted from code with a bit-shift operation, allowing easy routing of operations on a given record to its original database; this route can also be used by service proxy implementations to resolve available backends for a given request. For simplicity sake, we will call this database identifier cluster identifier throughout this article.

One other advantage is the possibility of designing a database in such a way that can be independently instantiated (eg. by country or region), but later can be merged back in a single database without collisions. This allows for early design of highly flexible systems without having to decide on high level architectural details from the beginning.

It is important to notice this mechanism has its downsides as well; By using stored procedures and interacting with local tables, there is a small performance overhead on insert operations; The amount of unique identifiers available (and by consequence, the maximum number of records) is shortened as well - in our case, down to 48 bit, allowing "only" 281.474.976.710.656 possible records per table. Why "possible records"? Because the system won't be reusing identifiers from both deleted rows and failed insert rows. Given those limitations, this may be a good apprach to mostly-read workloads, with predictable delete operations, and with tables with a row count below 1 billion entries.

Breaking down a bigint

Our tutorial will use PostgreSQL's BIGINT type, a 64-bit wide integer data type. Our implementation will use the lower 48 bit to implement a unique record identifier, and the upper 16 bits as a unique cluster identififer. Our generated 64 bit integer will look like this:

64 bit integer::img-center

To compute the final value, we pick a 16-bit cluster value, shift left 48 bit, and add the value generated from a sequence, as in the formula key_value = (cluster_id << 48) + sequence_id.

The cluster identifier

The cluster identifier is a unique value that identifies our cluster (or region, or database). It is limited to 2^16 possible values (between 0 and 65535). However, we will only use, 2^15 possible values (0-32.767), as the most significant bit is used for integer signing (indicating if the number is positive or negative).

In our implementation, the cluster identifier value will reside on a table named platform_config. The column value for this table should be defined accordingly on each cluster.

The unique id

The unique id takes the lower 48 bit, and its value is computed from a stored procedure that uses a sequence, so its behaviour is somewhat similar to the SERIAL implementation. To easy the process of keeping track of the sequences for different tables, we will be using a separate table with BIGSERIAL or IDENTITY columns - one for each table. This table will be called platform_sequence.

Generating unique values in PostgreSQL

Our mechanism relies on the usage of an underlying (sequence function)[https://www.postgresql.org/docs/current/sql-createsequence.html], just as the SERIAL or IDENTITY generate. A typical BIGSERIAL usage would look like this:

1CREATE TABLE my_table(
2    id_my_table BIGSERIAL
3);

but internally, PostgreSQL will create a sequence function, and assign it as default value for the specified column:

1CREATE SEQUENCE my_table_id_my_table_seq;
2
3CREATE TABLE my_table (
4    id_my_table BIGINT NOT NULL DEFAULT nextval('my_table_id_my_table_seq')
5);
6
7ALTER SEQUENCE my_table_id_my_table_seq
8OWNED BY my_table.id_my_table;

When using the IDENTITY modifier, the behaviour is generically the same, but IDENTITY columns have some other features that tackle existing limitations of SERIAL data types. See below for more information.

Please note

Usage of SERIAL for primary keys is discouraged, due to known issues. Using IDENTITY has multiple advantages over SERIAL, but in our case we just use them as a shortcut to create the underlying sequences. You can read a detailed article on the differences between the two implementations for more information.

Putting it together

Cluster identifier and sequence table

For demonstration purposes, we'll create a platform_config table to hold our cluster id, and a platform_sequence table with BIGSERIAL columns. This table will actually be empty, as we're using the columns as a way of generating sequences with a predefined name structure to be used by our stored procedure:

1CREATE TABLE platform_config (
2    id_cluster SMALLINT NOT NULL DEFAULT 0
3);
4
5CREATE TABLE platform_sequence (
6    table_a BIGSERIAL
7);

Stored procedure function

The stored procedure is used to compute the unique key value for the tables, and receives as a parameter the table name. It will then try to read the cluster id from platform_config, and the nextval() from a sequence. The sequence name is based on the platform_sequence table and the corresponding table name column. Every table that uses the stored procedure must have a corresponding sequence named platform_sequence_<table_name>_seq, and the easiest way of achieving this is to just add a new column to platform_sequence using the desired table name as the column name, and type BIGINT.

 1CREATE OR REPLACE FUNCTION clustered_sequence(VARCHAR)
 2    RETURNS BIGINT AS $$
 3DECLARE
 4    sequence_name ALIAS FOR $1;
 5    cluster_id BIGINT;
 6    seq        BIGINT;
 7BEGIN
 8    SELECT id_cluster FROM platform_config INTO cluster_id;
 9    SELECT nextval('platform_sequence_' || sequence_name ||'_seq') INTO seq;
10    seq := (seq << 16) >> 16;
11    RETURN (cluster_id << 48) + seq;
12END;
13$$ LANGUAGE PLPGSQL;

Application table

Finally, we create our example application table that will contain the actual data. To use our unique identifier, we create a BIGINT primary key column and set the DEFAULT value to be fetched from our stored procedure, using the table name as argument:

1CREATE TABLE table_a (
2  id_table_a BIGINT NOT NULL DEFAULT clustered_sequence('table_a') PRIMARY KEY,
3  data TEXT NOT NULL
4);
5

Making it work

Our example platform_config table will have only 1 row, and it needs to be filled when the database is provisioned, to make sure there is an initial cluster_id value:

1-- Initial cluster id: 10
2INSERT INTO platform_config(id_cluster) VALUES(10);

We can now write some test rows to our table_a table, change the cluster_id value to showcase as if it was a different database, and then insert some other rows:

 1-- insert some data
 2INSERT INTO table_a(data) VALUES('Alice');
 3INSERT INTO table_a(data) VALUES('Bob');
 4
 5-- change cluster id just for demo purposes
 6UPDATE platform_config set id_cluster=20;
 7
 8-- insert some more data, primary key value should be completely different
 9INSERT INTO table_a(data) VALUES('Alice');
10INSERT INTO table_a(data) VALUES('Bob');
11

If we check our table contents, we'll see our primary keys are some quite big numbers:

1demo_db=> SELECT * FROM table_a;
2    id_table_a    | data  
3------------------+-------
4 2814749767106561 | Alice
5 2814749767106562 | Bob
6 5629499534213123 | Alice
7 5629499534213124 | Bob
8(4 rows)

However, if we convert our primary key values to hexadecimal, we can clearly see the upper digits representing our distinct cluster id values, and the lower values from the sequence function. In different databases, the sequence function would start from the same number, so Alice would have a primary key of 0xa000000000001 and 0x14000000000001 - same sequence number but different cluster number, resulting in a collision-free key:

1demo_db=> SELECT to_hex(id_table_a), data from table_a;
2     to_hex     | data  
3----------------+-------
4 a000000000001  | Alice
5 a000000000002  | Bob
6 14000000000003 | Alice
7 14000000000004 | Bob
8(4 rows)