46

I'm not trying to restart the UUID vs serial integer key debate. I know there are valid points to either side. I'm using UUID's as the primary key in several of my tables.

  • Column type: "uuidKey" text NOT NULL
  • Index: CREATE UNIQUE INDEX grand_pkey ON grand USING btree ("uuidKey")
  • Primary Key Constraint: ADD CONSTRAINT grand_pkey PRIMARY KEY ("uuidKey");

Here is my first question; with PostgreSQL 9.4 is there any performance benefit to setting the column type to UUID?

The documentation http://www.postgresql.org/docs/9.4/static/datatype-uuid.html describes UUID's, but is there any benefit aside from type safety for using this type instead of text type? In the character types documentation it indicates that char(n) would not have any advantage over text in PostgreSQL.

Tip: There is no performance difference among these three types, apart from increased storage space when using the blank-padded type, and a few extra CPU cycles to check the length when storing into a length-constrained column. While character(n) has performance advantages in some other database systems, there is no such advantage in PostgreSQL; in fact character(n) is usually the slowest of the three because of its additional storage costs. In most situations text or character varying should be used instead.

I'm not worried about disk space, I'm just wondering if it's worth my time benchmarking UUID vs text column types?

Second question, hash vs b-tree indexes. No sense in sorting UUID keys so would b-tree have any other advantages over hash index?

5
  • If you are creating an unique index in addition to the primary key it is not necessary. When you set a primary key an unique index is created on the key. Commented Apr 26, 2015 at 20:03
  • I may have shown it in the wrong order. The index was automagically created by the primary key constraint. Commented Apr 28, 2015 at 11:24
  • It also seems, according to the docs (at the time of this comment 9.4 being the latest stable version), that the use of hash indexes are discouraged: postgresql.org/docs/9.4/static/indexes-types.html Commented Jan 2, 2016 at 18:31
  • Maybe I've misunderstood something about this post, but why would you use TEXT when Postgres has a native UUID column type? Are there any benefits to TEXT at all? Commented Feb 25, 2016 at 5:10
  • 2
    The UUID column type was added in 9.0. This database was first created in 8 Commented Feb 25, 2016 at 10:54

3 Answers 3

69

We had a table with about 30k rows that (for a specific unrelated architectural reason) had UUIDs stored in a text field and indexed. I noticed that the query perf was slower than I'd have expected. I created a new UUID column, copied in the text uuid primary key and compared below. 2.652ms vs 0.029ms. Quite a difference!

 -- With text index
    QUERY PLAN
    Index Scan using tmptable_pkey on tmptable (cost=0.41..1024.34 rows=1 width=1797) (actual time=0.183..2.632 rows=1 loops=1)
      Index Cond: (primarykey = '755ad490-9a34-4c9f-8027-45fa37632b04'::text)
    Planning time: 0.121 ms
    Execution time: 2.652 ms

    -- With a uuid index 
    QUERY PLAN
    Index Scan using idx_tmptable on tmptable (cost=0.29..2.51 rows=1 width=1797) (actual time=0.012..0.013 rows=1 loops=1)
      Index Cond: (uuidkey = '755ad490-9a34-4c9f-8027-45fa37632b04'::uuid)
    Planning time: 0.109 ms
    Execution time: 0.029 ms
Sign up to request clarification or add additional context in comments.

Comments

57

A UUID is a 16 bytes value. The same as text is a 32 bytes value. The storage sizes are:

select
    pg_column_size('a0eebc999c0b4ef8bb6d6bb9bd380a11'::text) as text_size,
    pg_column_size('a0eebc999c0b4ef8bb6d6bb9bd380a11'::uuid) as uuid_size;
 text_size | uuid_size 
-----------+-----------
        36 |        16

Smaller tables lead to faster operations.

4 Comments

But how it react when it comes to uuid comparison. Is there any benefit using uuid over int
MAX(uuid_column) isn't supported, so that's a real difference.
@AllainLalonde Why would you want that?
A discussion of max(uuid) can be found here: dba.stackexchange.com/questions/275251/… – there are some use cases and it can easily be added as an aggregate function
2

Moot in Postgres 18+

As others noted, a native UUID value is 128 bits. Any textual representation will be far bigger. There is no reason to use text. And text won’t solve the major issue with using UUIDs as keys in larger tables: inefficient indexing.

That issue is now resolved in Postgres 18.

Define your column as the uuid type. Use the new uuidv7() command to generate UUID values. Version 7 UUIDs provide efficient indexing.

UUID Version 7

In 2024, the specification for UUID was revamped. New Version 6 and Version 7 were added to provide for efficient indexing. RFC 9562 obsoletes RFC 4122.

The new RFC 9562 is very well written, more clear and insightful. I encourage reading the actual spec itself.

Version 7 puts a timestamp first in each UUID value. But rather than grouping the timestamp’s bits arbitrarily, a 48-bit big-endian unsigned number is used. This number is a Unix-style timestamp, a count of milliseconds since the start of 1970 UTC, 1970-01-01T00:00Z.

The rest of the bits are randomly-generated, save for the required 2-bits for Variant and 4-bits for Version. So Version 7 does away with the “node” (MAC address) field in Version 1 and Version 6. And, by default, Version 7 does away with the clock_seq field of those same Versions 1 & 6.

An implementation of Version 7 is allowed to optionally use the first chunk of those random bits for sub-millisecond timestamp fraction. And an implementation may choose to use a counter as defined in Version 1 & 6.

Postgres 18+

Postgres 18 adds support for Version 7 UUID.

uuidv7() function

A new built-in function, uuidv7(), generates Version 7 UUID values. No need to add the uuid-ossp extension commonly used in past generations of Postgres for generating UUID values.

The implementation of this function does take that first option mentioned above, a finer sub-millisecond timestamp. For details, read this post by Gwen Shapira on TheNile.dev. To quote:

The Postgres implementation includes a 12-bit sub-millisecond timestamp fraction immediately after the timestamp (as allowed but not required by the standard). This guarantees monotonicity for all UUIDv7 values generated by the same Postgres session (same backend process).

Version 7 bits table

 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                           unix_ts_ms                          |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|          unix_ts_ms           |  ver  |       rand_a          |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|var|                        rand_b                             |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                            rand_b                             |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.