ULID vs UUID vs ID - Pros and Cons

Pros of using ULIDs for indentifiers and how they can simplify developing microservices

What's the issue

Unique Identifiers are a thing that most developers don't think on the start of designing a database. When working with SQL databases most of us find an ID that SQL generates on its own pretty sufficient but when working with microservices number IDs are falling in the water. Let's see what are existing pros and cons of each approach:

Number ID

A most well-known type of IDs. It regularly starts with 1 (0 on some occasions). Let's see what are pros and cons of using it:

Pros

  • Easy to understand at first sight
  • Easy to sort

Cons

  • Completely useless when working with microservices
  • Potential security risk

When microservices started getting mainstream, developers started using UUID that existed from the 1980s to get over the shortcomings of Numeric ID's

UUID

UUID Universally unique identifier (UUID) also known as a globally unique identifier (GUID) uses 128 bits to store random identifiers. For example, the number of random version-4 UUIDs which need to be generated in order to have a 50% probability of at least one collision is 2.71 quintillion. This number is equivalent to generating 1 billion UUIDs per second for about 85 years.

UUIDs are often represented as 32-digits hexadecimal strings:

123e4567-e89b-12d3-a456-426614174000

Pros

  • Decentralized way of generating IDs
  • No way of getting the same ID across different applications

Cons

  • No way of sorting

UUIDs saved so much work for developers but ULID fixed one of the last shortcomings of UUID: Sorting

ULID

ULIDs are also using 128 bits to store identifiers but this time first 48 bits are used for timestamps while the rest of 80 bits are used for randomness. This is a key difference between it and UUID because it gives as the possibility to sort them. ULID is even lexicographically sortable which gives us the ability to sort them as strings!

As now identifier is bound to timestamps, we can create as much as 1.21e+24 unique ULIDs per millisecond. While UUID is theoretically can't be duplicated in practice it happens that some faulty implementations of concepts left people with multiple same identifiers for different data. While human error is still possible with ULIDs, it's a lot more unlikely.

ULIDs are represented as 26 character string:

0001EHZADJ5FDB1RJS00JK7VD8

From this ULID we can extract timestamps and random parts:

2019-07-05 15:49:06 5FDB1RJS00JK7VD8

As you can see first 10 characters contain a date and time of ULID.

Note: Sorting is working only for the timestamp part. We could not find out which of two ULIDs created at same millisecond is created first

Summary

Right now, if we take your regular table in SQL we can see something like this:

  1. ID: People still feel more comfortable with including it
  2. created_at: Timestamps
  3. UUID: For working with other (micro)services

As we concluded all of things up could be replaced with ULID.