mysql-8-0-uuid-support

Posted by

In MySQL 8.0.0 we introduced many new features; among those, three new functions that ease and enhance the support for working with UUIDs.

Background

A UUID is just a 128-bit value and it is usually represented in human-readable format as an UTF8 string composed of 5 groups of hexadecimal characters separated by dashes. Ex:

UUIDs can have different underlying structure depending on the version. The RFC4122 specifies 5 versions. The one that MySQL implements in the UUID() function is version 1 which is composed of the timestamp, UUID version and MAC address.

Advantages:

UUIDs are a good alternative to AUTO_INCREMENT PRIMARY KEY and are used mainly because:
– the keys are unique across tables, databases and servers
– are hard(er) to guess (example from an URL)
– can be generated offline (without any exchange of information with the database or collaboration with other components of the system)
– simplifies replication

Disadvantages:

But they also come with some disadvantages:
– increased storage: 36 characters
– more difficult to debug
– performance issues: mainly because of the size and not being ordered

Solution:

With these problems in mind, we added three new functions: UUID_TO_BIN, BIN_TO_UUID, IS_UUID. These will ease the work with UUIDs and will provide a solution around the issues mentioned above.

Let’s start with the pair: UUID_TO_BIN/BIN_TO_UUID. These function will be used to convert from the human-readable format (char/varchar) to the compact format (binary) and back. That means compressing the 32 characters (36 or more with separators) to the 16-bit format or back to the human-readable format.

This is how you would normally use them:

You can observe in the results above that the values most likely to be different on consecutively generated UUIDs are the ones at the beginning of the string, that is because the smaller time units are the hexadecimals characters at the beginning of the string, while the larger time units come next, ending with the MAC address. This will have a significant performance impact, since the values will be inserted in random locations in the index tree which will require a lot of IO when the index tree will not fit in memory anymore.

The UUID_TO_BIN/BIN_TO_UUID functions have a second boolean argument, which is optional, and can be used to avoid this problem. Setting the argument to true while inserting the values: “INSERT INTO t VALUES(UUID_TO_BIN(UUID(), true));” will rearrange the time-related bits so that consecutive generated values will be ordered.

For more information about the problem and solution for previous versions of MySQL check this link.

The other function IS_UUID does a very basic validation of the UUID provided as parameter and returns TRUE if the argument is a valid UUID and FALSE otherwise. A UUID is considered valid if it contains 32 hexadecimal characters with optional separators:
“{“,”-“,”}” . The function does not check the version bit or the timestamp.
All of these will be considered valid:

Conclusion

Please experiment with the new functions and give us feedback. Thank you for using MySQL!

Leave a Reply

Your email address will not be published. Required fields are marked *