Cassandra UUID vs TimeUUID benefits and disadvantages

Question

UUID and TIMEUUID are stored the same way in Cassandra, and they only really represent two different sorting implementations.

TIMEUUID columns are sorted by their time components first, and then by their raw bytes, whereas UUID columns are sorted by their version first, then if both are version 1 by their time component, and finally by their raw bytes. Curiosly the time component sorting implementations are duplicated between UUIDType and TimeUUIDType in the Cassandra code, except for different formatting.

I think of the UUID vs. TIMEUUID question primarily as documentation: if you choose TIMEUUID you’re saying that you’re storing things in chronological order, and that these things can occur at the same time, so a simple timestamp isn’t enough. Using UUID says that you don’t care about order (even if in practice the columns will be ordered by time if you put version 1 UUIDs in them), you just want to make sure that things have unique IDs.

Even if using NOW() to generate UUID values is convenient, it’s also very surprising to other people reading your code.

It probably does not matter much in the grand scheme of things, but sorting non-version 1 UUIDs is a bit faster than version 1, so if you have a UUID column and generate the UUIDs yourself, go for another version.

Leave a Comment Cancel reply