The Dark Side Of Using UUID as a Primary Key in MySQL

Using a UUID as a primary key in large tables may lead to performance issues, especially in terms of efficiency when inserting and modifying data. Below is a detailed analysis of the reasons behind this, including why modifying data causes index refreshes and why character-based primary keys are less efficient. Problems with Using UUID as a Primary Key Characteristics of UUID A UUID is a 128-bit string, typically represented as 36 characters (e.g., 550e8400-e29b-41d4-a716-446655440000). UUIDs are globally unique, making them suitable for generating unique identifiers in distributed systems. Disadvantages of Using UUID as a Primary Key Low Index Efficiency Index Size: A UUID is a string type and occupies a large amount of space (36 bytes), whereas an integer primary key (such as BIGINT) only takes up 8 bytes. The larger the index, the lower the efficiency of storage and queries. Index Splitting: UUIDs are unordered, meaning that when new data is inserted, it may cause frequent splitting and rebalancing of the index tree, affecting performance. Poor Insertion Performance Randomness: Since UUIDs are unordered, each new record may be inserted at an arbitrary position in the index tree, leading to frequent index tree adjustments. Page Splitting: The InnoDB storage engine uses a B+ tree as its index structure. Random inserts can cause page splits, increasing disk I/O operations. Poor Query Performance Low Comparison Efficiency: String comparisons are slower than integer comparisons, especially in large tables, where query performance significantly decreases. Large Index Scan Range: Since UUID indexes occupy more space, the scan range of indexes increases, reducing query efficiency. Why Modifying Data Causes Index Refreshes Purpose of Indexes Indexes are data structures (such as B+ trees) created to accelerate queries. When data is modified, the index must also be updated to maintain data consistency. Impact of Data Modification on Indexes Updating the Primary Key If the primary key value is modified, MySQL needs to delete the old primary key index record and insert a new primary key index record. This process requires adjustments to the index tree, increasing disk I/O operations. Updating Non-Primary Key Columns If the modified column is an indexed column (such as a unique index or regular index), MySQL needs to update the corresponding index records. This process also causes index tree adjustments. Additional Overhead of UUID Primary Keys Since UUIDs are unordered, modifying the primary key value may insert the new value at a different position in the index tree, leading to frequent index tree adjustments. Compared to ordered primary keys (such as auto-incrementing IDs), modifying a UUID primary key incurs a higher cost. Why Character-Based Primary Keys Reduce Efficiency Large Storage Space Character-based primary keys (such as UUIDs) take up more storage space than integer primary keys. The size of an index directly impacts query performance—the larger the index, the more disk I/O operations are required during queries. Low Comparison Efficiency String comparisons are slower than integer comparisons, especially in large tables where query performance significantly declines. For example, WHERE id = '550e8400-e29b-41d4-a716-446655440000' is less efficient than WHERE id = 12345. Index Splitting Character-based primary keys are often unordered, which means that when new data is inserted, frequent index tree splits and rebalancing may occur, impacting performance. How to Optimize the Performance of UUID Primary Keys Use Ordered UUIDs Use ordered UUIDs (such as UUIDv7) to reduce index splits and page splits. Ordered UUIDs can be generated based on timestamps to ensure insertion order. Store UUIDs in Binary Format Store UUIDs as BINARY(16) instead of CHAR(36) to reduce storage space. CREATE TABLE users ( id BINARY(16) PRIMARY KEY, name VARCHAR(255) ); Use Auto-Incrementing Primary Keys + UUID Use an auto-incrementing primary key as the physical primary key and a UUID as the logical primary key. CREATE TABLE users ( id BIGINT AUTO_INCREMENT PRIMARY KEY, uuid CHAR(36) UNIQUE, name VARCHAR(255) ); Use Partitioned Tables Partition large tables to reduce the size of individual index trees and improve query performance. Summary Disadvantages of Using UUID as a Primary Key Low index efficiency, poor insertion and query performance. Index refreshes frequently when modifying data, leading to performance degradation. Reasons for Low Efficiency of Character-Based Primary Keys Large storage space, low comparison efficiency, and frequent index splits. Optimization Recommendations Use o

Mar 18, 2025 - 20:46
 0
The Dark Side Of Using UUID as a Primary Key in MySQL

Cover

Using a UUID as a primary key in large tables may lead to performance issues, especially in terms of efficiency when inserting and modifying data. Below is a detailed analysis of the reasons behind this, including why modifying data causes index refreshes and why character-based primary keys are less efficient.

Problems with Using UUID as a Primary Key

Characteristics of UUID

  • A UUID is a 128-bit string, typically represented as 36 characters (e.g., 550e8400-e29b-41d4-a716-446655440000).
  • UUIDs are globally unique, making them suitable for generating unique identifiers in distributed systems.

Disadvantages of Using UUID as a Primary Key

  • Low Index Efficiency

    • Index Size: A UUID is a string type and occupies a large amount of space (36 bytes), whereas an integer primary key (such as BIGINT) only takes up 8 bytes. The larger the index, the lower the efficiency of storage and queries.
    • Index Splitting: UUIDs are unordered, meaning that when new data is inserted, it may cause frequent splitting and rebalancing of the index tree, affecting performance.
  • Poor Insertion Performance

    • Randomness: Since UUIDs are unordered, each new record may be inserted at an arbitrary position in the index tree, leading to frequent index tree adjustments.
    • Page Splitting: The InnoDB storage engine uses a B+ tree as its index structure. Random inserts can cause page splits, increasing disk I/O operations.
  • Poor Query Performance

    • Low Comparison Efficiency: String comparisons are slower than integer comparisons, especially in large tables, where query performance significantly decreases.
    • Large Index Scan Range: Since UUID indexes occupy more space, the scan range of indexes increases, reducing query efficiency.

Why Modifying Data Causes Index Refreshes

Purpose of Indexes

  • Indexes are data structures (such as B+ trees) created to accelerate queries.
  • When data is modified, the index must also be updated to maintain data consistency.

Impact of Data Modification on Indexes

  • Updating the Primary Key

    • If the primary key value is modified, MySQL needs to delete the old primary key index record and insert a new primary key index record.
    • This process requires adjustments to the index tree, increasing disk I/O operations.
  • Updating Non-Primary Key Columns

    • If the modified column is an indexed column (such as a unique index or regular index), MySQL needs to update the corresponding index records.
    • This process also causes index tree adjustments.

Additional Overhead of UUID Primary Keys

  • Since UUIDs are unordered, modifying the primary key value may insert the new value at a different position in the index tree, leading to frequent index tree adjustments.
  • Compared to ordered primary keys (such as auto-incrementing IDs), modifying a UUID primary key incurs a higher cost.

Why Character-Based Primary Keys Reduce Efficiency

Large Storage Space

  • Character-based primary keys (such as UUIDs) take up more storage space than integer primary keys.
  • The size of an index directly impacts query performance—the larger the index, the more disk I/O operations are required during queries.

Low Comparison Efficiency

  • String comparisons are slower than integer comparisons, especially in large tables where query performance significantly declines.
  • For example, WHERE id = '550e8400-e29b-41d4-a716-446655440000' is less efficient than WHERE id = 12345.

Index Splitting

  • Character-based primary keys are often unordered, which means that when new data is inserted, frequent index tree splits and rebalancing may occur, impacting performance.

How to Optimize the Performance of UUID Primary Keys

Use Ordered UUIDs

  • Use ordered UUIDs (such as UUIDv7) to reduce index splits and page splits.
  • Ordered UUIDs can be generated based on timestamps to ensure insertion order.

Store UUIDs in Binary Format

  • Store UUIDs as BINARY(16) instead of CHAR(36) to reduce storage space.
CREATE TABLE users (
    id BINARY(16) PRIMARY KEY,
    name VARCHAR(255)
);

Use Auto-Incrementing Primary Keys + UUID

  • Use an auto-incrementing primary key as the physical primary key and a UUID as the logical primary key.
CREATE TABLE users (
    id BIGINT AUTO_INCREMENT PRIMARY KEY,
    uuid CHAR(36) UNIQUE,
    name VARCHAR(255)
);

Use Partitioned Tables

  • Partition large tables to reduce the size of individual index trees and improve query performance.

Summary

  • Disadvantages of Using UUID as a Primary Key

    • Low index efficiency, poor insertion and query performance.
    • Index refreshes frequently when modifying data, leading to performance degradation.
  • Reasons for Low Efficiency of Character-Based Primary Keys

    • Large storage space, low comparison efficiency, and frequent index splits.
  • Optimization Recommendations

    • Use ordered UUIDs or store them in binary format.
    • Combine auto-incrementing primary keys with UUIDs.
    • Partition large tables.

We are Leapcell, your top choice for hosting backend projects.

Leapcell

Leapcell is the Next-Gen Serverless Platform for Web Hosting, Async Tasks, and Redis:

Multi-Language Support

  • Develop with Node.js, Python, Go, or Rust.

Deploy unlimited projects for free

  • pay only for usage — no requests, no charges.

Unbeatable Cost Efficiency

  • Pay-as-you-go with no idle charges.
  • Example: $25 supports 6.94M requests at a 60ms average response time.

Streamlined Developer Experience

  • Intuitive UI for effortless setup.
  • Fully automated CI/CD pipelines and GitOps integration.
  • Real-time metrics and logging for actionable insights.

Effortless Scalability and High Performance

  • Auto-scaling to handle high concurrency with ease.
  • Zero operational overhead — just focus on building.

Explore more in the Documentation!

Try Leapcell

Follow us on X: @LeapcellHQ

Read on our blog