Invention Title:

DISTRIBUTED DATABASE WITH INDEPENDENT SCALING OF COMMIT LAYER AND STORAGE LAYER

Publication number:

US20260154254

Publication date:
Section:

Physics

Class:

G06F16/2379

Inventors:

Assignee:

Applicant:

Smart overview of the Invention

A distributed database system is designed with a distinct commit layer and storage layer, each implemented on separate sets of host computing devices. The system allows independent scaling and sharding of these layers, enabling different sharding schemes for each. This flexibility is crucial for optimizing performance based on varying read and write workloads. The commit layer is responsible for managing writes, while the storage layer focuses on storing and retrieving data efficiently.

Background

Traditional database systems scale by increasing partitions, which simultaneously boosts read and write capacities. However, this approach can lead to inefficiencies, as read and write demands often differ across key ranges. The coupling of read and write capacities might result in over-scaling for writes or under-scaling for reads, leading to resource misallocation. The new system addresses these challenges by decoupling the scaling of the commit and storage layers.

System Architecture

The database system includes query processors that facilitate client interactions for executing reads and writes. Writes are sent to the commit layer for durable commitment, and acknowledgments are returned upon successful commitment. The storage layer reads committed writes from the commit layer's journal and stores them across its shards. The commit layer uses adjudicator instances to manage writes, ensuring they are correctly committed before acknowledgment. This setup allows for independent sharding schemes that cater to specific workload requirements.

Sharding Schemes

Different sharding schemes are employed for the commit and storage layers to handle varying volumes of reads and writes. For instance, keys with high write volumes may be sharded differently from those with high read volumes. The storage layer might include replicas to enhance read capacity, which the commit layer does not need to mirror. This separation allows for tailored sharding strategies that optimize resource allocation based on specific data access patterns.

Control Plane and Monitoring

A control plane oversees the sharding and load distribution across the commit and storage layers. It monitors shard heat and adjusts sharding schemes as needed to balance load effectively. The control plane gathers load statistics from each shard, enabling dynamic re-sharding to address high demand areas. This approach ensures that the system can adapt to changing workloads, maintaining efficiency and performance.