Skip to main content

Curvine: High-Performance Distributed Cache(Now Open Source)

· 6 min read
Founder of Curvine

What is Curvine

 Curvine is a distributed caching system implemented in Rust, featuring high concurrency, high throughput, low latency, and low resource consumption. Unlike KV caches like Redis or TiKV, Curvine exclusively provides file caching capabilities. It is not a storage system but rather a caching layer - data persistence still relies on underlying file systems or object storage systems for support.

What problem does it solve

  1. Large-scale Data I/O Performance Bottlenecks;
  2. Single-Machine Cache Capacity Limitations.

 In practical applications, what scenarios are suitable for Curvine acceleration?

Curvine Application Scenarios.

Fig. 1:Curvine Application Scenarios.

 As shown in the figure above, Curvine is designed for the following five core scenarios:

  1. Accelerating intermediate data processing in big data shuffle operations
  2. Caching hot table data for faster big data analytics
  3. Boosting AI training efficiency through dataset caching
  4. Accelerating model file distribution via caching layer
  5. Cross-cloud data caching to mitigate performance bottlenecks of dedicated cloud connections

 These use cases are just the beginning. In simple terms, Curvine fundamentally addresses: The growing conflict between escalating computational demands and the I/O bottlenecks of distributed cache systems.

Performance

 We demonstrate performance and resource utilization from the following aspects:​

1. Metadata operation performance

Operation TypeCurvine (QPS)Juicefs (QPS)oss (QPS)
create19,98516,0002,000
open60,37650,0003,900
rename43,00921,000200
delete39,01341,0001,900

Note: All benchmark comparisons were conducted with a concurrency level of 40.

Detailed results: https://curvineio.github.io/docs/Benchmark/meta/

Industry benchmark test data of comparable products: https://juicefs.com/zh-cn/blog/engineering/meta-perf-hdfs-oss-jfs

2. Data Read/Write Performance

 Benchmarking Alluxio performance under identical hardware conditions:

● 256K sequential read

Thread countCurvine Open Source Edition (GiB/s)Throughput of Open Source Alluxio (GiB/s)
12.20.6
23.71.1
46.82.3
88.94.5
169.27.9
329.58.8
649.2N/A
1289.2N/A

● 256K random read

Thread countCurvine Open Source Edition (GiB/s)Throughput of Open Source Alluxio (GiB/s)
10.30.0
20.70.1
41.40.1
82.80.2
165.20.4
327.80.3
648.7N/A
1289.0N/A

  Data disclosure from Alluxio official website: https://www.alluxio.com.cn/alluxio-enterprise-vs-open-source/.

3. Resource consumption

  Thanks to Rust's language features, in the big data shuffle acceleration scenario, our comparison of online resource consumption between Curvine and Alluxio shows a ​90%+ reduction in memory usage​ and ​50%+ reduction in CPU usage.

Architecture Overview

  Curvine's architectural design philosophy: Simplicity, Excellence, and Universality.

Curvine Architecture Diagram.

Fig. 2:Curvine Application Scenarios.

Simplicity: Lightweight design with only two roles in the caching service: master and worker. For non-performance-critical modules, maximize reuse of open-source or existing technologies, ensuring minimal code complexity.

Excellence: Key performance-impacting components (e.g., underlying RPC communication framework, Fuse implementation) are independently designed and optimized with a performance-first mindset.

Generality: Compatible with multiple existing access modes. The underlying storage supports mainstream distributed file and object storage systems, ensuring versatility and ease of use.

On Open-Source

 We have achieved significant performance gains by deploying Curvine in high-concurrency, high-throughput big data scenarios internally. Now, we aim to collaborate with external partners to co-build this solution and collectively accelerate the infrastructure transition to Rust.

https://github.com/curvineio/curvine

 Powered by OPPO Bigdata.