Curvine: High-Performance Distributed Cache(Now Open Source)
What is Curvine
Curvine is a distributed caching system implemented in Rust, featuring high concurrency, high throughput, low latency, and low resource consumption. Unlike KV caches like Redis or TiKV, Curvine exclusively provides file caching capabilities. It is not a storage system but rather a caching layer - data persistence still relies on underlying file systems or object storage systems for support.
What problem does it solve
- Large-scale Data I/O Performance Bottlenecks;
- Single-Machine Cache Capacity Limitations.
In practical applications, what scenarios are suitable for Curvine acceleration?

Fig. 1:Curvine Application Scenarios.
As shown in the figure above, Curvine is designed for the following five core scenarios:
- Accelerating intermediate data processing in big data shuffle operations
- Caching hot table data for faster big data analytics
- Boosting AI training efficiency through dataset caching
- Accelerating model file distribution via caching layer
- Cross-cloud data caching to mitigate performance bottlenecks of dedicated cloud connections
These use cases are just the beginning. In simple terms, Curvine fundamentally addresses: The growing conflict between escalating computational demands and the I/O bottlenecks of distributed cache systems.
Performance
We demonstrate performance and resource utilization from the following aspects:
1. Metadata operation performance
Operation Type | Curvine (QPS) | Juicefs (QPS) | oss (QPS) |
---|---|---|---|
create | 19,985 | 16,000 | 2,000 |
open | 60,376 | 50,000 | 3,900 |
rename | 43,009 | 21,000 | 200 |
delete | 39,013 | 41,000 | 1,900 |
Note: All benchmark comparisons were conducted with a concurrency level of 40.
Detailed results: https://curvineio.github.io/docs/Benchmark/meta/
Industry benchmark test data of comparable products: https://juicefs.com/zh-cn/blog/engineering/meta-perf-hdfs-oss-jfs
2. Data Read/Write Performance
Benchmarking Alluxio performance under identical hardware conditions:
● 256K sequential read
Thread count | Curvine Open Source Edition (GiB/s) | Throughput of Open Source Alluxio (GiB/s) |
---|---|---|
1 | 2.2 | 0.6 |
2 | 3.7 | 1.1 |
4 | 6.8 | 2.3 |
8 | 8.9 | 4.5 |
16 | 9.2 | 7.9 |
32 | 9.5 | 8.8 |
64 | 9.2 | N/A |
128 | 9.2 | N/A |
● 256K random read
Thread count | Curvine Open Source Edition (GiB/s) | Throughput of Open Source Alluxio (GiB/s) |
---|---|---|
1 | 0.3 | 0.0 |
2 | 0.7 | 0.1 |
4 | 1.4 | 0.1 |
8 | 2.8 | 0.2 |
16 | 5.2 | 0.4 |
32 | 7.8 | 0.3 |
64 | 8.7 | N/A |
128 | 9.0 | N/A |
Data disclosure from Alluxio official website: https://www.alluxio.com.cn/alluxio-enterprise-vs-open-source/.
3. Resource consumption
Thanks to Rust's language features, in the big data shuffle acceleration scenario, our comparison of online resource consumption between Curvine and Alluxio shows a 90%+ reduction in memory usage and 50%+ reduction in CPU usage.
Architecture Overview
Curvine's architectural design philosophy: Simplicity, Excellence, and Universality.

Fig. 2:Curvine Application Scenarios.
Simplicity: Lightweight design with only two roles in the caching service: master and worker. For non-performance-critical modules, maximize reuse of open-source or existing technologies, ensuring minimal code complexity.
Excellence: Key performance-impacting components (e.g., underlying RPC communication framework, Fuse implementation) are independently designed and optimized with a performance-first mindset.
Generality: Compatible with multiple existing access modes. The underlying storage supports mainstream distributed file and object storage systems, ensuring versatility and ease of use.
On Open-Source
We have achieved significant performance gains by deploying Curvine in high-concurrency, high-throughput big data scenarios internally. Now, we aim to collaborate with external partners to co-build this solution and collectively accelerate the infrastructure transition to Rust.
https://github.com/curvineio/curvine
Powered by OPPO Bigdata.