Skip to main content

Curvine Distributed Cache System User Guide

ยท 12 min read

Version License Documentation

๐Ÿ“š Table of Contentsโ€‹


๐ŸŽฏ System Overviewโ€‹

Curvine is a high-performance, cloud-native distributed caching system designed for modern data-intensive applications. It provides an intelligent caching layer between underlying storage (UFS) and compute engines, significantly improving data access performance.

๐Ÿ† Performance Advantagesโ€‹

Compared to traditional storage access methods, Curvine can provide:

MetricCloud StorageCurvine CachePerformance Improvement
Read Latency100-500ms1-10ms10-50x
Throughput100-500 MB/s1-10 GB/s10-20x
IOPS1K-10K100K-1M100x
Concurrent Connections100-1K10K-100K100x

Core Componentsโ€‹

  • Master Cluster: Metadata management, cache scheduling, consistency guarantees
  • Worker Nodes: Data caching, I/O processing, task execution
  • Client SDK: Multi-language clients, supporting Rust, Fuse, Java, Python
  • Job Manager: Distributed task scheduling and management
  • Metrics System: Real-time monitoring and performance analysis

๐Ÿ“‚ Path Mount Managementโ€‹

Mounting is the first step in using Curvine cache, which establishes the mapping relationship between underlying storage (UFS) and cache paths.

Mounting Modes Explainedโ€‹

Curvine supports two flexible mounting modes:

๐ŸŽฏ CST Mode (Consistent Path Mode)โ€‹

# Consistent path, easy to manage and maintain
bin/cv mount s3://bucket/data /bucket/data --mnt-type cst

Ideal scenarios:

  • Data lake scenarios with clear path structures
  • Production environments requiring intuitive path mapping
  • Data platforms with multi-team collaboration

๐Ÿ”€ Arch Mode (Orchestration Mode)โ€‹

# Flexible path mapping, supporting complex path transformations
bin/cv mount s3://complex-bucket/deep/nested/path /simple/data --mnt-type arch

Ideal scenarios:

  • Complex storage hierarchies
  • Scenarios requiring path abstraction
  • Multi-cloud storage unified access

Complete Mounting Exampleโ€‹

# Mount S3 storage to Curvine (production-grade configuration)
bin/cv mount \
s3://bucket/warehouse/tpch_500g.db/orders \
/bucket/warehouse/tpch_500g.db/orders \
--ttl-ms 24h \
--ttl-action delete \
--replicas 3 \
--block-size 128MB \
--consistency-strategy always \
--storage-type ssd \
-c s3.endpoint_url=https://s3.ap-southeast-1.amazonaws.com \
-c s3.credentials.access=access_key \
-c s3.credentials.secret=secret_key \
-c s3.region_name=ap-southeast-1

Mounting Parameters Explainedโ€‹

ParameterTypeDefaultDescriptionExample
--ttl-msduration0Cache data expiration time24h, 7d, 30d
--ttl-actionenumnoneExpiration policy: delete/nonedelete
--replicasint1Number of data replicas (1-5)3
--block-sizesize128MBCache block size64MB, 128MB, 256MB
--consistency-strategyenumalwaysConsistency strategynone/always/period
--storage-typeenumdiskStorage medium typemem/ssd/disk

Mount Point Managementโ€‹

# View all mount points
bin/cv mount

# Unmount path
bin/cv unmount /bucket/warehouse/tpch_500g.db/orders

๐Ÿ’พ Intelligent Caching Strategiesโ€‹

Curvine provides multiple intelligent caching strategies, from passive response to active prediction, comprehensively optimizing data access performance.

Active Data Preloadingโ€‹

Active loading allows you to warm up the cache before business peaks to ensure optimal performance:

# Basic loading
bin/cv load s3:/bucket/warehouse/critical-dataset

# Synchronous loading with progress monitoring
bin/cv load s3:/bucket/warehouse/critical-dataset -w

Automatic Caching Strategyโ€‹

Curvine's automatic caching system has significant advantages over traditional solutions:

โœจ Curvine Intelligent Cache Architectureโ€‹

curvine

Core Advantage Comparisonโ€‹

FeatureOpen Source CompetitorsCurvineAdvantage Description
Loading GranularityBlock-levelFile/Directory-levelAvoid fragmentation, ensure integrity
Duplicate ProcessingExists duplicate loadingIntelligent deduplicationSave bandwidth and storage resources
Task SchedulingSimple queueDistributed Job ManagerEfficient concurrency, load balancing
Consistency GuaranteePassive checkingActive awarenessReal-time data synchronization

๐Ÿ”„ Data Consistency Guaranteesโ€‹

Data consistency is a core challenge for caching systems, and Curvine provides multi-level consistency guarantee mechanisms.

Consistency Strategy Detailsโ€‹

1. ๐Ÿšซ None Mode - Highest Performanceโ€‹

bin/cv mount s3://bucket/path /bucket/path --consistency-strategy=none
  • Ideal scenarios: Static data, archived data, read-only datasets
  • Performance: โญโญโญโญโญ (fastest)
  • Consistency: โญโญ (TTL-dependent)

2. โœ… Always Mode - Strong Consistencyโ€‹

bin/cv mount s3://bucket/path /bucket/path --consistency-strategy=always
  • Ideal scenarios: Frequently updated business data, critical business systems
  • Performance: โญโญโญ (has overhead)
  • Consistency: โญโญโญโญโญ (strong consistency)

3. ๐Ÿ•ฐ๏ธ Period Mode - Balanced Solutionโ€‹

bin/cv mount s3://bucket/path /bucket/path \
--consistency-strategy=period \
--check-interval=5m
  • Ideal scenarios: Data with predictable update frequency
  • Performance: โญโญโญโญ (good)
  • Consistency: โญโญโญโญ (periodically guaranteed)

Cache Performance Monitoringโ€‹

Monitoring cache hit ratio is an important way to evaluate the effectiveness of consistency strategies:

# Get cache hit ratio
curl -s http://master:9001/metrics | grep -E "(cache_hits|cache_misses)"
client_mount_cache_hits{id="3108497238"} 823307
client_mount_cache_misses{id="3108497238"} 4380

๐Ÿค– AI/ML Scenario Applicationsโ€‹

AI and machine learning workloads have extremely high requirements for storage performance, and Curvine provides specially optimized functions for this.

Deep Learning Training Optimizationโ€‹

# Optimized data loading for GPU clusters
bin/cv mount s3://datasets/imagenet /datasets/imagenet \
--storage-type=mem \
--block-size=1GB \
--replicas=2

Model Serving Scenariosโ€‹

# Model file caching (low-latency access)
bin/cv mount s3://model/bert-large /models/bert-large \
--storage-type=mem \
--ttl-ms=none \
--consistency-strategy=always

# Inference data caching
bin/cv mount s3://inference/input /inference/input \
--storage-type=ssd \
--ttl-ms=1h \
--consistency-strategy=none

๐Ÿ”— POSIX Semantics and FUSE Accessโ€‹

Curvine perfectly supports POSIX semantics through the FUSE (Filesystem in Userspace) interface, allowing the Curvine cluster to be mounted as a local file system, providing a transparent file access experience for AI/ML applications.

FUSE Usage in AI/ML Trainingโ€‹

1. Deep Learning Training Data Accessโ€‹
# PyTorch training script
import torch
from torch.utils.data import Dataset, DataLoader
from torchvision import transforms
from PIL import Image
import os

class CurvineImageDataset(Dataset):
def __init__(self, root_dir, transform=None):
"""
Directly access data in Curvine through FUSE mount point
root_dir: FUSE mount point path, such as /curvine-fuse/datasets/imagenet
"""
self.root_dir = root_dir
self.transform = transform
self.image_paths = []

# Directly traverse the FUSE-mounted directory
for class_dir in os.listdir(root_dir):
class_path = os.path.join(root_dir, class_dir)
if os.path.isdir(class_path):
for img_file in os.listdir(class_path):
if img_file.lower().endswith(('.png', '.jpg', '.jpeg')):
self.image_paths.append(os.path.join(class_path, img_file))

def __len__(self):
return len(self.image_paths)

def __getitem__(self, idx):
# Access data through standard file operations, enjoying Curvine cache acceleration
img_path = self.image_paths[idx]
image = Image.open(img_path).convert('RGB')

if self.transform:
image = self.transform(image)

# Extract label from path
label = os.path.basename(os.path.dirname(img_path))
return image, label

# Usage example
transform = transforms.Compose([
transforms.Resize((224, 224)),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225])
])

# Directly use the path of the FUSE mount point
dataset = CurvineImageDataset(
root_dir='/curvine-fuse/datasets/imagenet/train',
transform=transform
)

dataloader = DataLoader(
dataset,
batch_size=64,
shuffle=True,
num_workers=8,
pin_memory=True
)

# Training loop
for epoch in range(num_epochs):
for batch_idx, (data, targets) in enumerate(dataloader):
# Data is automatically loaded from Curvine cache through FUSE
# Enjoy near-memory access speed
outputs = model(data.cuda())
loss = criterion(outputs, targets.cuda())
# ... training logic
2. TensorFlow/Keras Data Pipelineโ€‹
import tensorflow as tf
import os

def create_curvine_dataset(data_dir, batch_size=32):
"""
Create TensorFlow data pipeline through FUSE mount point
data_dir: FUSE-mounted data directory
"""

# Directly access FUSE-mounted data using standard file APIs
def load_and_preprocess_image(path):
# TensorFlow transparently accesses Curvine cache through FUSE
image = tf.io.read_file(path)
image = tf.image.decode_jpeg(image, channels=3)
image = tf.image.resize(image, [224, 224])
image = tf.cast(image, tf.float32) / 255.0
return image

# Scan files in the FUSE-mounted directory
image_paths = []
labels = []

for class_name in os.listdir(data_dir):
class_dir = os.path.join(data_dir, class_name)
if os.path.isdir(class_dir):
for img_file in os.listdir(class_dir):
if img_file.lower().endswith(('.png', '.jpg', '.jpeg')):
image_paths.append(os.path.join(class_dir, img_file))
labels.append(class_name)

# Create dataset
path_ds = tf.data.Dataset.from_tensor_slices(image_paths)
label_ds = tf.data.Dataset.from_tensor_slices(labels)

# Apply preprocessing
image_ds = path_ds.map(
load_and_preprocess_image,
num_parallel_calls=tf.data.AUTOTUNE
)

# Combine data and labels
dataset = tf.data.Dataset.zip((image_ds, label_ds))

return dataset.batch(batch_size).prefetch(tf.data.AUTOTUNE)

# Usage example
train_dataset = create_curvine_dataset('/curvine-fuse/datasets/imagenet/train')
val_dataset = create_curvine_dataset('/curvine-fuse/datasets/imagenet/val')

# Model training
model.fit(
train_dataset,
validation_data=val_dataset,
epochs=50,
callbacks=[
tf.keras.callbacks.ModelCheckpoint('/curvine-fuse/models/checkpoints/'),
tf.keras.callbacks.TensorBoard(log_dir='/curvine-fuse/logs/')
]
)

๐Ÿ—„๏ธ Big Data Ecosystem Integrationโ€‹

Curvine seamlessly integrates with mainstream big data frameworks, providing transparent cache acceleration capabilities.

Hadoop Ecosystem Integrationโ€‹

Basic Configurationโ€‹

Add in hdfs-site.xml or core-site.xml:

<!-- Curvine FileSystem implementation -->
<property>
<name>fs.cv.impl</name>
<value>io.curvine.CurvineFileSystem</value>
</property>

<!-- Single cluster configuration -->
<property>
<name>fs.cv.master_addrs</name>
<value>master1:8995,master2:8995,master3:8995</value>
</property>

Multi-cluster Supportโ€‹

<!-- Cluster 1: Production environment -->
<property>
<name>fs.cv.production.master_addrs</name>
<value>prod-master1:8995,prod-master2:8995,prod-master3:8995</value>
</property>

<!-- Cluster 2: Development environment -->
<property>
<name>fs.cv.development.master_addrs</name>
<value>dev-master1:8995,dev-master2:8995</value>
</property>

<!-- Cluster 3: Machine learning dedicated cluster -->
<property>
<name>fs.cv.ml-cluster.master_addrs</name>
<value>ml-master1:8995,ml-master2:8995,ml-master3:8995</value>
</property>

๐Ÿ”„ UFS Transparent Proxyโ€‹

To better support existing Java applications to seamlessly access Curvine cache, we provide a UFS transparent proxy solution. The core advantage of this solution is zero code modification, allowing existing applications to immediately enjoy the cache acceleration effects of Curvine.

โœจ Core Features of Transparent Proxyโ€‹

  • ๐Ÿšซ Zero code modification: Preserves all original interfaces unchanged, no business code modifications required
  • ๐Ÿ” Intelligent path recognition: Only determines whether the path has been mounted to Curvine when opening a file
  • โšก Automatic cache acceleration: Automatically enables cache acceleration for mounted paths, native S3 access for unmounted paths
  • ๐Ÿ”„ Smooth switching: Supports dynamically switching whether to use cache at runtime without restarting the application

๐Ÿ› ๏ธ Configuration Methodโ€‹

Simply replace the S3FileSystem implementation class in Hadoop configuration:

<!-- Traditional S3 access configuration -->
<!--
<property>
<name>fs.s3a.impl</name>
<value>org.apache.hadoop.fs.s3a.S3AFileSystem</value>
</property>
-->

<!-- Replace with Curvine transparent proxy -->
<property>
<name>fs.s3a.impl</name>
<value>io.curvine.S3AProxyFileSystem</value>
</property>

<property>
<name>fs.cv.impl</name>
<value>io.curvine.CurvineFileSystem</value>
</property>

<!-- Curvine cluster configuration -->
<property>
<name>fs.curvine.master_addrs</name>
<value>master1:8995,master2:8995,master3:8995</value>
</property>

๐Ÿ”ง Working Principleโ€‹

Working Principle

๐Ÿš€ Usage Exampleโ€‹

No need to modify any business code, original code directly enjoys acceleration:

// Business code remains completely unchanged!
Configuration conf = new Configuration();
FileSystem fs = FileSystem.get(URI.create("s3a://my-bucket/"), conf);

// If this path is mounted to Curvine, automatically enjoy cache acceleration
FSDataInputStream input = fs.open(new Path("s3a://my-bucket/warehouse/data.parquet"));

// If this path is not mounted, use native S3 access
FSDataInputStream input2 = fs.open(new Path("s3a://my-bucket/archive/old-data.parquet"));

Spark/MapReduce code example:

// Spark code does not need any modification
Dataset<Row> df = spark.read()
.option("header", "true")
// If /warehouse/ path is mounted, automatically use cache acceleration
.csv("s3a://data-lake/warehouse/customer_data/");

df.groupBy("region")
.agg(sum("revenue").alias("total_revenue"))
.orderBy(desc("total_revenue"))
.show(20);

Python PySpark example:

# Python code also does not need modification
from pyspark.sql import SparkSession
from pyspark.sql.functions import sum, desc

spark = SparkSession.builder.appName("TransparentCache").getOrCreate()

# Automatically determine whether to use cache
df = spark.read \
.option("header", "true") \
.csv("s3a://data-lake/analytics/events/")

result = df.groupBy("event_type") \
.agg(sum("count").alias("total_events")) \
.orderBy(desc("total_events"))

result.show()

Apache Spark Optimization Configurationโ€‹

# Spark application startup configuration
spark-submit \
--class com.example.SparkApp \
--master yarn \
--deploy-mode cluster \
--conf spark.hadoop.fs.cv.impl=io.curvine.CurvineFileSystem \
--conf spark.hadoop.fs.cv.master_addrs=master1:8995,master2:8995,master3:8995 \
--conf spark.sql.adaptive.enabled=true \
--jars curvine-hadoop-client.jar \
app.jar

Spark Code Exampleโ€‹

// Scala example
val spark = SparkSession.builder()
.appName("Curvine Demo")
.config("spark.hadoop.fs.cv.impl", "io.curvine.CurvineFileSystem")
.getOrCreate()

// Directly use cv:// protocol to access cached data
val df = spark.read
.option("multiline", "true")
.json("cv://production/warehouse/events/2024/01/01/")

df.groupBy("event_type")
.count()
.show()

// Multi-cluster access
val prodData = spark.read.parquet("cv://production/warehouse/sales/")
val mlData = spark.read.parquet("cv://ml-cluster/features/user_profiles/")
# Python example
from pyspark.sql import SparkSession

spark = SparkSession.builder \
.appName("Curvine Python Demo") \
.config("spark.hadoop.fs.cv.impl", "io.curvine.CurvineFileSystem") \
.config("spark.hadoop.fs.cv.master_addrs", "master1:8995,master2:8995") \
.getOrCreate()

# Read data from cache
df = spark.read.option("header", "true") \
.csv("cv://warehouse/customer_data/")

# Complex queries automatically enjoy cache acceleration
result = df.groupBy("region") \
.agg({"revenue": "sum", "orders": "count"}) \
.orderBy("sum(revenue)", ascending=False)

result.show(20)

Trino/Presto Plugin Integrationโ€‹

Curvine provides an intelligent path replacement plugin, which can achieve non-invasive cache acceleration without requiring business code modifications, achieving completely transparent cache acceleration:

Plugin Workflowโ€‹

Plugin Workflow

Spark plugin usage example:

spark-submit \
--class main.scala.Tpch \
--name tpch_demo \
--conf spark.hadoop.fs.cv.impl=io.curvine.CurvineFileSystem \
--conf spark.hadoop.fs.cv.default.master_addrs=master1:8995,master2:8995 \
--conf spark.sql.extensions=io.curvine.spark.CurvineSparkExtension \
// Flink Table API integration example
TableEnvironment tableEnv = TableEnvironment.create(settings);

// Configure Curvine FileSystem
Configuration config = new Configuration();
config.setString("fs.cv.impl", "io.curvine.CurvineFileSystem");
config.setString("fs.cv.master_addrs", "master1:8995,master2:8995");

// Create Curvine table
tableEnv.executeSql(
"CREATE TABLE user_events (" +
" user_id BIGINT," +
" event_type STRING," +
" timestamp_ms BIGINT," +
" properties MAP<STRING, STRING>" +
") WITH (" +
" 'connector' = 'filesystem'," +
" 'path' = 'cv://streaming/events/'," +
" 'format' = 'json'" +
")"
);

// Real-time query enjoys cache acceleration
Table result = tableEnv.sqlQuery(
"SELECT user_id, COUNT(*) as event_count " +
"FROM user_events " +
"WHERE timestamp_ms > UNIX_TIMESTAMP() * 1000 - 3600000 " +
"GROUP BY user_id"
);


๐Ÿ’ก Best Practicesโ€‹

๐ŸŽฏ Mounting Strategy Best Practicesโ€‹

Tiered Mounting by Business Scenariosโ€‹

# Hot data: high-frequency access, using memory cache
bin/cv mount s3://bucket/hot /bucket/hot \
--storage-type=mem \
--replicas=3 \
--ttl-ms=1d \
--ttl-action=delete

# Warm data: regular access, using SSD cache
bin/cv mount s3://bucket/warm /bucket/warm \
--storage-type=ssd \
--replicas=2 \
--ttl-ms=7d \
--ttl-action=delete


# Cold data: low-frequency access, using disk cache
bin/cv mount s3://bucket/cold /bucket/cold \
--storage-type=disk \
--replicas=1 \
--ttl-ms=30d \
--ttl-action=delete

Optimization by Data Typeโ€‹

# Small file intensive (e.g., logs, configurations)
bin/cv mount s3://bucket/logs /bucket/logs \
--block-size=4MB \
--consistency-strategy=none

# Large file type (e.g., videos, models)
bin/cv mount s3://bucket/models /bucket/models \
--block-size=1GB \
--consistency-strategy=always

# Analytical data (e.g., Parquet)
bin/cv mount s3://bucket/analytics /bucket/analytics \
--block-size=128MB \
--consistency-strategy=none \

๐ŸŽฏ Summaryโ€‹

As a new generation distributed caching system, Curvine provides excellent performance improvements for modern data-intensive applications through intelligent caching strategies, strong consistency guarantees, and seamless ecosystem integration.

๐Ÿ† Core Valuesโ€‹

  • ๐Ÿš€ Performance Improvement: 10-100x access acceleration, significantly reducing data access latency
  • ๐Ÿ’ฐ Cost Optimization: Reduce cloud storage access costs, improve computing resource utilization
  • ๐Ÿ›ก๏ธ Data Security: Multiple consistency guarantees to ensure data accuracy and integrity
  • ๐ŸŒ Ecosystem Friendly: Seamless integration with mainstream big data and AI frameworks

Curvine - Make data access lightning fast โšก

Building a Curvine Cluster from Scratch & FIO Testing

ยท 2 min read
Founder of Curvine

How to quickly get started and try out Curvine's performance? This article will introduce how to build a local small cluster from scratch, allowing everyone to get hands-on experience quickly.

GitHub: https://github.com/CurvineIO/curvine


1. Download the Code:โ€‹

git clone https://github.com/CurvineIO/curvine.git

2. Environment Requirements:โ€‹

GCC: version 10 or later 
Rust: version 1.86 or later
Protobuf: version 3.x
Maven: version 3.8 or later
LLVM: version 12 or later
FUSE: libfuse2 or libfuse3 development packages
JDK: version 1.8 or later
npm: version 9 or later
Python: version 3.7 or later

3. Compile & Runโ€‹

make all

To facilitate compilation, our build script will check dependencies in advance. For macOS users, we will temporarily skip FUSE compilation (currently not adapted for macOS). Interested users can consider using the macfuse project for adaptation.

make-checkenv

4. After Compilation, Start Local Clusterโ€‹

cd build/dist
./bin/restart-all.sh

After successful startup, execute the report command to check if it's working:


bin/cv report

active_master: localhost:8995
journal_nodes: 1,localhost:8996
capacity: 233.5GB
available: 105.0GB (44.99%)
fs_used: 0.0B (0.00%)
non_fs_used: 128.4GB
live_worker_num: 1
lost_worker_num: 0
inode_num: 0
block_num: 0
live_worker_list: 192.168.xxx.xxx:8997,105.0GB/233.5GB (44.99%)
lost_worker_list:

5. View Local Master and Worker WebUIโ€‹

http://localhost:9000/
http://localhost:9001/

webui

6. FIO Testingโ€‹

Test Environment: Alibaba Cloud ecs.r8a.8xlarge instance with one master/worker/client each

  • 32 cores (vCPU)
  • 256 GiB memory
  • System disk and data disk both: ESSD cloud disk 500 GiB (7800 IOPS)
  • Maximum bandwidth: 25Gb

Prepare data (on worker machine):

bin/curvine-bench.sh fuse.write

FIO Sequential Read Test, 8 Concurrent Jobs

fio -iodepth=1 -rw=read -ioengine=libaio -bs=256k
-group_reporting -size=200gb
-filename=/curvine-fuse/fs-bench/0
-name=read_test --readonly -direct=1 --runtime=60
-numjobs=8

FIO Random Read Test, 8 Concurrent Jobs


fio -iodepth=1 -rw=randread -ioengine=libaio -bs=256k
-group_reporting -size=200gb
-filename=/curvine-fuse/fs-bench/0
-name=read_test --readonly -direct=1 --runtime=60
-numjobs=8

Finally, here's a video demonstration of the FIO testing results: