Curvine and Fluid Integration
This document describes the current Curvine Fluid integration based on the code under curvine-docker/fluid in the Curvine main branch. It replaces the older thin-runtime-only narrative and reflects the unified curvine-fluid image and entrypoint now used by the main branch.
Overviewâ
Curvine currently supports two Fluid integration modes:
- CacheRuntime mode: Fluid launches Curvine master, worker, and client/FUSE components from the unified
curvine-fluidimage. This mode is driven byCacheRuntimeClassandDataset. - ThinRuntime mode: Fluid launches only a Curvine FUSE-based runtime for a Dataset through
ThinRuntimeProfile,Dataset, andThinRuntime.
Both modes use the same image entrypoint, which auto-detects runtime mode from:
FLUID_RUNTIME_TYPEFLUID_RUNTIME_COMPONENT_TYPEFLUID_RUNTIME_CONFIG_PATH- the explicit
fluid-thin-runtimecommand argument
Source of Truthâ
The current implementation lives in:
curvine-docker/fluid/Dockerfilecurvine-docker/fluid/entrypoint.shcurvine-docker/fluid/generate_config.pycurvine-docker/fluid/config-parse.pycurvine-docker/fluid/cache-runtime/curvine-cache-runtime-class.yamlcurvine-docker/fluid/cache-runtime/curvine-dataset.yamlcurvine-docker/fluid/cache-runtime/test-pod.yamlcurvine-docker/fluid/thin-runtime/curvine-thinruntime.yaml
If this page conflicts with those files, trust the code.
Runtime Modesâ
1. CacheRuntime modeâ
CacheRuntime mode is selected when either:
FLUID_RUNTIME_COMPONENT_TYPEis set, or- a Fluid runtime config file exists at
FLUID_RUNTIME_CONFIG_PATH
In this mode, entrypoint.sh:
- generates a base Curvine config under
$CURVINE_HOME/conf/curvine-cluster.toml - merges Fluid topology and component options through
generate_config.py - starts one of the Curvine roles:
master,worker, orclient
The sample curvine-cache-runtime-class.yaml defines:
- master as a
StatefulSet - worker as a
StatefulSet - client as a
DaemonSet
The client pod runs the FUSE side and mounts into the Fluid target path.
2. ThinRuntime modeâ
ThinRuntime mode is selected when either:
FLUID_RUNTIME_TYPE=thin, or- the container is started with the
fluid-thin-runtimeargument
In this mode, entrypoint.sh calls config-parse.py, which:
- parses the Fluid runtime config JSON
- extracts
mountPoint,targetPath, and Dataset options - generates a minimal Curvine TOML file
- writes a
mount-curvine.shwrapper - launches
curvine-fusedirectly
ThinRuntime mode is therefore the lighter integration path: it does not create Curvine master/worker pods from Fluid, and instead expects a reachable Curvine cluster already running elsewhere.
Prerequisitesâ
Before integrating Curvine with Fluid, make sure you have:
- A working Kubernetes cluster
- Fluid installed in the cluster
- A Curvine image or locally built
curvine-fluidimage available to the cluster - For ThinRuntime: a reachable external Curvine cluster (
master-endpoints) - For CacheRuntime: permission to create
CacheRuntimeClass,Dataset, and test workloads
This page assumes you already know how to build or deploy Curvine itself.
Build the Unified Fluid Imageâ
The current main branch exposes a unified Fluid image build target:
cd /path/to/curvine
make docker-build-fluid
That target builds:
curvine-fluid:latest
The resulting image uses curvine-docker/fluid/Dockerfile, whose base image is:
ghcr.io/curvineio/curvine:${BASE_IMAGE_TAG}
The image entrypoint is:
/entrypoint.sh
If you publish your own image, update the example manifests accordingly.
CacheRuntime Integrationâ
CacheRuntime mode is the right choice when you want Fluid to manage Curvine services inside the cluster.
Step 1: Create the CacheRuntimeClassâ
Start from:
curvine-docker/fluid/cache-runtime/curvine-cache-runtime-class.yaml
The shipped sample defines:
masterworkload type:StatefulSetworkerworkload type:StatefulSetclientworkload type:DaemonSet- image:
ghcr.io/curvineio/curvine-fluid:latest
Apply it:
kubectl apply -f curvine-docker/fluid/cache-runtime/curvine-cache-runtime-class.yaml
Step 2: Create a Datasetâ
Start from:
curvine-docker/fluid/cache-runtime/curvine-dataset.yaml
The sample contains:
kind: Datasetkind: CacheRuntimeruntimeClassName: curvinemountPoint: "curvine:///data"
Apply it:
kubectl apply -f curvine-docker/fluid/cache-runtime/curvine-dataset.yaml
Step 3: Understand how config is generatedâ
generate_config.py merges Fluid topology and component options into Curvine config.
Important behaviors:
CURVINE_DATASET_NAMEbecomes the Curvinecluster_id- Master journal peer addresses are generated from Fluid master pod topology
worker.data_diris derived fromworker.options.data_dirortieredStoreclient.targetPathbecomesfuse.mnt_pathclient.master_addrsis derived from generated master endpoints
This means the effective Curvine configuration is not a static hand-written TOML file. It is generated from Fluid runtime config plus environment variables.
Step 4: Verify the runtimeâ
Check the core resources:
kubectl get cacheruntimeclass
kubectl get dataset
kubectl get pods -A | grep curvine
Step 5: Run the sample test podâ
Start from:
curvine-docker/fluid/cache-runtime/test-pod.yaml
Apply it:
kubectl apply -f curvine-docker/fluid/cache-runtime/test-pod.yaml
kubectl logs curvine-demo
The sample pod mounts the Curvine-backed PVC at /data and performs simple read/write verification.
ThinRuntime Integrationâ
ThinRuntime mode is the right choice when you already have a Curvine cluster and only want Fluid to mount it into workloads.
Step 1: Create the ThinRuntimeProfileâ
Use the profile from:
curvine-docker/fluid/thin-runtime/curvine-thinruntime.yaml
Key fields:
kind: ThinRuntimeProfile
spec:
fileSystemType: fuse
fuse:
image: ghcr.io/curvineio/curvine-fluid
imageTag: latest
Step 2: Create the Datasetâ
The same sample file also contains the Dataset:
kind: Dataset
spec:
mounts:
- mountPoint: curvine:///data
options:
master-endpoints: "127.0.0.1:8995"
The most important Dataset options parsed by config-parse.py are:
| Option | Required | Meaning |
|---|---|---|
master-endpoints | yes | Curvine Master RPC endpoint, host:port |
master-web-port | no | Master web port override |
io-threads | no | FUSE I/O thread count |
worker-threads | no | FUSE worker thread count |
mnt-number | no | FUSE mount count |
Step 3: Create the ThinRuntimeâ
The sample also contains:
kind: ThinRuntime
metadata:
name: curvine-dataset
spec:
profileName: curvine-profile
Apply the combined sample:
kubectl apply -f curvine-docker/fluid/thin-runtime/curvine-thinruntime.yaml
Step 4: Understand generated filesâ
In ThinRuntime mode, config-parse.py generates:
$CURVINE_HOME/conf/curvine-cluster.toml$CURVINE_HOME/mount-curvine.sh
The generated TOML contains:
master.hostnameclient.master_addrsfuse.mnt_pathfuse.fs_path
derived from the Fluid Dataset config.
Step 5: Verify the ThinRuntimeâ
Check resources:
kubectl get thinruntime
kubectl get dataset
kubectl get pods -A | grep curvine
Verification Checklistâ
For either mode, verify:
- the relevant Fluid resource is
Ready/Bound - the Curvine or FUSE pods are running
- the mount path inside the workload is accessible
- reads and writes behave as expected for your selected mode
Useful commands:
kubectl get dataset
kubectl get thinruntime
kubectl get cacheruntime
kubectl get pods -A | grep curvine
kubectl describe dataset <name>
Troubleshootingâ
Image starts in the wrong modeâ
Check:
FLUID_RUNTIME_TYPEFLUID_RUNTIME_COMPONENT_TYPE- whether
FLUID_RUNTIME_CONFIG_PATHexists - the container arguments (
master,worker,client, orfluid-thin-runtime)
ThinRuntime cannot reach Curvineâ
Check the Dataset option:
master-endpoints: "host:port"
It must point to a reachable Curvine Master RPC endpoint.
FUSE mount is missingâ
Inspect:
kubectl logs <pod>
and check whether:
/dev/fuseis available- the runtime is privileged when required
targetPathandmountPointare correct
CacheRuntime topology looks wrongâ
Inspect the generated runtime config and logs:
kubectl logs <master-pod>
kubectl logs <worker-pod>
kubectl logs <client-pod>
generate_config.py relies on Fluid topology metadata to derive journal peers and service FQDNs. If topology is missing or malformed, the generated Curvine config will also be wrong.
Recommended Documentation Patternâ
When you extend this integration in the future, keep the documentation aligned to the code with this structure:
- Runtime modes
- Source-of-truth files
- Build image
- Install manifests
- Verification
- Troubleshooting
That keeps the integration guide stable even when manifest layout changes.