Data Orchestration
Curvine provides UFS (Unified File System) view to manage all supported distributed storage systems, including s3/hdfs etc.
Mountingâ
Curvine supports connecting multiple UFS sources by mounting them to different Curvine paths. Curvine does not provide default UFS configuration, which means if you want to load data from UFS, you must first mount the UFS source.

Curvine persists the mount table in metadata, so there is no need to remount when Curvine restarts. However, some rules must be followed.
- Mounting to the root path is not allowed.
- Mounting other UFS under a mounted path is not allowed.
- The same mount path cannot be mounted to different UFS.
Mount command:
bin/cv mount ufs_path curvine_path [configs]
- ufs_path: UFS path, e.g., s3://bucket/path.
- curvine_path: Curvine path, e.g., /ufs/path.
- configs: Optional parameters, e.g., access_key_id, secret_access_key, region, endpoint, etc.
Example:
bin/cv mount s3://ai/xuen-test /s3 \
-c s3.endpoint_url=http://hostname.com \
-c s3.region_name=cn \
-c s3.credentials.access=access_key \
-c s3.credentials.secret=secret_key \
-c s3.path_style=true
You can use command line, API to access ufs directories and files after UFS is mounted, but ufs data will not be automatically synchronized to curvine unless you load specific paths.
Mounting Parametersâ
| Parameter | Type | Default | Description | Example |
|---|---|---|---|---|
--ttl-ms | duration | 0 | Cache data expiration time | 24h, 7d, 30d |
--ttl-action | enum | none | Expiration policy: delete/none | delete |
--replicas | int | 1 | Number of data replicas (1-5) | 3 |
--block-size | size | 128MB | Cache block size | 64MB, 128MB, 256MB |
--consistency-strategy | enum | always | Consistency strategy | none/always/period |
--storage-type | enum | disk | Storage medium type | mem/ssd/disk |
Mount Modesâ
Write Cacheâ
WriteType controls the write behavior between Curvine cache and underlying storage (UFS)
| Mode | Behavior (Sync/Async) | Consistency | Use Cases |
|---|---|---|---|
| Cache mode | Write only to Curvine cache, not sync to UFS | Cache-only consistency, no data in UFS | Temporary data, scratch data, disposable intermediate results |
| Through mode | Write directly to UFS, bypass cache | Strong consistency, data directly persisted to UFS | Write-once-read-many scenarios, cache-unbeneficial data |
| AsyncThrough mode (default) | Write to cache first, async sync to UFS, return immediately | Eventual consistency, cache readable immediately, UFS updated async | Machine learning training, scenarios balancing performance and persistence |
| CacheThrough mode | Sync write to cache and UFS, wait for UFS completion | Strong consistency, ensure data exists in both cache and UFS | Shared data, scenarios requiring strong persistence guarantees |
Read Cacheâ
ConsistencyStrategy determines whether consistency validation with underlying storage (UFS) is needed when reading logs
| Mode | Behavior | Consistency | Use Cases |
|---|---|---|---|
| None | Trust cache, no consistency validation during reads | No consistency guarantee, may read stale data | High-performance read scenarios, relatively static base data, temporary data/intermediate results |
| Always | Validate cache matches UFS on every read | Strong consistency guarantee, ensure latest data | Intermittent data updates, multi-client shared data, strong consistency requirement scenarios |
TTL controls read cache behavior in Curvine by determining cache validity to trigger automatic cache refresh or cleanup operations.
Unified Accessâ
After UFS is mounted, Curvine provides a unified file system view, and you can access the UFS file system just like accessing the Curvine file system; Clients, command line tools, fuse, etc. can all access the UFS file system through a unified path.
- Curvine does not cache UFS metadata, so there is no data consistency issue when accessing. Accessing UFS through Curvine is no different from accessing UFS directly. When Curvine cache data read fails, it automatically falls back to reading data from UFS.
- If using the cv command, you can use the cache-only parameter to temporarily disable unified access to view files only cached in curvine. See fsSubcommandsubcommand for details.
Disabling Unified Accessâ
If you don't want to use unified access, you can add or modify the following configuration:
[client]
enable_unified_fs = false