Performance Tuning
This chapter introduces how to optimize Curvine's performance.
Server-Side Performance Tuningâ
The default configuration is suitable for most scenarios. In high-load environments, you can adjust the following parameters:
- Thread count tuning: Adjust the number of worker threads based on CPU cores and business load
- Kernel parameter optimization: Adjust network buffers, resource limits, and other parameters
Curvine configuration adjustment example:
# io_threads = CPU cores, worker_threads = CPU cores * 2
[master]
io_threads = 32
worker_threads = 64
[worker]
io_threads = 32
worker_threads = 64
Kernel parameter adjustment example (configuration examples show which parameters affect performance, for reference only, adjust according to actual needs):
# Adjust maximum file open count
ulimit -n 128000
# Adjust network parameters
echo "
net.ipv4.tcp_rmem=8192 128000 33554432
net.ipv4.tcp_wmem=8192 128000 33554432
net.core.rmem_max=67108864
net.core.wmem_max=67108864
net.ipv4.tcp_mem=94500000 915000000 927000000
" > /etc/sysctl.d/99-curvine-sysctl.conf
Client-Side Performance Tuningâ
Sequential Read/Write Scenariosâ
Sequential read/write scenarios can achieve optimal performance using default configurations.
High Concurrency Read/Write Scenariosâ
When clients need to handle a large number of file I/O operations simultaneously:
- Set
read_chunk_num=1
andwrite_chunk_num=1
to reduce memory usage - Suitable for scenarios with concurrent file count > 1000
- Single file throughput will decrease, but overall throughput remains unchanged, memory usage will decrease by 90%
Configuration example:
[client]
read_chunk_num = 1
write_chunk_num = 1
Large File Sequential Readâ
For sequential reading of single large files (>100GB):
- Recommended parallelism = 4-16
- Performance can improve 3-5 times, reaching 2-3GB/s
Configuration example:
# Set read parallelism
[client]
read_parallel = 8
Random Readâ
For completely random reads, it's recommended to:
- Set buffer to 1 to avoid read amplification
- Make chunk_size close to average I/O size (e.g., 4KB/256KB)
- For large files, set block_size âĨ 1GB to reduce connection count
Configuration example:
[client]
read_chunk_num = 1
# Adjust according to actual I/O patterns
read_chunk_size = "256KB"
In completely random read scenarios, the client will cache connections with workers to reduce connection establishment overhead. However, if a file is very large, it will have many blocks, leading to too many connections. It's recommended to set block_size to 1GB or above when writing such large files to reduce connection count.
FUSE Performance Optimizationâ
Curvine FUSE currently supports FUSE2 and FUSE3. FUSE2's performance is approximately only 50% of FUSE3, so it's recommended to use FUSE3 in production environments.
Memory Controlâ
If FUSE is deployed on memory-constrained machines, you can set queue sizes to reduce memory usage:
Configuration example:
[fuse]
# Set FUSE request queue size
fuse_channel_size = 1024
# Set read/write data queue buffer size
stream_channel_size = 16
Curvine FUSE currently does not support page cache.
Metadata Performance Optimizationâ
In traditional libfuse implementations, metadata may be blocked by data read/write requests, causing request delays. Users are usually advised to set attr_timeout
and entry_timeout
to let the kernel cache metadata and improve performance.
However, in Curvine's FUSE implementation, FUSE request processing is completely asynchronous, so there's no need to set attr_timeout
and entry_timeout
.
Of course, if needed, you can also increase cache time. Configuration example:
# Set metadata expiration time to 60s, default is 1s
[fuse]
entry_timeout = 60
attr_timeout = 60
If you want modifications to be visible in real-time, you can set expiration time to 0, which won't affect FUSE performance:
[fuse]
entry_timeout = 0
attr_timeout = 0
FUSE2 Optimization Solutionâ
If your environment can only use FUSE2 and you want to improve performance, you can use multiple mount points as a solution. In actual testing, 4-8 mount points can achieve about 80% of FUSE3's performance.
Configuration example:
[fuse]
mnt_number = 4
Multiple mount points will mount multiple directories in the operating system. Programs need to use random or round-robin methods to select mount points to distribute the load.