Debugging Guide
This guide provides instructions for debugging Curvine components and troubleshooting common issues.
Logging Configurationâ
Server Loggingâ
Curvine uses structured logging with configurable levels. Configure logging in curvine.toml
:
[log]
level = "info" # trace, debug, info, warn, error
dir = "/var/log/curvine"
max_size = "100MB"
max_files = 10
Client Loggingâ
For client debugging, set the log level in client configuration:
[client.log]
level = "debug"
dir = "/tmp/curvine-client"
Common Debugging Scenariosâ
Master/Worker Connection Issuesâ
-
Check network connectivity:
telnet <master-host> <master-port>
-
Verify configuration:
- Check
master_addrs
in worker configuration - Verify firewall settings
- Check DNS resolution
- Check
-
Review logs:
tail -f /var/log/curvine/master.log
tail -f /var/log/curvine/worker.log
Performance Issuesâ
-
Monitor system resources:
htop
iostat -x 1
sar -n DEV 1 -
Check Curvine metrics:
curl http://localhost:9000/metrics # Master metrics
curl http://localhost:9001/metrics # Worker metrics -
Profile with perf:
perf record -g ./curvine-server
perf report
FUSE Mount Issuesâ
-
Check mount status:
mount | grep curvine
fusermount -u /mnt/curvine # Unmount if needed -
Debug FUSE operations:
# Enable FUSE debug logging
curvine-fuse -o debug /mnt/curvine -
Check permissions:
ls -la /dev/fuse
groups $USER # Check if user is in fuse group
Debugging Toolsâ
Built-in Toolsâ
- curvine-cli: Command-line interface for cluster management
- curvine-bench: Performance testing and profiling
- Health checks: Built-in health monitoring endpoints
External Toolsâ
- strace: System call tracing
- gdb: Debugging with core dumps
- valgrind: Memory debugging
- perf: Performance profiling
Core Dump Analysisâ
-
Enable core dumps:
ulimit -c unlimited
echo '/tmp/core.%e.%p' > /proc/sys/kernel/core_pattern -
Analyze with gdb:
gdb ./curvine-server /tmp/core.curvine-server.12345
(gdb) bt
(gdb) info registers
Log Analysisâ
Key Log Patternsâ
- Connection errors: Look for "connection refused" or "timeout"
- Memory issues: Search for "out of memory" or "allocation failed"
- Disk errors: Check for "I/O error" or "disk full"
- Performance: Monitor "slow operation" warnings
Log Aggregationâ
# Collect logs from all nodes
for host in master worker1 worker2; do
scp $host:/var/log/curvine/*.log ./logs/$host/
done
# Search across all logs
grep -r "ERROR" ./logs/
Troubleshooting Checklistâ
-
System Health:
- Sufficient disk space
- Memory availability
- Network connectivity
- Process status
-
Configuration:
- Valid configuration files
- Correct network addresses
- Proper permissions
- Environment variables
-
Logs:
- No critical errors
- Reasonable log levels
- Recent log entries
- Consistent timestamps
Getting Helpâ
- Check the GitHub Issues
- Review documentation and FAQ
- Join the community discussions
- Provide detailed logs and configuration when reporting issues