Debugging Guide

This guide provides instructions for debugging Curvine components and troubleshooting common issues.

Logging Configuration

Server Logging

Curvine uses structured logging with configurable levels. Configure logging in curvine.toml:

[log]
level = "info"  # trace, debug, info, warn, error
dir = "/var/log/curvine"
max_size = "100MB"
max_files = 10

Client Logging

For client debugging, set the log level in client configuration:

[client.log]
level = "debug"
dir = "/tmp/curvine-client"

Common Debugging Scenarios

Master/Worker Connection Issues

Check network connectivity:
```
telnet <master-host> <master-port>
```
Verify configuration:
- Check master_addrs in worker configuration
- Verify firewall settings
- Check DNS resolution

Review logs:

tail -f /var/log/curvine/master.log
tail -f /var/log/curvine/worker.log

Performance Issues

Monitor system resources:
```
htop
iostat -x 1
sar -n DEV 1
```

Check Curvine metrics:

curl http://localhost:9000/metrics  # Master metrics
curl http://localhost:9001/metrics  # Worker metrics

Profile with perf:

perf record -g ./curvine-server
perf report

FUSE Mount Issues

Check mount status:

mount | grep curvine
fusermount -u /mnt/curvine  # Unmount if needed

Debug FUSE operations:

# Enable FUSE debug logging
curvine-fuse -o debug /mnt/curvine

Check permissions:

ls -la /dev/fuse
groups $USER  # Check if user is in fuse group

Debugging Tools

Built-in Tools

curvine-cli: Command-line interface for cluster management
curvine-bench: Performance testing and profiling
Health checks: Built-in health monitoring endpoints

External Tools

strace: System call tracing
gdb: Debugging with core dumps
valgrind: Memory debugging
perf: Performance profiling

Core Dump Analysis

Enable core dumps:

ulimit -c unlimited
echo '/tmp/core.%e.%p' > /proc/sys/kernel/core_pattern

Analyze with gdb:

gdb ./curvine-server /tmp/core.curvine-server.12345
(gdb) bt
(gdb) info registers

Log Analysis

Key Log Patterns

Connection errors: Look for "connection refused" or "timeout"
Memory issues: Search for "out of memory" or "allocation failed"
Disk errors: Check for "I/O error" or "disk full"
Performance: Monitor "slow operation" warnings

Log Aggregation

# Collect logs from all nodes
for host in master worker1 worker2; do
  scp $host:/var/log/curvine/*.log ./logs/$host/
done

# Search across all logs
grep -r "ERROR" ./logs/

Troubleshooting Checklist

System Health:
- Sufficient disk space
- Memory availability
- Network connectivity
- Process status
Configuration:
- Valid configuration files
- Correct network addresses
- Proper permissions
- Environment variables
Logs:
- No critical errors
- Reasonable log levels
- Recent log entries
- Consistent timestamps

Getting Help

Check the GitHub Issues
Review documentation and FAQ
Join the community discussions
Provide detailed logs and configuration when reporting issues

Logging Configuration
- Server Logging
- Client Logging
Common Debugging Scenarios
Debugging Tools
- Built-in Tools
- External Tools
Core Dump Analysis
Log Analysis
- Key Log Patterns
- Log Aggregation
Troubleshooting Checklist
Getting Help

Logging Configuration​

Server Logging​

Client Logging​

Common Debugging Scenarios​

Master/Worker Connection Issues​

Performance Issues​

FUSE Mount Issues​

Debugging Tools​

Built-in Tools​

External Tools​

Core Dump Analysis​

Log Analysis​

Key Log Patterns​

Log Aggregation​

Troubleshooting Checklist​

Getting Help​