Add the redacted source file for demo purposes Reviewed-on: https://source.michaeldileo.org/michael_dileo/Keybard-Vagabond-Demo/pulls/1 Co-authored-by: Michael DiLeo <michael_dileo@proton.me> Co-committed-by: Michael DiLeo <michael_dileo@proton.me>
9.3 KiB
Auto-Discovery Celery Metrics Exporter
The Celery metrics exporter now automatically discovers all Redis databases and their queues without requiring manual configuration. It scans all Redis databases (0-15) and identifies potential Celery queues based on patterns and naming conventions.
How Auto-Discovery Works
Automatic Database Scanning
- Scans Redis databases 0-15 by default
- Only monitors databases that contain keys
- Only includes databases that have identifiable queues
Automatic Queue Discovery
The exporter supports two discovery modes:
Smart Filtering Mode (Default: monitor_all_lists: false)
Identifies queues using multiple strategies:
-
Pattern Matching: Matches known queue patterns from your applications:
celery,*_priority,default,mailers,push,schedulerstreams,images,suggested_users,email,connectors,lists,inbox,imports,import_triggered,misc(BookWyrm)background,send(PieFed)high,mmo(Pixelfed/Laravel)
-
Heuristic Detection: Identifies Redis lists containing queue-related keywords:
- Keys containing:
queue,celery,task,job,work
- Keys containing:
-
Type Checking: Only considers Redis
listtype keys (Celery queues are Redis lists)
Monitor Everything Mode (monitor_all_lists: true)
- Monitors ALL Redis list-type keys in all databases
- No filtering or pattern matching
- Maximum visibility but potentially more noise
- Useful for debugging or comprehensive monitoring
Which Mode Should You Use?
Use Smart Filtering (default) when:
- ✅ You want clean, relevant metrics
- ✅ You care about Prometheus cardinality limits
- ✅ Your applications use standard queue naming
- ✅ You want to avoid monitoring non-queue Redis lists
Use Monitor Everything when:
- ✅ You're debugging queue discovery issues
- ✅ You have non-standard queue names not covered by patterns
- ✅ You want absolute certainty you're not missing anything
- ✅ You have sufficient Prometheus storage/performance headroom
- ❌ You don't mind potential noise from non-queue lists
Configuration (Optional)
While the exporter works completely automatically, you can customize its behavior via the celery-exporter-config ConfigMap:
apiVersion: v1
kind: ConfigMap
metadata:
name: celery-exporter-config
namespace: celery-monitoring
data:
config.yaml: |
# Auto-discovery settings
auto_discovery:
enabled: true
scan_databases: true # Scan all Redis databases 0-15
scan_queues: true # Auto-discover queues in each database
monitor_all_lists: false # If true, monitor ALL Redis lists, not just queue-like ones
# Queue patterns to look for (Redis list keys that are likely Celery queues)
queue_patterns:
- "celery"
- "*_priority"
- "default"
- "mailers"
- "push"
- "scheduler"
- "broadcast"
- "federation"
- "media"
- "user_dir"
# Optional: Database name mapping (if you want friendly names)
# If not specified, databases will be named "db_0", "db_1", etc.
database_names:
0: "piefed"
1: "mastodon"
2: "matrix"
3: "bookwyrm"
# Minimum queue length to report (avoid noise from empty queues)
min_queue_length: 0
# Maximum number of databases to scan (safety limit)
max_databases: 16
Adding New Applications
No configuration needed! New applications are automatically discovered when they:
- Use a Redis database (any database 0-15)
- Create queues that match common patterns or contain queue-related keywords
- Use Redis lists for their queues (standard Celery behavior)
Custom Queue Patterns
If your application uses non-standard queue names, add them to the queue_patterns list:
kubectl edit configmap celery-exporter-config -n celery-monitoring
Add your pattern:
queue_patterns:
- "celery"
- "*_priority"
- "my_custom_queue_*" # Add your pattern here
Friendly Database Names
To give databases friendly names instead of db_0, db_1, etc.:
database_names:
0: "piefed"
1: "mastodon"
2: "matrix"
3: "bookwyrm"
4: "my_new_app" # Add your app here
Metrics Produced
The exporter produces these metrics for each discovered database:
celery_queue_length
- Labels:
queue_name,database,db_number - Description: Number of pending tasks in each queue
- Example:
celery_queue_length{queue_name="celery", database="piefed", db_number="0"} 1234 - Special:
queue_name="_total"shows total tasks across all queues in a database
redis_connection_status
- Labels:
database,db_number - Description: Connection status per database (1=connected, 0=disconnected)
- Example:
redis_connection_status{database="piefed", db_number="0"} 1
celery_databases_discovered
- Description: Total number of databases with queues discovered
- Example:
celery_databases_discovered 4
celery_queues_discovered
- Labels:
database - Description: Number of queues discovered per database
- Example:
celery_queues_discovered{database="bookwyrm"} 5
celery_queue_info
- Description: General information about all monitored queues
- Includes: Total lengths, Redis host, last update timestamp, auto-discovery status
PromQL Query Examples
Discovery Overview
# How many databases were discovered
celery_databases_discovered
# How many queues per database
celery_queues_discovered
# Auto-discovery status
celery_queue_info
All Applications Overview
# All queue lengths grouped by database
sum by (database) (celery_queue_length{queue_name!="_total"})
# Total tasks across all databases
sum(celery_queue_length{queue_name="_total"})
# Individual queues (excluding totals)
celery_queue_length{queue_name!="_total"}
# Only active queues (> 0 tasks)
celery_queue_length{queue_name!="_total"} > 0
Specific Applications
# PieFed queues only
celery_queue_length{database="piefed", queue_name!="_total"}
# BookWyrm high priority queue (if it exists)
celery_queue_length{database="bookwyrm", queue_name="high_priority"}
# All applications' main celery queue
celery_queue_length{queue_name="celery"}
# Database totals only
celery_queue_length{queue_name="_total"}
Processing Rates
# Tasks processed per minute (negative = queue decreasing)
rate(celery_queue_length{queue_name!="_total"}[5m]) * -60
# Processing rate by database (using totals)
rate(celery_queue_length{queue_name="_total"}[5m]) * -60
# Overall processing rate across all databases
sum(rate(celery_queue_length{queue_name="_total"}[5m]) * -60)
Health Monitoring
# Databases with connection issues
redis_connection_status == 0
# Queues growing too fast
increase(celery_queue_length{queue_name!="_total"}[5m]) > 1000
# Stalled processing (no change in 15 minutes)
changes(celery_queue_length{queue_name="_total"}[15m]) == 0 and celery_queue_length{queue_name="_total"} > 100
# Databases that stopped being discovered
changes(celery_databases_discovered[10m]) < 0
Troubleshooting
Check Auto-Discovery Status
# View current configuration
kubectl get configmap celery-exporter-config -n celery-monitoring -o yaml
# Check exporter logs for discovery results
kubectl logs -n celery-monitoring deployment/celery-metrics-exporter
# Look for discovery messages like:
# "Database 0 (piefed): 1 queues, 245 total keys"
# "Auto-discovery complete: Found 3 databases with queues"
Test Redis Connectivity
# Test connection to specific database
kubectl exec -n redis-system redis-master-0 -- redis-cli -a PASSWORD -n DB_NUMBER ping
# Check what keys exist in a database
kubectl exec -n redis-system redis-master-0 -- redis-cli -a PASSWORD -n DB_NUMBER keys '*'
# Check if a key is a list (queue)
kubectl exec -n redis-system redis-master-0 -- redis-cli -a PASSWORD -n DB_NUMBER type QUEUE_NAME
# Check queue length manually
kubectl exec -n redis-system redis-master-0 -- redis-cli -a PASSWORD -n DB_NUMBER llen QUEUE_NAME
Validate Metrics
# Port forward and check metrics endpoint
kubectl port-forward -n celery-monitoring svc/celery-metrics-exporter 8000:8000
# Check discovery metrics
curl http://localhost:8000/metrics | grep celery_databases_discovered
curl http://localhost:8000/metrics | grep celery_queues_discovered
# Check queue metrics
curl http://localhost:8000/metrics | grep celery_queue_length
Debug Discovery Issues
If queues aren't being discovered:
- Check queue patterns - Add your queue names to
queue_patterns - Verify queue type - Ensure queues are Redis lists:
redis-cli type queue_name - Check database numbers - Verify your app uses the expected Redis database
- Review logs - Look for discovery debug messages in exporter logs
Force Restart Discovery
# Restart the exporter to re-run discovery
kubectl rollout restart deployment/celery-metrics-exporter -n celery-monitoring
Security Notes
- The exporter connects to Redis using the shared
redis-credentialssecret - All database connections use the same Redis host and password
- Only queue length information is exposed, not queue contents
- The exporter scans all databases but only reports queue-like keys
- Metrics are scraped via ServiceMonitor for OpenTelemetry collection