# Auto-Discovery Celery Metrics Exporter The Celery metrics exporter now **automatically discovers** all Redis databases and their queues without requiring manual configuration. It scans all Redis databases (0-15) and identifies potential Celery queues based on patterns and naming conventions. ## How Auto-Discovery Works ### Automatic Database Scanning - Scans Redis databases 0-15 by default - Only monitors databases that contain keys - Only includes databases that have identifiable queues ### Automatic Queue Discovery The exporter supports two discovery modes: #### Smart Filtering Mode (Default: `monitor_all_lists: false`) Identifies queues using multiple strategies: 1. **Pattern Matching**: Matches known queue patterns from your applications: - `celery`, `*_priority`, `default`, `mailers`, `push`, `scheduler` - `streams`, `images`, `suggested_users`, `email`, `connectors`, `lists`, `inbox`, `imports`, `import_triggered`, `misc` (BookWyrm) - `background`, `send` (PieFed) - `high`, `mmo` (Pixelfed/Laravel) 2. **Heuristic Detection**: Identifies Redis lists containing queue-related keywords: - Keys containing: `queue`, `celery`, `task`, `job`, `work` 3. **Type Checking**: Only considers Redis `list` type keys (Celery queues are Redis lists) #### Monitor Everything Mode (`monitor_all_lists: true`) - Monitors **ALL** Redis list-type keys in all databases - No filtering or pattern matching - Maximum visibility but potentially more noise - Useful for debugging or comprehensive monitoring ### Which Mode Should You Use? **Use Smart Filtering (default)** when: - ✅ You want clean, relevant metrics - ✅ You care about Prometheus cardinality limits - ✅ Your applications use standard queue naming - ✅ You want to avoid monitoring non-queue Redis lists **Use Monitor Everything** when: - ✅ You're debugging queue discovery issues - ✅ You have non-standard queue names not covered by patterns - ✅ You want absolute certainty you're not missing anything - ✅ You have sufficient Prometheus storage/performance headroom - ❌ You don't mind potential noise from non-queue lists ## Configuration (Optional) While the exporter works completely automatically, you can customize its behavior via the `celery-exporter-config` ConfigMap: ```yaml apiVersion: v1 kind: ConfigMap metadata: name: celery-exporter-config namespace: celery-monitoring data: config.yaml: | # Auto-discovery settings auto_discovery: enabled: true scan_databases: true # Scan all Redis databases 0-15 scan_queues: true # Auto-discover queues in each database monitor_all_lists: false # If true, monitor ALL Redis lists, not just queue-like ones # Queue patterns to look for (Redis list keys that are likely Celery queues) queue_patterns: - "celery" - "*_priority" - "default" - "mailers" - "push" - "scheduler" - "broadcast" - "federation" - "media" - "user_dir" # Optional: Database name mapping (if you want friendly names) # If not specified, databases will be named "db_0", "db_1", etc. database_names: 0: "piefed" 1: "mastodon" 2: "matrix" 3: "bookwyrm" # Minimum queue length to report (avoid noise from empty queues) min_queue_length: 0 # Maximum number of databases to scan (safety limit) max_databases: 16 ``` ## Adding New Applications **No configuration needed!** New applications are automatically discovered when they: 1. **Use a Redis database** (any database 0-15) 2. **Create queues** that match common patterns or contain queue-related keywords 3. **Use Redis lists** for their queues (standard Celery behavior) ### Custom Queue Patterns If your application uses non-standard queue names, add them to the `queue_patterns` list: ```bash kubectl edit configmap celery-exporter-config -n celery-monitoring ``` Add your pattern: ```yaml queue_patterns: - "celery" - "*_priority" - "my_custom_queue_*" # Add your pattern here ``` ### Friendly Database Names To give databases friendly names instead of `db_0`, `db_1`, etc.: ```yaml database_names: 0: "piefed" 1: "mastodon" 2: "matrix" 3: "bookwyrm" 4: "my_new_app" # Add your app here ``` ## Metrics Produced The exporter produces these metrics for each discovered database: ### `celery_queue_length` - **Labels**: `queue_name`, `database`, `db_number` - **Description**: Number of pending tasks in each queue - **Example**: `celery_queue_length{queue_name="celery", database="piefed", db_number="0"} 1234` - **Special**: `queue_name="_total"` shows total tasks across all queues in a database ### `redis_connection_status` - **Labels**: `database`, `db_number` - **Description**: Connection status per database (1=connected, 0=disconnected) - **Example**: `redis_connection_status{database="piefed", db_number="0"} 1` ### `celery_databases_discovered` - **Description**: Total number of databases with queues discovered - **Example**: `celery_databases_discovered 4` ### `celery_queues_discovered` - **Labels**: `database` - **Description**: Number of queues discovered per database - **Example**: `celery_queues_discovered{database="bookwyrm"} 5` ### `celery_queue_info` - **Description**: General information about all monitored queues - **Includes**: Total lengths, Redis host, last update timestamp, auto-discovery status ## PromQL Query Examples ### Discovery Overview ```promql # How many databases were discovered celery_databases_discovered # How many queues per database celery_queues_discovered # Auto-discovery status celery_queue_info ``` ### All Applications Overview ```promql # All queue lengths grouped by database sum by (database) (celery_queue_length{queue_name!="_total"}) # Total tasks across all databases sum(celery_queue_length{queue_name="_total"}) # Individual queues (excluding totals) celery_queue_length{queue_name!="_total"} # Only active queues (> 0 tasks) celery_queue_length{queue_name!="_total"} > 0 ``` ### Specific Applications ```promql # PieFed queues only celery_queue_length{database="piefed", queue_name!="_total"} # BookWyrm high priority queue (if it exists) celery_queue_length{database="bookwyrm", queue_name="high_priority"} # All applications' main celery queue celery_queue_length{queue_name="celery"} # Database totals only celery_queue_length{queue_name="_total"} ``` ### Processing Rates ```promql # Tasks processed per minute (negative = queue decreasing) rate(celery_queue_length{queue_name!="_total"}[5m]) * -60 # Processing rate by database (using totals) rate(celery_queue_length{queue_name="_total"}[5m]) * -60 # Overall processing rate across all databases sum(rate(celery_queue_length{queue_name="_total"}[5m]) * -60) ``` ### Health Monitoring ```promql # Databases with connection issues redis_connection_status == 0 # Queues growing too fast increase(celery_queue_length{queue_name!="_total"}[5m]) > 1000 # Stalled processing (no change in 15 minutes) changes(celery_queue_length{queue_name="_total"}[15m]) == 0 and celery_queue_length{queue_name="_total"} > 100 # Databases that stopped being discovered changes(celery_databases_discovered[10m]) < 0 ``` ## Troubleshooting ### Check Auto-Discovery Status ```bash # View current configuration kubectl get configmap celery-exporter-config -n celery-monitoring -o yaml # Check exporter logs for discovery results kubectl logs -n celery-monitoring deployment/celery-metrics-exporter # Look for discovery messages like: # "Database 0 (piefed): 1 queues, 245 total keys" # "Auto-discovery complete: Found 3 databases with queues" ``` ### Test Redis Connectivity ```bash # Test connection to specific database kubectl exec -n redis-system redis-master-0 -- redis-cli -a PASSWORD -n DB_NUMBER ping # Check what keys exist in a database kubectl exec -n redis-system redis-master-0 -- redis-cli -a PASSWORD -n DB_NUMBER keys '*' # Check if a key is a list (queue) kubectl exec -n redis-system redis-master-0 -- redis-cli -a PASSWORD -n DB_NUMBER type QUEUE_NAME # Check queue length manually kubectl exec -n redis-system redis-master-0 -- redis-cli -a PASSWORD -n DB_NUMBER llen QUEUE_NAME ``` ### Validate Metrics ```bash # Port forward and check metrics endpoint kubectl port-forward -n celery-monitoring svc/celery-metrics-exporter 8000:8000 # Check discovery metrics curl http://localhost:8000/metrics | grep celery_databases_discovered curl http://localhost:8000/metrics | grep celery_queues_discovered # Check queue metrics curl http://localhost:8000/metrics | grep celery_queue_length ``` ### Debug Discovery Issues If queues aren't being discovered: 1. **Check queue patterns** - Add your queue names to `queue_patterns` 2. **Verify queue type** - Ensure queues are Redis lists: `redis-cli type queue_name` 3. **Check database numbers** - Verify your app uses the expected Redis database 4. **Review logs** - Look for discovery debug messages in exporter logs ### Force Restart Discovery ```bash # Restart the exporter to re-run discovery kubectl rollout restart deployment/celery-metrics-exporter -n celery-monitoring ``` ## Security Notes - The exporter connects to Redis using the shared `redis-credentials` secret - All database connections use the same Redis host and password - Only queue length information is exposed, not queue contents - The exporter scans all databases but only reports queue-like keys - Metrics are scraped via ServiceMonitor for OpenTelemetry collection