Files
Keybard-Vagabond-Demo/manifests/infrastructure/celery-monitoring/DATABASE-CONFIG.md

9.3 KiB

Auto-Discovery Celery Metrics Exporter

The Celery metrics exporter now automatically discovers all Redis databases and their queues without requiring manual configuration. It scans all Redis databases (0-15) and identifies potential Celery queues based on patterns and naming conventions.

How Auto-Discovery Works

Automatic Database Scanning

  • Scans Redis databases 0-15 by default
  • Only monitors databases that contain keys
  • Only includes databases that have identifiable queues

Automatic Queue Discovery

The exporter supports two discovery modes:

Smart Filtering Mode (Default: monitor_all_lists: false)

Identifies queues using multiple strategies:

  1. Pattern Matching: Matches known queue patterns from your applications:

    • celery, *_priority, default, mailers, push, scheduler
    • streams, images, suggested_users, email, connectors, lists, inbox, imports, import_triggered, misc (BookWyrm)
    • background, send (PieFed)
    • high, mmo (Pixelfed/Laravel)
  2. Heuristic Detection: Identifies Redis lists containing queue-related keywords:

    • Keys containing: queue, celery, task, job, work
  3. Type Checking: Only considers Redis list type keys (Celery queues are Redis lists)

Monitor Everything Mode (monitor_all_lists: true)

  • Monitors ALL Redis list-type keys in all databases
  • No filtering or pattern matching
  • Maximum visibility but potentially more noise
  • Useful for debugging or comprehensive monitoring

Which Mode Should You Use?

Use Smart Filtering (default) when:

  • You want clean, relevant metrics
  • You care about Prometheus cardinality limits
  • Your applications use standard queue naming
  • You want to avoid monitoring non-queue Redis lists

Use Monitor Everything when:

  • You're debugging queue discovery issues
  • You have non-standard queue names not covered by patterns
  • You want absolute certainty you're not missing anything
  • You have sufficient Prometheus storage/performance headroom
  • You don't mind potential noise from non-queue lists

Configuration (Optional)

While the exporter works completely automatically, you can customize its behavior via the celery-exporter-config ConfigMap:

apiVersion: v1
kind: ConfigMap
metadata:
  name: celery-exporter-config
  namespace: celery-monitoring
data:
  config.yaml: |
    # Auto-discovery settings
    auto_discovery:
      enabled: true
      scan_databases: true  # Scan all Redis databases 0-15
      scan_queues: true     # Auto-discover queues in each database
      monitor_all_lists: false  # If true, monitor ALL Redis lists, not just queue-like ones
      
    # Queue patterns to look for (Redis list keys that are likely Celery queues)
    queue_patterns:
      - "celery"
      - "*_priority"
      - "default"
      - "mailers"
      - "push"
      - "scheduler"
      - "broadcast"
      - "federation"
      - "media"
      - "user_dir"
      
    # Optional: Database name mapping (if you want friendly names)
    # If not specified, databases will be named "db_0", "db_1", etc.
    database_names:
      0: "piefed"
      1: "mastodon"
      2: "matrix"
      3: "bookwyrm"
      
    # Minimum queue length to report (avoid noise from empty queues)
    min_queue_length: 0
    
    # Maximum number of databases to scan (safety limit)
    max_databases: 16

Adding New Applications

No configuration needed! New applications are automatically discovered when they:

  1. Use a Redis database (any database 0-15)
  2. Create queues that match common patterns or contain queue-related keywords
  3. Use Redis lists for their queues (standard Celery behavior)

Custom Queue Patterns

If your application uses non-standard queue names, add them to the queue_patterns list:

kubectl edit configmap celery-exporter-config -n celery-monitoring

Add your pattern:

queue_patterns:
  - "celery"
  - "*_priority"
  - "my_custom_queue_*"  # Add your pattern here

Friendly Database Names

To give databases friendly names instead of db_0, db_1, etc.:

database_names:
  0: "piefed"
  1: "mastodon"
  2: "matrix"
  3: "bookwyrm"
  4: "my_new_app"  # Add your app here

Metrics Produced

The exporter produces these metrics for each discovered database:

celery_queue_length

  • Labels: queue_name, database, db_number
  • Description: Number of pending tasks in each queue
  • Example: celery_queue_length{queue_name="celery", database="piefed", db_number="0"} 1234
  • Special: queue_name="_total" shows total tasks across all queues in a database

redis_connection_status

  • Labels: database, db_number
  • Description: Connection status per database (1=connected, 0=disconnected)
  • Example: redis_connection_status{database="piefed", db_number="0"} 1

celery_databases_discovered

  • Description: Total number of databases with queues discovered
  • Example: celery_databases_discovered 4

celery_queues_discovered

  • Labels: database
  • Description: Number of queues discovered per database
  • Example: celery_queues_discovered{database="bookwyrm"} 5

celery_queue_info

  • Description: General information about all monitored queues
  • Includes: Total lengths, Redis host, last update timestamp, auto-discovery status

PromQL Query Examples

Discovery Overview

# How many databases were discovered
celery_databases_discovered

# How many queues per database
celery_queues_discovered

# Auto-discovery status
celery_queue_info

All Applications Overview

# All queue lengths grouped by database
sum by (database) (celery_queue_length{queue_name!="_total"})

# Total tasks across all databases
sum(celery_queue_length{queue_name="_total"})

# Individual queues (excluding totals)
celery_queue_length{queue_name!="_total"}

# Only active queues (> 0 tasks)
celery_queue_length{queue_name!="_total"} > 0

Specific Applications

# PieFed queues only
celery_queue_length{database="piefed", queue_name!="_total"}

# BookWyrm high priority queue (if it exists)
celery_queue_length{database="bookwyrm", queue_name="high_priority"}

# All applications' main celery queue
celery_queue_length{queue_name="celery"}

# Database totals only
celery_queue_length{queue_name="_total"}

Processing Rates

# Tasks processed per minute (negative = queue decreasing)
rate(celery_queue_length{queue_name!="_total"}[5m]) * -60

# Processing rate by database (using totals)
rate(celery_queue_length{queue_name="_total"}[5m]) * -60

# Overall processing rate across all databases
sum(rate(celery_queue_length{queue_name="_total"}[5m]) * -60)

Health Monitoring

# Databases with connection issues
redis_connection_status == 0

# Queues growing too fast
increase(celery_queue_length{queue_name!="_total"}[5m]) > 1000

# Stalled processing (no change in 15 minutes)
changes(celery_queue_length{queue_name="_total"}[15m]) == 0 and celery_queue_length{queue_name="_total"} > 100

# Databases that stopped being discovered
changes(celery_databases_discovered[10m]) < 0

Troubleshooting

Check Auto-Discovery Status

# View current configuration
kubectl get configmap celery-exporter-config -n celery-monitoring -o yaml

# Check exporter logs for discovery results
kubectl logs -n celery-monitoring deployment/celery-metrics-exporter

# Look for discovery messages like:
# "Database 0 (piefed): 1 queues, 245 total keys"
# "Auto-discovery complete: Found 3 databases with queues"

Test Redis Connectivity

# Test connection to specific database
kubectl exec -n redis-system redis-master-0 -- redis-cli -a PASSWORD -n DB_NUMBER ping

# Check what keys exist in a database
kubectl exec -n redis-system redis-master-0 -- redis-cli -a PASSWORD -n DB_NUMBER keys '*'

# Check if a key is a list (queue)
kubectl exec -n redis-system redis-master-0 -- redis-cli -a PASSWORD -n DB_NUMBER type QUEUE_NAME

# Check queue length manually
kubectl exec -n redis-system redis-master-0 -- redis-cli -a PASSWORD -n DB_NUMBER llen QUEUE_NAME

Validate Metrics

# Port forward and check metrics endpoint
kubectl port-forward -n celery-monitoring svc/celery-metrics-exporter 8000:8000

# Check discovery metrics
curl http://localhost:8000/metrics | grep celery_databases_discovered
curl http://localhost:8000/metrics | grep celery_queues_discovered

# Check queue metrics
curl http://localhost:8000/metrics | grep celery_queue_length

Debug Discovery Issues

If queues aren't being discovered:

  1. Check queue patterns - Add your queue names to queue_patterns
  2. Verify queue type - Ensure queues are Redis lists: redis-cli type queue_name
  3. Check database numbers - Verify your app uses the expected Redis database
  4. Review logs - Look for discovery debug messages in exporter logs

Force Restart Discovery

# Restart the exporter to re-run discovery
kubectl rollout restart deployment/celery-metrics-exporter -n celery-monitoring

Security Notes

  • The exporter connects to Redis using the shared redis-credentials secret
  • All database connections use the same Redis host and password
  • Only queue length information is exposed, not queue contents
  • The exporter scans all databases but only reports queue-like keys
  • Metrics are scraped via ServiceMonitor for OpenTelemetry collection