Files
Michael DiLeo 7327d77dcd redaction (#1)
Add the redacted source file for demo purposes

Reviewed-on: https://source.michaeldileo.org/michael_dileo/Keybard-Vagabond-Demo/pulls/1
Co-authored-by: Michael DiLeo <michael_dileo@proton.me>
Co-committed-by: Michael DiLeo <michael_dileo@proton.me>
2025-12-24 13:40:47 +00:00

204 lines
6.9 KiB
Markdown

# Celery Monitoring (Flower)
This directory contains the infrastructure for monitoring Celery tasks across all applications in the cluster using Flower.
## Overview
- **Flower**: Web-based tool for monitoring and administrating Celery clusters
- **Multi-Application**: Monitors both PieFed and BookWyrm Celery tasks
- **Namespace**: `celery-monitoring`
- **URL**: `https://flower.keyboardvagabond.com`
## Components
- `namespace.yaml` - Dedicated namespace for monitoring
- `flower-deployment.yaml` - Flower application deployment
- `service.yaml` - Internal service for Flower
- `ingress.yaml` - External access with TLS and basic auth
- `kustomization.yaml` - Kustomize configuration
## Redis Database Monitoring
Flower monitors multiple Redis databases:
- **Database 0**: PieFed Celery broker
- **Database 3**: BookWyrm Celery broker
## Access & Security
- **Access Method**: kubectl port-forward (local access only)
- **Command**: `kubectl port-forward -n celery-monitoring svc/celery-flower 8080:5555`
- **URL**: http://localhost:8080
- **Security**: No authentication required (local access only)
- **Network Policies**: Cilium policies allow cluster and health check access only
### Port-Forward Setup
1. **Prerequisites**:
- Valid kubeconfig with access to the cluster
- kubectl installed and configured
- RBAC permissions to create port-forwards in celery-monitoring namespace
2. **Network Policies**: Cilium policies ensure:
- Port 5555 access from cluster and host (for port-forward)
- Redis access for monitoring (DB 0 & 3)
- Cluster-internal health checks
3. **No Authentication Required**:
- Port-forward provides secure local access
- No additional credentials needed
## **🔒 Simplified Security Architecture**
**Current Status**: ✅ **Local access via kubectl port-forward**
### **Security Model**
**1. Local Access Only**
- **Port-Forward**: `kubectl port-forward` provides secure tunnel to the service
- **No External Exposure**: Service is not accessible from outside the cluster
- **Authentication**: Kubernetes RBAC controls who can create port-forwards
- **Encryption**: Traffic encrypted via Kubernetes API tunnel
**2. Network Layer (Cilium Network Policies)**
- **`celery-flower-ingress`**: Allows cluster and host access for port-forward and health checks
- **`celery-flower-egress`**: Restricts outbound to Redis and DNS only
- **DNS Resolution**: Explicit DNS access for service discovery
- **Redis Connectivity**: Targeted access to Redis master (DB 0 & 3)
**3. Pod-Level Security**
- Resource limits (CPU: 500m, Memory: 256Mi)
- Health checks (liveness/readiness probes)
- Non-root container execution
- Read-only root filesystem (where possible)
### **How It Works**
1. **Access Layer**: kubectl port-forward creates secure tunnel via Kubernetes API
2. **Network Layer**: Cilium policies ensure only cluster traffic reaches pods
3. **Application Layer**: Flower connects only to authorized Redis databases
4. **Monitoring Layer**: Health checks ensure service availability
5. **Local Security**: Access requires valid kubeconfig and RBAC permissions
## Features
- **Flower Web UI**: Real-time task monitoring and worker status
- **Prometheus Metrics**: Custom Celery queue metrics exported to OpenObserve
- **Automated Alerts**: Queue size and connection status monitoring
- **Dashboard**: Visual monitoring of queue trends and processing rates
## Monitoring & Alerts
### Metrics Exported
**From Celery Metrics Exporter** (celery-monitoring namespace):
1. **`celery_queue_length`**: Number of pending tasks in each queue
- Labels: `queue_name`, `database` (piefed/bookwyrm)
2. **`redis_connection_status`**: Redis connectivity status (1=connected, 0=disconnected)
3. **`celery_queue_info`**: General information about queue status
**From Redis Exporter** (redis-system namespace):
4. **`redis_list_length`**: General Redis list lengths including Celery queues
5. **`redis_memory_used_bytes`**: Redis memory usage
6. **`redis_connected_clients`**: Number of connected Redis clients
7. **`redis_commands_total`**: Total Redis commands executed
### Alert Thresholds
- **PieFed Warning**: > 10,000 pending tasks
- **PieFed Critical**: > 50,000 pending tasks
- **BookWyrm Warning**: > 1,000 pending tasks
- **Redis Connection**: Connection lost alert
### OpenObserve Setup
1. **Deploy the monitoring infrastructure**:
```bash
kubectl apply -k manifests/infrastructure/celery-monitoring/
```
2. **Import alerts and dashboard**:
- Access OpenObserve dashboard
- Import alert configurations from the `openobserve-alert-configs` ConfigMap
- Import dashboard from the same ConfigMap
- Configure webhook URLs for notifications
3. **Verify metrics collection**:
```sql
SELECT * FROM metrics WHERE __name__ LIKE 'celery_%' ORDER BY _timestamp DESC LIMIT 10
```
### Useful Monitoring Queries
**Current queue sizes**:
```sql
SELECT queue_name, database, celery_queue_length
FROM metrics
WHERE _timestamp >= now() - interval '5 minutes'
GROUP BY queue_name, database
ORDER BY celery_queue_length DESC
```
**Queue processing rate**:
```sql
SELECT _timestamp,
celery_queue_length - LAG(celery_queue_length, 1) OVER (ORDER BY _timestamp) as processing_rate
FROM metrics
WHERE queue_name='celery' AND database='piefed'
AND _timestamp >= now() - interval '1 hour'
```
- Queue length monitoring
- Task history and details
- Performance metrics
- Multi-broker support
## Dependencies
- Redis (for Celery brokers)
- kubectl (for port-forward access)
- Valid kubeconfig with cluster access
## Testing & Validation
### Quick Access
```bash
# Start port-forward (runs in background)
kubectl port-forward -n celery-monitoring svc/celery-flower 8080:5555 &
# Access Flower UI
open http://localhost:8080
# or visit http://localhost:8080 in your browser
# Stop port-forward when done
pkill -f "kubectl port-forward.*celery-flower"
```
### Manual Testing Checklist
1. **Port-Forward Access**: ✅ Can access http://localhost:8080 after port-forward
2. **No External Access**: ❌ Service not accessible from outside cluster
3. **Redis Connectivity**: 📊 Shows tasks from both PieFed (DB 0) and BookWyrm (DB 3)
4. **Health Checks**: ✅ Pod shows Ready status
5. **Network Policies**: 🛡️ Egress restricted to DNS and Redis only
### Troubleshooting Commands
```bash
# Check Flower pod status
kubectl get pods -n celery-monitoring -l app.kubernetes.io/name=celery-flower
# View Flower logs
kubectl logs -n celery-monitoring -l app.kubernetes.io/name=celery-flower
# Test Redis connectivity
kubectl exec -n celery-monitoring -it deployment/celery-flower -- wget -qO- http://localhost:5555
# Check network policies
kubectl get cnp -n celery-monitoring
# Test port-forward connectivity
kubectl port-forward -n celery-monitoring svc/celery-flower 8080:5555 --dry-run=client
```
## Deployment
Deployed automatically via Flux GitOps from `manifests/cluster/flux-system/celery-monitoring.yaml`.