# PieFed Database Migration Setup ## Overview Database migrations are now handled by a **dedicated Kubernetes Job** that runs before web and worker pods start. This eliminates race conditions and follows Kubernetes best practices. ## Architecture ``` 1. piefed-db-init Job (runs once) ├── Uses entrypoint-init.sh ├── Waits for DB and Redis ├── Runs: flask db upgrade └── Exits on completion 2. Web/Worker Deployments (wait for Job) ├── Init Container: wait-for-migrations │ ├── Watches Job status │ └── Blocks until Job completes └── Main Container: starts after init passes ``` ## Components ### 1. Database Init Job **File**: `job-db-init.yaml` - Runs migrations using `entrypoint-init.sh` - Must complete before any pods start - Retries up to 3 times on failure - Kept for 24h after completion (for debugging) ### 2. Init Containers (Web & Worker) **Files**: `deployment-web.yaml`, `deployment-worker.yaml` - Wait for `piefed-db-init` Job to complete - Timeout after 10 minutes - Show migration logs if Job fails - Block pod startup until migrations succeed ### 3. RBAC Permissions **File**: `rbac-init-checker.yaml` - ServiceAccount: `piefed-init-checker` - Permissions to read Job status and logs - Scoped to `piefed-application` namespace only ## Deployment Flow ```mermaid sequenceDiagram participant Flux participant RBAC as RBAC Resources participant Job as DB Init Job participant Init as Init Containers participant Pods as Web/Worker Pods Flux->>RBAC: 1. Create ServiceAccount + Role Flux->>Job: 2. Create Job Job->>Job: 3. Run migrations Flux->>Init: 4. Start Deployments Init->>Job: 5. Wait for Job complete Job-->>Init: 6. Job successful Init->>Pods: 7. Start main containers ``` ## First-Time Setup ### 1. Build New Container Images The base image now includes `entrypoint-init.sh`: ```bash cd build/piefed ./build-all.sh ``` ### 2. Apply Manifests Flux will automatically pick up changes, or apply manually: ```bash # Apply everything kubectl apply -k manifests/applications/piefed/ # Watch the migration Job kubectl logs -f -n piefed-application job/piefed-db-init # Watch pods waiting for migrations kubectl get pods -n piefed-application -w ``` ## Upgrade Process (New Versions) When upgrading PieFed to a new version with schema changes: ```bash # 1. Build and push new images cd build/piefed ./build-all.sh # 2. Delete old Job (so it re-runs with new image) kubectl delete job piefed-db-init -n piefed-application # 3. Apply manifests (Job will recreate) kubectl apply -k manifests/applications/piefed/ # 4. Watch migration progress kubectl logs -f -n piefed-application job/piefed-db-init # 5. Verify Job completed kubectl wait --for=condition=complete --timeout=300s \ job/piefed-db-init -n piefed-application # 6. Restart deployments to pick up new image kubectl rollout restart deployment piefed-web -n piefed-application kubectl rollout restart deployment piefed-worker -n piefed-application ``` ## Troubleshooting ### Migration Job Failed ```bash # Check Job status kubectl get job piefed-db-init -n piefed-application # View full logs kubectl logs -n piefed-application job/piefed-db-init # Check database connection kubectl exec -n piefed-application deployment/piefed-web -- \ flask db current ``` ### Pods Stuck in Init ```bash # Check init container logs kubectl logs -n piefed-application -c wait-for-migrations # Check if Job is running kubectl get job piefed-db-init -n piefed-application # Manual Job completion check kubectl get job piefed-db-init -n piefed-application \ -o jsonpath='{.status.conditions[?(@.type=="Complete")].status}' ``` ### RBAC Permissions Issue ```bash # Verify ServiceAccount exists kubectl get sa piefed-init-checker -n piefed-application # Check Role binding kubectl get rolebinding piefed-init-checker -n piefed-application # Test permissions from a pod kubectl auth can-i get jobs \ --as=system:serviceaccount:piefed-application:piefed-init-checker \ -n piefed-application ``` ## Benefits ✅ **No Race Conditions**: Single Job runs migrations sequentially ✅ **Proper Ordering**: Init containers enforce dependencies ✅ **Clean Separation**: Web/worker focus on their primary roles ✅ **Easy Debugging**: Clear logs for each stage ✅ **GitOps Compatible**: Works perfectly with Flux CD ✅ **Idempotent**: Safe to re-run, Jobs handle completion state ✅ **Fast Scaling**: Web/worker pods start immediately after migrations ## Migration from Old Setup The old setup had `PIEFED_INIT_CONTAINER=true` on all pods, causing race conditions. **Changes Made**: 1. ✅ Removed `PIEFED_INIT_CONTAINER` env var from all pods 2. ✅ Removed migration logic from `entrypoint-common.sh` 3. ✅ Created dedicated `entrypoint-init.sh` for Job 4. ✅ Added init containers to wait for Job 5. ✅ Created RBAC for Job status checking **Before deploying**, ensure you rebuild images with the new entrypoint script!