Files
Keybard-Vagabond-Demo/manifests/applications/piefed/MIGRATION-SETUP.md
Michael DiLeo 7327d77dcd redaction (#1)
Add the redacted source file for demo purposes

Reviewed-on: https://source.michaeldileo.org/michael_dileo/Keybard-Vagabond-Demo/pulls/1
Co-authored-by: Michael DiLeo <michael_dileo@proton.me>
Co-committed-by: Michael DiLeo <michael_dileo@proton.me>
2025-12-24 13:40:47 +00:00

5.0 KiB

PieFed Database Migration Setup

Overview

Database migrations are now handled by a dedicated Kubernetes Job that runs before web and worker pods start. This eliminates race conditions and follows Kubernetes best practices.

Architecture

1. piefed-db-init Job (runs once)
   ├── Uses entrypoint-init.sh
   ├── Waits for DB and Redis
   ├── Runs: flask db upgrade
   └── Exits on completion

2. Web/Worker Deployments (wait for Job)
   ├── Init Container: wait-for-migrations
   │   ├── Watches Job status
   │   └── Blocks until Job completes
   └── Main Container: starts after init passes

Components

1. Database Init Job

File: job-db-init.yaml

  • Runs migrations using entrypoint-init.sh
  • Must complete before any pods start
  • Retries up to 3 times on failure
  • Kept for 24h after completion (for debugging)

2. Init Containers (Web & Worker)

Files: deployment-web.yaml, deployment-worker.yaml

  • Wait for piefed-db-init Job to complete
  • Timeout after 10 minutes
  • Show migration logs if Job fails
  • Block pod startup until migrations succeed

3. RBAC Permissions

File: rbac-init-checker.yaml

  • ServiceAccount: piefed-init-checker
  • Permissions to read Job status and logs
  • Scoped to piefed-application namespace only

Deployment Flow

sequenceDiagram
    participant Flux
    participant RBAC as RBAC Resources
    participant Job as DB Init Job
    participant Init as Init Containers
    participant Pods as Web/Worker Pods

    Flux->>RBAC: 1. Create ServiceAccount + Role
    Flux->>Job: 2. Create Job
    Job->>Job: 3. Run migrations
    Flux->>Init: 4. Start Deployments
    Init->>Job: 5. Wait for Job complete
    Job-->>Init: 6. Job successful
    Init->>Pods: 7. Start main containers

First-Time Setup

1. Build New Container Images

The base image now includes entrypoint-init.sh:

cd build/piefed
./build-all.sh

2. Apply Manifests

Flux will automatically pick up changes, or apply manually:

# Apply everything
kubectl apply -k manifests/applications/piefed/

# Watch the migration Job
kubectl logs -f -n piefed-application job/piefed-db-init

# Watch pods waiting for migrations
kubectl get pods -n piefed-application -w

Upgrade Process (New Versions)

When upgrading PieFed to a new version with schema changes:

# 1. Build and push new images
cd build/piefed
./build-all.sh

# 2. Delete old Job (so it re-runs with new image)
kubectl delete job piefed-db-init -n piefed-application

# 3. Apply manifests (Job will recreate)
kubectl apply -k manifests/applications/piefed/

# 4. Watch migration progress
kubectl logs -f -n piefed-application job/piefed-db-init

# 5. Verify Job completed
kubectl wait --for=condition=complete --timeout=300s \
  job/piefed-db-init -n piefed-application

# 6. Restart deployments to pick up new image
kubectl rollout restart deployment piefed-web -n piefed-application
kubectl rollout restart deployment piefed-worker -n piefed-application

Troubleshooting

Migration Job Failed

# Check Job status
kubectl get job piefed-db-init -n piefed-application

# View full logs
kubectl logs -n piefed-application job/piefed-db-init

# Check database connection
kubectl exec -n piefed-application deployment/piefed-web -- \
  flask db current

Pods Stuck in Init

# Check init container logs
kubectl logs -n piefed-application <pod-name> -c wait-for-migrations

# Check if Job is running
kubectl get job piefed-db-init -n piefed-application

# Manual Job completion check
kubectl get job piefed-db-init -n piefed-application \
  -o jsonpath='{.status.conditions[?(@.type=="Complete")].status}'

RBAC Permissions Issue

# Verify ServiceAccount exists
kubectl get sa piefed-init-checker -n piefed-application

# Check Role binding
kubectl get rolebinding piefed-init-checker -n piefed-application

# Test permissions from a pod
kubectl auth can-i get jobs \
  --as=system:serviceaccount:piefed-application:piefed-init-checker \
  -n piefed-application

Benefits

No Race Conditions: Single Job runs migrations sequentially
Proper Ordering: Init containers enforce dependencies
Clean Separation: Web/worker focus on their primary roles
Easy Debugging: Clear logs for each stage
GitOps Compatible: Works perfectly with Flux CD
Idempotent: Safe to re-run, Jobs handle completion state
Fast Scaling: Web/worker pods start immediately after migrations

Migration from Old Setup

The old setup had PIEFED_INIT_CONTAINER=true on all pods, causing race conditions.

Changes Made:

  1. Removed PIEFED_INIT_CONTAINER env var from all pods
  2. Removed migration logic from entrypoint-common.sh
  3. Created dedicated entrypoint-init.sh for Job
  4. Added init containers to wait for Job
  5. Created RBAC for Job status checking

Before deploying, ensure you rebuild images with the new entrypoint script!