redaction (#1)
Add the redacted source file for demo purposes Reviewed-on: https://source.michaeldileo.org/michael_dileo/Keybard-Vagabond-Demo/pulls/1 Co-authored-by: Michael DiLeo <michael_dileo@proton.me> Co-committed-by: Michael DiLeo <michael_dileo@proton.me>
This commit was merged in pull request #1.
This commit is contained in:
329
docs/CLOUDFLARE-TUNNEL-NGINX-MIGRATION.md
Normal file
329
docs/CLOUDFLARE-TUNNEL-NGINX-MIGRATION.md
Normal file
@@ -0,0 +1,329 @@
|
||||
# Cloudflare Tunnel to Nginx Ingress Migration
|
||||
|
||||
## Project Overview
|
||||
|
||||
**Goal**: Route Cloudflare Zero Trust tunnel traffic through nginx ingress controller to enable unified request metrics collection for all fediverse applications.
|
||||
|
||||
**Problem**: Currently only Harbor registry shows up in nginx ingress metrics because fediverse apps (PieFed, Mastodon, Pixelfed, BookWyrm) use Cloudflare tunnels that bypass nginx ingress entirely.
|
||||
|
||||
**Solution**: Reconfigure Cloudflare tunnels to route traffic through nginx ingress controller instead of directly to application services.
|
||||
|
||||
## Current vs Target Architecture
|
||||
|
||||
### Current Architecture
|
||||
```
|
||||
Internet → Cloudflare Tunnel → Direct to App Services → Fediverse Apps (NO METRICS)
|
||||
Internet → External IPs → nginx ingress → Harbor (HAS METRICS)
|
||||
```
|
||||
|
||||
### Target Architecture
|
||||
```
|
||||
Internet → Cloudflare Tunnel → nginx ingress → All Applications (UNIFIED METRICS)
|
||||
```
|
||||
|
||||
## Migration Strategy
|
||||
|
||||
**Approach**: Gradual rollout per application to minimize risk and allow monitoring at each stage.
|
||||
|
||||
**Order**: BookWyrm → Pixelfed → PieFed → Mastodon (lowest to highest traffic/criticality)
|
||||
|
||||
## Application Migration Checklist
|
||||
|
||||
### Phase 1: BookWyrm (STARTING) ⏳
|
||||
- [ ] **Pre-migration checks**
|
||||
- [ ] Verify BookWyrm ingress configuration
|
||||
- [ ] Baseline nginx ingress resource usage
|
||||
- [ ] Test nginx ingress accessibility from within cluster
|
||||
- [ ] Document current Cloudflare tunnel config for BookWyrm
|
||||
- [ ] **Migration execution**
|
||||
- [ ] Update Cloudflare tunnel: `bookwyrm.keyboardvagabond.com` → `http://ingress-nginx-controller.ingress-nginx.svc.cluster.local:80`
|
||||
- [ ] Test BookWyrm accessibility immediately after change
|
||||
- [ ] Verify nginx metrics show BookWyrm requests
|
||||
- [ ] **Post-migration monitoring (24-48 hours)**
|
||||
- [ ] Monitor nginx ingress pod CPU/memory usage
|
||||
- [ ] Check BookWyrm response times and error rates
|
||||
- [ ] Verify BookWyrm appears in nginx metrics with expected traffic
|
||||
- [ ] Confirm no nginx ingress errors in logs
|
||||
|
||||
### Phase 2: Pixelfed (PENDING) 📋
|
||||
- [ ] **Pre-migration checks**
|
||||
- [ ] Verify lessons learned from BookWyrm migration
|
||||
- [ ] Check nginx resource usage after BookWyrm
|
||||
- [ ] Baseline Pixelfed performance metrics
|
||||
- [ ] **Migration execution**
|
||||
- [ ] Update Cloudflare tunnel: `pixelfed.keyboardvagabond.com` → nginx ingress
|
||||
- [ ] Test and monitor as per BookWyrm process
|
||||
- [ ] **Post-migration monitoring**
|
||||
- [ ] Monitor combined BookWyrm + Pixelfed traffic impact
|
||||
|
||||
### Phase 3: PieFed (PENDING) 📋
|
||||
- [ ] **Pre-migration checks**
|
||||
- [ ] PieFed has heaviest ActivityPub federation traffic
|
||||
- [ ] Ensure nginx can handle federation bursts
|
||||
- [ ] Review PieFed rate limiting configuration
|
||||
- [ ] **Migration execution**
|
||||
- [ ] Update Cloudflare tunnel: `piefed.keyboardvagabond.com` → nginx ingress
|
||||
- [ ] Monitor federation traffic patterns closely
|
||||
- [ ] **Post-migration monitoring**
|
||||
- [ ] Watch for ActivityPub federation performance impact
|
||||
- [ ] Verify rate limiting still works effectively
|
||||
|
||||
### Phase 4: Mastodon (PENDING) 📋
|
||||
- [ ] **Pre-migration checks**
|
||||
- [ ] Most critical application - proceed with extra caution
|
||||
- [ ] Verify all previous migrations stable
|
||||
- [ ] Review Mastodon streaming service impact
|
||||
- [ ] **Migration execution**
|
||||
- [ ] Update Cloudflare tunnel: `mastodon.keyboardvagabond.com` → nginx ingress
|
||||
- [ ] Update streaming tunnel: `streamingmastodon.keyboardvagabond.com` → nginx ingress
|
||||
- [ ] **Post-migration monitoring**
|
||||
- [ ] Monitor Mastodon federation and streaming performance
|
||||
- [ ] Verify WebSocket connections work correctly
|
||||
|
||||
## Current Configuration
|
||||
|
||||
### Nginx Ingress Service
|
||||
```bash
|
||||
# Main ingress controller service (internal)
|
||||
kubectl get svc ingress-nginx-controller -n ingress-nginx
|
||||
# ClusterIP: 10.101.136.40, Port: 80
|
||||
|
||||
# Public service (external IPs for Harbor)
|
||||
kubectl get svc ingress-nginx-public -n ingress-nginx
|
||||
# LoadBalancer: 10.107.187.45, ExternalIPs: <NODE_1_EXTERNAL_IP>,<NODE_2_EXTERNAL_IP>
|
||||
```
|
||||
|
||||
### Current Cloudflare Tunnel Routes (TO BE CHANGED)
|
||||
```
|
||||
bookwyrm.keyboardvagabond.com → http://bookwyrm-web.bookwyrm-application.svc.cluster.local:80
|
||||
pixelfed.keyboardvagabond.com → http://pixelfed-web.pixelfed-application.svc.cluster.local:80
|
||||
piefed.keyboardvagabond.com → http://piefed-web.piefed-application.svc.cluster.local:80
|
||||
mastodon.keyboardvagabond.com → http://mastodon-web.mastodon-application.svc.cluster.local:3000
|
||||
streamingmastodon.keyboardvagabond.com → http://mastodon-streaming.mastodon-application.svc.cluster.local:4000
|
||||
```
|
||||
|
||||
### Target Cloudflare Tunnel Routes
|
||||
```
|
||||
bookwyrm.keyboardvagabond.com → http://ingress-nginx-controller.ingress-nginx.svc.cluster.local:80
|
||||
pixelfed.keyboardvagabond.com → http://ingress-nginx-controller.ingress-nginx.svc.cluster.local:80
|
||||
piefed.keyboardvagabond.com → http://ingress-nginx-controller.ingress-nginx.svc.cluster.local:80
|
||||
mastodon.keyboardvagabond.com → http://ingress-nginx-controller.ingress-nginx.svc.cluster.local:80
|
||||
streamingmastodon.keyboardvagabond.com → http://ingress-nginx-controller.ingress-nginx.svc.cluster.local:80
|
||||
```
|
||||
|
||||
## Monitoring Commands
|
||||
|
||||
### Pre-Migration Baseline
|
||||
```bash
|
||||
# Check nginx ingress resource usage
|
||||
kubectl top pods -n ingress-nginx
|
||||
|
||||
# Check current request metrics (should only show Harbor)
|
||||
# Your existing query:
|
||||
# (sum(rate(nginx_ingress_controller_requests{status=~"2.."}[5m])) by (host) / sum(rate(nginx_ingress_controller_requests[5m])) by (host)) * 100
|
||||
|
||||
# Monitor nginx ingress logs
|
||||
kubectl logs -n ingress-nginx -l app.kubernetes.io/name=ingress-nginx --tail=50
|
||||
```
|
||||
|
||||
### Post-Migration Verification
|
||||
```bash
|
||||
# Verify nginx metrics include new application
|
||||
# Run your metrics query - should now show BookWyrm traffic
|
||||
|
||||
# Check nginx ingress is handling traffic
|
||||
kubectl logs -n ingress-nginx -l app.kubernetes.io/name=ingress-nginx --tail=20 | grep bookwyrm
|
||||
|
||||
# Monitor resource impact
|
||||
kubectl top pods -n ingress-nginx
|
||||
```
|
||||
|
||||
## Rollback Procedures
|
||||
|
||||
### Quick Rollback (Per Application)
|
||||
1. **Immediate**: Revert Cloudflare tunnel configuration in Zero Trust dashboard
|
||||
2. **Verify**: Test application accessibility
|
||||
3. **Monitor**: Confirm traffic flows correctly
|
||||
|
||||
### Full Rollback (All Applications)
|
||||
1. Revert all Cloudflare tunnel configurations to direct service routing
|
||||
2. Verify all applications accessible
|
||||
3. Confirm metrics collection returns to Harbor-only state
|
||||
|
||||
## Risk Mitigation
|
||||
|
||||
### Resource Monitoring
|
||||
- **nginx Pod Resources**: Watch CPU/memory usage after each migration
|
||||
- **Response Times**: Monitor application response times for degradation
|
||||
- **Error Rates**: Check for increased 5xx errors in nginx logs
|
||||
|
||||
### Traffic Impact Assessment
|
||||
- **Federation Traffic**: Especially important for PieFed and Mastodon
|
||||
- **Rate Limiting**: Verify existing rate limits still function correctly
|
||||
- **WebSocket Connections**: Critical for Mastodon streaming
|
||||
|
||||
## Success Criteria
|
||||
|
||||
✅ **Migration Complete When**:
|
||||
- All fediverse applications route through nginx ingress
|
||||
- Unified metrics show traffic for all applications
|
||||
- No performance degradation observed
|
||||
- All rate limiting and security policies functional
|
||||
- nginx ingress resource usage within acceptable limits
|
||||
|
||||
## Notes & Lessons Learned
|
||||
|
||||
### Phase 1 (BookWyrm) - Status: PRE-MIGRATION COMPLETE ✅
|
||||
|
||||
**Pre-Migration Checks (2025-08-25)**:
|
||||
- ✅ **BookWyrm Ingress**: Correctly configured with host `bookwyrm.keyboardvagabond.com`, nginx class, proper CORS settings
|
||||
- ✅ **BookWyrm Service**: `bookwyrm-web.bookwyrm-application.svc.cluster.local:80` accessible (ClusterIP: 10.96.26.11)
|
||||
- ✅ **Nginx Baseline Resources**:
|
||||
- n1 (625nz): 9m CPU, 174Mi memory
|
||||
- n2 (br8rg): 4m CPU, 169Mi memory
|
||||
- n3 (rkddn): 14m CPU, 159Mi memory
|
||||
- ✅ **Nginx Accessibility Test**: Successfully accessed BookWyrm through nginx ingress with correct Host header
|
||||
- Response: HTTP 200, BookWyrm page served correctly
|
||||
- CORS headers applied properly
|
||||
- No nginx routing issues
|
||||
|
||||
**Current Cloudflare Tunnel Config**:
|
||||
```
|
||||
bookwyrm.keyboardvagabond.com → http://bookwyrm-web.bookwyrm-application.svc.cluster.local:80
|
||||
```
|
||||
|
||||
**Ready for Migration**: All pre-checks passed. Nginx ingress can successfully route BookWyrm traffic.
|
||||
|
||||
**Migration Executed (2025-08-25 16:06 UTC)**: ✅ SUCCESS
|
||||
- **Cloudflare Tunnel Updated**: `bookwyrm.keyboardvagabond.com` → `http://ingress-nginx-controller.ingress-nginx.svc.cluster.local:80`
|
||||
- **Immediate Verification**: BookWyrm web UI accessible, no downtime
|
||||
- **nginx Logs Confirmation**: BookWyrm traffic flowing through nginx ingress:
|
||||
```
|
||||
136.41.98.74 - "GET / HTTP/1.1" 200 [bookwyrm-application-bookwyrm-web-80]
|
||||
143.110.147.80 - "POST /inbox HTTP/1.1" 200 [bookwyrm-application-bookwyrm-web-80]
|
||||
```
|
||||
- **Resource Impact**: Minimal increase in nginx CPU (9-15m cores), memory stable (~170Mi)
|
||||
- **Next**: Monitor for 24-48 hours, verify metrics collection
|
||||
|
||||
**METRICS VERIFICATION**: ✅ SUCCESS!
|
||||
- **BookWyrm now appears in nginx metrics query**: `bookwyrm.keyboardvagabond.com` visible alongside `<YOUR_REGISTRY_URL>`
|
||||
- **Unified metrics collection achieved**: Both Harbor and BookWyrm traffic now measured through nginx ingress
|
||||
- **Phase 1 COMPLETE**: Ready to monitor for stability before Phase 2
|
||||
|
||||
### Phase 2 (Pixelfed) - Status: PRE-MIGRATION STARTING ⏳
|
||||
|
||||
**Lessons Learned from BookWyrm**:
|
||||
- Migration process works flawlessly
|
||||
- nginx ingress handles additional load without issues
|
||||
- Metrics integration successful
|
||||
- Zero downtime achieved
|
||||
|
||||
**Pre-Migration Checks (2025-08-25)**: ✅ COMPLETE
|
||||
- ✅ **Pixelfed Ingress**: Correctly configured with host `pixelfed.keyboardvagabond.com`, nginx class, 20MB upload limit, rate limiting
|
||||
- ✅ **Pixelfed Service**: `pixelfed-web.pixelfed-application.svc.cluster.local:80` accessible (ClusterIP: 10.97.130.244)
|
||||
- ✅ **nginx Post-BookWyrm Resources**: Stable performance after BookWyrm migration
|
||||
- n1 (625nz): 8m CPU, 173Mi memory
|
||||
- n2 (br8rg): 10m CPU, 169Mi memory
|
||||
- n3 (rkddn): 11m CPU, 159Mi memory
|
||||
- ✅ **nginx Accessibility Test**: Successfully accessed Pixelfed through nginx ingress with correct Host header
|
||||
- Response: HTTP 200, Pixelfed Laravel application served correctly
|
||||
- Proper session cookies and security headers
|
||||
- No nginx routing issues
|
||||
|
||||
**Current Cloudflare Tunnel Config**:
|
||||
```
|
||||
pixelfed.keyboardvagabond.com → http://pixelfed-web.pixelfed-application.svc.cluster.local:80
|
||||
```
|
||||
|
||||
**Ready for Migration**: All pre-checks passed. nginx ingress can successfully route Pixelfed traffic.
|
||||
|
||||
**Migration Executed (2025-08-25 16:19 UTC)**: ✅ SUCCESS
|
||||
- **Cloudflare Tunnel Updated**: `pixelfed.keyboardvagabond.com` → `http://ingress-nginx-controller.ingress-nginx.svc.cluster.local:80`
|
||||
- **Immediate Verification**: Pixelfed web UI accessible, no downtime
|
||||
- **nginx Logs Confirmation**: Pixelfed traffic flowing through nginx ingress:
|
||||
```
|
||||
136.41.98.74 - "HEAD / HTTP/1.1" 200 [pixelfed-application-pixelfed-web-80]
|
||||
136.41.98.74 - "GET / HTTP/1.1" 302 [pixelfed-application-pixelfed-web-80]
|
||||
136.41.98.74 - "GET /sw.js HTTP/1.1" 200 [pixelfed-application-pixelfed-web-80]
|
||||
```
|
||||
- **Resource Impact**: Stable nginx performance (3-10m CPU cores), memory unchanged
|
||||
- **Multi-App Success**: Both BookWyrm AND Pixelfed now routing through nginx ingress
|
||||
- **Metrics Fix**: Updated query to include 3xx redirects as success (`status=~"[23].."`)
|
||||
- **PHASE 2 COMPLETE**: Pixelfed metrics now showing correctly in unified dashboard
|
||||
|
||||
### Phase 3 (PieFed) - Status: PRE-MIGRATION STARTING ⏳
|
||||
|
||||
**Lessons Learned from BookWyrm + Pixelfed**:
|
||||
- Migration process consistently successful across different app types
|
||||
- nginx ingress handles additional load without issues
|
||||
- Metrics integration working with proper 2xx+3xx success criteria
|
||||
- Zero downtime achieved for both migrations
|
||||
- Traffic patterns clearly visible in nginx logs
|
||||
|
||||
**Pre-Migration Checks (2025-08-25)**: ✅ COMPLETE
|
||||
- ✅ **PieFed Ingress**: Correctly configured with host `piefed.keyboardvagabond.com`, nginx class, 20MB upload limit, rate limiting (100/min)
|
||||
- ✅ **PieFed Service**: `piefed-web.piefed-application.svc.cluster.local:80` accessible (ClusterIP: 10.104.62.239)
|
||||
- ✅ **nginx Post-2-Apps Resources**: Stable performance after BookWyrm + Pixelfed migrations
|
||||
- n1 (625nz): 10m CPU, 173Mi memory
|
||||
- n2 (br8rg): 16m CPU, 169Mi memory
|
||||
- n3 (rkddn): 3m CPU, 161Mi memory
|
||||
- ✅ **nginx Accessibility Test**: Successfully accessed PieFed through nginx ingress with correct Host header
|
||||
- Response: HTTP 200, PieFed application served correctly (343KB response)
|
||||
- Proper security headers and CSP policies
|
||||
- Flask session handling working correctly
|
||||
- ✅ **Federation Traffic Assessment**: **HEAVY** ActivityPub load confirmed
|
||||
- **58 federation requests** in last 30 Cloudflare tunnel logs
|
||||
- Constant ActivityPub `/inbox` POST requests from multiple Lemmy instances
|
||||
- Sources: lemmy.dbzer0.com, lemmy.world, and others
|
||||
- This will significantly increase nginx ingress load
|
||||
|
||||
**Current Cloudflare Tunnel Config**:
|
||||
```
|
||||
piefed.keyboardvagabond.com → http://piefed-web.piefed-application.svc.cluster.local:80
|
||||
```
|
||||
|
||||
**Ready for Migration**: All pre-checks passed. ⚠️ **CAUTION**: PieFed has the heaviest federation traffic - monitor nginx closely during/after migration.
|
||||
|
||||
**Migration Executed (2025-08-25 17:26 UTC)**: ✅ SUCCESS
|
||||
- **Cloudflare Tunnel Updated**: `piefed.keyboardvagabond.com` → `http://ingress-nginx-controller.ingress-nginx.svc.cluster.local:80`
|
||||
- **Immediate Verification**: PieFed web UI accessible, no downtime
|
||||
- **nginx Logs Confirmation**: **HEAVY** federation traffic flowing through nginx ingress:
|
||||
```
|
||||
135.181.143.221 - "POST /inbox HTTP/1.1" 200 [piefed-application-piefed-web-80]
|
||||
135.181.143.221 - "POST /inbox HTTP/1.1" 200 [piefed-application-piefed-web-80]
|
||||
Multiple ActivityPub federation requests per second from lemmy.world
|
||||
```
|
||||
- **Resource Impact**: nginx ingress handling heavy load excellently
|
||||
- CPU: 9-17m cores (slight increase, well within limits)
|
||||
- Memory: 160-174Mi (stable)
|
||||
- Response times: 0.045-0.066s (excellent performance)
|
||||
- **Load Balancing**: Traffic properly distributed across multiple PieFed pods
|
||||
- **Federation Success**: All ActivityPub requests returning HTTP 200
|
||||
- **PHASE 3 COMPLETE**: PieFed successfully migrated with heaviest traffic load
|
||||
|
||||
### Phase 4 (Mastodon) - Status: COMPLETE ✅
|
||||
|
||||
**Migration Executed (2025-08-25 17:36 UTC)**: ✅ SUCCESS
|
||||
- **Issue Encountered**: Complex nginx rate limiting configuration caused host header validation failures
|
||||
- **Root Cause**: `server-snippet` and `configuration-snippet` annotations interfered with proper request routing
|
||||
- **Solution**: Simplified ingress configuration by removing complex rate limiting annotations
|
||||
- **Fix Process**:
|
||||
1. Suspended Flux applications to prevent config reversion
|
||||
2. Deleted and recreated ingress resources to clear nginx cache
|
||||
3. Applied clean ingress configuration
|
||||
- **Cloudflare Tunnel Updated**: Both Mastodon routes to nginx ingress:
|
||||
- `mastodon.keyboardvagabond.com` → `http://ingress-nginx-controller.ingress-nginx.svc.cluster.local:80`
|
||||
- `streamingmastodon.keyboardvagabond.com` → `http://ingress-nginx-controller.ingress-nginx.svc.cluster.local:80`
|
||||
- **Immediate Verification**: Mastodon web UI accessible, HTTP 200 responses
|
||||
- **nginx Logs Confirmation**: Mastodon traffic flowing through nginx ingress:
|
||||
```
|
||||
136.41.98.74 - "HEAD / HTTP/1.1" 200 [mastodon-application-mastodon-web-3000]
|
||||
```
|
||||
- **Performance**: Fast response times (0.100s), all security headers working correctly
|
||||
- **🎉 MIGRATION COMPLETE**: All 4 fediverse applications successfully migrated to unified nginx ingress routing!
|
||||
|
||||
---
|
||||
|
||||
**Created**: 2025-08-25
|
||||
**Last Updated**: 2025-08-25
|
||||
**Status**: Phase 1 (BookWyrm) Starting
|
||||
Reference in New Issue
Block a user