330 lines
15 KiB
Markdown
330 lines
15 KiB
Markdown
|
|
# Cloudflare Tunnel to Nginx Ingress Migration
|
||
|
|
|
||
|
|
## Project Overview
|
||
|
|
|
||
|
|
**Goal**: Route Cloudflare Zero Trust tunnel traffic through nginx ingress controller to enable unified request metrics collection for all fediverse applications.
|
||
|
|
|
||
|
|
**Problem**: Currently only Harbor registry shows up in nginx ingress metrics because fediverse apps (PieFed, Mastodon, Pixelfed, BookWyrm) use Cloudflare tunnels that bypass nginx ingress entirely.
|
||
|
|
|
||
|
|
**Solution**: Reconfigure Cloudflare tunnels to route traffic through nginx ingress controller instead of directly to application services.
|
||
|
|
|
||
|
|
## Current vs Target Architecture
|
||
|
|
|
||
|
|
### Current Architecture
|
||
|
|
```
|
||
|
|
Internet → Cloudflare Tunnel → Direct to App Services → Fediverse Apps (NO METRICS)
|
||
|
|
Internet → External IPs → nginx ingress → Harbor (HAS METRICS)
|
||
|
|
```
|
||
|
|
|
||
|
|
### Target Architecture
|
||
|
|
```
|
||
|
|
Internet → Cloudflare Tunnel → nginx ingress → All Applications (UNIFIED METRICS)
|
||
|
|
```
|
||
|
|
|
||
|
|
## Migration Strategy
|
||
|
|
|
||
|
|
**Approach**: Gradual rollout per application to minimize risk and allow monitoring at each stage.
|
||
|
|
|
||
|
|
**Order**: BookWyrm → Pixelfed → PieFed → Mastodon (lowest to highest traffic/criticality)
|
||
|
|
|
||
|
|
## Application Migration Checklist
|
||
|
|
|
||
|
|
### Phase 1: BookWyrm (STARTING) ⏳
|
||
|
|
- [ ] **Pre-migration checks**
|
||
|
|
- [ ] Verify BookWyrm ingress configuration
|
||
|
|
- [ ] Baseline nginx ingress resource usage
|
||
|
|
- [ ] Test nginx ingress accessibility from within cluster
|
||
|
|
- [ ] Document current Cloudflare tunnel config for BookWyrm
|
||
|
|
- [ ] **Migration execution**
|
||
|
|
- [ ] Update Cloudflare tunnel: `bookwyrm.keyboardvagabond.com` → `http://ingress-nginx-controller.ingress-nginx.svc.cluster.local:80`
|
||
|
|
- [ ] Test BookWyrm accessibility immediately after change
|
||
|
|
- [ ] Verify nginx metrics show BookWyrm requests
|
||
|
|
- [ ] **Post-migration monitoring (24-48 hours)**
|
||
|
|
- [ ] Monitor nginx ingress pod CPU/memory usage
|
||
|
|
- [ ] Check BookWyrm response times and error rates
|
||
|
|
- [ ] Verify BookWyrm appears in nginx metrics with expected traffic
|
||
|
|
- [ ] Confirm no nginx ingress errors in logs
|
||
|
|
|
||
|
|
### Phase 2: Pixelfed (PENDING) 📋
|
||
|
|
- [ ] **Pre-migration checks**
|
||
|
|
- [ ] Verify lessons learned from BookWyrm migration
|
||
|
|
- [ ] Check nginx resource usage after BookWyrm
|
||
|
|
- [ ] Baseline Pixelfed performance metrics
|
||
|
|
- [ ] **Migration execution**
|
||
|
|
- [ ] Update Cloudflare tunnel: `pixelfed.keyboardvagabond.com` → nginx ingress
|
||
|
|
- [ ] Test and monitor as per BookWyrm process
|
||
|
|
- [ ] **Post-migration monitoring**
|
||
|
|
- [ ] Monitor combined BookWyrm + Pixelfed traffic impact
|
||
|
|
|
||
|
|
### Phase 3: PieFed (PENDING) 📋
|
||
|
|
- [ ] **Pre-migration checks**
|
||
|
|
- [ ] PieFed has heaviest ActivityPub federation traffic
|
||
|
|
- [ ] Ensure nginx can handle federation bursts
|
||
|
|
- [ ] Review PieFed rate limiting configuration
|
||
|
|
- [ ] **Migration execution**
|
||
|
|
- [ ] Update Cloudflare tunnel: `piefed.keyboardvagabond.com` → nginx ingress
|
||
|
|
- [ ] Monitor federation traffic patterns closely
|
||
|
|
- [ ] **Post-migration monitoring**
|
||
|
|
- [ ] Watch for ActivityPub federation performance impact
|
||
|
|
- [ ] Verify rate limiting still works effectively
|
||
|
|
|
||
|
|
### Phase 4: Mastodon (PENDING) 📋
|
||
|
|
- [ ] **Pre-migration checks**
|
||
|
|
- [ ] Most critical application - proceed with extra caution
|
||
|
|
- [ ] Verify all previous migrations stable
|
||
|
|
- [ ] Review Mastodon streaming service impact
|
||
|
|
- [ ] **Migration execution**
|
||
|
|
- [ ] Update Cloudflare tunnel: `mastodon.keyboardvagabond.com` → nginx ingress
|
||
|
|
- [ ] Update streaming tunnel: `streamingmastodon.keyboardvagabond.com` → nginx ingress
|
||
|
|
- [ ] **Post-migration monitoring**
|
||
|
|
- [ ] Monitor Mastodon federation and streaming performance
|
||
|
|
- [ ] Verify WebSocket connections work correctly
|
||
|
|
|
||
|
|
## Current Configuration
|
||
|
|
|
||
|
|
### Nginx Ingress Service
|
||
|
|
```bash
|
||
|
|
# Main ingress controller service (internal)
|
||
|
|
kubectl get svc ingress-nginx-controller -n ingress-nginx
|
||
|
|
# ClusterIP: 10.101.136.40, Port: 80
|
||
|
|
|
||
|
|
# Public service (external IPs for Harbor)
|
||
|
|
kubectl get svc ingress-nginx-public -n ingress-nginx
|
||
|
|
# LoadBalancer: 10.107.187.45, ExternalIPs: <NODE_1_EXTERNAL_IP>,<NODE_2_EXTERNAL_IP>
|
||
|
|
```
|
||
|
|
|
||
|
|
### Current Cloudflare Tunnel Routes (TO BE CHANGED)
|
||
|
|
```
|
||
|
|
bookwyrm.keyboardvagabond.com → http://bookwyrm-web.bookwyrm-application.svc.cluster.local:80
|
||
|
|
pixelfed.keyboardvagabond.com → http://pixelfed-web.pixelfed-application.svc.cluster.local:80
|
||
|
|
piefed.keyboardvagabond.com → http://piefed-web.piefed-application.svc.cluster.local:80
|
||
|
|
mastodon.keyboardvagabond.com → http://mastodon-web.mastodon-application.svc.cluster.local:3000
|
||
|
|
streamingmastodon.keyboardvagabond.com → http://mastodon-streaming.mastodon-application.svc.cluster.local:4000
|
||
|
|
```
|
||
|
|
|
||
|
|
### Target Cloudflare Tunnel Routes
|
||
|
|
```
|
||
|
|
bookwyrm.keyboardvagabond.com → http://ingress-nginx-controller.ingress-nginx.svc.cluster.local:80
|
||
|
|
pixelfed.keyboardvagabond.com → http://ingress-nginx-controller.ingress-nginx.svc.cluster.local:80
|
||
|
|
piefed.keyboardvagabond.com → http://ingress-nginx-controller.ingress-nginx.svc.cluster.local:80
|
||
|
|
mastodon.keyboardvagabond.com → http://ingress-nginx-controller.ingress-nginx.svc.cluster.local:80
|
||
|
|
streamingmastodon.keyboardvagabond.com → http://ingress-nginx-controller.ingress-nginx.svc.cluster.local:80
|
||
|
|
```
|
||
|
|
|
||
|
|
## Monitoring Commands
|
||
|
|
|
||
|
|
### Pre-Migration Baseline
|
||
|
|
```bash
|
||
|
|
# Check nginx ingress resource usage
|
||
|
|
kubectl top pods -n ingress-nginx
|
||
|
|
|
||
|
|
# Check current request metrics (should only show Harbor)
|
||
|
|
# Your existing query:
|
||
|
|
# (sum(rate(nginx_ingress_controller_requests{status=~"2.."}[5m])) by (host) / sum(rate(nginx_ingress_controller_requests[5m])) by (host)) * 100
|
||
|
|
|
||
|
|
# Monitor nginx ingress logs
|
||
|
|
kubectl logs -n ingress-nginx -l app.kubernetes.io/name=ingress-nginx --tail=50
|
||
|
|
```
|
||
|
|
|
||
|
|
### Post-Migration Verification
|
||
|
|
```bash
|
||
|
|
# Verify nginx metrics include new application
|
||
|
|
# Run your metrics query - should now show BookWyrm traffic
|
||
|
|
|
||
|
|
# Check nginx ingress is handling traffic
|
||
|
|
kubectl logs -n ingress-nginx -l app.kubernetes.io/name=ingress-nginx --tail=20 | grep bookwyrm
|
||
|
|
|
||
|
|
# Monitor resource impact
|
||
|
|
kubectl top pods -n ingress-nginx
|
||
|
|
```
|
||
|
|
|
||
|
|
## Rollback Procedures
|
||
|
|
|
||
|
|
### Quick Rollback (Per Application)
|
||
|
|
1. **Immediate**: Revert Cloudflare tunnel configuration in Zero Trust dashboard
|
||
|
|
2. **Verify**: Test application accessibility
|
||
|
|
3. **Monitor**: Confirm traffic flows correctly
|
||
|
|
|
||
|
|
### Full Rollback (All Applications)
|
||
|
|
1. Revert all Cloudflare tunnel configurations to direct service routing
|
||
|
|
2. Verify all applications accessible
|
||
|
|
3. Confirm metrics collection returns to Harbor-only state
|
||
|
|
|
||
|
|
## Risk Mitigation
|
||
|
|
|
||
|
|
### Resource Monitoring
|
||
|
|
- **nginx Pod Resources**: Watch CPU/memory usage after each migration
|
||
|
|
- **Response Times**: Monitor application response times for degradation
|
||
|
|
- **Error Rates**: Check for increased 5xx errors in nginx logs
|
||
|
|
|
||
|
|
### Traffic Impact Assessment
|
||
|
|
- **Federation Traffic**: Especially important for PieFed and Mastodon
|
||
|
|
- **Rate Limiting**: Verify existing rate limits still function correctly
|
||
|
|
- **WebSocket Connections**: Critical for Mastodon streaming
|
||
|
|
|
||
|
|
## Success Criteria
|
||
|
|
|
||
|
|
✅ **Migration Complete When**:
|
||
|
|
- All fediverse applications route through nginx ingress
|
||
|
|
- Unified metrics show traffic for all applications
|
||
|
|
- No performance degradation observed
|
||
|
|
- All rate limiting and security policies functional
|
||
|
|
- nginx ingress resource usage within acceptable limits
|
||
|
|
|
||
|
|
## Notes & Lessons Learned
|
||
|
|
|
||
|
|
### Phase 1 (BookWyrm) - Status: PRE-MIGRATION COMPLETE ✅
|
||
|
|
|
||
|
|
**Pre-Migration Checks (2025-08-25)**:
|
||
|
|
- ✅ **BookWyrm Ingress**: Correctly configured with host `bookwyrm.keyboardvagabond.com`, nginx class, proper CORS settings
|
||
|
|
- ✅ **BookWyrm Service**: `bookwyrm-web.bookwyrm-application.svc.cluster.local:80` accessible (ClusterIP: 10.96.26.11)
|
||
|
|
- ✅ **Nginx Baseline Resources**:
|
||
|
|
- n1 (625nz): 9m CPU, 174Mi memory
|
||
|
|
- n2 (br8rg): 4m CPU, 169Mi memory
|
||
|
|
- n3 (rkddn): 14m CPU, 159Mi memory
|
||
|
|
- ✅ **Nginx Accessibility Test**: Successfully accessed BookWyrm through nginx ingress with correct Host header
|
||
|
|
- Response: HTTP 200, BookWyrm page served correctly
|
||
|
|
- CORS headers applied properly
|
||
|
|
- No nginx routing issues
|
||
|
|
|
||
|
|
**Current Cloudflare Tunnel Config**:
|
||
|
|
```
|
||
|
|
bookwyrm.keyboardvagabond.com → http://bookwyrm-web.bookwyrm-application.svc.cluster.local:80
|
||
|
|
```
|
||
|
|
|
||
|
|
**Ready for Migration**: All pre-checks passed. Nginx ingress can successfully route BookWyrm traffic.
|
||
|
|
|
||
|
|
**Migration Executed (2025-08-25 16:06 UTC)**: ✅ SUCCESS
|
||
|
|
- **Cloudflare Tunnel Updated**: `bookwyrm.keyboardvagabond.com` → `http://ingress-nginx-controller.ingress-nginx.svc.cluster.local:80`
|
||
|
|
- **Immediate Verification**: BookWyrm web UI accessible, no downtime
|
||
|
|
- **nginx Logs Confirmation**: BookWyrm traffic flowing through nginx ingress:
|
||
|
|
```
|
||
|
|
136.41.98.74 - "GET / HTTP/1.1" 200 [bookwyrm-application-bookwyrm-web-80]
|
||
|
|
143.110.147.80 - "POST /inbox HTTP/1.1" 200 [bookwyrm-application-bookwyrm-web-80]
|
||
|
|
```
|
||
|
|
- **Resource Impact**: Minimal increase in nginx CPU (9-15m cores), memory stable (~170Mi)
|
||
|
|
- **Next**: Monitor for 24-48 hours, verify metrics collection
|
||
|
|
|
||
|
|
**METRICS VERIFICATION**: ✅ SUCCESS!
|
||
|
|
- **BookWyrm now appears in nginx metrics query**: `bookwyrm.keyboardvagabond.com` visible alongside `<YOUR_REGISTRY_URL>`
|
||
|
|
- **Unified metrics collection achieved**: Both Harbor and BookWyrm traffic now measured through nginx ingress
|
||
|
|
- **Phase 1 COMPLETE**: Ready to monitor for stability before Phase 2
|
||
|
|
|
||
|
|
### Phase 2 (Pixelfed) - Status: PRE-MIGRATION STARTING ⏳
|
||
|
|
|
||
|
|
**Lessons Learned from BookWyrm**:
|
||
|
|
- Migration process works flawlessly
|
||
|
|
- nginx ingress handles additional load without issues
|
||
|
|
- Metrics integration successful
|
||
|
|
- Zero downtime achieved
|
||
|
|
|
||
|
|
**Pre-Migration Checks (2025-08-25)**: ✅ COMPLETE
|
||
|
|
- ✅ **Pixelfed Ingress**: Correctly configured with host `pixelfed.keyboardvagabond.com`, nginx class, 20MB upload limit, rate limiting
|
||
|
|
- ✅ **Pixelfed Service**: `pixelfed-web.pixelfed-application.svc.cluster.local:80` accessible (ClusterIP: 10.97.130.244)
|
||
|
|
- ✅ **nginx Post-BookWyrm Resources**: Stable performance after BookWyrm migration
|
||
|
|
- n1 (625nz): 8m CPU, 173Mi memory
|
||
|
|
- n2 (br8rg): 10m CPU, 169Mi memory
|
||
|
|
- n3 (rkddn): 11m CPU, 159Mi memory
|
||
|
|
- ✅ **nginx Accessibility Test**: Successfully accessed Pixelfed through nginx ingress with correct Host header
|
||
|
|
- Response: HTTP 200, Pixelfed Laravel application served correctly
|
||
|
|
- Proper session cookies and security headers
|
||
|
|
- No nginx routing issues
|
||
|
|
|
||
|
|
**Current Cloudflare Tunnel Config**:
|
||
|
|
```
|
||
|
|
pixelfed.keyboardvagabond.com → http://pixelfed-web.pixelfed-application.svc.cluster.local:80
|
||
|
|
```
|
||
|
|
|
||
|
|
**Ready for Migration**: All pre-checks passed. nginx ingress can successfully route Pixelfed traffic.
|
||
|
|
|
||
|
|
**Migration Executed (2025-08-25 16:19 UTC)**: ✅ SUCCESS
|
||
|
|
- **Cloudflare Tunnel Updated**: `pixelfed.keyboardvagabond.com` → `http://ingress-nginx-controller.ingress-nginx.svc.cluster.local:80`
|
||
|
|
- **Immediate Verification**: Pixelfed web UI accessible, no downtime
|
||
|
|
- **nginx Logs Confirmation**: Pixelfed traffic flowing through nginx ingress:
|
||
|
|
```
|
||
|
|
136.41.98.74 - "HEAD / HTTP/1.1" 200 [pixelfed-application-pixelfed-web-80]
|
||
|
|
136.41.98.74 - "GET / HTTP/1.1" 302 [pixelfed-application-pixelfed-web-80]
|
||
|
|
136.41.98.74 - "GET /sw.js HTTP/1.1" 200 [pixelfed-application-pixelfed-web-80]
|
||
|
|
```
|
||
|
|
- **Resource Impact**: Stable nginx performance (3-10m CPU cores), memory unchanged
|
||
|
|
- **Multi-App Success**: Both BookWyrm AND Pixelfed now routing through nginx ingress
|
||
|
|
- **Metrics Fix**: Updated query to include 3xx redirects as success (`status=~"[23].."`)
|
||
|
|
- **PHASE 2 COMPLETE**: Pixelfed metrics now showing correctly in unified dashboard
|
||
|
|
|
||
|
|
### Phase 3 (PieFed) - Status: PRE-MIGRATION STARTING ⏳
|
||
|
|
|
||
|
|
**Lessons Learned from BookWyrm + Pixelfed**:
|
||
|
|
- Migration process consistently successful across different app types
|
||
|
|
- nginx ingress handles additional load without issues
|
||
|
|
- Metrics integration working with proper 2xx+3xx success criteria
|
||
|
|
- Zero downtime achieved for both migrations
|
||
|
|
- Traffic patterns clearly visible in nginx logs
|
||
|
|
|
||
|
|
**Pre-Migration Checks (2025-08-25)**: ✅ COMPLETE
|
||
|
|
- ✅ **PieFed Ingress**: Correctly configured with host `piefed.keyboardvagabond.com`, nginx class, 20MB upload limit, rate limiting (100/min)
|
||
|
|
- ✅ **PieFed Service**: `piefed-web.piefed-application.svc.cluster.local:80` accessible (ClusterIP: 10.104.62.239)
|
||
|
|
- ✅ **nginx Post-2-Apps Resources**: Stable performance after BookWyrm + Pixelfed migrations
|
||
|
|
- n1 (625nz): 10m CPU, 173Mi memory
|
||
|
|
- n2 (br8rg): 16m CPU, 169Mi memory
|
||
|
|
- n3 (rkddn): 3m CPU, 161Mi memory
|
||
|
|
- ✅ **nginx Accessibility Test**: Successfully accessed PieFed through nginx ingress with correct Host header
|
||
|
|
- Response: HTTP 200, PieFed application served correctly (343KB response)
|
||
|
|
- Proper security headers and CSP policies
|
||
|
|
- Flask session handling working correctly
|
||
|
|
- ✅ **Federation Traffic Assessment**: **HEAVY** ActivityPub load confirmed
|
||
|
|
- **58 federation requests** in last 30 Cloudflare tunnel logs
|
||
|
|
- Constant ActivityPub `/inbox` POST requests from multiple Lemmy instances
|
||
|
|
- Sources: lemmy.dbzer0.com, lemmy.world, and others
|
||
|
|
- This will significantly increase nginx ingress load
|
||
|
|
|
||
|
|
**Current Cloudflare Tunnel Config**:
|
||
|
|
```
|
||
|
|
piefed.keyboardvagabond.com → http://piefed-web.piefed-application.svc.cluster.local:80
|
||
|
|
```
|
||
|
|
|
||
|
|
**Ready for Migration**: All pre-checks passed. ⚠️ **CAUTION**: PieFed has the heaviest federation traffic - monitor nginx closely during/after migration.
|
||
|
|
|
||
|
|
**Migration Executed (2025-08-25 17:26 UTC)**: ✅ SUCCESS
|
||
|
|
- **Cloudflare Tunnel Updated**: `piefed.keyboardvagabond.com` → `http://ingress-nginx-controller.ingress-nginx.svc.cluster.local:80`
|
||
|
|
- **Immediate Verification**: PieFed web UI accessible, no downtime
|
||
|
|
- **nginx Logs Confirmation**: **HEAVY** federation traffic flowing through nginx ingress:
|
||
|
|
```
|
||
|
|
135.181.143.221 - "POST /inbox HTTP/1.1" 200 [piefed-application-piefed-web-80]
|
||
|
|
135.181.143.221 - "POST /inbox HTTP/1.1" 200 [piefed-application-piefed-web-80]
|
||
|
|
Multiple ActivityPub federation requests per second from lemmy.world
|
||
|
|
```
|
||
|
|
- **Resource Impact**: nginx ingress handling heavy load excellently
|
||
|
|
- CPU: 9-17m cores (slight increase, well within limits)
|
||
|
|
- Memory: 160-174Mi (stable)
|
||
|
|
- Response times: 0.045-0.066s (excellent performance)
|
||
|
|
- **Load Balancing**: Traffic properly distributed across multiple PieFed pods
|
||
|
|
- **Federation Success**: All ActivityPub requests returning HTTP 200
|
||
|
|
- **PHASE 3 COMPLETE**: PieFed successfully migrated with heaviest traffic load
|
||
|
|
|
||
|
|
### Phase 4 (Mastodon) - Status: COMPLETE ✅
|
||
|
|
|
||
|
|
**Migration Executed (2025-08-25 17:36 UTC)**: ✅ SUCCESS
|
||
|
|
- **Issue Encountered**: Complex nginx rate limiting configuration caused host header validation failures
|
||
|
|
- **Root Cause**: `server-snippet` and `configuration-snippet` annotations interfered with proper request routing
|
||
|
|
- **Solution**: Simplified ingress configuration by removing complex rate limiting annotations
|
||
|
|
- **Fix Process**:
|
||
|
|
1. Suspended Flux applications to prevent config reversion
|
||
|
|
2. Deleted and recreated ingress resources to clear nginx cache
|
||
|
|
3. Applied clean ingress configuration
|
||
|
|
- **Cloudflare Tunnel Updated**: Both Mastodon routes to nginx ingress:
|
||
|
|
- `mastodon.keyboardvagabond.com` → `http://ingress-nginx-controller.ingress-nginx.svc.cluster.local:80`
|
||
|
|
- `streamingmastodon.keyboardvagabond.com` → `http://ingress-nginx-controller.ingress-nginx.svc.cluster.local:80`
|
||
|
|
- **Immediate Verification**: Mastodon web UI accessible, HTTP 200 responses
|
||
|
|
- **nginx Logs Confirmation**: Mastodon traffic flowing through nginx ingress:
|
||
|
|
```
|
||
|
|
136.41.98.74 - "HEAD / HTTP/1.1" 200 [mastodon-application-mastodon-web-3000]
|
||
|
|
```
|
||
|
|
- **Performance**: Fast response times (0.100s), all security headers working correctly
|
||
|
|
- **🎉 MIGRATION COMPLETE**: All 4 fediverse applications successfully migrated to unified nginx ingress routing!
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
**Created**: 2025-08-25
|
||
|
|
**Last Updated**: 2025-08-25
|
||
|
|
**Status**: Phase 1 (BookWyrm) Starting
|