266 lines
8.1 KiB
Markdown
266 lines
8.1 KiB
Markdown
# Migrating from External DNS to CF Zero Trust
|
|
Now that the CF domain is set up, it's time to move other apps and services to using it, then to potentially seal off
|
|
as much of the Talos and k8s ports as I can.
|
|
|
|
## Zero-Downtime Migration Process
|
|
|
|
### Step 1: Discover Service Configuration
|
|
```bash
|
|
# Find service name and port
|
|
kubectl get svc -n <namespace>
|
|
# Example output: service-name ClusterIP 10.x.x.x <none> 9898/TCP
|
|
```
|
|
|
|
### Step 2: Create Tunnel Route (FIRST!)
|
|
1. Go to **Cloudflare Zero Trust Dashboard** → **Networks** → **Tunnels**
|
|
2. Find your tunnel, click **Configure**
|
|
3. Add **Public Hostname**:
|
|
- **Subdomain**: `app`
|
|
- **Domain**: `keyboardvagabond.com`
|
|
- **Service**: `http://service-name.namespace.svc.cluster.local:port`
|
|
4. **Test** the tunnel URL works before proceeding!
|
|
|
|
### Step 3: Update Application Configuration
|
|
Clear external-DNS annotations and TLS configuration:
|
|
```yaml
|
|
# In Helm values or ingress manifest:
|
|
ingress:
|
|
annotations: {} # Explicitly empty - removes cert-manager and external-dns
|
|
tls: [] # Explicitly empty array - no certificates needed
|
|
```
|
|
|
|
### Step 4: Deploy Changes
|
|
```bash
|
|
# For Helm apps via Flux:
|
|
flux reconcile helmrelease <app-name> -n <namespace>
|
|
|
|
# For direct manifests:
|
|
kubectl apply -f <manifest-file>
|
|
```
|
|
|
|
### Step 5: Clean Up Certificates
|
|
```bash
|
|
# Delete certificate resources
|
|
kubectl delete certificate <cert-name> -n <namespace>
|
|
|
|
# Find and delete TLS secrets
|
|
kubectl get secrets -n <namespace> | grep tls
|
|
kubectl delete secret <tls-secret-name> -n <namespace>
|
|
```
|
|
|
|
### Step 6: Verify Clean State
|
|
```bash
|
|
# Check no new certificates are being created
|
|
kubectl get certificate,secret -n <namespace> | grep <app-name>
|
|
|
|
# Should only show Helm release secrets, no certificate or TLS secrets
|
|
```
|
|
|
|
### Step 7: DNS Record Management
|
|
**How it works:**
|
|
- **Tunnel automatically creates**: CNAME record → `tunnel-id.cfargotunnel.com`
|
|
- **External-DNS created**: A records → your cluster IPs
|
|
- **DNS Priority**: CNAME takes precedence over A records
|
|
|
|
**Cleanup options:**
|
|
```bash
|
|
# Option 1: Auto-cleanup (recommended) - wait 5 minutes after removing annotations
|
|
# External-DNS will automatically delete A records after TTL expires
|
|
|
|
# Option 2: Manual cleanup (immediate)
|
|
# Go to Cloudflare DNS dashboard and manually delete A records
|
|
# Keep the CNAME record (created by tunnel)
|
|
```
|
|
|
|
**Verification:**
|
|
```bash
|
|
# Check DNS resolution shows CNAME (not A records)
|
|
dig podinfo.keyboardvagabond.com
|
|
|
|
# Should show:
|
|
# podinfo.keyboardvagabond.com. CNAME tunnel-id.cfargotunnel.com.
|
|
```
|
|
|
|
## Rollback Plan
|
|
If tunnel doesn't work:
|
|
1. **Revert** Helm values/manifests (add back annotations and TLS)
|
|
2. **Redeploy**: `flux reconcile` or `kubectl apply`
|
|
3. **Wait** for cert-manager to recreate certificates
|
|
|
|
## Benefits After Migration
|
|
- ✅ **No exposed public IPs** - cluster nodes not directly accessible
|
|
- ✅ **Automatic DDoS protection** via Cloudflare
|
|
- ✅ **Centralized SSL management** - Cloudflare handles certificates
|
|
- ✅ **Better observability** - Cloudflare analytics and logs
|
|
|
|
**It should work!** 🚀 (And now we have a plan if it doesn't!)
|
|
|
|
## Advanced: Securing Administrative Access
|
|
|
|
### Securing Kubernetes & Talos APIs
|
|
|
|
Once application migration is complete, you can secure administrative access:
|
|
|
|
#### Option 1: TCP Proxy (Simpler)
|
|
```yaml
|
|
# Cloudflare Zero Trust → Tunnels → Configure
|
|
Public Hostname:
|
|
Subdomain: api
|
|
Domain: keyboardvagabond.com
|
|
Service: tcp://localhost:6443 # Kubernetes API
|
|
|
|
Public Hostname:
|
|
Subdomain: talos
|
|
Domain: keyboardvagabond.com
|
|
Service: tcp://<NODE_1_IP>:50000 # Talos API
|
|
```
|
|
|
|
**Client configuration:**
|
|
```bash
|
|
# Update kubectl config
|
|
kubectl config set-cluster keyboardvagabond \
|
|
--server=https://api.keyboardvagabond.com:443 # Note: 443, not 6443
|
|
|
|
# Update talosctl config
|
|
talosctl config endpoint talos.keyboardvagabond.com:443
|
|
```
|
|
|
|
#### Option 2: Private Network via WARP (Most Secure)
|
|
|
|
**Step 1: Configure Private Network**
|
|
```yaml
|
|
# Cloudflare Zero Trust → Tunnels → Configure → Private Networks
|
|
Private Network:
|
|
CIDR: 10.132.0.0/24 # Your NetCup vLAN network
|
|
Description: "Keyboard Vagabond Cluster Internal Network"
|
|
```
|
|
|
|
**Step 2: Configure Split Tunnels**
|
|
```yaml
|
|
# Zero Trust → Settings → WARP Client → Device settings → Split Tunnels
|
|
Mode: Exclude (recommended)
|
|
Remove: 10.0.0.0/8 # Remove broad private range
|
|
Add back:
|
|
- 10.0.0.0/9 # 10.0.0.0 - 10.127.255.255
|
|
- 10.133.0.0/16 # 10.133.0.0 - 10.133.255.255
|
|
- 10.134.0.0/15 # 10.134.0.0 - 10.135.255.255
|
|
# This ensures only 10.132.0.0/24 routes through WARP
|
|
```
|
|
|
|
**Step 3: Client Configuration**
|
|
```bash
|
|
# Install WARP client on admin machines
|
|
# macOS: brew install --cask cloudflare-warp
|
|
# Connect to Zero Trust organization
|
|
warp-cli registration new
|
|
|
|
# Configure kubectl to use internal IPs
|
|
kubectl config set-cluster keyboardvagabond \
|
|
--server=https://<NODE_1_IP>:6443 # Direct to internal node IP
|
|
|
|
# Configure talosctl to use internal IPs
|
|
talosctl config endpoint <NODE_1_IP>:50000,<NODE_2_IP>:50000
|
|
```
|
|
|
|
**Step 4: Access Policies (Recommended)**
|
|
```yaml
|
|
# Zero Trust → Access → Applications → Add application
|
|
Application Type: Private Network
|
|
Name: "Kubernetes Cluster Admin Access"
|
|
Application Domain: 10.132.0.0/24
|
|
|
|
Policies:
|
|
- Name: "Admin Team Only"
|
|
Action: Allow
|
|
Rules:
|
|
- Email domain: @yourdomain.com
|
|
- Device Posture: Managed device required
|
|
```
|
|
|
|
**Step 5: Device Enrollment**
|
|
```bash
|
|
# On admin device
|
|
# 1. Install WARP: https://1.1.1.1/
|
|
# 2. Login with Zero Trust organization
|
|
# 3. Verify private network access:
|
|
ping <NODE_1_IP> # Should work through WARP
|
|
|
|
# 4. Test API access
|
|
kubectl get nodes # Should connect to internal cluster
|
|
talosctl version # Should connect to internal Talos API
|
|
```
|
|
|
|
**Step 6: Lock Down External Access**
|
|
Once WARP is working, update Talos machine configs to block external access:
|
|
```yaml
|
|
# In machineconfigs/n1.yaml and n2.yaml
|
|
machine:
|
|
network:
|
|
extraHostEntries:
|
|
# Firewall rules via Talos
|
|
- ip: 127.0.0.1 # Placeholder - actual firewall config needed
|
|
```
|
|
|
|
#### WARP Benefits:
|
|
- ✅ **No public DNS entries** - Admin endpoints not discoverable
|
|
- ✅ **Device control** - Only managed devices can access cluster
|
|
- ✅ **Zero-trust policies** - Granular access control per user/device
|
|
- ✅ **Audit logs** - Full visibility into who accessed what when
|
|
- ✅ **Device posture** - Require encryption, OS updates, etc.
|
|
- ✅ **Split tunneling** - Only cluster traffic goes through tunnel
|
|
- ✅ **Automatic failover** - Multiple WARP data centers
|
|
|
|
## Testing WARP Implementation
|
|
|
|
### Before WARP (Current State)
|
|
```bash
|
|
# Current kubectl configuration
|
|
kubectl config view --minify -o jsonpath='{.clusters[0].cluster.server}'
|
|
# Output: https://api.keyboardvagabond.com:6443
|
|
|
|
# This goes through internet → external IPs
|
|
kubectl get nodes
|
|
```
|
|
|
|
### After WARP Setup
|
|
```bash
|
|
# 1. Test private network connectivity first
|
|
ping <NODE_1_IP> # Should work once WARP is connected
|
|
|
|
# 2. Create backup kubectl context
|
|
kubectl config set-context keyboardvagabond-external \
|
|
--cluster=keyboardvagabond.com \
|
|
--user=admin@keyboardvagabond.com
|
|
|
|
# 3. Update main context to use internal IP
|
|
kubectl config set-cluster keyboardvagabond.com \
|
|
--server=https://<NODE_1_IP>:6443
|
|
|
|
# 4. Test internal access
|
|
kubectl get nodes # Should work through WARP → private network
|
|
|
|
# 5. Verify traffic path
|
|
# WARP status should show "Connected" in system tray
|
|
warp-cli status # Should show connected to your Zero Trust org
|
|
```
|
|
|
|
### Rollback Plan
|
|
```bash
|
|
# If WARP doesn't work, quickly restore external access:
|
|
kubectl config set-cluster keyboardvagabond.com \
|
|
--server=https://api.keyboardvagabond.com:6443
|
|
|
|
# Test external access still works
|
|
kubectl get nodes
|
|
```
|
|
|
|
## Next Steps After WARP
|
|
|
|
Once WARP is proven working:
|
|
1. **Configure Talos firewall** to block external access to ports 6443 and 50000
|
|
2. **Remove public API DNS entry** (api.keyboardvagabond.com)
|
|
3. **Document emergency access procedure** (temporary firewall rule + external DNS)
|
|
4. **Set up additional WARP devices** for other administrators
|
|
|
|
This gives you a **zero-trust administrative access model** where cluster APIs are completely invisible from the internet! 🔒
|