Files
Keybard-Vagabond-Demo/docs/CILIUM-POLICY-AUDIT-TESTING.md

170 lines
4.2 KiB
Markdown
Raw Permalink Normal View History

2025-12-24 14:35:17 +01:00
# Cilium Host Firewall Policy Audit Mode Testing
## Overview
This guide explains how to test Cilium host firewall policies in audit mode before applying them in enforcement mode. This prevents accidentally locking yourself out of the cluster.
## Prerequisites
- `kubectl` configured and working
- Access to the cluster (via Tailscale or direct connection)
- Cilium installed and running
## Quick Start
Run the automated test script:
```bash
./tools/test-cilium-policy-audit.sh
```
This script will:
1. Find the Cilium pod
2. Locate the host endpoint (identity 1)
3. Enable PolicyAuditMode
4. Start monitoring policy verdicts
5. Test basic connectivity
6. Show audit log entries
## Manual Testing Steps
### 1. Find Cilium Pod
```bash
kubectl -n kube-system get pods -l "k8s-app=cilium"
```
### 2. Find Host Endpoint
The host endpoint has identity `1`. Find its endpoint ID:
```bash
CILIUM_POD=$(kubectl -n kube-system get pods -l "k8s-app=cilium" -o jsonpath='{.items[0].metadata.name}')
kubectl exec -n kube-system ${CILIUM_POD} -- \
cilium endpoint list -o jsonpath='{[?(@.status.identity.id==1)].id}'
```
### 3. Enable Audit Mode
```bash
kubectl exec -n kube-system ${CILIUM_POD} -- \
cilium endpoint config <ENDPOINT_ID> PolicyAuditMode=Enabled
```
### 4. Verify Audit Mode
```bash
kubectl exec -n kube-system ${CILIUM_POD} -- \
cilium endpoint config <ENDPOINT_ID> | grep PolicyAuditMode
```
Should show: `PolicyAuditMode : Enabled`
### 5. Start Monitoring
In a separate terminal, start monitoring policy verdicts:
```bash
kubectl exec -n kube-system ${CILIUM_POD} -- \
cilium monitor -t policy-verdict --related-to <ENDPOINT_ID>
```
### 6. Test Connectivity
While monitoring, test various connections:
**Kubernetes API:**
```bash
kubectl get nodes
kubectl get pods -A
```
**Talos API (if talosctl available):**
```bash
talosctl -n <NODE_IP> time
talosctl -n <NODE_IP> version
```
**Cluster Internal:**
```bash
kubectl get services -A
```
### 7. Review Audit Log
Look for entries in the monitor output:
- `action allow` - Traffic allowed by policy
- `action audit` - Traffic would be denied but is being audited (not dropped)
- `action deny` - Traffic denied (only in enforcement mode)
### 8. Disable Audit Mode (When Ready)
Once you've verified all necessary traffic is allowed:
```bash
kubectl exec -n kube-system ${CILIUM_POD} -- \
cilium endpoint config <ENDPOINT_ID> PolicyAuditMode=Disabled
```
## Expected Results
With the current policies, you should see `action allow` for:
1. **Kubernetes API (6443)** from:
- Tailscale network (100.64.0.0/10)
- VLAN subnet (10.132.0.0/24)
- VIP (<VIP_IP>)
- External IPs (152.53.x.x)
- Cluster entities
2. **Talos API (50000, 50001)** from:
- Tailscale network
- VLAN subnet
- VIP
- External IPs
- Cluster entities
3. **Cluster Internal Traffic** from:
- Cluster entities
- Remote nodes
- Host
## Troubleshooting
### No Policy Verdicts Appearing
- Ensure PolicyAuditMode is enabled
- Check that policies are actually applied: `kubectl get ciliumclusterwidenetworkpolicies`
- Generate more traffic to trigger policy evaluation
### Seeing `action audit` (Would Be Denied)
This means traffic would be blocked in enforcement mode. Review your policies and add appropriate rules.
### Locked Out After Disabling Audit Mode
If you lose access after disabling audit mode:
1. Use the Hetzner Robot firewall escape hatch (if configured)
2. Or access via Tailscale network (should still work)
3. Re-enable audit mode via direct node access if needed
## Policy Verification Checklist
Before disabling audit mode, verify:
- [ ] Kubernetes API accessible from Tailscale
- [ ] Kubernetes API accessible from VLAN
- [ ] Talos API accessible from Tailscale
- [ ] Talos API accessible from VLAN
- [ ] Cluster internal communication working
- [ ] Worker nodes can reach control plane
- [ ] No unexpected `action audit` entries for critical services
## References
- [Cilium Host Firewall Documentation](https://docs.cilium.io/en/stable/policy/language/#host-firewall)
- [Policy Audit Mode Guide](https://datavirke.dk/posts/bare-metal-kubernetes-part-2-cilium-and-firewalls/#policy-audit-mode)
- [Cilium Network Policies](https://docs.cilium.io/en/stable/policy/language/)