# Adding a New Node for Nginx Ingress Metrics Collection This guide documents the steps required to add a new node to the cluster and ensure nginx ingress controller metrics are properly collected from it. ## Overview The nginx ingress controller is deployed as a **DaemonSet** (kind: DaemonSet), which means it automatically deploys one pod per node. However, for metrics collection to work properly, additional configuration steps are required. ## Current Configuration Currently, the cluster has 3 nodes with metrics collection configured for: - **n1 ()**: Control plane + worker - **n2 ()**: Worker - **n3 ()**: Worker ## Steps to Add a New Node ### 1. Add the Node to Kubernetes Cluster Follow your standard node addition process (this is outside the scope of this guide). Ensure the new node: - Is properly joined to the cluster - Has the nginx ingress controller pod deployed (should happen automatically due to DaemonSet) - Is accessible on the cluster network ### 2. Verify Nginx Ingress Controller Deployment Check that the nginx ingress controller pod is running on the new node: ```bash kubectl get pods -n ingress-nginx -o wide ``` Look for a pod on your new node. The nginx ingress controller should automatically deploy due to the DaemonSet configuration. ### 3. Update OpenTelemetry Collector Configuration **File to modify**: `manifests/infrastructure/openobserve-collector/gateway-collector.yaml` **Current configuration** (lines 217-219): ```yaml - job_name: 'nginx-ingress' static_configs: - targets: [':10254', ':10254', ':10254'] ``` **Add the new node IP** to the targets list: ```yaml - job_name: 'nginx-ingress' static_configs: - targets: [':10254', ':10254', ':10254', 'NEW_NODE_IP:10254'] ``` Replace `NEW_NODE_IP` with the actual IP address of your new node. ### 4. Update Host Firewall Policies (if applicable) **File to check**: `manifests/infrastructure/cluster-policies/host-fw-worker-nodes.yaml` Ensure the firewall allows nginx metrics port access (should already be configured): ```yaml # NGINX Ingress Controller metrics port - fromEntities: - cluster toPorts: - ports: - port: "10254" protocol: "TCP" # NGINX Ingress metrics ``` ### 5. Apply the Configuration Changes ```bash # Apply the updated collector configuration kubectl apply -f manifests/infrastructure/openobserve-collector/gateway-collector.yaml # Restart the collector to pick up the new configuration kubectl rollout restart statefulset/openobserve-collector-gateway-collector -n openobserve-collector ``` ### 6. Verification Steps 1. **Check that the nginx pod is running on the new node**: ```bash kubectl get pods -n ingress-nginx -o wide | grep NEW_NODE_NAME ``` 2. **Verify metrics endpoint is accessible**: ```bash curl -s http://NEW_NODE_IP:10254/metrics | grep nginx_ingress_controller_requests | head -3 ``` 3. **Check collector logs for the new target**: ```bash kubectl logs -n openobserve-collector openobserve-collector-gateway-collector-0 --tail=50 | grep -i nginx ``` 4. **Verify target discovery**: Look for log entries like: ``` Scrape job added {"jobName": "nginx-ingress"} ``` 5. **Test metrics in OpenObserve**: Your dashboard query should now include metrics from the new node: ```promql sum(increase(nginx_ingress_controller_requests[5m])) by (host) ``` ## Important Notes ### Automatic vs Manual Configuration - ✅ **Automatic**: Nginx ingress controller deployment (DaemonSet handles this) - ✅ **Automatic**: ServiceMonitor discovery (target allocator handles this) - ❌ **Manual**: Static scrape configuration (requires updating the targets list) ### Why Both ServiceMonitor and Static Config? The current setup uses **both approaches** for redundancy: 1. **ServiceMonitor**: Automatically discovers nginx ingress services 2. **Static Configuration**: Ensures specific node IPs are always monitored ### Network Requirements - Port **10254** must be accessible from the OpenTelemetry collector pods - The new node should be on the same network as existing nodes - Host firewall policies should allow metrics collection ### Monitoring Best Practices - Always verify metrics are flowing after adding a node - Test your dashboard queries to ensure the new node's metrics appear - Monitor collector logs for any scraping errors ## Troubleshooting ### Common Issues 1. **Nginx pod not starting**: Check node labels and taints 2. **Metrics endpoint not accessible**: Verify network connectivity and firewall rules 3. **Collector not scraping**: Check collector logs and restart if needed 4. **Missing metrics in dashboard**: Wait 30-60 seconds for metrics to propagate ### Useful Commands ```bash # Check nginx ingress pods kubectl get pods -n ingress-nginx -o wide # Test metrics endpoint curl -s http://NODE_IP:10254/metrics | grep nginx_ingress_controller_requests # Check collector status kubectl get pods -n openobserve-collector # View collector logs kubectl logs -n openobserve-collector openobserve-collector-gateway-collector-0 --tail=50 # Check ServiceMonitor kubectl get servicemonitor -n ingress-nginx -o yaml ``` ## Configuration Files Summary Files that may need updates when adding a node: 1. **Required**: `manifests/infrastructure/openobserve-collector/gateway-collector.yaml` - Update static targets list (line ~219) 2. **Optional**: `manifests/infrastructure/cluster-policies/host-fw-worker-nodes.yaml` - Usually already configured for port 10254 3. **Automatic**: `manifests/infrastructure/ingress-nginx/ingress-nginx.yaml` - No changes needed (DaemonSet handles deployment)