Files

Michael DiLeo 7327d77dcd redaction (#1 )

Add the redacted source file for demo purposes

Reviewed-on: https://source.michaeldileo.org/michael_dileo/Keybard-Vagabond-Demo/pulls/1
Co-authored-by: Michael DiLeo <michael_dileo@proton.me>
Co-committed-by: Michael DiLeo <michael_dileo@proton.me>

2025-12-24 13:40:47 +00:00

5.7 KiB

Raw Blame History

Adding a New Node for Nginx Ingress Metrics Collection

This guide documents the steps required to add a new node to the cluster and ensure nginx ingress controller metrics are properly collected from it.

Overview

The nginx ingress controller is deployed as a DaemonSet (kind: DaemonSet), which means it automatically deploys one pod per node. However, for metrics collection to work properly, additional configuration steps are required.

Current Configuration

Currently, the cluster has 3 nodes with metrics collection configured for:

n1 (<NODE_1_EXTERNAL_IP>): Control plane + worker
n2 (<NODE_2_EXTERNAL_IP>): Worker
n3 (<NODE_3_EXTERNAL_IP>): Worker

Steps to Add a New Node

1. Add the Node to Kubernetes Cluster

Follow your standard node addition process (this is outside the scope of this guide). Ensure the new node:

Is properly joined to the cluster
Has the nginx ingress controller pod deployed (should happen automatically due to DaemonSet)
Is accessible on the cluster network

2. Verify Nginx Ingress Controller Deployment

Check that the nginx ingress controller pod is running on the new node:

kubectl get pods -n ingress-nginx -o wide

Look for a pod on your new node. The nginx ingress controller should automatically deploy due to the DaemonSet configuration.

3. Update OpenTelemetry Collector Configuration

File to modify: manifests/infrastructure/openobserve-collector/gateway-collector.yaml

Current configuration (lines 217-219):

- job_name: 'nginx-ingress'
  static_configs:
    - targets: ['<NODE_1_EXTERNAL_IP>:10254', '<NODE_2_EXTERNAL_IP>:10254', '<NODE_3_EXTERNAL_IP>:10254']

Add the new node IP to the targets list:

- job_name: 'nginx-ingress'
  static_configs:
    - targets: ['<NODE_1_EXTERNAL_IP>:10254', '<NODE_2_EXTERNAL_IP>:10254', '<NODE_3_EXTERNAL_IP>:10254', 'NEW_NODE_IP:10254']

Replace NEW_NODE_IP with the actual IP address of your new node.

4. Update Host Firewall Policies (if applicable)

File to check: manifests/infrastructure/cluster-policies/host-fw-worker-nodes.yaml

Ensure the firewall allows nginx metrics port access (should already be configured):

# NGINX Ingress Controller metrics port
- fromEntities:
  - cluster
  toPorts:
  - ports:
    - port: "10254"
      protocol: "TCP"  # NGINX Ingress metrics

5. Apply the Configuration Changes

# Apply the updated collector configuration
kubectl apply -f manifests/infrastructure/openobserve-collector/gateway-collector.yaml

# Restart the collector to pick up the new configuration
kubectl rollout restart statefulset/openobserve-collector-gateway-collector -n openobserve-collector

6. Verification Steps

Check that the nginx pod is running on the new node:

kubectl get pods -n ingress-nginx -o wide | grep NEW_NODE_NAME

Verify metrics endpoint is accessible:

curl -s http://NEW_NODE_IP:10254/metrics | grep nginx_ingress_controller_requests | head -3

Check collector logs for the new target:

kubectl logs -n openobserve-collector openobserve-collector-gateway-collector-0 --tail=50 | grep -i nginx

Verify target discovery: Look for log entries like:
```
Scrape job added {"jobName": "nginx-ingress"}
```
Test metrics in OpenObserve: Your dashboard query should now include metrics from the new node:
```
sum(increase(nginx_ingress_controller_requests[5m])) by (host)
```

Important Notes

Automatic vs Manual Configuration

✅ Automatic: Nginx ingress controller deployment (DaemonSet handles this)
✅ Automatic: ServiceMonitor discovery (target allocator handles this)
❌ Manual: Static scrape configuration (requires updating the targets list)

Why Both ServiceMonitor and Static Config?

The current setup uses both approaches for redundancy:

ServiceMonitor: Automatically discovers nginx ingress services
Static Configuration: Ensures specific node IPs are always monitored

Network Requirements

Port 10254 must be accessible from the OpenTelemetry collector pods
The new node should be on the same network as existing nodes
Host firewall policies should allow metrics collection

Monitoring Best Practices

Always verify metrics are flowing after adding a node
Test your dashboard queries to ensure the new node's metrics appear
Monitor collector logs for any scraping errors

Troubleshooting

Common Issues

Nginx pod not starting: Check node labels and taints
Metrics endpoint not accessible: Verify network connectivity and firewall rules
Collector not scraping: Check collector logs and restart if needed
Missing metrics in dashboard: Wait 30-60 seconds for metrics to propagate

Useful Commands

# Check nginx ingress pods
kubectl get pods -n ingress-nginx -o wide

# Test metrics endpoint
curl -s http://NODE_IP:10254/metrics | grep nginx_ingress_controller_requests

# Check collector status
kubectl get pods -n openobserve-collector

# View collector logs
kubectl logs -n openobserve-collector openobserve-collector-gateway-collector-0 --tail=50

# Check ServiceMonitor
kubectl get servicemonitor -n ingress-nginx -o yaml

Configuration Files Summary

Files that may need updates when adding a node:

Required: manifests/infrastructure/openobserve-collector/gateway-collector.yaml
- Update static targets list (line ~219)
Optional: manifests/infrastructure/cluster-policies/host-fw-worker-nodes.yaml
- Usually already configured for port 10254
Automatic: manifests/infrastructure/ingress-nginx/ingress-nginx.yaml
- No changes needed (DaemonSet handles deployment)

5.7 KiB Raw Blame History