# Adding a New Node for Nginx Ingress Metrics Collection

This guide documents the steps required to add a new node to the cluster and ensure nginx ingress controller metrics are properly collected from it.

## Overview

The nginx ingress controller is deployed as a **DaemonSet** (kind: DaemonSet), which means it automatically deploys one pod per node. However, for metrics collection to work properly, additional configuration steps are required.

## Current Configuration

Currently, the cluster has 3 nodes with metrics collection configured for:
- **n1 (<NODE_1_EXTERNAL_IP>)**: Control plane + worker
- **n2 (<NODE_2_EXTERNAL_IP>)**: Worker  
- **n3 (<NODE_3_EXTERNAL_IP>)**: Worker

## Steps to Add a New Node

### 1. Add the Node to Kubernetes Cluster

Follow your standard node addition process (this is outside the scope of this guide). Ensure the new node:
- Is properly joined to the cluster
- Has the nginx ingress controller pod deployed (should happen automatically due to DaemonSet)
- Is accessible on the cluster network

### 2. Verify Nginx Ingress Controller Deployment

Check that the nginx ingress controller pod is running on the new node:

```bash
kubectl get pods -n ingress-nginx -o wide
```

Look for a pod on your new node. The nginx ingress controller should automatically deploy due to the DaemonSet configuration.

### 3. Update OpenTelemetry Collector Configuration

**File to modify**: `manifests/infrastructure/openobserve-collector/gateway-collector.yaml`

**Current configuration** (lines 217-219):
```yaml
- job_name: 'nginx-ingress'
  static_configs:
    - targets: ['<NODE_1_EXTERNAL_IP>:10254', '<NODE_2_EXTERNAL_IP>:10254', '<NODE_3_EXTERNAL_IP>:10254']
```

**Add the new node IP** to the targets list:
```yaml
- job_name: 'nginx-ingress'
  static_configs:
    - targets: ['<NODE_1_EXTERNAL_IP>:10254', '<NODE_2_EXTERNAL_IP>:10254', '<NODE_3_EXTERNAL_IP>:10254', 'NEW_NODE_IP:10254']
```

Replace `NEW_NODE_IP` with the actual IP address of your new node.

### 4. Update Host Firewall Policies (if applicable)

**File to check**: `manifests/infrastructure/cluster-policies/host-fw-worker-nodes.yaml`

Ensure the firewall allows nginx metrics port access (should already be configured):
```yaml
# NGINX Ingress Controller metrics port
- fromEntities:
  - cluster
  toPorts:
  - ports:
    - port: "10254"
      protocol: "TCP"  # NGINX Ingress metrics
```

### 5. Apply the Configuration Changes

```bash
# Apply the updated collector configuration
kubectl apply -f manifests/infrastructure/openobserve-collector/gateway-collector.yaml

# Restart the collector to pick up the new configuration
kubectl rollout restart statefulset/openobserve-collector-gateway-collector -n openobserve-collector
```

### 6. Verification Steps

1. **Check that the nginx pod is running on the new node**:
   ```bash
   kubectl get pods -n ingress-nginx -o wide | grep NEW_NODE_NAME
   ```

2. **Verify metrics endpoint is accessible**:
   ```bash
   curl -s http://NEW_NODE_IP:10254/metrics | grep nginx_ingress_controller_requests | head -3
   ```

3. **Check collector logs for the new target**:
   ```bash
   kubectl logs -n openobserve-collector openobserve-collector-gateway-collector-0 --tail=50 | grep -i nginx
   ```

4. **Verify target discovery**:
   Look for log entries like:
   ```
   Scrape job added {"jobName": "nginx-ingress"}
   ```

5. **Test metrics in OpenObserve**:
   Your dashboard query should now include metrics from the new node:
   ```promql
   sum(increase(nginx_ingress_controller_requests[5m])) by (host)
   ```

## Important Notes

### Automatic vs Manual Configuration

- ✅ **Automatic**: Nginx ingress controller deployment (DaemonSet handles this)
- ✅ **Automatic**: ServiceMonitor discovery (target allocator handles this)
- ❌ **Manual**: Static scrape configuration (requires updating the targets list)

### Why Both ServiceMonitor and Static Config?

The current setup uses **both approaches** for redundancy:
1. **ServiceMonitor**: Automatically discovers nginx ingress services
2. **Static Configuration**: Ensures specific node IPs are always monitored

### Network Requirements

- Port **10254** must be accessible from the OpenTelemetry collector pods
- The new node should be on the same network as existing nodes
- Host firewall policies should allow metrics collection

### Monitoring Best Practices

- Always verify metrics are flowing after adding a node
- Test your dashboard queries to ensure the new node's metrics appear
- Monitor collector logs for any scraping errors

## Troubleshooting

### Common Issues

1. **Nginx pod not starting**: Check node labels and taints
2. **Metrics endpoint not accessible**: Verify network connectivity and firewall rules
3. **Collector not scraping**: Check collector logs and restart if needed
4. **Missing metrics in dashboard**: Wait 30-60 seconds for metrics to propagate

### Useful Commands

```bash
# Check nginx ingress pods
kubectl get pods -n ingress-nginx -o wide

# Test metrics endpoint
curl -s http://NODE_IP:10254/metrics | grep nginx_ingress_controller_requests

# Check collector status
kubectl get pods -n openobserve-collector

# View collector logs
kubectl logs -n openobserve-collector openobserve-collector-gateway-collector-0 --tail=50

# Check ServiceMonitor
kubectl get servicemonitor -n ingress-nginx -o yaml
```

## Configuration Files Summary

Files that may need updates when adding a node:

1. **Required**: `manifests/infrastructure/openobserve-collector/gateway-collector.yaml`
   - Update static targets list (line ~219)

2. **Optional**: `manifests/infrastructure/cluster-policies/host-fw-worker-nodes.yaml`
   - Usually already configured for port 10254

3. **Automatic**: `manifests/infrastructure/ingress-nginx/ingress-nginx.yaml`
   - No changes needed (DaemonSet handles deployment)