# Kubernetes Metrics Server

## Overview
This deploys the Kubernetes Metrics Server to provide resource metrics for nodes and pods. The metrics server enables `kubectl top` commands and provides metrics for Horizontal Pod Autoscaling (HPA) and Vertical Pod Autoscaling (VPA).

## Architecture

### Current Deployment (Simple)
- **Version**: v0.7.2 (latest stable)
- **Replicas**: 2 (HA across both cluster nodes)
- **TLS Mode**: Insecure TLS for initial deployment (`--kubelet-insecure-tls=true`)
- **Integration**: OpenObserve monitoring via ServiceMonitor

### Security Configuration
The current deployment uses `--kubelet-insecure-tls=true` for compatibility with Talos Linux. This is acceptable for internal cluster metrics as:
- Metrics traffic stays within the cluster network
- The VLAN provides network isolation 
- No sensitive data is exposed via metrics
- Proper RBAC controls access to the metrics API

### Future Enhancements (Optional)
For production hardening, the repository includes:
- `certificate.yaml`: cert-manager certificates for proper TLS
- `metrics-server.yaml`: Full TLS-enabled deployment
- Switch to secure TLS by updating kustomization.yaml when needed

## Usage

### Basic Commands
```bash
# View node resource usage
kubectl top nodes

# View pod resource usage (all namespaces)
kubectl top pods --all-namespaces

# View pod resource usage (specific namespace)
kubectl top pods -n kube-system

# View pod resource usage with containers
kubectl top pods --containers
```

### Integration with Monitoring
The metrics server is automatically discovered by OpenObserve via ServiceMonitor for:
- Metrics server performance monitoring
- Resource usage dashboards
- Alerting on high resource consumption

## Troubleshooting

### Common Issues
1. **"Metrics API not available"**: Check pod status with `kubectl get pods -n metrics-server-system`
2. **TLS certificate errors**: Verify APIService with `kubectl get apiservice v1beta1.metrics.k8s.io`
3. **Resource limits**: Pods may be OOMKilled if cluster load is high

### Verification
```bash
# Check metrics server status
kubectl get pods -n metrics-server-system

# Verify API registration
kubectl get apiservice v1beta1.metrics.k8s.io

# Test metrics collection
kubectl top nodes
kubectl top pods -n metrics-server-system
```

## Configuration

### Resource Requests/Limits
- **CPU**: 100m request, 500m limit
- **Memory**: 200Mi request, 500Mi limit
- **Priority**: system-cluster-critical

### Node Scheduling
- Tolerates control plane taints
- Can schedule on both n1 (control plane) and n2 (worker)
- Uses node selector for Linux nodes only

## Monitoring Integration
- **ServiceMonitor**: Automatically scraped by OpenObserve
- **Metrics Path**: `/metrics` on HTTPS port
- **Scrape Interval**: 30 seconds
- **Dashboard**: Available in OpenObserve for resource analysis