# Kubernetes Metrics Server ## Overview This deploys the Kubernetes Metrics Server to provide resource metrics for nodes and pods. The metrics server enables `kubectl top` commands and provides metrics for Horizontal Pod Autoscaling (HPA) and Vertical Pod Autoscaling (VPA). ## Architecture ### Current Deployment (Simple) - **Version**: v0.7.2 (latest stable) - **Replicas**: 2 (HA across both cluster nodes) - **TLS Mode**: Insecure TLS for initial deployment (`--kubelet-insecure-tls=true`) - **Integration**: OpenObserve monitoring via ServiceMonitor ### Security Configuration The current deployment uses `--kubelet-insecure-tls=true` for compatibility with Talos Linux. This is acceptable for internal cluster metrics as: - Metrics traffic stays within the cluster network - The VLAN provides network isolation - No sensitive data is exposed via metrics - Proper RBAC controls access to the metrics API ### Future Enhancements (Optional) For production hardening, the repository includes: - `certificate.yaml`: cert-manager certificates for proper TLS - `metrics-server.yaml`: Full TLS-enabled deployment - Switch to secure TLS by updating kustomization.yaml when needed ## Usage ### Basic Commands ```bash # View node resource usage kubectl top nodes # View pod resource usage (all namespaces) kubectl top pods --all-namespaces # View pod resource usage (specific namespace) kubectl top pods -n kube-system # View pod resource usage with containers kubectl top pods --containers ``` ### Integration with Monitoring The metrics server is automatically discovered by OpenObserve via ServiceMonitor for: - Metrics server performance monitoring - Resource usage dashboards - Alerting on high resource consumption ## Troubleshooting ### Common Issues 1. **"Metrics API not available"**: Check pod status with `kubectl get pods -n metrics-server-system` 2. **TLS certificate errors**: Verify APIService with `kubectl get apiservice v1beta1.metrics.k8s.io` 3. **Resource limits**: Pods may be OOMKilled if cluster load is high ### Verification ```bash # Check metrics server status kubectl get pods -n metrics-server-system # Verify API registration kubectl get apiservice v1beta1.metrics.k8s.io # Test metrics collection kubectl top nodes kubectl top pods -n metrics-server-system ``` ## Configuration ### Resource Requests/Limits - **CPU**: 100m request, 500m limit - **Memory**: 200Mi request, 500Mi limit - **Priority**: system-cluster-critical ### Node Scheduling - Tolerates control plane taints - Can schedule on both n1 (control plane) and n2 (worker) - Uses node selector for Linux nodes only ## Monitoring Integration - **ServiceMonitor**: Automatically scraped by OpenObserve - **Metrics Path**: `/metrics` on HTTPS port - **Scrape Interval**: 30 seconds - **Dashboard**: Available in OpenObserve for resource analysis