Kubernetes Metrics Server
Overview
This deploys the Kubernetes Metrics Server to provide resource metrics for nodes and pods. The metrics server enables kubectl top commands and provides metrics for Horizontal Pod Autoscaling (HPA) and Vertical Pod Autoscaling (VPA).
Architecture
Current Deployment (Simple)
- Version: v0.7.2 (latest stable)
- Replicas: 2 (HA across both cluster nodes)
- TLS Mode: Insecure TLS for initial deployment (
--kubelet-insecure-tls=true) - Integration: OpenObserve monitoring via ServiceMonitor
Security Configuration
The current deployment uses --kubelet-insecure-tls=true for compatibility with Talos Linux. This is acceptable for internal cluster metrics as:
- Metrics traffic stays within the cluster network
- The VLAN provides network isolation
- No sensitive data is exposed via metrics
- Proper RBAC controls access to the metrics API
Future Enhancements (Optional)
For production hardening, the repository includes:
certificate.yaml: cert-manager certificates for proper TLSmetrics-server.yaml: Full TLS-enabled deployment- Switch to secure TLS by updating kustomization.yaml when needed
Usage
Basic Commands
# View node resource usage
kubectl top nodes
# View pod resource usage (all namespaces)
kubectl top pods --all-namespaces
# View pod resource usage (specific namespace)
kubectl top pods -n kube-system
# View pod resource usage with containers
kubectl top pods --containers
Integration with Monitoring
The metrics server is automatically discovered by OpenObserve via ServiceMonitor for:
- Metrics server performance monitoring
- Resource usage dashboards
- Alerting on high resource consumption
Troubleshooting
Common Issues
- "Metrics API not available": Check pod status with
kubectl get pods -n metrics-server-system - TLS certificate errors: Verify APIService with
kubectl get apiservice v1beta1.metrics.k8s.io - Resource limits: Pods may be OOMKilled if cluster load is high
Verification
# Check metrics server status
kubectl get pods -n metrics-server-system
# Verify API registration
kubectl get apiservice v1beta1.metrics.k8s.io
# Test metrics collection
kubectl top nodes
kubectl top pods -n metrics-server-system
Configuration
Resource Requests/Limits
- CPU: 100m request, 500m limit
- Memory: 200Mi request, 500Mi limit
- Priority: system-cluster-critical
Node Scheduling
- Tolerates control plane taints
- Can schedule on both n1 (control plane) and n2 (worker)
- Uses node selector for Linux nodes only
Monitoring Integration
- ServiceMonitor: Automatically scraped by OpenObserve
- Metrics Path:
/metricson HTTPS port - Scrape Interval: 30 seconds
- Dashboard: Available in OpenObserve for resource analysis