Add the redacted source file for demo purposes Reviewed-on: https://source.michaeldileo.org/michael_dileo/Keybard-Vagabond-Demo/pulls/1 Co-authored-by: Michael DiLeo <michael_dileo@proton.me> Co-committed-by: Michael DiLeo <michael_dileo@proton.me>
261 lines
8.9 KiB
Markdown
261 lines
8.9 KiB
Markdown
# Elasticsearch Infrastructure
|
|
|
|
This directory contains the Elasticsearch setup using ECK (Elastic Cloud on Kubernetes) operator for full-text search on the Kubernetes cluster.
|
|
|
|
## Architecture
|
|
|
|
- **ECK Operator**: Production-grade Elasticsearch deployment on Kubernetes
|
|
- **Single-node cluster**: Optimized for your 2-node cluster (can be scaled later)
|
|
- **Security enabled**: X-Pack security with custom role and user for Mastodon
|
|
- **Longhorn storage**: Distributed storage with 2-replica redundancy
|
|
- **Self-signed certificates**: Internal cluster communication with TLS
|
|
|
|
## Components
|
|
|
|
### **Core Components**
|
|
- `namespace.yaml`: Elasticsearch system namespace
|
|
- `repository.yaml`: Elastic Helm repository
|
|
- `operator.yaml`: ECK operator deployment
|
|
- Uses existing `longhorn-retain` storage class with backup labels on PVCs
|
|
- `cluster.yaml`: Elasticsearch and Kibana cluster configuration
|
|
|
|
### **Security Components**
|
|
- `secret.yaml`: SOPS-encrypted credentials for Elasticsearch admin and Mastodon user
|
|
- `security-setup.yaml`: Job to create Mastodon role and user after cluster deployment
|
|
|
|
### **Monitoring Components**
|
|
- `monitoring.yaml`: ServiceMonitor for OpenObserve integration + optional Kibana ingress
|
|
- Built-in metrics: Elasticsearch Prometheus exporter
|
|
|
|
## Services Created
|
|
|
|
ECK automatically creates these services:
|
|
|
|
- `elasticsearch-es-http`: HTTPS API access (port 9200)
|
|
- `elasticsearch-es-transport`: Internal cluster transport (port 9300)
|
|
- `kibana-kb-http`: Kibana web UI (port 5601) - optional management interface
|
|
|
|
## Connection Information
|
|
|
|
### For Applications (Mastodon)
|
|
|
|
Applications should connect using these connection parameters:
|
|
|
|
**Elasticsearch Connection:**
|
|
```yaml
|
|
host: elasticsearch-es-http.elasticsearch-system.svc.cluster.local
|
|
port: 9200
|
|
scheme: https # ECK uses HTTPS with self-signed certificates
|
|
user: mastodon
|
|
password: <password from elasticsearch-credentials secret>
|
|
```
|
|
|
|
### Getting Credentials
|
|
|
|
The Elasticsearch credentials are stored in SOPS-encrypted secrets:
|
|
|
|
```bash
|
|
# Get the admin password (auto-generated by ECK)
|
|
kubectl get secret elasticsearch-es-elastic-user -n elasticsearch-system -o jsonpath="{.data.elastic}" | base64 -d
|
|
|
|
# Get the Mastodon user password (set during security setup)
|
|
kubectl get secret elasticsearch-credentials -n elasticsearch-system -o jsonpath="{.data.password}" | base64 -d
|
|
```
|
|
|
|
## Deployment Steps
|
|
|
|
### 1. Encrypt Secrets
|
|
Before deploying, encrypt the secrets with SOPS:
|
|
|
|
```bash
|
|
# Edit and encrypt the Elasticsearch credentials
|
|
sops manifests/infrastructure/elasticsearch/secret.yaml
|
|
|
|
# Edit and encrypt the Mastodon Elasticsearch credentials
|
|
sops manifests/applications/mastodon/elasticsearch-secret.yaml
|
|
```
|
|
|
|
### 2. Deploy Infrastructure
|
|
The infrastructure will be deployed automatically by Flux when you commit:
|
|
|
|
```bash
|
|
git add manifests/infrastructure/elasticsearch/
|
|
git add manifests/cluster/flux-system/elasticsearch.yaml
|
|
git add manifests/cluster/flux-system/kustomization.yaml
|
|
git commit -m "Add Elasticsearch infrastructure for Mastodon search"
|
|
git push
|
|
```
|
|
|
|
### 3. Wait for Deployment
|
|
```bash
|
|
# Monitor ECK operator deployment
|
|
kubectl get pods -n elasticsearch-system -w
|
|
|
|
# Monitor Elasticsearch cluster startup
|
|
kubectl get elasticsearch -n elasticsearch-system -w
|
|
|
|
# Check cluster health
|
|
kubectl get elasticsearch elasticsearch -n elasticsearch-system -o yaml
|
|
```
|
|
|
|
### 4. Verify Security Setup
|
|
```bash
|
|
# Check if security setup job completed successfully
|
|
kubectl get jobs -n elasticsearch-system
|
|
|
|
# Verify Mastodon user was created
|
|
kubectl logs -n elasticsearch-system job/elasticsearch-security-setup
|
|
```
|
|
|
|
### 5. Update Mastodon
|
|
After Elasticsearch is running, deploy the updated Mastodon configuration:
|
|
|
|
```bash
|
|
git add manifests/applications/mastodon/
|
|
git commit -m "Enable Elasticsearch in Mastodon"
|
|
git push
|
|
```
|
|
|
|
### 6. Populate Search Indices
|
|
Once Mastodon is running with Elasticsearch enabled, populate the search indices:
|
|
|
|
```bash
|
|
# Get a Mastodon web pod
|
|
MASTODON_POD=$(kubectl get pods -n mastodon-application -l app.kubernetes.io/component=web -o jsonpath='{.items[0].metadata.name}')
|
|
|
|
# Run the search deployment command
|
|
kubectl exec -n mastodon-application $MASTODON_POD -- bin/tootctl search deploy
|
|
```
|
|
|
|
## Configuration Details
|
|
|
|
### Elasticsearch Configuration
|
|
- **Version**: 7.17.27 (latest 7.x compatible with Mastodon)
|
|
- **Preset**: `single_node_cluster` (optimized for single-node deployment)
|
|
- **Memory**: 2GB heap size (50% of 4GB container limit)
|
|
- **Storage**: 50GB persistent volume with existing `longhorn-retain` storage class
|
|
- **Security**: X-Pack security enabled with custom roles
|
|
|
|
### Security Configuration
|
|
Following the [Mastodon Elasticsearch documentation](https://docs.joinmastodon.org/admin/elasticsearch/), the setup includes:
|
|
|
|
- **Custom Role**: `mastodon_full_access` with minimal required permissions
|
|
- **Dedicated User**: `mastodon` with the custom role
|
|
- **TLS Encryption**: All connections use HTTPS with self-signed certificates
|
|
|
|
### Performance Configuration
|
|
- **JVM Settings**: Optimized for your cluster's resource constraints
|
|
- **Discovery**: Single-node discovery (can be changed for multi-node scaling)
|
|
- **Memory**: Conservative settings for 2-node cluster compatibility
|
|
- **Storage**: Optimized for SSD performance with proper disk watermarks
|
|
|
|
## Mastodon Integration
|
|
|
|
### Search Features Enabled
|
|
Once configured, Mastodon will provide full-text search for:
|
|
|
|
- Public statuses from accounts that opted into search results
|
|
- User's own statuses
|
|
- User's mentions, favourites, and bookmarks
|
|
- Account information (display names, usernames, bios)
|
|
|
|
### Search Index Deployment
|
|
The `tootctl search deploy` command will create these indices:
|
|
|
|
- `accounts_index`: User accounts and profiles
|
|
- `statuses_index`: User's own statuses, mentions, favourites, bookmarks
|
|
- `public_statuses_index`: Public searchable content
|
|
- `tags_index`: Hashtag search
|
|
|
|
## Monitoring Integration
|
|
|
|
### OpenObserve Metrics
|
|
Elasticsearch metrics are automatically collected and sent to OpenObserve:
|
|
|
|
- **Cluster Health**: Node status, cluster state, allocation
|
|
- **Performance**: Query latency, indexing rate, search performance
|
|
- **Storage**: Disk usage, index sizes, shard distribution
|
|
- **JVM**: Memory usage, garbage collection, heap statistics
|
|
|
|
### Kibana Management UI
|
|
Optional Kibana web interface available at `https://kibana.keyboardvagabond.com` for:
|
|
|
|
- Index management and monitoring
|
|
- Query development and testing
|
|
- Cluster configuration and troubleshooting
|
|
- Visual dashboards for Elasticsearch data
|
|
|
|
## Scaling Considerations
|
|
|
|
### Current Setup
|
|
- **Single-node cluster**: Optimized for current 2-node Kubernetes cluster
|
|
- **50GB storage**: Sufficient for small-to-medium Mastodon instances
|
|
- **2GB heap**: Conservative memory allocation
|
|
|
|
### Future Scaling
|
|
When adding more Kubernetes nodes:
|
|
|
|
1. Update `discovery.type` from `single-node` to `zen` in cluster configuration
|
|
2. Increase `nodeSets.count` to 2 or 3 for high availability
|
|
3. Change `ES_PRESET` to `small_cluster` in Mastodon configuration
|
|
4. Consider increasing storage and memory allocations
|
|
|
|
## Troubleshooting
|
|
|
|
### Common Issues
|
|
|
|
**Elasticsearch pods pending:**
|
|
- Check storage class and PVC creation
|
|
- Verify Longhorn is healthy and has available space
|
|
|
|
**Security setup job failing:**
|
|
- Check Elasticsearch cluster health
|
|
- Verify admin credentials are available
|
|
- Review job logs for API errors
|
|
|
|
**Mastodon search not working:**
|
|
- Verify Elasticsearch credentials in Mastodon secret
|
|
- Check network connectivity between namespaces
|
|
- Ensure search indices are created with `tootctl search deploy`
|
|
|
|
### Useful Commands
|
|
|
|
```bash
|
|
# Check Elasticsearch cluster status
|
|
kubectl get elasticsearch -n elasticsearch-system
|
|
|
|
# View Elasticsearch logs
|
|
kubectl logs -n elasticsearch-system -l elasticsearch.k8s.elastic.co/cluster-name=elasticsearch
|
|
|
|
# Check security setup
|
|
kubectl describe job elasticsearch-security-setup -n elasticsearch-system
|
|
|
|
# Test connectivity from Mastodon
|
|
kubectl exec -n mastodon-application deployment/mastodon-web -- curl -k https://elasticsearch-es-http.elasticsearch-system.svc.cluster.local:9200/_cluster/health
|
|
```
|
|
|
|
## Backup Integration
|
|
|
|
### S3 Backup Strategy
|
|
- **Longhorn Integration**: Elasticsearch volumes are automatically backed up to Backblaze B2
|
|
- **Volume Labels**: `backup.longhorn.io/enable: "true"` enables automatic S3 backup
|
|
- **Backup Frequency**: Follows existing Longhorn backup schedule
|
|
|
|
### Index Backup
|
|
For additional protection, consider periodic index snapshots:
|
|
|
|
```bash
|
|
# Create snapshot repository (one-time setup)
|
|
curl -k -u "mastodon:$ES_PASSWORD" -X PUT "https://elasticsearch-es-http.elasticsearch-system.svc.cluster.local:9200/_snapshot/s3_repository" -H 'Content-Type: application/json' -d'
|
|
{
|
|
"type": "s3",
|
|
"settings": {
|
|
"bucket": "longhorn-backup-bucket",
|
|
"region": "eu-central-003",
|
|
"endpoint": "<REPLACE_WITH_S3_ENDPOINT>"
|
|
}
|
|
}'
|
|
|
|
# Create manual snapshot
|
|
curl -k -u "mastodon:$ES_PASSWORD" -X PUT "https://elasticsearch-es-http.elasticsearch-system.svc.cluster.local:9200/_snapshot/s3_repository/snapshot_1"
|
|
``` |