Files

Michael DiLeo 74324d5a1b add source code and readme

2025-12-24 14:35:17 +01:00

8.9 KiB

Raw Blame History

Elasticsearch Infrastructure

This directory contains the Elasticsearch setup using ECK (Elastic Cloud on Kubernetes) operator for full-text search on the Kubernetes cluster.

Architecture

ECK Operator: Production-grade Elasticsearch deployment on Kubernetes
Single-node cluster: Optimized for your 2-node cluster (can be scaled later)
Security enabled: X-Pack security with custom role and user for Mastodon
Longhorn storage: Distributed storage with 2-replica redundancy
Self-signed certificates: Internal cluster communication with TLS

Components

Core Components

namespace.yaml: Elasticsearch system namespace
repository.yaml: Elastic Helm repository
operator.yaml: ECK operator deployment
Uses existing longhorn-retain storage class with backup labels on PVCs
cluster.yaml: Elasticsearch and Kibana cluster configuration

Security Components

secret.yaml: SOPS-encrypted credentials for Elasticsearch admin and Mastodon user
security-setup.yaml: Job to create Mastodon role and user after cluster deployment

Monitoring Components

monitoring.yaml: ServiceMonitor for OpenObserve integration + optional Kibana ingress
Built-in metrics: Elasticsearch Prometheus exporter

Services Created

ECK automatically creates these services:

elasticsearch-es-http: HTTPS API access (port 9200)
elasticsearch-es-transport: Internal cluster transport (port 9300)
kibana-kb-http: Kibana web UI (port 5601) - optional management interface

Connection Information

For Applications (Mastodon)

Applications should connect using these connection parameters:

Elasticsearch Connection:

host: elasticsearch-es-http.elasticsearch-system.svc.cluster.local
port: 9200
scheme: https  # ECK uses HTTPS with self-signed certificates
user: mastodon
password: <password from elasticsearch-credentials secret>

Getting Credentials

The Elasticsearch credentials are stored in SOPS-encrypted secrets:

# Get the admin password (auto-generated by ECK)
kubectl get secret elasticsearch-es-elastic-user -n elasticsearch-system -o jsonpath="{.data.elastic}" | base64 -d

# Get the Mastodon user password (set during security setup)
kubectl get secret elasticsearch-credentials -n elasticsearch-system -o jsonpath="{.data.password}" | base64 -d

Deployment Steps

1. Encrypt Secrets

Before deploying, encrypt the secrets with SOPS:

# Edit and encrypt the Elasticsearch credentials
sops manifests/infrastructure/elasticsearch/secret.yaml

# Edit and encrypt the Mastodon Elasticsearch credentials  
sops manifests/applications/mastodon/elasticsearch-secret.yaml

2. Deploy Infrastructure

The infrastructure will be deployed automatically by Flux when you commit:

git add manifests/infrastructure/elasticsearch/
git add manifests/cluster/flux-system/elasticsearch.yaml
git add manifests/cluster/flux-system/kustomization.yaml
git commit -m "Add Elasticsearch infrastructure for Mastodon search"
git push

3. Wait for Deployment

# Monitor ECK operator deployment
kubectl get pods -n elasticsearch-system -w

# Monitor Elasticsearch cluster startup
kubectl get elasticsearch -n elasticsearch-system -w

# Check cluster health
kubectl get elasticsearch elasticsearch -n elasticsearch-system -o yaml

4. Verify Security Setup

# Check if security setup job completed successfully
kubectl get jobs -n elasticsearch-system

# Verify Mastodon user was created
kubectl logs -n elasticsearch-system job/elasticsearch-security-setup

5. Update Mastodon

After Elasticsearch is running, deploy the updated Mastodon configuration:

git add manifests/applications/mastodon/
git commit -m "Enable Elasticsearch in Mastodon"
git push

6. Populate Search Indices

Once Mastodon is running with Elasticsearch enabled, populate the search indices:

# Get a Mastodon web pod
MASTODON_POD=$(kubectl get pods -n mastodon-application -l app.kubernetes.io/component=web -o jsonpath='{.items[0].metadata.name}')

# Run the search deployment command
kubectl exec -n mastodon-application $MASTODON_POD -- bin/tootctl search deploy

Configuration Details

Elasticsearch Configuration

Version: 7.17.27 (latest 7.x compatible with Mastodon)
Preset: single_node_cluster (optimized for single-node deployment)
Memory: 2GB heap size (50% of 4GB container limit)
Storage: 50GB persistent volume with existing longhorn-retain storage class
Security: X-Pack security enabled with custom roles

Security Configuration

Following the Mastodon Elasticsearch documentation, the setup includes:

Custom Role: mastodon_full_access with minimal required permissions
Dedicated User: mastodon with the custom role
TLS Encryption: All connections use HTTPS with self-signed certificates

Performance Configuration

JVM Settings: Optimized for your cluster's resource constraints
Discovery: Single-node discovery (can be changed for multi-node scaling)
Memory: Conservative settings for 2-node cluster compatibility
Storage: Optimized for SSD performance with proper disk watermarks

Mastodon Integration

Search Features Enabled

Once configured, Mastodon will provide full-text search for:

Public statuses from accounts that opted into search results
User's own statuses
User's mentions, favourites, and bookmarks
Account information (display names, usernames, bios)

Search Index Deployment

The tootctl search deploy command will create these indices:

accounts_index: User accounts and profiles
statuses_index: User's own statuses, mentions, favourites, bookmarks
public_statuses_index: Public searchable content
tags_index: Hashtag search

Monitoring Integration

OpenObserve Metrics

Elasticsearch metrics are automatically collected and sent to OpenObserve:

Cluster Health: Node status, cluster state, allocation
Performance: Query latency, indexing rate, search performance
Storage: Disk usage, index sizes, shard distribution
JVM: Memory usage, garbage collection, heap statistics

Kibana Management UI

Optional Kibana web interface available at https://kibana.keyboardvagabond.com for:

Index management and monitoring
Query development and testing
Cluster configuration and troubleshooting
Visual dashboards for Elasticsearch data

Scaling Considerations

Current Setup

Single-node cluster: Optimized for current 2-node Kubernetes cluster
50GB storage: Sufficient for small-to-medium Mastodon instances
2GB heap: Conservative memory allocation

Future Scaling

When adding more Kubernetes nodes:

Update discovery.type from single-node to zen in cluster configuration
Increase nodeSets.count to 2 or 3 for high availability
Change ES_PRESET to small_cluster in Mastodon configuration
Consider increasing storage and memory allocations

Troubleshooting

Common Issues

Elasticsearch pods pending:

Check storage class and PVC creation
Verify Longhorn is healthy and has available space

Security setup job failing:

Check Elasticsearch cluster health
Verify admin credentials are available
Review job logs for API errors

Mastodon search not working:

Verify Elasticsearch credentials in Mastodon secret
Check network connectivity between namespaces
Ensure search indices are created with tootctl search deploy

Useful Commands

# Check Elasticsearch cluster status
kubectl get elasticsearch -n elasticsearch-system

# View Elasticsearch logs
kubectl logs -n elasticsearch-system -l elasticsearch.k8s.elastic.co/cluster-name=elasticsearch

# Check security setup
kubectl describe job elasticsearch-security-setup -n elasticsearch-system

# Test connectivity from Mastodon
kubectl exec -n mastodon-application deployment/mastodon-web -- curl -k https://elasticsearch-es-http.elasticsearch-system.svc.cluster.local:9200/_cluster/health

Backup Integration

S3 Backup Strategy

Longhorn Integration: Elasticsearch volumes are automatically backed up to Backblaze B2
Volume Labels: backup.longhorn.io/enable: "true" enables automatic S3 backup
Backup Frequency: Follows existing Longhorn backup schedule

Index Backup

For additional protection, consider periodic index snapshots:

# Create snapshot repository (one-time setup)
curl -k -u "mastodon:$ES_PASSWORD" -X PUT "https://elasticsearch-es-http.elasticsearch-system.svc.cluster.local:9200/_snapshot/s3_repository" -H 'Content-Type: application/json' -d'
{
  "type": "s3",
  "settings": {
    "bucket": "longhorn-backup-bucket",
    "region": "eu-central-003",
    "endpoint": "<REPLACE_WITH_S3_ENDPOINT>"
  }
}'

# Create manual snapshot
curl -k -u "mastodon:$ES_PASSWORD" -X PUT "https://elasticsearch-es-http.elasticsearch-system.svc.cluster.local:9200/_snapshot/s3_repository/snapshot_1"

8.9 KiB Raw Blame History