esddns

ESDDNS Kopf Operator - Implementation Summary

Overview

A complete Kubernetes operator implementation for ESDDNS with LoadBalancer service deployment, using the Kopf framework for node monitoring and automatic DNS synchronization.

What Was Created

1. Core Operator Implementation

File: k8s/esddns_operator.py

A Kopf-based Kubernetes operator with optimized centralized IP detection:

Architecture:

# Centralized IP Detection (runs once on leader)
class CentralizedIPDetector
├── get_current_ip()         # Retrieve WAN IP
├── update_configmap(ip)     # Store in ConfigMap
└── detect_and_store()       # Main detection logic

# Distributed DNS Updates (runs on all nodes)
class NodeDNSUpdater
├── get_cached_ip()          # Read from ConfigMap
├── fallback_detect_ip()     # Direct detection if needed
├── update_dns(ip)           # Update DNS record
├── sync_dns_from_cache()    # Main sync operation
└── metrics                  # Track reads, failures, fallbacks

Event handlers:

@kopf.timer('ConfigMap')      # Centralized IP detection (leader election)
@kopf.on.event('Node')        # DaemonSet: DNS update on node changes
@kopf.timer('Node')           # DaemonSet: Periodic DNS sync from cache
@kopf.on.update('Deployment') # Monitor service health

2. Kubernetes Manifests

Base Configuration (k8s/base/)

Core cluster setup:

Deployments:

Services:

Configuration:

Environment Overlays

Development (k8s/overlays/development/)

Production (k8s/overlays/production/)

3. Monitoring Integration

Prometheus ServiceMonitor (k8s/monitoring/prometheus-servicemonitor.yaml)

ServiceMonitor:
  - esddns-operator    # Metrics from operator DaemonSet port 8080
  - esddns-service     # Metrics from service deployment port 51339

PrometheusRules (k8s/monitoring/prometheus-rules.yaml)

Alert definitions:

4. Metrics Implementation

File: esddns_service/metrics.py

Prometheus metrics exported:

dns_updates_total               # Successful DNS updates
dns_update_failures_total       # Failed DNS updates
dns_update_duration_seconds     # Time to update DNS
last_dns_update_timestamp       # Last update time
current_wan_ip_info             # Current WAN IP
wan_ip_changes_total            # IP change events
wan_ip_discovery_duration       # Time to discover IP
wan_ip_discovery_failures       # Failed discovery attempts
state_in_sync                   # Sync status
dns_record_ip_matches           # IP match status
service_health                  # Service health (1=up, 0=down)
cache_hits/misses               # Endpoint cache statistics
request_duration_seconds        # HTTP request latency
endpoint_requests_total         # Total HTTP requests

Metric recording functions:

5. Container & Deployment

Dockerfile (k8s/Dockerfile)

FROM python:3.11-slim
WORKDIR /app
RUN apt-get install -y --no-install-recommends curl git
RUN pip install kopf[kubernetes]==1.37.2 prometheus-client==0.18.0 flask-cors==4.0.0
COPY . .
USER esddns:esddns (uid: 1000)
HEALTHCHECK every 30s
CMD ["kopf", "run", "--liveness=http://0.0.0.0:8080/healthz", "k8s/esddns_operator.py"]

6. Deployment Automation

Script: k8s/deploy.sh

Automated deployment with:

./k8s/deploy.sh development  # Deploy to dev
./k8s/deploy.sh production   # Deploy to prod

Features:

7. Documentation

DEPLOYMENT.md - Complete deployment guide

README.md - Quick reference

Architecture Overview

┌──────────────────────────────────────────────────────────┐
│       Kubernetes Cluster (Any Cloud)                     │
├──────────────────────────────────────────────────────────┤
│                                                          │
│  CENTRALIZED IP DETECTION (Leader via Kopf lock)         │
│  ┌──────────────────────────────────────────────────┐    │
│  │ Kopf Operator (Single Instance)                  │    │
│  │ CentralizedIPDetector.detect_wan_ip()            │    │
│  │ • Detects IP every 5 minutes                     │    │
│  │ • Stores in ConfigMap: esddns-wan-ip             │    │
│  └────────────────┬─────────────────────────────────┘    │
│                   │                                      │
│                   ▼                                      │
│       ┌─────────────────────────┐                        │
│       │ ConfigMap               │                        │
│       │ esddns-wan-ip           │                        │
│       │ current_ip: X.X.X.X     │                        │
│       │ detected_at: timestamp  │                        │
│       └────────┬────────────────┘                        │
│                │                                         │
│  DISTRIBUTED DNS UPDATES (All DaemonSet pods)            │
│  Node 1              Node 2        Node N                │
│  ┌──────────────────┐ ┌──────────────────┐               │
│  │ Operator Pod     │ │ Operator Pod     │ ...           │
│  │ (hostNetwork)    │ │ (hostNetwork)    │               │
│  │ NodeDNSUpdater   │ │ NodeDNSUpdater   │               │
│  │ • Read ConfigMap │ │ • Read ConfigMap │               │
│  │ • Update DNS     │ │ • Update DNS     │               │
│  │ • Fallback if    │ │ • Fallback if    │               │
│  │   stale/missing  │ │   stale/missing  │               │
│  └────┬─────────────┘ └────┬─────────────┘               │
│       │                    │                             │
│       └────────────────────┴──────────────┐              │
│                                           │              │
│  ┌──────────────────────────────────┐     │              │
│  │  esddns-service (Deployment)     │     │              │
│  │  Flask web service (1 replica)   │     │              │
│  │  :51339 HTTP endpoint            │     │              │
│  └──────────────────────────────────┘     │              │
│             │                             │              │
│  ┌──────────▼────────────────────┐        │              │
│  │  LoadBalancer Service         │        │              │
│  │  External IP (cloud provider) │        │              │
│  │  Ports: 80, 443 → 51339       │        │              │
│  └──────────┬────────────────────┘        │              │
│             │                             │              │
└─────────────┼─────────────────────────────┘              │
              │                                            │
        ┌─────▼────────────────────────┐                   │
        │  Gandi.net LiveDNS API       │◄──────────────────┘
        │  A Record Updates            │
        │  https://api.gandi.net/v5    │
        └──────────────────────────────┘

Configuration:
├── ConfigMap: esddns-config (dns.ini bundled)
├── ConfigMap: esddns-wan-ip (cached WAN IP + timestamp)
├── Secret: esddns-gandi-credentials (API key)
├── RBAC: ServiceAccount + ClusterRole (with ConfigMap read/write)
└── Monitoring: Prometheus ServiceMonitor + Rules

Integration Points

Existing Code Reuse

New Components

How They Work Together

NodeIPMonitor (operator.py)
  ├─ Uses: ESDDNS class → esddns.py
  ├─ Uses: DomainManagement → api/dns_manager.py
  ├─ Uses: WANIPState → api/get_ip.py
  ├─ Uses: logger_wrapper → api/logs.py
  └─ Reads: dns.ini from ConfigMap

esddns-service Deployment
  ├─ Runs: esddns_endpoint.py (Flask)
  ├─ Uses: metrics.py for Prometheus export
  ├─ Reads: ConfigMap for dns.ini
  ├─ Reads: Secret for API key
  └─ Exposes: /raw and / endpoints

Deployment Scenarios

AWS EKS

# LoadBalancer creates AWS ELB automatically
./k8s/deploy.sh production
kubectl get svc esddns-service -n esddns
# Type: LoadBalancer
# External IP: <AWS ELB IP>

Google GKE

# LoadBalancer creates Google Cloud LB
./k8s/deploy.sh production
kubectl get svc esddns-service -n esddns
# Type: LoadBalancer
# External IP: <GCP Load Balancer IP>

Azure AKS

# LoadBalancer creates Azure LB
./k8s/deploy.sh production
kubectl get svc esddns-service -n esddns
# Type: LoadBalancer
# External IP: <Azure LB IP>

On-Premises

# For non-cloud clusters, use NodePort instead
kubectl patch svc esddns-service -n esddns \
  -p '{"spec":{"type":"NodePort"}}'
# Access via: <node-ip>:<node-port>

Key Features

Centralized IP Detection - Single leader detects WAN IP once (90-95% fewer API calls)

Distributed DNS Updates - All nodes update DNS from ConfigMap cache

High Availability - Fallback to direct detection if ConfigMap unavailable/stale

Network Isolation Safe - Works even if nodes can’t reach external services

Zero Configuration Drift - All nodes use same detected IP source

Cloud Integration - LoadBalancer with stable external IP

Prometheus Metrics - Full observability with fallback tracking

Alert Rules - PrometheusRules for critical events

Environment Isolation - Dev/prod kustomization overlays

Configuration Management - ConfigMap + Secrets + IP cache

Health Checks - Liveness and readiness probes

RBAC - Minimal required permissions (with ConfigMap access)

Automated Deployment - deploy.sh script

Complete Documentation - DEPLOYMENT.md + README.md

Security - Non-root for web service, secrets management

Getting Started

1. Quick Deploy

cd k8s
export GANDI_API_KEY="your-api-key"
./deploy.sh production

2. Verify Deployment

kubectl get all -n esddns
kubectl logs -n esddns -l app=esddns-operator -f

3. Get External IP

kubectl get svc -n esddns esddns-service
# Wait for External-IP to be assigned

4. Access Service

EXTERNAL_IP=$(kubectl get svc -n esddns esddns-service \
  -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
curl http://$EXTERNAL_IP/

Next Steps

  1. Update dns.ini - Modify target domain and record settings in ConfigMap
  2. Update API Key - Create proper secret with real Gandi API key
  3. Monitor Setup - Install Prometheus and configure ServiceMonitor scraping
  4. Alerting - Apply PrometheusRules for notifications
  5. CI/CD Integration - Automate builds and deployments
  6. Network Policies - Restrict traffic for security

Files Created

k8s/
├── esddns_operator.py                 # Kopf operator
├── Dockerfile                         # Container image
├── deploy.sh                          # Deployment script
├── README.md                          # Quick reference
├── DEPLOYMENT.md                      # Detailed guide
├── base/
│   ├── kustomization.yaml
│   ├── namespace.yaml
│   ├── serviceaccount.yaml
│   ├── clusterrole.yaml
│   ├── clusterrolebinding.yaml
│   ├── daemon-deployment.yaml
│   ├── service-deployment.yaml
│   ├── service.yaml
│   ├── configmap.yaml
│   └── secrets.yaml
├── overlays/
│   ├── development/
│   │   ├── kustomization.yaml
│   │   ├── daemon-dev-patch.yaml
│   │   └── service-dev-patch.yaml
│   └── production/
│       ├── kustomization.yaml
│       ├── daemon-prod-patch.yaml
│       └── service-prod-patch.yaml
└── monitoring/
    ├── prometheus-servicemonitor.yaml
    └── prometheus-rules.yaml

esddns_service/
└── metrics.py                         # Prometheus metrics

requirements-k8s.txt                  # K8s-specific dependencies

OPERATOR_SUMMARY.md                   # This file

Customization

Change Domain

kubectl edit configmap esddns-config -n esddns
# Update target-domain field
kubectl rollout restart daemonset/esddns-operator-daemon -n esddns

Change Update Interval

# Edit daemon_thread_interval in ConfigMap (seconds)
kubectl patch configmap esddns-config -n esddns \
  -p '{"data":{"daemon_thread_interval":"600"}}'

Adjust Resources

Edit overlays/production/(daemon|service)-prod-patch.yaml

Change Replicas

Edit overlays/production/kustomization.yaml replicas field

Troubleshooting

See DEPLOYMENT.md for detailed troubleshooting steps.

Quick checks:

# Operator status
kubectl get daemonset -n esddns esddns-operator-daemon
kubectl logs -n esddns -l app=esddns-operator -f

# Service status
kubectl get deployment -n esddns esddns-service
kubectl logs -n esddns -l app=esddns-service -f

# LoadBalancer status
kubectl get svc -n esddns esddns-service
kubectl describe svc -n esddns esddns-service

# Metrics
kubectl port-forward -n esddns daemonset/esddns-operator-daemon 8080:8080
curl http://localhost:8080/metrics

Support & Issues

  1. Check DEPLOYMENT.md troubleshooting
  2. Review operator logs
  3. Check Prometheus metrics
  4. Open GitHub issue with logs and configuration

Implementation Date: November 12, 2025 Framework: Kopf 1.37.2 + Kubernetes Python Version: 3.11+ License: MIT (same as ESDDNS)