39 KiB
Netcup RS 8000 Migration & AI Orchestration Setup Plan
🎯 Overview
Complete migration plan from DigitalOcean droplets to Netcup RS 8000 G12 Pro with smart AI orchestration layer that routes between local CPU (RS 8000) and serverless GPU (RunPod).
Server Specs:
- 20 cores, 64GB RAM, 3TB storage
- IP: 159.195.32.209
- Location: Germany (EU)
- SSH:
ssh netcup
Expected Savings: $86-350/month ($1,032-4,200/year)
📋 Phase 1: Pre-Migration Preparation
1.1 Inventory Current Services
DigitalOcean Main Droplet (143.198.39.165):
# Document all running services
ssh droplet "docker ps --format '{{.Names}}\t{{.Image}}\t{{.Ports}}'"
ssh droplet "pm2 list"
ssh droplet "systemctl list-units --type=service --state=running"
# Backup configurations
ssh droplet "tar -czf ~/configs-backup.tar.gz /etc/nginx /etc/systemd/system ~/.config"
scp droplet:~/configs-backup.tar.gz ~/backups/droplet-configs-$(date +%Y%m%d).tar.gz
DigitalOcean AI Services Droplet (178.128.238.87):
# Document AI services
ssh ai-droplet "docker ps --format '{{.Names}}\t{{.Image}}\t{{.Ports}}'"
ssh ai-droplet "nvidia-smi" # Check GPU usage
ssh ai-droplet "df -h" # Check disk usage for models
# Backup AI model weights and configs
ssh ai-droplet "tar -czf ~/ai-models-backup.tar.gz ~/models ~/.cache/huggingface"
scp ai-droplet:~/ai-models-backup.tar.gz ~/backups/ai-models-$(date +%Y%m%d).tar.gz
Create Service Inventory Document:
cat > ~/migration-inventory.md << 'EOF'
# Service Inventory
## Main Droplet (143.198.39.165)
- [ ] nginx reverse proxy
- [ ] canvas-website
- [ ] Other web apps: ________________
- [ ] Databases: ________________
- [ ] Monitoring: ________________
## AI Droplet (178.128.238.87)
- [ ] Stable Diffusion
- [ ] Ollama/LLM services
- [ ] Model storage location: ________________
- [ ] Current GPU usage: ________________
## Data to Migrate
- [ ] Databases (size: ___GB)
- [ ] User uploads (size: ___GB)
- [ ] AI models (size: ___GB)
- [ ] Configuration files
- [ ] SSL certificates
- [ ] Environment variables
EOF
1.2 Test Netcup RS 8000 Access
# Verify SSH access
ssh netcup "hostname && uname -a && df -h"
# Check system resources
ssh netcup "nproc && free -h && lscpu | grep 'Model name'"
# Install basic tools
ssh netcup "apt update && apt install -y docker.io docker-compose git htop ncdu curl wget"
# Configure Docker
ssh netcup "systemctl enable docker && systemctl start docker"
ssh netcup "docker run hello-world"
1.3 Setup Directory Structure on Netcup
ssh netcup << 'EOF'
# Create organized directory structure
mkdir -p /opt/{ai-orchestrator,apps,databases,monitoring,backups}
mkdir -p /data/{models,uploads,databases}
mkdir -p /etc/docker/compose
# Set permissions
chown -R $USER:$USER /opt /data
chmod 755 /opt /data
ls -la /opt /data
EOF
📋 Phase 2: Deploy AI Orchestration Infrastructure
2.1 Transfer AI Orchestration Stack
# Create the AI orchestration directory structure
cat > /tmp/create-ai-orchestrator.sh << 'SCRIPT'
#!/bin/bash
set -e
BASE_DIR="/opt/ai-orchestrator"
mkdir -p $BASE_DIR/{services/{router,workers,monitor},configs,data/{redis,postgres,prometheus}}
echo "✅ Created AI orchestrator directory structure"
ls -R $BASE_DIR
SCRIPT
# Copy to Netcup and execute
scp /tmp/create-ai-orchestrator.sh netcup:/tmp/
ssh netcup "chmod +x /tmp/create-ai-orchestrator.sh && /tmp/create-ai-orchestrator.sh"
2.2 Deploy Docker Compose Stack
Create main docker-compose.yml:
ssh netcup "cat > /opt/ai-orchestrator/docker-compose.yml" << 'EOF'
version: '3.8'
services:
# Redis for job queues
redis:
image: redis:7-alpine
container_name: ai-redis
ports:
- "6379:6379"
volumes:
- ./data/redis:/data
command: redis-server --appendonly yes
restart: unless-stopped
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 5s
timeout: 3s
retries: 5
# PostgreSQL for job history and analytics
postgres:
image: postgres:15-alpine
container_name: ai-postgres
environment:
POSTGRES_DB: ai_orchestrator
POSTGRES_USER: aiuser
POSTGRES_PASSWORD: ${POSTGRES_PASSWORD:-changeme}
ports:
- "5432:5432"
volumes:
- ./data/postgres:/var/lib/postgresql/data
restart: unless-stopped
healthcheck:
test: ["CMD-SHELL", "pg_isready -U aiuser"]
interval: 5s
timeout: 3s
retries: 5
# Smart Router API (FastAPI)
router:
build: ./services/router
container_name: ai-router
ports:
- "8000:8000"
environment:
REDIS_URL: redis://redis:6379
DATABASE_URL: postgresql://aiuser:${POSTGRES_PASSWORD:-changeme}@postgres:5432/ai_orchestrator
RUNPOD_API_KEY: ${RUNPOD_API_KEY}
OLLAMA_URL: http://ollama:11434
SD_CPU_URL: http://stable-diffusion-cpu:7860
depends_on:
redis:
condition: service_healthy
postgres:
condition: service_healthy
restart: unless-stopped
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
interval: 10s
timeout: 5s
retries: 3
# Text Worker (processes text generation queue)
text-worker:
build: ./services/workers
container_name: ai-text-worker
environment:
REDIS_URL: redis://redis:6379
DATABASE_URL: postgresql://aiuser:${POSTGRES_PASSWORD:-changeme}@postgres:5432/ai_orchestrator
WORKER_TYPE: text
OLLAMA_URL: http://ollama:11434
RUNPOD_API_KEY: ${RUNPOD_API_KEY}
depends_on:
- redis
- postgres
- router
restart: unless-stopped
deploy:
replicas: 2
# Image Worker (processes image generation queue)
image-worker:
build: ./services/workers
container_name: ai-image-worker
environment:
REDIS_URL: redis://redis:6379
DATABASE_URL: postgresql://aiuser:${POSTGRES_PASSWORD:-changeme}@postgres:5432/ai_orchestrator
WORKER_TYPE: image
SD_CPU_URL: http://stable-diffusion-cpu:7860
RUNPOD_API_KEY: ${RUNPOD_API_KEY}
depends_on:
- redis
- postgres
- router
restart: unless-stopped
# Code Worker (processes code generation queue)
code-worker:
build: ./services/workers
container_name: ai-code-worker
environment:
REDIS_URL: redis://redis:6379
DATABASE_URL: postgresql://aiuser:${POSTGRES_PASSWORD:-changeme}@postgres:5432/ai_orchestrator
WORKER_TYPE: code
OLLAMA_URL: http://ollama:11434
depends_on:
- redis
- postgres
- router
restart: unless-stopped
# Video Worker (processes video generation queue - always RunPod)
video-worker:
build: ./services/workers
container_name: ai-video-worker
environment:
REDIS_URL: redis://redis:6379
DATABASE_URL: postgresql://aiuser:${POSTGRES_PASSWORD:-changeme}@postgres:5432/ai_orchestrator
WORKER_TYPE: video
RUNPOD_API_KEY: ${RUNPOD_API_KEY}
RUNPOD_VIDEO_ENDPOINT_ID: ${RUNPOD_VIDEO_ENDPOINT_ID}
depends_on:
- redis
- postgres
- router
restart: unless-stopped
# Ollama (local LLM server)
ollama:
image: ollama/ollama:latest
container_name: ai-ollama
ports:
- "11434:11434"
volumes:
- /data/models/ollama:/root/.ollama
restart: unless-stopped
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:11434/api/tags"]
interval: 30s
timeout: 10s
retries: 3
# Stable Diffusion (CPU fallback)
stable-diffusion-cpu:
image: ghcr.io/stablecog/sc-worker:latest
container_name: ai-sd-cpu
ports:
- "7860:7860"
volumes:
- /data/models/stable-diffusion:/models
environment:
USE_CPU: "true"
MODEL_PATH: /models/sd-v2.1
restart: unless-stopped
# Cost Monitor & Analytics
monitor:
build: ./services/monitor
container_name: ai-monitor
ports:
- "3000:3000"
environment:
REDIS_URL: redis://redis:6379
DATABASE_URL: postgresql://aiuser:${POSTGRES_PASSWORD:-changeme}@postgres:5432/ai_orchestrator
depends_on:
- redis
- postgres
restart: unless-stopped
# Prometheus (metrics collection)
prometheus:
image: prom/prometheus:latest
container_name: ai-prometheus
ports:
- "9090:9090"
volumes:
- ./configs/prometheus.yml:/etc/prometheus/prometheus.yml
- ./data/prometheus:/prometheus
command:
- '--config.file=/etc/prometheus/prometheus.yml'
- '--storage.tsdb.path=/prometheus'
restart: unless-stopped
# Grafana (dashboards)
grafana:
image: grafana/grafana:latest
container_name: ai-grafana
ports:
- "3001:3000"
volumes:
- ./data/grafana:/var/lib/grafana
- ./configs/grafana-dashboards:/etc/grafana/provisioning/dashboards
environment:
GF_SECURITY_ADMIN_PASSWORD: ${GRAFANA_PASSWORD:-admin}
depends_on:
- prometheus
restart: unless-stopped
networks:
default:
name: ai-orchestrator-network
EOF
2.3 Create Smart Router Service
ssh netcup "mkdir -p /opt/ai-orchestrator/services/router"
ssh netcup "cat > /opt/ai-orchestrator/services/router/Dockerfile" << 'EOF'
FROM python:3.11-slim
WORKDIR /app
RUN pip install --no-cache-dir \
fastapi==0.104.1 \
uvicorn[standard]==0.24.0 \
redis==5.0.1 \
asyncpg==0.29.0 \
httpx==0.25.1 \
pydantic==2.5.0 \
pydantic-settings==2.1.0
COPY main.py .
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]
EOF
Create Router API:
ssh netcup "cat > /opt/ai-orchestrator/services/router/main.py" << 'EOF'
from fastapi import FastAPI, HTTPException, BackgroundTasks
from pydantic import BaseModel
from typing import Optional, Literal
import redis.asyncio as redis
import asyncpg
import httpx
import json
import time
import os
from datetime import datetime
import uuid
app = FastAPI(title="AI Orchestrator", version="1.0.0")
# Configuration
REDIS_URL = os.getenv("REDIS_URL", "redis://localhost:6379")
DATABASE_URL = os.getenv("DATABASE_URL")
RUNPOD_API_KEY = os.getenv("RUNPOD_API_KEY")
OLLAMA_URL = os.getenv("OLLAMA_URL", "http://localhost:11434")
SD_CPU_URL = os.getenv("SD_CPU_URL", "http://localhost:7860")
# Redis connection pool
redis_pool = None
@app.on_event("startup")
async def startup():
global redis_pool
redis_pool = redis.ConnectionPool.from_url(REDIS_URL, decode_responses=True)
@app.on_event("shutdown")
async def shutdown():
if redis_pool:
await redis_pool.disconnect()
# Request Models
class TextGenerationRequest(BaseModel):
prompt: str
model: str = "llama3-70b"
priority: Literal["low", "normal", "high"] = "normal"
user_id: Optional[str] = None
wait: bool = False # Wait for result or return job_id
class ImageGenerationRequest(BaseModel):
prompt: str
model: str = "sdxl"
priority: Literal["low", "normal", "high"] = "normal"
size: str = "1024x1024"
user_id: Optional[str] = None
wait: bool = False
class VideoGenerationRequest(BaseModel):
prompt: str
model: str = "wan2.1-i2v"
duration: int = 3 # seconds
user_id: Optional[str] = None
wait: bool = False
class CodeGenerationRequest(BaseModel):
prompt: str
language: str = "python"
priority: Literal["low", "normal", "high"] = "normal"
user_id: Optional[str] = None
wait: bool = False
# Response Models
class JobResponse(BaseModel):
job_id: str
status: str
message: str
class ResultResponse(BaseModel):
job_id: str
status: str
result: Optional[dict] = None
cost: Optional[float] = None
provider: Optional[str] = None
processing_time: Optional[float] = None
# Health Check
@app.get("/health")
async def health_check():
return {"status": "healthy", "timestamp": datetime.utcnow().isoformat()}
# Smart Routing Logic
async def route_text_job(request: TextGenerationRequest) -> str:
"""
Text routing logic:
- Always use local Ollama (FREE, fast enough with 20 cores)
- Only use RunPod for extremely large context or special models
"""
return "local" # 99% of text goes to local CPU
async def route_image_job(request: ImageGenerationRequest) -> str:
"""
Image routing logic:
- Low priority → Local SD CPU (slow but FREE)
- Normal priority → Check queue depth, route to faster option
- High priority → RunPod GPU (fast, $0.02)
"""
if request.priority == "high":
return "runpod"
if request.priority == "low":
return "local"
# Normal priority: check queue depth
r = redis.Redis(connection_pool=redis_pool)
queue_depth = await r.llen("queue:image:local")
# If local queue is backed up (>10 jobs), use RunPod for faster response
if queue_depth > 10:
return "runpod"
return "local"
async def route_video_job(request: VideoGenerationRequest) -> str:
"""
Video routing logic:
- Always RunPod (no local option for video generation)
"""
return "runpod"
async def route_code_job(request: CodeGenerationRequest) -> str:
"""
Code routing logic:
- Always local (CodeLlama/DeepSeek on Ollama)
"""
return "local"
# Text Generation Endpoint
@app.post("/generate/text", response_model=JobResponse)
async def generate_text(request: TextGenerationRequest, background_tasks: BackgroundTasks):
job_id = str(uuid.uuid4())
provider = await route_text_job(request)
# Add to queue
r = redis.Redis(connection_pool=redis_pool)
job_data = {
"job_id": job_id,
"type": "text",
"provider": provider,
"request": request.dict(),
"created_at": datetime.utcnow().isoformat(),
"status": "queued"
}
await r.lpush(f"queue:text:{provider}", json.dumps(job_data))
await r.set(f"job:{job_id}", json.dumps(job_data))
return JobResponse(
job_id=job_id,
status="queued",
message=f"Job queued on {provider} provider"
)
# Image Generation Endpoint
@app.post("/generate/image", response_model=JobResponse)
async def generate_image(request: ImageGenerationRequest):
job_id = str(uuid.uuid4())
provider = await route_image_job(request)
r = redis.Redis(connection_pool=redis_pool)
job_data = {
"job_id": job_id,
"type": "image",
"provider": provider,
"request": request.dict(),
"created_at": datetime.utcnow().isoformat(),
"status": "queued"
}
await r.lpush(f"queue:image:{provider}", json.dumps(job_data))
await r.set(f"job:{job_id}", json.dumps(job_data))
return JobResponse(
job_id=job_id,
status="queued",
message=f"Job queued on {provider} provider (priority: {request.priority})"
)
# Video Generation Endpoint
@app.post("/generate/video", response_model=JobResponse)
async def generate_video(request: VideoGenerationRequest):
job_id = str(uuid.uuid4())
provider = "runpod" # Always RunPod for video
r = redis.Redis(connection_pool=redis_pool)
job_data = {
"job_id": job_id,
"type": "video",
"provider": provider,
"request": request.dict(),
"created_at": datetime.utcnow().isoformat(),
"status": "queued"
}
await r.lpush(f"queue:video:{provider}", json.dumps(job_data))
await r.set(f"job:{job_id}", json.dumps(job_data))
return JobResponse(
job_id=job_id,
status="queued",
message="Video generation queued on RunPod GPU"
)
# Code Generation Endpoint
@app.post("/generate/code", response_model=JobResponse)
async def generate_code(request: CodeGenerationRequest):
job_id = str(uuid.uuid4())
provider = "local" # Always local for code
r = redis.Redis(connection_pool=redis_pool)
job_data = {
"job_id": job_id,
"type": "code",
"provider": provider,
"request": request.dict(),
"created_at": datetime.utcnow().isoformat(),
"status": "queued"
}
await r.lpush(f"queue:code:{provider}", json.dumps(job_data))
await r.set(f"job:{job_id}", json.dumps(job_data))
return JobResponse(
job_id=job_id,
status="queued",
message="Code generation queued on local provider"
)
# Job Status Endpoint
@app.get("/job/{job_id}", response_model=ResultResponse)
async def get_job_status(job_id: str):
r = redis.Redis(connection_pool=redis_pool)
job_data = await r.get(f"job:{job_id}")
if not job_data:
raise HTTPException(status_code=404, detail="Job not found")
job = json.loads(job_data)
return ResultResponse(
job_id=job_id,
status=job.get("status", "unknown"),
result=job.get("result"),
cost=job.get("cost"),
provider=job.get("provider"),
processing_time=job.get("processing_time")
)
# Queue Status Endpoint
@app.get("/queue/status")
async def get_queue_status():
r = redis.Redis(connection_pool=redis_pool)
queues = {
"text_local": await r.llen("queue:text:local"),
"text_runpod": await r.llen("queue:text:runpod"),
"image_local": await r.llen("queue:image:local"),
"image_runpod": await r.llen("queue:image:runpod"),
"video_runpod": await r.llen("queue:video:runpod"),
"code_local": await r.llen("queue:code:local"),
}
return {
"queues": queues,
"total_pending": sum(queues.values()),
"timestamp": datetime.utcnow().isoformat()
}
# Cost Summary Endpoint
@app.get("/costs/summary")
async def get_cost_summary():
# This would query PostgreSQL for cost data
# For now, return mock data
return {
"today": {
"local": 0.00,
"runpod": 2.45,
"total": 2.45
},
"this_month": {
"local": 0.00,
"runpod": 45.20,
"total": 45.20
},
"breakdown": {
"text": 0.00,
"image": 12.50,
"video": 32.70,
"code": 0.00
}
}
EOF
2.4 Create Worker Service
ssh netcup "cat > /opt/ai-orchestrator/services/workers/Dockerfile" << 'EOF'
FROM python:3.11-slim
WORKDIR /app
RUN pip install --no-cache-dir \
redis==5.0.1 \
asyncpg==0.29.0 \
httpx==0.25.1 \
openai==1.3.0
COPY worker.py .
CMD ["python", "worker.py"]
EOF
ssh netcup "cat > /opt/ai-orchestrator/services/workers/worker.py" << 'EOF'
import redis
import json
import os
import time
import httpx
import asyncio
from datetime import datetime
REDIS_URL = os.getenv("REDIS_URL", "redis://localhost:6379")
WORKER_TYPE = os.getenv("WORKER_TYPE", "text")
OLLAMA_URL = os.getenv("OLLAMA_URL", "http://localhost:11434")
SD_CPU_URL = os.getenv("SD_CPU_URL", "http://localhost:7860")
RUNPOD_API_KEY = os.getenv("RUNPOD_API_KEY")
r = redis.Redis.from_url(REDIS_URL, decode_responses=True)
async def process_text_job(job_data):
"""Process text generation job using Ollama"""
request = job_data["request"]
provider = job_data["provider"]
start_time = time.time()
if provider == "local":
# Use Ollama
async with httpx.AsyncClient() as client:
response = await client.post(
f"{OLLAMA_URL}/api/generate",
json={
"model": request["model"],
"prompt": request["prompt"],
"stream": False
},
timeout=120.0
)
result = response.json()
return {
"text": result.get("response", ""),
"cost": 0.00, # Local is free
"provider": "ollama",
"processing_time": time.time() - start_time
}
else:
# Use RunPod (fallback)
# Implementation for RunPod text endpoint
return {
"text": "RunPod text generation",
"cost": 0.01,
"provider": "runpod",
"processing_time": time.time() - start_time
}
async def process_image_job(job_data):
"""Process image generation job"""
request = job_data["request"]
provider = job_data["provider"]
start_time = time.time()
if provider == "local":
# Use local Stable Diffusion (CPU)
async with httpx.AsyncClient() as client:
response = await client.post(
f"{SD_CPU_URL}/sdapi/v1/txt2img",
json={
"prompt": request["prompt"],
"steps": 20,
"width": 512,
"height": 512
},
timeout=180.0
)
result = response.json()
return {
"image_url": result.get("images", [""])[0],
"cost": 0.00, # Local is free
"provider": "stable-diffusion-cpu",
"processing_time": time.time() - start_time
}
else:
# Use RunPod SDXL
# Implementation for RunPod image endpoint
return {
"image_url": "runpod_image_url",
"cost": 0.02,
"provider": "runpod-sdxl",
"processing_time": time.time() - start_time
}
async def process_video_job(job_data):
"""Process video generation job (always RunPod)"""
request = job_data["request"]
start_time = time.time()
# Implementation for RunPod video endpoint (Wan2.1)
return {
"video_url": "runpod_video_url",
"cost": 0.50,
"provider": "runpod-wan2.1",
"processing_time": time.time() - start_time
}
async def process_code_job(job_data):
"""Process code generation job (local only)"""
request = job_data["request"]
start_time = time.time()
# Use Ollama with CodeLlama
async with httpx.AsyncClient() as client:
response = await client.post(
f"{OLLAMA_URL}/api/generate",
json={
"model": "codellama",
"prompt": request["prompt"],
"stream": False
},
timeout=120.0
)
result = response.json()
return {
"code": result.get("response", ""),
"cost": 0.00,
"provider": "ollama-codellama",
"processing_time": time.time() - start_time
}
async def worker_loop():
"""Main worker loop"""
print(f"🚀 Starting {WORKER_TYPE} worker...")
processors = {
"text": process_text_job,
"image": process_image_job,
"video": process_video_job,
"code": process_code_job
}
processor = processors.get(WORKER_TYPE)
if not processor:
raise ValueError(f"Unknown worker type: {WORKER_TYPE}")
while True:
try:
# Try both local and runpod queues
for provider in ["local", "runpod"]:
queue_name = f"queue:{WORKER_TYPE}:{provider}"
# Block for 1 second waiting for job
job_json = r.brpop(queue_name, timeout=1)
if job_json:
_, job_data_str = job_json
job_data = json.loads(job_data_str)
job_id = job_data["job_id"]
print(f"📝 Processing job {job_id} ({WORKER_TYPE}/{provider})")
# Update status to processing
job_data["status"] = "processing"
r.set(f"job:{job_id}", json.dumps(job_data))
try:
# Process the job
result = await processor(job_data)
# Update job with result
job_data["status"] = "completed"
job_data["result"] = result
job_data["cost"] = result.get("cost", 0)
job_data["processing_time"] = result.get("processing_time", 0)
job_data["completed_at"] = datetime.utcnow().isoformat()
r.set(f"job:{job_id}", json.dumps(job_data))
print(f"✅ Completed job {job_id} (cost: ${result.get('cost', 0):.4f})")
except Exception as e:
print(f"❌ Error processing job {job_id}: {e}")
job_data["status"] = "failed"
job_data["error"] = str(e)
r.set(f"job:{job_id}", json.dumps(job_data))
break # Processed a job, start loop again
# Small delay to prevent tight loop
await asyncio.sleep(0.1)
except Exception as e:
print(f"❌ Worker error: {e}")
await asyncio.sleep(5)
if __name__ == "__main__":
asyncio.run(worker_loop())
EOF
2.5 Create Environment Configuration
ssh netcup "cat > /opt/ai-orchestrator/.env" << 'EOF'
# PostgreSQL
POSTGRES_PASSWORD=change_this_password_$(openssl rand -hex 16)
# RunPod API Keys
RUNPOD_API_KEY=your_runpod_api_key_here
RUNPOD_TEXT_ENDPOINT_ID=your_text_endpoint_id
RUNPOD_IMAGE_ENDPOINT_ID=your_image_endpoint_id
RUNPOD_VIDEO_ENDPOINT_ID=your_video_endpoint_id
# Grafana
GRAFANA_PASSWORD=change_this_password_$(openssl rand -hex 16)
# Monitoring
ALERT_EMAIL=your@email.com
COST_ALERT_THRESHOLD=100 # Alert if daily cost exceeds $100
EOF
2.6 Deploy AI Orchestration Stack
# Deploy the stack
ssh netcup "cd /opt/ai-orchestrator && docker-compose up -d"
# Check status
ssh netcup "cd /opt/ai-orchestrator && docker-compose ps"
# View logs
ssh netcup "cd /opt/ai-orchestrator && docker-compose logs -f router"
# Test health
ssh netcup "curl http://localhost:8000/health"
ssh netcup "curl http://localhost:8000/docs" # API documentation
📋 Phase 3: Setup Local AI Models
3.1 Download and Configure Ollama Models
# Pull recommended models
ssh netcup << 'EOF'
docker exec ai-ollama ollama pull llama3:70b
docker exec ai-ollama ollama pull codellama:34b
docker exec ai-ollama ollama pull deepseek-coder:33b
docker exec ai-ollama ollama pull mistral:7b
# List installed models
docker exec ai-ollama ollama list
# Test a model
docker exec ai-ollama ollama run llama3:70b "Hello, how are you?"
EOF
3.2 Setup Stable Diffusion Models
# Download Stable Diffusion v2.1 weights
ssh netcup << 'EOF'
mkdir -p /data/models/stable-diffusion/sd-v2.1
# Download from HuggingFace
cd /data/models/stable-diffusion/sd-v2.1
wget https://huggingface.co/stabilityai/stable-diffusion-2-1/resolve/main/v2-1_768-ema-pruned.safetensors
# Verify download
ls -lh /data/models/stable-diffusion/sd-v2.1/
EOF
3.3 Setup Video Generation Models (Wan2.1)
# Download Wan2.1 I2V model weights
ssh netcup << 'EOF'
# Install huggingface-cli if not already installed
pip install huggingface-hub
# Download Wan2.1 I2V 14B 720p model
mkdir -p /data/models/video-generation
cd /data/models/video-generation
huggingface-cli download Wan-AI/Wan2.1-I2V-14B-720P \
--include "*.safetensors" \
--local-dir wan2.1_i2v_14b
# Verify download
du -sh wan2.1_i2v_14b
ls -lh wan2.1_i2v_14b/
EOF
Note: The Wan2.1 model is very large (~28GB) and is designed to run on RunPod GPU, not locally on CPU. We'll configure RunPod endpoints for video generation.
📋 Phase 4: Migrate Existing Services
4.1 Migrate canvas-website
# On Netcup, create app directory
ssh netcup "mkdir -p /opt/apps/canvas-website"
# From local machine, sync the code
rsync -avz --exclude 'node_modules' --exclude '.git' \
~/Github/canvas-website/ \
netcup:/opt/apps/canvas-website/
# Build and deploy on Netcup
ssh netcup << 'EOF'
cd /opt/apps/canvas-website
# Install dependencies
npm install
# Build
npm run build
# Create systemd service or Docker container
# Option 1: Docker (recommended)
cat > Dockerfile << 'DOCKER'
FROM node:20-alpine
WORKDIR /app
COPY package*.json ./
RUN npm ci --production
COPY . .
RUN npm run build
EXPOSE 3000
CMD ["npm", "start"]
DOCKER
docker build -t canvas-website .
docker run -d --name canvas-website -p 3000:3000 canvas-website
# Option 2: PM2
pm2 start npm --name canvas-website -- start
pm2 save
EOF
4.2 Setup Nginx Reverse Proxy
ssh netcup << 'EOF'
apt install -y nginx certbot python3-certbot-nginx
# Create nginx config
cat > /etc/nginx/sites-available/canvas-website << 'NGINX'
server {
listen 80;
server_name canvas.jeffemmett.com;
location / {
proxy_pass http://localhost:3000;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection 'upgrade';
proxy_set_header Host $host;
proxy_cache_bypass $http_upgrade;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}
}
# AI Orchestrator API
server {
listen 80;
server_name ai-api.jeffemmett.com;
location / {
proxy_pass http://localhost:8000;
proxy_http_version 1.1;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}
}
NGINX
# Enable site
ln -s /etc/nginx/sites-available/canvas-website /etc/nginx/sites-enabled/
nginx -t
systemctl reload nginx
# Setup SSL
certbot --nginx -d canvas.jeffemmett.com -d ai-api.jeffemmett.com
EOF
4.3 Migrate Databases
# Export from DigitalOcean
ssh droplet << 'EOF'
# PostgreSQL
pg_dump -U postgres your_database > /tmp/db_backup.sql
# MongoDB (if you have it)
mongodump --out /tmp/mongo_backup
EOF
# Transfer to Netcup
scp droplet:/tmp/db_backup.sql /tmp/
scp /tmp/db_backup.sql netcup:/tmp/
# Import on Netcup
ssh netcup << 'EOF'
# PostgreSQL
psql -U postgres -d your_database < /tmp/db_backup.sql
# Verify
psql -U postgres -d your_database -c "SELECT COUNT(*) FROM your_table;"
EOF
4.4 Migrate User Uploads and Data
# Sync user uploads
rsync -avz --progress \
droplet:/var/www/uploads/ \
netcup:/data/uploads/
# Sync any other data directories
rsync -avz --progress \
droplet:/var/www/data/ \
netcup:/data/app-data/
📋 Phase 5: Update canvas-website for AI Orchestration
5.1 Update Environment Variables
Now let's update the canvas-website configuration to use the new AI orchestrator:
# Create updated .env file for canvas-website
cat > .env.local << 'EOF'
# AI Orchestrator
VITE_AI_ORCHESTRATOR_URL=http://159.195.32.209:8000
# Or use domain: https://ai-api.jeffemmett.com
# RunPod (direct access, fallback)
VITE_RUNPOD_API_KEY=your_runpod_api_key_here
VITE_RUNPOD_TEXT_ENDPOINT_ID=your_text_endpoint_id
VITE_RUNPOD_IMAGE_ENDPOINT_ID=your_image_endpoint_id
VITE_RUNPOD_VIDEO_ENDPOINT_ID=your_video_endpoint_id
# Other existing vars...
VITE_GOOGLE_CLIENT_ID=your_google_client_id
VITE_GOOGLE_MAPS_API_KEY=your_google_maps_api_key
VITE_DAILY_DOMAIN=your_daily_domain
VITE_TLDRAW_WORKER_URL=your_worker_url
EOF
5.2 Disable Mock Mode for Image Generation
Let's fix the ImageGenShapeUtil to use the real AI orchestrator:
# Update USE_MOCK_API flag
sed -i 's/const USE_MOCK_API = true/const USE_MOCK_API = false/' \
src/shapes/ImageGenShapeUtil.tsx
5.3 Create AI Orchestrator Client
Create a new client library for the AI orchestrator:
// src/lib/aiOrchestrator.ts
export interface AIJob {
job_id: string
status: 'queued' | 'processing' | 'completed' | 'failed'
result?: any
cost?: number
provider?: string
processing_time?: number
}
export class AIOrchestrator {
private baseUrl: string
constructor(baseUrl?: string) {
this.baseUrl = baseUrl ||
import.meta.env.VITE_AI_ORCHESTRATOR_URL ||
'http://localhost:8000'
}
async generateText(
prompt: string,
options: {
model?: string
priority?: 'low' | 'normal' | 'high'
userId?: string
wait?: boolean
} = {}
): Promise<AIJob> {
const response = await fetch(`${this.baseUrl}/generate/text`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
prompt,
model: options.model || 'llama3-70b',
priority: options.priority || 'normal',
user_id: options.userId,
wait: options.wait || false
})
})
const job = await response.json()
if (options.wait) {
return this.waitForJob(job.job_id)
}
return job
}
async generateImage(
prompt: string,
options: {
model?: string
priority?: 'low' | 'normal' | 'high'
size?: string
userId?: string
wait?: boolean
} = {}
): Promise<AIJob> {
const response = await fetch(`${this.baseUrl}/generate/image`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
prompt,
model: options.model || 'sdxl',
priority: options.priority || 'normal',
size: options.size || '1024x1024',
user_id: options.userId,
wait: options.wait || false
})
})
const job = await response.json()
if (options.wait) {
return this.waitForJob(job.job_id)
}
return job
}
async generateVideo(
prompt: string,
options: {
model?: string
duration?: number
userId?: string
wait?: boolean
} = {}
): Promise<AIJob> {
const response = await fetch(`${this.baseUrl}/generate/video`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
prompt,
model: options.model || 'wan2.1-i2v',
duration: options.duration || 3,
user_id: options.userId,
wait: options.wait || false
})
})
const job = await response.json()
if (options.wait) {
return this.waitForJob(job.job_id)
}
return job
}
async generateCode(
prompt: string,
options: {
language?: string
priority?: 'low' | 'normal' | 'high'
userId?: string
wait?: boolean
} = {}
): Promise<AIJob> {
const response = await fetch(`${this.baseUrl}/generate/code`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
prompt,
language: options.language || 'python',
priority: options.priority || 'normal',
user_id: options.userId,
wait: options.wait || false
})
})
const job = await response.json()
if (options.wait) {
return this.waitForJob(job.job_id)
}
return job
}
async getJobStatus(jobId: string): Promise<AIJob> {
const response = await fetch(`${this.baseUrl}/job/${jobId}`)
return response.json()
}
async waitForJob(
jobId: string,
maxAttempts: number = 120,
pollInterval: number = 1000
): Promise<AIJob> {
for (let i = 0; i < maxAttempts; i++) {
const job = await this.getJobStatus(jobId)
if (job.status === 'completed') {
return job
}
if (job.status === 'failed') {
throw new Error(`Job failed: ${JSON.stringify(job)}`)
}
await new Promise(resolve => setTimeout(resolve, pollInterval))
}
throw new Error(`Job ${jobId} timed out after ${maxAttempts} attempts`)
}
async getQueueStatus() {
const response = await fetch(`${this.baseUrl}/queue/status`)
return response.json()
}
async getCostSummary() {
const response = await fetch(`${this.baseUrl}/costs/summary`)
return response.json()
}
}
// Singleton instance
export const aiOrchestrator = new AIOrchestrator()
📋 Phase 6: Testing & Validation
6.1 Test AI Orchestrator
# Test text generation
curl -X POST http://159.195.32.209:8000/generate/text \
-H "Content-Type: application/json" \
-d '{
"prompt": "Write a hello world program in Python",
"priority": "normal",
"wait": false
}'
# Get job status
curl http://159.195.32.209:8000/job/YOUR_JOB_ID
# Check queue status
curl http://159.195.32.209:8000/queue/status
# Check costs
curl http://159.195.32.209:8000/costs/summary
6.2 Test Image Generation
# Low priority (local CPU)
curl -X POST http://159.195.32.209:8000/generate/image \
-H "Content-Type: application/json" \
-d '{
"prompt": "A beautiful landscape",
"priority": "low"
}'
# High priority (RunPod GPU)
curl -X POST http://159.195.32.209:8000/generate/image \
-H "Content-Type: application/json" \
-d '{
"prompt": "A beautiful landscape",
"priority": "high"
}'
6.3 Validate Migration
Checklist:
- All services accessible from new IPs
- SSL certificates installed and working
- Databases migrated and verified
- User uploads accessible
- AI orchestrator responding
- Monitoring dashboards working
- Cost tracking functional
📋 Phase 7: DNS Updates & Cutover
7.1 Update DNS Records
# Update A records to point to Netcup RS 8000
# Old IP: 143.198.39.165 (DigitalOcean)
# New IP: 159.195.32.209 (Netcup)
# Update these domains:
# - canvas.jeffemmett.com → 159.195.32.209
# - ai-api.jeffemmett.com → 159.195.32.209
# - Any other domains hosted on droplet
7.2 Parallel Running Period
Run both servers in parallel for 1-2 weeks:
- Monitor traffic on both
- Compare performance
- Watch for issues
- Verify all features work on new server
7.3 Final Cutover
Once validated:
- Update DNS TTL to 300 seconds (5 min)
- Switch DNS to Netcup IPs
- Monitor for 48 hours
- Shut down DigitalOcean droplets
- Cancel DigitalOcean subscription
📋 Phase 8: Monitoring & Optimization
8.1 Setup Monitoring Dashboards
Access your monitoring:
- Grafana: http://159.195.32.209:3001
- Prometheus: http://159.195.32.209:9090
- AI API Docs: http://159.195.32.209:8000/docs
8.2 Cost Optimization Recommendations
# Get optimization suggestions
curl http://159.195.32.209:3000/api/recommendations
# Review daily costs
curl http://159.195.32.209:3000/api/costs/summary
8.3 Performance Tuning
Based on usage patterns:
- Adjust worker pool sizes
- Tune queue routing thresholds
- Optimize model choices
- Scale RunPod endpoints
💰 Expected Cost Breakdown
Before Migration (DigitalOcean):
- Main Droplet (2 vCPU, 2GB): $18/mo
- AI Droplet (2 vCPU, 4GB): $36/mo
- RunPod persistent pods: $100-200/mo
- Total: $154-254/mo
After Migration (Netcup + RunPod):
- RS 8000 G12 Pro: €55.57/mo (~$60/mo)
- RunPod serverless (70% reduction): $30-60/mo
- Total: $90-120/mo
Savings:
- Monthly: $64-134
- Annual: $768-1,608
Plus you get:
- 10x CPU cores (20 vs 2)
- 32x RAM (64GB vs 2GB)
- 25x storage (3TB vs 120GB)
🎯 Next Steps Summary
- TODAY: Verify Netcup RS 8000 access
- Week 1: Deploy AI orchestration stack
- Week 2: Migrate canvas-website and test
- Week 3: Migrate remaining services
- Week 4: DNS cutover and monitoring
- Week 5: Decommission DigitalOcean
Total migration timeline: 4-5 weeks for safe, validated migration.
📚 Additional Resources
- AI Orchestrator API Docs: http://159.195.32.209:8000/docs
- Grafana Dashboards: http://159.195.32.209:3001
- Queue Monitoring: http://159.195.32.209:8000/queue/status
- Cost Tracking: http://159.195.32.209:3000/api/costs/summary
Ready to start? Let's begin with Phase 1: Pre-Migration Preparation! 🚀