Replace CLAUDE.md symlink with actual file for Docker compatibility

🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-04 02:29:06 -08:00 · 2025-12-04 02:29:06 -08:00 · b438c33ae2
parent 411994d9a4
commit b438c33ae2
1 changed files with 988 additions and 1 deletions
--- a/CLAUDE.md
+++ b/CLAUDE.md
@ -1 +0,0 @@
-/home/jeffe/.claude/CLAUDE.md
--- a/CLAUDE.md
+++ b/CLAUDE.md
@ -0,0 +1,988 @@
+## 🔧 AUTO-APPROVED OPERATIONS
+
+The following operations are auto-approved and do not require user confirmation:
+- **Read**: All file read operations (`Read(*)`)
+- **Glob**: All file pattern matching (`Glob(*)`)
+- **Grep**: All content searching (`Grep(*)`)
+
+These permissions are configured in `~/.claude/settings.json`.
+
+---
+
+## ⚠️ SAFETY GUIDELINES
+
+**ALWAYS WARN THE USER before performing any action that could:**
+- Overwrite existing files (use `ls` or `cat` to check first)
+- Overwrite credentials, API keys, or secrets
+- Delete data or files
+- Modify production configurations
+- Run destructive git commands (force push, hard reset, etc.)
+- Drop databases or truncate tables
+
+**Best practices:**
+- Before writing to a file, check if it exists and show its contents
+- Use `>>` (append) instead of `>` (overwrite) for credential files
+- Create backups before modifying critical configs (e.g., `cp file file.backup`)
+- Ask for confirmation before irreversible actions
+
+**Sudo commands:**
+- **NEVER run sudo commands directly** - the Bash tool doesn't support interactive input
+- Instead, **provide the user with the exact sudo command** they need to run in their terminal
+- Format the command clearly in a code block for easy copy-paste
+- After user runs the sudo command, continue with the workflow
+- Alternative: If user has recently run sudo (within ~15 min), subsequent sudo commands may not require password
+
+---
+
+## 🔑 ACCESS & CREDENTIALS
+
+### Version Control & Code Hosting
+- **Gitea**: Self-hosted at `gitea.jeffemmett.com` - PRIMARY repository
+  - Push here FIRST, then mirror to GitHub
+  - Private repos and source of truth
+  - SSH Key: `~/.ssh/gitea_ed25519` (private), `~/.ssh/gitea_ed25519.pub` (public)
+  - Public Key: `ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIIE2+2UZElEYptgZ9GFs2CXW0PIA57BfQcU9vlyV6fz4 gitea@jeffemmett.com`
+  - **Gitea CLI (tea)**: ✅ Installed at `~/bin/tea` (added to PATH)
+
+- **GitHub**: Public mirror and collaboration
+  - Receives pushes from Gitea via mirror sync
+  - Token: `(REDACTED-GITHUB-TOKEN)`
+  - SSH Key: `~/.ssh/github_deploy_key` (private), `~/.ssh/github_deploy_key.pub` (public)
+  - **GitHub CLI (gh)**: ✅ Installed and available for PR/issue management
+
+### Git Workflow
+**Two-way sync between Gitea and GitHub:**
+
+**Gitea-Primary Repos (Default):**
+1. Develop locally in `/home/jeffe/Github/`
+2. Commit and push to Gitea first
+3. Gitea automatically mirrors TO GitHub (built-in push mirror)
+4. GitHub used for public collaboration and visibility
+
+**GitHub-Primary Repos (Mirror Repos):**
+For repos where GitHub is source of truth (v0.dev exports, client collabs):
+1. Push to GitHub
+2. Deploy webhook pulls from GitHub and deploys
+3. Webhook triggers Gitea to sync FROM GitHub
+
+### 🔀 DEV BRANCH WORKFLOW (MANDATORY)
+
+**CRITICAL: All development work on canvas-website (and other active projects) MUST use a dev branch.**
+
+#### Branch Strategy
+```
+main (production)
+  └── dev (integration/staging)
+        └── feature/* (optional feature branches)
+```
+
+#### Development Rules
+
+1. **ALWAYS work on the `dev` branch** for new features and changes:
+   ```bash
+   cd /home/jeffe/Github/canvas-website
+   git checkout dev
+   git pull origin dev
+   ```
+
+2. **After completing a feature**, push to dev:
+   ```bash
+   git add .
+   git commit -m "feat: description of changes"
+   git push origin dev
+   ```
+
+3. **Update backlog task** immediately after pushing:
+   ```bash
+   backlog task edit <task-id> --status "Done" --append-notes "Pushed to dev branch"
+   ```
+
+4. **NEVER push directly to main** - main is for tested, verified features only
+
+5. **Merge dev → main manually** when features are verified working:
+   ```bash
+   git checkout main
+   git pull origin main
+   git merge dev
+   git push origin main
+   git checkout dev  # Return to dev for continued work
+   ```
+
+#### Complete Feature Deployment Checklist
+
+- [ ] Work on `dev` branch (not main)
+- [ ] Test locally before committing
+- [ ] Commit with descriptive message
+- [ ] Push to `dev` branch on Gitea
+- [ ] Update backlog task status to "Done"
+- [ ] Add notes to backlog task about what was implemented
+- [ ] (Later) When verified working: merge dev → main manually
+
+#### Why This Matters
+- **Protects production**: main branch always has known-working code
+- **Enables testing**: dev branch can be deployed to staging for verification
+- **Clean history**: main only gets complete, tested features
+- **Easy rollback**: if dev breaks, main is still stable
+
+### Server Infrastructure
+- **Netcup RS 8000 G12 Pro**: Primary application & AI server
+  - IP: `159.195.32.209`
+  - 20 cores, 64GB RAM, 3TB storage
+  - Hosts local AI models (Ollama, Stable Diffusion)
+  - All websites and apps deployed here in Docker containers
+  - Location: Germany (low latency EU)
+  - SSH Key (local): `~/.ssh/netcup_ed25519` (private), `~/.ssh/netcup_ed25519.pub` (public)
+  - Public Key: `ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIKmp4A2klKv/YIB1C6JAsb2UzvlzzE+0EcJ0jtkyFuhO netcup-rs8000@jeffemmett.com`
+  - SSH Access: `ssh netcup`
+  - **SSH Keys ON the server** (for git operations):
+    - Gitea: `~/.ssh/gitea_ed25519` → `ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIIE2+2UZElEYptgZ9GFs2CXW0PIA57BfQcU9vlyV6fz4 gitea@jeffemmett.com`
+    - GitHub: `~/.ssh/github_ed25519` → `ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIC6xXNICy0HXnqHO+U7+y7ui+pZBGe0bm0iRMS23pR1E github-deploy@netcup-rs8000`
+
+- **RunPod**: GPU burst capacity for AI workloads
+  - Host: `ssh.runpod.io`
+  - Serverless GPU pods (pay-per-use)
+  - Used for: SDXL/SD3, video generation, training
+  - Smart routing from RS 8000 orchestrator
+  - SSH Key: `~/.ssh/runpod_ed25519` (private), `~/.ssh/runpod_ed25519.pub` (public)
+  - Public Key: `ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIAC7NYjI0U/2ChGaZBBWP7gKt/V12Ts6FgatinJOQ8JG runpod@jeffemmett.com`
+  - SSH Access: `ssh runpod`
+  - **API Key**: `(REDACTED-RUNPOD-KEY)`
+  - **CLI Config**: `~/.runpod/config.toml`
+  - **Serverless Endpoints**:
+    - Image (SD): `tzf1j3sc3zufsy` (Automatic1111)
+    - Video (Wan2.2): `4jql4l7l0yw0f3`
+    - Text (vLLM): `03g5hz3hlo8gr2`
+    - Whisper: `lrtisuv8ixbtub`
+    - ComfyUI: `5zurj845tbf8he`
+
+### API Keys & Services
+
+**IMPORTANT**: All API keys and tokens are stored securely on the Netcup server. Never store credentials locally.
+- Access credentials via: `ssh netcup "cat ~/.cloudflare-credentials.env"` or `ssh netcup "cat ~/.porkbun_credentials"`
+- All API operations should be performed FROM the Netcup server, not locally
+
+#### Credential Files on Netcup (`/root/`)
+| File | Contents |
+|------|----------|
+| `~/.cloudflare-credentials.env` | Cloudflare API tokens, account ID, tunnel token |
+| `~/.cloudflare_credentials` | Legacy/DNS token |
+| `~/.porkbun_credentials` | Porkbun API key and secret |
+| `~/.v0_credentials` | V0.dev API key |
+
+#### Cloudflare
+- **Account ID**: `0e7b3338d5278ed1b148e6456b940913`
+- **Tokens stored on Netcup** - source `~/.cloudflare-credentials.env`:
+  - `CLOUDFLARE_API_TOKEN` - Zone read, Worker:read/edit, R2:read/edit
+  - `CLOUDFLARE_TUNNEL_TOKEN` - Tunnel management
+  - `CLOUDFLARE_ZONE_TOKEN` - Zone:Edit, DNS:Edit (for adding domains)
+
+#### Porkbun (Domain Registrar)
+- **Credentials stored on Netcup** - source `~/.porkbun_credentials`:
+  - `PORKBUN_API_KEY` and `PORKBUN_SECRET_KEY`
+- **API Endpoint**: `https://api-ipv4.porkbun.com/api/json/v3/`
+- **API Docs**: https://porkbun.com/api/json/v3/documentation
+- **Important**: JSON must have `secretapikey` before `apikey` in requests
+- **Capabilities**: Update nameservers, get auth codes for transfers, manage DNS
+- **Note**: Each domain must have "API Access" enabled individually in Porkbun dashboard
+
+#### Domain Onboarding Workflow (Porkbun → Cloudflare)
+Run these commands FROM Netcup (`ssh netcup`):
+1. Add domain to Cloudflare (creates zone, returns nameservers)
+2. Update nameservers at Porkbun to point to Cloudflare
+3. Add CNAME record pointing to Cloudflare tunnel
+4. Add hostname to tunnel config and restart cloudflared
+5. Domain is live through the tunnel!
+
+#### V0.dev (AI UI Generation)
+- **Credentials stored on Netcup** - source `~/.v0_credentials`:
+  - `V0_API_KEY` - Platform API access
+- **API Key**: `v1:5AwJbit4j9rhGcAKPU4XlVWs:05vyCcJLiWRVQW7Xu4u5E03G`
+- **SDK**: `npm install v0-sdk` (use `v0` CLI for adding components)
+- **Docs**: https://v0.app/docs/v0-platform-api
+- **Capabilities**:
+  - List/create/update/delete projects
+  - Manage chats and versions
+  - Download generated code
+  - Create deployments
+  - Manage environment variables
+- **Limitations**: GitHub-only for git integration (no Gitea/GitLab support)
+- **Usage**:
+  ```javascript
+  const { v0 } = require('v0-sdk');
+  // Uses V0_API_KEY env var automatically
+  const projects = await v0.projects.find();
+  const chats = await v0.chats.find();
+  ```
+
+#### Other Services
+- **HuggingFace**: CLI access available for model downloads
+- **RunPod**: API access for serverless GPU orchestration (see Server Infrastructure above)
+
+### Dev Ops Stack & Principles
+- **Platform**: Linux WSL2 (Ubuntu on Windows) for development
+- **Working Directory**: `/home/jeffe/Github`
+- **Container Strategy**:
+  - ALL repos should be Dockerized
+  - Optimized containers for production deployment
+  - Docker Compose for multi-service orchestration
+- **Process Management**: PM2 available for Node.js services
+- **Version Control**: Git configured with GitHub + Gitea mirrors
+- **Package Managers**: npm/pnpm/yarn available
+
+### 🚀 Traefik Reverse Proxy (Central Routing)
+All HTTP services on Netcup RS 8000 route through Traefik for automatic service discovery.
+
+**Architecture:**
+```
+Internet → Cloudflare Tunnel → Traefik (:80/:443) → Docker Services
+                                    │
+                                    ├── gitea.jeffemmett.com → gitea:3000
+                                    ├── mycofi.earth → mycofi:3000
+                                    ├── games.jeffemmett.com → games:80
+                                    └── [auto-discovered via Docker labels]
+```
+
+**Location:** `/root/traefik/` on Netcup RS 8000
+
+**Adding a New Service:**
+```yaml
+# In your docker-compose.yml, add these labels:
+services:
+  myapp:
+    image: myapp:latest
+    labels:
+      - "traefik.enable=true"
+      - "traefik.http.routers.myapp.rule=Host(`myapp.jeffemmett.com`)"
+      - "traefik.http.services.myapp.loadbalancer.server.port=3000"
+    networks:
+      - traefik-public
+networks:
+  traefik-public:
+    external: true
+```
+
+**Traefik Dashboard:** `http://159.195.32.209:8888` (internal only)
+
+**SSH Git Access:**
+- SSH goes direct (not through Traefik): `git.jeffemmett.com:223` → `159.195.32.209:223`
+- Web UI goes through Traefik: `gitea.jeffemmett.com` → Traefik → gitea:3000
+
+### ☁️ Cloudflare Tunnel Configuration
+**Location:** `/root/cloudflared/` on Netcup RS 8000
+
+The tunnel uses a token-based configuration managed via Cloudflare Zero Trust Dashboard.
+All public hostnames should point to `http://localhost:80` (Traefik), which routes based on Host header.
+
+**Managed hostnames:**
+- `gitea.jeffemmett.com` → Traefik → Gitea
+- `photos.jeffemmett.com` → Traefik → Immich
+- `movies.jeffemmett.com` → Traefik → Jellyfin
+- `search.jeffemmett.com` → Traefik → Semantic Search
+- `mycofi.earth` → Traefik → MycoFi
+- `games.jeffemmett.com` → Traefik → Games Platform
+- `decolonizeti.me` → Traefik → Decolonize Time
+
+**Tunnel ID:** `a838e9dc-0af5-4212-8af2-6864eb15e1b5`
+**Tunnel CNAME Target:** `a838e9dc-0af5-4212-8af2-6864eb15e1b5.cfargotunnel.com`
+
+**To deploy a new website/service:**
+
+1. **Dockerize the project** with Traefik labels in `docker-compose.yml`:
+   ```yaml
+   services:
+     myapp:
+       build: .
+       labels:
+         - "traefik.enable=true"
+         - "traefik.http.routers.myapp.rule=Host(`mydomain.com`) || Host(`www.mydomain.com`)"
+         - "traefik.http.services.myapp.loadbalancer.server.port=3000"
+       networks:
+         - traefik-public
+   networks:
+     traefik-public:
+       external: true
+   ```
+
+2. **Deploy to Netcup:**
+   ```bash
+   ssh netcup "cd /opt/websites && git clone <repo-url>"
+   ssh netcup "cd /opt/websites/<project> && docker compose up -d --build"
+   ```
+
+3. **Add hostname to tunnel config** (`/root/cloudflared/config.yml`):
+   ```yaml
+   - hostname: mydomain.com
+     service: http://localhost:80
+   - hostname: www.mydomain.com
+     service: http://localhost:80
+   ```
+   Then restart: `ssh netcup "docker restart cloudflared"`
+
+4. **Configure DNS in Cloudflare dashboard** (CRITICAL - prevents 525 SSL errors):
+   - Go to Cloudflare Dashboard → select domain → DNS → Records
+   - Delete any existing A/AAAA records for `@` and `www`
+   - Add CNAME records:
+     | Type | Name | Target | Proxy |
+     |------|------|--------|-------|
+     | CNAME | `@` | `a838e9dc-0af5-4212-8af2-6864eb15e1b5.cfargotunnel.com` | Proxied ✓ |
+     | CNAME | `www` | `a838e9dc-0af5-4212-8af2-6864eb15e1b5.cfargotunnel.com` | Proxied ✓ |
+
+**API Credentials** (on Netcup at `~/.cloudflare*`):
+- `CLOUDFLARE_API_TOKEN` - Zone read access only
+- `CLOUDFLARE_TUNNEL_TOKEN` - Tunnel management only
+- See **API Keys & Services** section above for Domain Management Token (required for DNS automation)
+
+### 🔄 Auto-Deploy Webhook System
+**Location:** `/opt/deploy-webhook/` on Netcup RS 8000
+**Endpoint:** `https://deploy.jeffemmett.com/deploy/<repo-name>`
+**Secret:** `gitea-deploy-secret-2025`
+
+Pushes to Gitea automatically trigger rebuilds. The webhook receiver:
+1. Validates HMAC signature from Gitea
+2. Runs `git pull && docker compose up -d --build`
+3. Returns build status
+
+**Adding a new repo to auto-deploy:**
+1. Add entry to `/opt/deploy-webhook/webhook.py` REPOS dict
+2. Restart: `ssh netcup "cd /opt/deploy-webhook && docker compose up -d --build"`
+3. Add Gitea webhook:
+   ```bash
+   curl -X POST "https://gitea.jeffemmett.com/api/v1/repos/jeffemmett/<repo>/hooks" \
+     -H "Authorization: token <gitea-token>" \
+     -H "Content-Type: application/json" \
+     -d '{"type":"gitea","active":true,"events":["push"],"config":{"url":"https://deploy.jeffemmett.com/deploy/<repo>","content_type":"json","secret":"gitea-deploy-secret-2025"}}'
+   ```
+
+**Currently auto-deploying:**
+- `decolonize-time-website` → /opt/websites/decolonize-time-website
+- `mycofi-earth-website` → /opt/websites/mycofi-earth-website
+- `games-platform` → /opt/apps/games-platform
+
+### 🔐 SSH Keys Quick Reference
+
+**Local keys** (in `~/.ssh/` on your laptop):
+
+| Service | Private Key | Public Key | Purpose |
+|---------|-------------|------------|---------|
+| **Gitea** | `gitea_ed25519` | `gitea_ed25519.pub` | Primary git repository |
+| **GitHub** | `github_deploy_key` | `github_deploy_key.pub` | Public mirror sync |
+| **Netcup RS 8000** | `netcup_ed25519` | `netcup_ed25519.pub` | Primary server SSH |
+| **RunPod** | `runpod_ed25519` | `runpod_ed25519.pub` | GPU pods SSH |
+| **Default** | `id_ed25519` | `id_ed25519.pub` | General purpose/legacy |
+
+**Server-side keys** (in `/root/.ssh/` on Netcup RS 8000):
+
+| Service | Key File | Purpose |
+|---------|----------|---------|
+| **Gitea** | `gitea_ed25519` | Server pulls from Gitea repos |
+| **GitHub** | `github_ed25519` | Server pulls from GitHub (mirror repos) |
+
+**SSH Config**: `~/.ssh/config` contains all host configurations
+**Quick Access**:
+- `ssh netcup` - Connect to Netcup RS 8000
+- `ssh runpod` - Connect to RunPod
+- `ssh gitea.jeffemmett.com` - Git operations
+
+---
+
+## 🤖 AI ORCHESTRATION ARCHITECTURE
+
+### Smart Routing Strategy
+All AI requests go through intelligent orchestration layer on RS 8000:
+
+**Routing Logic:**
+- **Text/Code (70-80% of workload)**: Always local RS 8000 CPU (Ollama) → FREE
+- **Images - Low Priority**: RS 8000 CPU (SD 1.5/2.1) → FREE but slow (~60s)
+- **Images - High Priority**: RunPod GPU (SDXL/SD3) → $0.02/image, fast
+- **Video Generation**: Always RunPod GPU → $0.50/video (only option)
+- **Training/Fine-tuning**: RunPod GPU on-demand
+
+**Queue System:**
+- Redis-based queues: text, image, code, video
+- Priority-based routing (low/normal/high)
+- Worker pools scale based on load
+- Cost tracking per job, per user
+
+**Cost Optimization:**
+- Target: $90-120/mo (vs $136-236/mo current)
+- Savings: $552-1,392/year
+- 70-80% of workload FREE (local CPU)
+- GPU only when needed (serverless = no idle costs)
+
+### Deployment Architecture
+```
+RS 8000 G12 Pro (Netcup)
+├── Cloudflare Tunnel (secure ingress)
+├── Traefik Reverse Proxy (auto-discovery)
+│   └── Routes to all services via Docker labels
+├── Core Services
+│   ├── Gitea (git hosting) - gitea.jeffemmett.com
+│   └── Other internal tools
+├── AI Services
+│   ├── Ollama (text/code models)
+│   ├── Stable Diffusion (CPU fallback)
+│   └── Smart Router API (FastAPI)
+├── Queue Infrastructure
+│   ├── Redis (job queues)
+│   └── PostgreSQL (job history/analytics)
+├── Monitoring
+│   ├── Prometheus (metrics)
+│   ├── Grafana (dashboards)
+│   └── Cost tracking API
+└── Application Hosting
+    ├── All websites (Dockerized + Traefik labels)
+    ├── All apps (Dockerized + Traefik labels)
+    └── Backend services (Dockerized)
+
+RunPod Serverless (GPU Burst)
+├── SDXL/SD3 endpoints
+├── Video generation (Wan2.1)
+└── Training/fine-tuning jobs
+```
+
+### Integration Pattern for Projects
+All projects use unified AI client SDK:
+```python
+from orchestrator_client import AIOrchestrator
+ai = AIOrchestrator("http://rs8000-ip:8000")
+
+# Automatically routes based on priority & model
+result = await ai.generate_text(prompt, priority="low")  # → FREE CPU
+result = await ai.generate_image(prompt, priority="high") # → RunPod GPU
+```
+
+---
+
+## 💰 GPU COST ANALYSIS & MIGRATION PLAN
+
+### Current Infrastructure Costs (Monthly)
+
+| Service | Type | Cost | Notes |
+|---------|------|------|-------|
+| Netcup RS 8000 G12 Pro | Fixed | ~€45 | 20 cores, 64GB RAM, 3TB (CPU-only) |
+| RunPod Serverless | Variable | $50-100 | Pay-per-use GPU (images, video) |
+| DigitalOcean Droplets | Fixed | ~$48 | ⚠️ DEPRECATED - migrate ASAP |
+| **Current Total** | | **~$140-190/mo** | |
+
+### GPU Provider Comparison
+
+#### Netcup vGPU (NEW - Early Access, Ends July 7, 2025)
+
+| Plan | GPU | VRAM | vCores | RAM | Storage | Price/mo | Price/hr equiv |
+|------|-----|------|--------|-----|---------|----------|----------------|
+| RS 2000 vGPU 7 | H200 | 7 GB dedicated | 8 | 16 GB DDR5 | 512 GB NVMe | €137.31 (~$150) | $0.21/hr |
+| RS 4000 vGPU 14 | H200 | 14 GB dedicated | 12 | 32 GB DDR5 | 1 TB NVMe | €261.39 (~$285) | $0.40/hr |
+
+**Pros:**
+- NVIDIA H200 (latest gen, better than H100 for inference)
+- Dedicated VRAM (no noisy neighbors)
+- Germany location (EU data sovereignty, low latency to RS 8000)
+- Fixed monthly cost = predictable budgeting
+- 24/7 availability, no cold starts
+
+**Cons:**
+- Pay even when idle
+- Limited to 7GB or 14GB VRAM options
+- Early access = limited availability
+
+#### RunPod Serverless (Current)
+
+| GPU | VRAM | Price/hr | Typical Use |
+|-----|------|----------|-------------|
+| RTX 4090 | 24 GB | ~$0.44/hr | SDXL, medium models |
+| A100 40GB | 40 GB | ~$1.14/hr | Large models, training |
+| H100 80GB | 80 GB | ~$2.49/hr | Largest models |
+
+**Current Endpoint Costs:**
+- Image (SD/SDXL): ~$0.02/image (~2s compute)
+- Video (Wan2.2): ~$0.50/video (~60s compute)
+- Text (vLLM): ~$0.001/request
+- Whisper: ~$0.01/minute audio
+
+**Pros:**
+- Zero idle costs
+- Unlimited burst capacity
+- Wide GPU selection (up to 80GB VRAM)
+- Pay only for actual compute
+
+**Cons:**
+- Cold start delays (10-30s first request)
+- Variable availability during peak times
+- Per-request costs add up at scale
+
+### Break-even Analysis
+
+**When does Netcup vGPU become cheaper than RunPod?**
+
+| Scenario | RunPod Cost | Netcup RS 2000 vGPU 7 | Netcup RS 4000 vGPU 14 |
+|----------|-------------|----------------------|------------------------|
+| 1,000 images/mo | $20 | $150 ❌ | $285 ❌ |
+| 5,000 images/mo | $100 | $150 ❌ | $285 ❌ |
+| **7,500 images/mo** | **$150** | **$150 ✅** | $285 ❌ |
+| 10,000 images/mo | $200 | $150 ✅ | $285 ❌ |
+| **14,250 images/mo** | **$285** | $150 ✅ | **$285 ✅** |
+| 100 videos/mo | $50 | $150 ❌ | $285 ❌ |
+| **300 videos/mo** | **$150** | **$150 ✅** | $285 ❌ |
+| 500 videos/mo | $250 | $150 ✅ | $285 ❌ |
+
+**Recommendation by Usage Pattern:**
+
+| Monthly Usage | Best Option | Est. Cost |
+|---------------|-------------|-----------|
+| < 5,000 images OR < 250 videos | RunPod Serverless | $50-100 |
+| 5,000-10,000 images OR 250-500 videos | **Netcup RS 2000 vGPU 7** | $150 fixed |
+| > 10,000 images OR > 500 videos + training | **Netcup RS 4000 vGPU 14** | $285 fixed |
+| Unpredictable/bursty workloads | RunPod Serverless | Variable |
+
+### Migration Strategy
+
+#### Phase 1: Immediate (Before July 7, 2025)
+**Decision Point: Secure Netcup vGPU Early Access?**
+
+- [ ] Monitor actual GPU usage for 2-4 weeks
+- [ ] Calculate average monthly image/video generation
+- [ ] If consistently > 5,000 images/mo → Consider RS 2000 vGPU 7
+- [ ] If consistently > 10,000 images/mo → Consider RS 4000 vGPU 14
+- [ ] **ACTION**: Redeem early access code if usage justifies fixed GPU
+
+#### Phase 2: Hybrid Architecture (If vGPU Acquired)
+
+```
+RS 8000 G12 Pro (CPU - Current)
+├── Ollama (text/code) → FREE
+├── SD 1.5/2.1 CPU fallback → FREE
+└── Orchestrator API
+
+Netcup vGPU Server (NEW - If purchased)
+├── Primary GPU workloads
+├── SDXL/SD3 generation
+├── Video generation (Wan2.1 I2V)
+├── Model inference (14B params with 14GB VRAM)
+└── Connected via internal netcup network (low latency)
+
+RunPod Serverless (Burst Only)
+├── Overflow capacity
+├── Models requiring > 14GB VRAM
+├── Training/fine-tuning jobs
+└── Geographic distribution needs
+```
+
+#### Phase 3: Cost Optimization Targets
+
+| Scenario | Current | With vGPU Migration | Savings |
+|----------|---------|---------------------|---------|
+| Low usage | $140/mo | $95/mo (RS8000 + minimal RunPod) | $540/yr |
+| Medium usage | $190/mo | $195/mo (RS8000 + vGPU 7) | Break-even |
+| High usage | $250/mo | $195/mo (RS8000 + vGPU 7) | $660/yr |
+| Very high usage | $350/mo | $330/mo (RS8000 + vGPU 14) | $240/yr |
+
+### Model VRAM Requirements Reference
+
+| Model | VRAM Needed | Fits vGPU 7? | Fits vGPU 14? |
+|-------|-------------|--------------|---------------|
+| SD 1.5 | ~4 GB | ✅ | ✅ |
+| SD 2.1 | ~5 GB | ✅ | ✅ |
+| SDXL | ~7 GB | ⚠️ Tight | ✅ |
+| SD3 Medium | ~8 GB | ❌ | ✅ |
+| Wan2.1 I2V 14B | ~12 GB | ❌ | ✅ |
+| Wan2.1 T2V 14B | ~14 GB | ❌ | ⚠️ Tight |
+| Flux.1 Dev | ~12 GB | ❌ | ✅ |
+| LLaMA 3 8B (Q4) | ~6 GB | ✅ | ✅ |
+| LLaMA 3 70B (Q4) | ~40 GB | ❌ | ❌ (RunPod) |
+
+### Decision Framework
+
+```
+┌─────────────────────────────────────────────────────────┐
+│              GPU WORKLOAD DECISION TREE                 │
+├─────────────────────────────────────────────────────────┤
+│                                                         │
+│  Is usage predictable and consistent?                   │
+│  ├── YES → Is monthly GPU spend > $150?                 │
+│  │         ├── YES → Netcup vGPU (fixed cost wins)      │
+│  │         └── NO  → RunPod Serverless (no idle cost)   │
+│  └── NO  → RunPod Serverless (pay for what you use)     │
+│                                                         │
+│  Does model require > 14GB VRAM?                        │
+│  ├── YES → RunPod (A100/H100 on-demand)                 │
+│  └── NO  → Netcup vGPU or RS 8000 CPU                   │
+│                                                         │
+│  Is low latency critical?                               │
+│  ├── YES → Netcup vGPU (same datacenter as RS 8000)     │
+│  └── NO  → RunPod Serverless (acceptable for batch)     │
+│                                                         │
+└─────────────────────────────────────────────────────────┘
+```
+
+### Monitoring & Review Schedule
+
+- **Weekly**: Review RunPod spend dashboard
+- **Monthly**: Calculate total GPU costs, compare to vGPU break-even
+- **Quarterly**: Re-evaluate architecture, consider plan changes
+- **Annually**: Full infrastructure cost audit
+
+### Action Items
+
+- [ ] **URGENT**: Decide on Netcup vGPU early access before July 7, 2025
+- [ ] Set up GPU usage tracking in orchestrator
+- [ ] Create Grafana dashboard for cost monitoring
+- [ ] Test Wan2.1 I2V 14B model on vGPU 14 (if acquired)
+- [ ] Document migration runbook for vGPU setup
+- [ ] Complete DigitalOcean deprecation (separate from GPU decision)
+
+---
+
+## 📁 PROJECT PORTFOLIO STRUCTURE
+
+### Repository Organization
+- **Location**: `/home/jeffe/Github/`
+- **Primary Flow**: Gitea (source of truth) → GitHub (public mirror)
+- **Containerization**: ALL repos must be Dockerized with optimized production containers
+
+### 🎯 MAIN PROJECT: canvas-website
+**Location**: `/home/jeffe/Github/canvas-website`
+**Description**: Collaborative canvas deployment - the integration hub where all tools come together
+- Tldraw-based collaborative canvas platform
+- Integrates Hyperindex, rSpace, MycoFi, and other tools
+- Real-time collaboration features
+- Deployed on RS 8000 in Docker
+- Uses AI orchestrator for intelligent features
+
+### Project Categories
+
+**AI & Infrastructure:**
+- AI Orchestrator (smart routing between RS 8000 & RunPod)
+- Model hosting & fine-tuning pipelines
+- Cost optimization & monitoring dashboards
+
+**Web Applications & Sites:**
+- **canvas-website**: Main collaborative canvas (integration hub)
+- All deployed in Docker containers on RS 8000
+- Cloudflare Workers for edge functions (Hyperindex)
+- Static sites + dynamic backends containerized
+
+**Supporting Projects:**
+- **Hyperindex**: Tldraw canvas integration (Cloudflare stack) - integrates into canvas-website
+- **rSpace**: Real-time collaboration platform - integrates into canvas-website
+- **MycoFi**: DeFi/Web3 project - integrates into canvas-website
+- **Canvas-related tools**: Knowledge graph & visualization components
+
+### Deployment Strategy
+1. **Development**: Local WSL2 environment (`/home/jeffe/Github/`)
+2. **Version Control**: Push to Gitea FIRST → Auto-mirror to GitHub
+3. **Containerization**: Build optimized Docker images with Traefik labels
+4. **Deployment**: Deploy to RS 8000 via Docker Compose (join `traefik-public` network)
+5. **Routing**: Traefik auto-discovers service via labels, no config changes needed
+6. **DNS**: Add hostname to Cloudflare tunnel (if new domain) or it just works (existing domains)
+7. **AI Integration**: Connect to local orchestrator API
+8. **Monitoring**: Grafana dashboards for all services
+
+### Infrastructure Philosophy
+- **Self-hosted first**: Own your infrastructure (RS 8000 + Gitea)
+- **Cloud for edge cases**: Cloudflare (edge), RunPod (GPU burst)
+- **Cost-optimized**: Local CPU for 70-80% of workload
+- **Dockerized everything**: Reproducible, scalable, maintainable
+- **Smart orchestration**: Right compute for the right job
+
+---
+
+- can you make sure you are runing the hf download for a non deprecated version?  After that, you can proceed with  Image-to-Video 14B 720p (RECOMMENDED)
+huggingface-cli download Wan-AI/Wan2.1-I2V-14B-720P \
+  --include "*.safetensors" \
+  --local-dir models/diffusion_models/wan2.1_i2v_14b
+
+## 🕸️ HYPERINDEX PROJECT - TOP PRIORITY
+
+**Location:** `/home/jeffe/Github/hyperindex-system/`
+
+When user is ready to work on the hyperindexing system:
+1. Reference `HYPERINDEX_PROJECT.md` for complete architecture and implementation details
+2. Follow `HYPERINDEX_TODO.md` for step-by-step checklist
+3. Start with Phase 1 (Database & Core Types), then proceed sequentially through Phase 5
+4. This is a tldraw canvas integration project using Cloudflare Workers, D1, R2, and Durable Objects
+5. Creates a "living, mycelial network" of web discoveries that spawn on the canvas in real-time
+
+---
+
+## 📋 BACKLOG.MD - UNIFIED TASK MANAGEMENT
+
+**All projects use Backlog.md for task tracking.** Tasks are managed as markdown files and can be viewed at `backlog.jeffemmett.com` for a unified cross-project view.
+
+### MCP Integration
+Backlog.md is integrated via MCP server. Available tools:
+- `backlog.task_create` - Create new tasks
+- `backlog.task_list` - List tasks with filters
+- `backlog.task_update` - Update task status/details
+- `backlog.task_view` - View task details
+- `backlog.search` - Search across tasks, docs, decisions
+
+### Task Lifecycle Workflow
+
+**CRITICAL: Claude agents MUST follow this workflow for ALL development tasks:**
+
+#### 1. Task Discovery (Before Starting Work)
+```bash
+# Check if task already exists
+backlog search "<task description>" --plain
+
+# List current tasks
+backlog task list --plain
+```
+
+#### 2. Task Creation (If Not Exists)
+```bash
+# Create task with full details
+backlog task create "Task Title" \
+  --desc "Detailed description" \
+  --priority high \
+  --status "To Do"
+```
+
+#### 3. Starting Work (Move to In Progress)
+```bash
+# Update status when starting
+backlog task edit <task-id> --status "In Progress"
+```
+
+#### 4. During Development (Update Notes)
+```bash
+# Append progress notes
+backlog task edit <task-id> --append-notes "Completed X, working on Y"
+
+# Update acceptance criteria
+backlog task edit <task-id> --check-ac 1
+```
+
+#### 5. Completion (Move to Done)
+```bash
+# Mark complete when finished
+backlog task edit <task-id> --status "Done"
+```
+
+### Project Initialization
+
+When starting work in a new repository that doesn't have backlog:
+```bash
+cd /path/to/repo
+backlog init "Project Name" --integration-mode mcp --defaults
+```
+
+This creates the `backlog/` directory structure:
+```
+backlog/
+├── config.yml          # Project configuration
+├── tasks/              # Active tasks
+├── completed/          # Finished tasks
+├── drafts/             # Draft tasks
+├── docs/               # Project documentation
+├── decisions/          # Architecture decision records
+└── archive/            # Archived tasks
+```
+
+### Task File Format
+Tasks are markdown files with YAML frontmatter:
+```yaml
+---
+id: task-001
+title: Feature implementation
+status: In Progress
+assignee: [@claude]
+created_date: '2025-12-03 14:30'
+labels: [feature, backend]
+priority: high
+dependencies: [task-002]
+---
+
+## Description
+What needs to be done...
+
+## Plan
+1. Step one
+2. Step two
+
+## Acceptance Criteria
+- [ ] Criterion 1
+- [x] Criterion 2 (completed)
+
+## Notes
+Progress updates go here...
+```
+
+### Cross-Project Aggregation (backlog.jeffemmett.com)
+
+**Architecture:**
+```
+┌─────────────────────────────────────────────────────────────┐
+│                  backlog.jeffemmett.com                     │
+│              (Unified Kanban Dashboard)                     │
+├─────────────────────────────────────────────────────────────┤
+│                                                             │
+│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐         │
+│  │ canvas-web  │  │ hyperindex  │  │  mycofi     │  ...    │
+│  │  (purple)   │  │  (green)    │  │  (blue)     │         │
+│  └──────┬──────┘  └──────┬──────┘  └──────┬──────┘         │
+│         │                │                │                 │
+│         └────────────────┴────────────────┘                 │
+│                          │                                  │
+│              ┌───────────┴───────────┐                     │
+│              │    Aggregation API    │                     │
+│              │  (polls all projects) │                     │
+│              └───────────────────────┘                     │
+│                                                             │
+└─────────────────────────────────────────────────────────────┘
+
+Data Sources:
+├── Local: /home/jeffe/Github/*/backlog/
+└── Remote: ssh netcup "ls /opt/*/backlog/"
+```
+
+**Color Coding by Project:**
+| Project | Color | Location |
+|---------|-------|----------|
+| canvas-website | Purple | Local + Netcup |
+| hyperindex-system | Green | Local |
+| mycofi-earth | Blue | Local + Netcup |
+| decolonize-time | Orange | Local + Netcup |
+| ai-orchestrator | Red | Netcup |
+
+**Aggregation Service** (to be deployed on Netcup):
+- Polls all project `backlog/tasks/` directories
+- Serves unified JSON API at `api.backlog.jeffemmett.com`
+- Web UI at `backlog.jeffemmett.com` shows combined Kanban
+- Real-time updates via WebSocket
+- Filter by project, status, priority, assignee
+
+### Agent Behavior Requirements
+
+**When Claude starts working on ANY task:**
+
+1. **Check for existing backlog** in the repo:
+   ```bash
+   ls backlog/config.yml 2>/dev/null || echo "Backlog not initialized"
+   ```
+
+2. **If backlog exists**, search for related tasks:
+   ```bash
+   backlog search "<relevant keywords>" --plain
+   ```
+
+3. **Create or update task** before writing code:
+   ```bash
+   # If new task needed:
+   backlog task create "Task title" --status "In Progress"
+
+   # If task exists:
+   backlog task edit <id> --status "In Progress"
+   ```
+
+4. **Update task on completion**:
+   ```bash
+   backlog task edit <id> --status "Done" --append-notes "Implementation complete"
+   ```
+
+5. **Never leave tasks in "In Progress"** when stopping work - either complete them or add notes explaining blockers.
+
+### Viewing Tasks
+
+**Terminal Kanban Board:**
+```bash
+backlog board
+```
+
+**Web Interface (single project):**
+```bash
+backlog browser --port 6420
+```
+
+**Unified View (all projects):**
+Visit `backlog.jeffemmett.com` (served from Netcup)
+
+### Backlog CLI Quick Reference
+
+#### Task Operations
+| Action | Command |
+|--------|---------|
+| View task | `backlog task 42 --plain` |
+| List tasks | `backlog task list --plain` |
+| Search tasks | `backlog search "topic" --plain` |
+| Filter by status | `backlog task list -s "In Progress" --plain` |
+| Create task | `backlog task create "Title" -d "Description" --ac "Criterion 1"` |
+| Edit task | `backlog task edit 42 -t "New Title" -s "In Progress"` |
+| Assign task | `backlog task edit 42 -a @claude` |
+
+#### Acceptance Criteria Management
+| Action | Command |
+|--------|---------|
+| Add AC | `backlog task edit 42 --ac "New criterion"` |
+| Check AC #1 | `backlog task edit 42 --check-ac 1` |
+| Check multiple | `backlog task edit 42 --check-ac 1 --check-ac 2` |
+| Uncheck AC | `backlog task edit 42 --uncheck-ac 1` |
+| Remove AC | `backlog task edit 42 --remove-ac 2` |
+
+#### Multi-line Input (Description/Plan/Notes)
+The CLI preserves input literally. Use shell-specific syntax for real newlines:
+
+```bash
+# Bash/Zsh (ANSI-C quoting)
+backlog task edit 42 --notes $'Line1\nLine2\nLine3'
+backlog task edit 42 --plan $'1. Step one\n2. Step two'
+
+# POSIX portable
+backlog task edit 42 --notes "$(printf 'Line1\nLine2')"
+
+# Append notes progressively
+backlog task edit 42 --append-notes $'- Completed X\n- Working on Y'
+```
+
+#### Definition of Done (DoD)
+A task is **Done** only when ALL of these are complete:
+
+**Via CLI:**
+1. All acceptance criteria checked: `--check-ac <index>` for each
+2. Implementation notes added: `--notes "..."` or `--append-notes "..."`
+3. Status set to Done: `-s Done`
+
+**Via Code/Testing:**
+4. Tests pass (run test suite and linting)
+5. Documentation updated if needed
+6. Code self-reviewed
+7. No regressions
+
+**NEVER mark a task as Done without completing ALL items above.**
+
+### Configuration Reference
+
+---
+
+## 🔧 TROUBLESHOOTING
+
+### tmux "server exited unexpectedly"
+This error occurs when a stale socket file exists from a crashed tmux server.
+
+**Fix:**
+```bash
+rm -f /tmp/tmux-$(id -u)/default
+```
+
+Then start a new session normally with `tmux` or `tmux new -s <name>`.
+
+---
+
+Default `backlog/config.yml`:
+```yaml
+project_name: "Project Name"
+default_status: "To Do"
+statuses: ["To Do", "In Progress", "Done"]
+labels: []
+milestones: []
+date_format: yyyy-mm-dd
+max_column_width: 20
+auto_open_browser: true
+default_port: 6420
+remote_operations: true
+auto_commit: true
+zero_padded_ids: 3
+bypass_git_hooks: false
+check_active_branches: true
+active_branch_days: 60
+```