ai-orchestrator

AI Orchestrator - Smart routing between Ollama (free) and RunPod (GPU)

Go to file

Jeff Emmett cc2d06bbfb fix: use sampling_params for RunPod vLLM max_tokens + add API key env vLLM ignores top-level max_tokens, only reads from sampling_params. Also adds RUNPOD_API_KEY to compose for explicit env injection. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>		2026-03-23 10:15:12 -07:00
.dockerignore	Add .dockerignore for optimized Docker builds	2026-02-21 17:59:08 -07:00
.env.example	Initial commit: AI Orchestrator with Ollama + RunPod smart routing	2025-11-26 19:11:58 -08:00
.gitignore	Initial commit: AI Orchestrator with Ollama + RunPod smart routing	2025-11-26 19:11:58 -08:00
Dockerfile	Wire Infisical secret injection for RUNPOD_API_KEY	2026-02-24 09:32:34 -08:00
docker-compose.yml	fix: use sampling_params for RunPod vLLM max_tokens + add API key env	2026-03-23 10:15:12 -07:00
entrypoint.sh	Wire Infisical secret injection for RUNPOD_API_KEY	2026-02-24 09:32:34 -08:00
requirements.txt	Initial commit: AI Orchestrator with Ollama + RunPod smart routing	2025-11-26 19:11:58 -08:00
server.py	fix: use sampling_params for RunPod vLLM max_tokens + add API key env	2026-03-23 10:15:12 -07:00
test_api.py	Initial commit: AI Orchestrator with Ollama + RunPod smart routing	2025-11-26 19:11:58 -08:00