CPU-based Ollama inference on Netcup is too slow due to server memory pressure. Add OpenAI-compatible API support so we can use Gemini Flash or other cloud APIs for clip analysis. Also increase transcript sample size to 20K chars since cloud APIs handle it easily. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> |
||
|---|---|---|
| .. | ||
| app | ||
| Dockerfile | ||
| requirements.txt | ||