vLLM ignores top-level max_tokens, only reads from sampling_params. Also adds RUNPOD_API_KEY to compose for explicit env injection. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> |
||
|---|---|---|
| .dockerignore | ||
| .env.example | ||
| .gitignore | ||
| Dockerfile | ||
| docker-compose.yml | ||
| entrypoint.sh | ||
| requirements.txt | ||
| server.py | ||
| test_api.py | ||