Go to file
Jeff Emmett 3215283f97 Speed up bot: use llama3.2:1b, reduce context, limit tokens
- Switch default model from llama3.1:8b to llama3.2:1b (2x faster on CPU)
- Limit Ollama context to 2048 tokens and max output to 512 tokens
- Reduce retrieval chunks from 4 to 3, chunk content from 800 to 500 chars
- Trim conversation history from 10 to 6 messages
- Shorten system prompt to reduce input tokens

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 19:44:04 -07:00
app Speed up bot: use llama3.2:1b, reduce context, limit tokens 2026-02-16 19:44:04 -07:00
backlog Initialize backlog and record deployment setup 2026-02-16 18:51:00 -07:00
.env.example Initial commit: Erowid conversational bot 2026-02-17 01:19:49 +00:00
.gitignore Initial commit: Erowid conversational bot 2026-02-17 01:19:49 +00:00
Dockerfile Initial commit: Erowid conversational bot 2026-02-17 01:19:49 +00:00
docker-compose.yml Update Traefik host to erowid.psilo-cyber.net 2026-02-16 18:38:58 -07:00
requirements.txt Initial commit: Erowid conversational bot 2026-02-17 01:19:49 +00:00