- Cut context to 512 tokens, max output to 128 - Only 2 retrieval chunks of 150 chars each (no headers) - Keep only last 2 conversation messages - Minimized system prompt overhead Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> |
||
|---|---|---|
| .. | ||
| scraper | ||
| static | ||
| __init__.py | ||
| config.py | ||
| database.py | ||
| embeddings.py | ||
| llm.py | ||
| main.py | ||
| models.py | ||
| rag.py | ||