P2P Wiki AI — RAG chat and article generation for the P2P Foundation Wiki
Go to file
Jeff Emmett dcd576944c Make ingress async to avoid Cloudflare timeout
The LLM analysis step was taking too long and causing 524 errors.
Now returns immediately and processes in background.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 10:46:09 +01:00
src Make ingress async to avoid Cloudflare timeout 2026-01-24 10:46:09 +01:00
web Initial commit: P2P Wiki AI system 2026-01-23 13:53:29 +01:00
.env.example Initial commit: P2P Wiki AI system 2026-01-23 13:53:29 +01:00
.gitignore Initial commit: P2P Wiki AI system 2026-01-23 13:53:29 +01:00
Dockerfile Initial commit: P2P Wiki AI system 2026-01-23 13:53:29 +01:00
README.md Initial commit: P2P Wiki AI system 2026-01-23 13:53:29 +01:00
docker-compose.yml Initial commit: P2P Wiki AI system 2026-01-23 13:53:29 +01:00
pagenames.txt Initial commit: P2P Wiki AI system 2026-01-23 13:53:29 +01:00
pyproject.toml Initial commit: P2P Wiki AI system 2026-01-23 13:53:29 +01:00

README.md

P2P Wiki AI

AI-augmented system for the P2P Foundation Wiki with two main features:

  1. Conversational Agent - Ask questions about the 23,000+ wiki articles using RAG (Retrieval Augmented Generation)
  2. Article Ingress Pipeline - Drop article URLs to automatically analyze content, find matching wiki articles for citations, and generate draft articles

Architecture

┌─────────────────────────────────────────────────────────────────┐
│                    P2P Wiki AI System                           │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  ┌─────────────────┐     ┌─────────────────┐                   │
│  │   Chat (Q&A)    │     │  Ingress Tool   │                   │
│  │   via RAG       │     │  (URL Drop)     │                   │
│  └────────┬────────┘     └────────┬────────┘                   │
│           │                       │                             │
│           └───────────┬───────────┘                             │
│                       ▼                                         │
│           ┌───────────────────────┐                             │
│           │    FastAPI Backend    │                             │
│           └───────────┬───────────┘                             │
│                       │                                         │
│        ┌──────────────┼──────────────┐                         │
│        ▼              ▼              ▼                          │
│  ┌──────────┐  ┌─────────────┐  ┌──────────────┐               │
│  │ ChromaDB │  │ Ollama/     │  │   Article    │               │
│  │ (Vector) │  │ Claude      │  │   Scraper    │               │
│  └──────────┘  └─────────────┘  └──────────────┘               │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Quick Start

1. Prerequisites

  • Python 3.10+
  • Ollama installed locally (or access to a remote Ollama server)
  • Optional: Anthropic API key for Claude (higher quality article drafts)

2. Install Dependencies

cd /home/jeffe/Github/p2pwiki-content
pip install -e .

3. Parse Wiki Content

Convert the MediaWiki XML dumps to searchable JSON:

python -m src.parser

This creates data/articles.json with all parsed articles (~23,000 pages).

4. Generate Embeddings

Create the vector store for semantic search:

python -m src.embeddings

This creates the ChromaDB vector store in data/chroma/. Takes a few minutes.

5. Configure Environment

cp .env.example .env
# Edit .env with your settings

6. Run the Server

python -m src.api

Visit http://localhost:8420/ui for the web interface.

Docker Deployment

For production deployment on the RS 8000:

# Build and run
docker compose up -d --build

# Check logs
docker compose logs -f

# Access at http://localhost:8420/ui
# Or via Traefik at https://wiki-ai.jeffemmett.com

API Endpoints

Chat

# Ask a question
curl -X POST http://localhost:8420/chat \
  -H "Content-Type: application/json" \
  -d '{"query": "What is commons-based peer production?"}'

Ingress

# Process an external article
curl -X POST http://localhost:8420/ingress \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com/article-about-cooperatives"}'

Review Queue

# Get all items in review queue
curl http://localhost:8420/review

# Approve a draft article
curl -X POST http://localhost:8420/review/action \
  -H "Content-Type: application/json" \
  -d '{"filepath": "/path/to/item.json", "item_type": "draft", "item_index": 0, "action": "approve"}'
# Direct vector search
curl "http://localhost:8420/search?q=cooperative%20economics&n=10"

# List article titles
curl "http://localhost:8420/articles?limit=100"

Hybrid AI Routing

The system uses intelligent routing between local (Ollama) and cloud (Claude) LLMs:

Task Default LLM Reasoning
Chat Q&A Ollama Fast, free, good enough for retrieval-based answers
Content Analysis Claude Better at extracting topics and identifying wiki relevance
Draft Generation Claude Higher quality article writing
Embeddings Local (sentence-transformers) Fast, free, optimized for semantic search

Configure in .env:

USE_CLAUDE_FOR_DRAFTS=true
USE_OLLAMA_FOR_CHAT=true

Project Structure

p2pwiki-content/
├── src/
│   ├── api.py          # FastAPI backend
│   ├── config.py       # Configuration settings
│   ├── embeddings.py   # Vector store (ChromaDB)
│   ├── ingress.py      # Article scraper & analyzer
│   ├── llm.py          # LLM client (Ollama/Claude)
│   ├── parser.py       # MediaWiki XML parser
│   └── rag.py          # RAG chat system
├── web/
│   └── index.html      # Web UI
├── data/
│   ├── articles.json   # Parsed wiki content
│   ├── chroma/         # Vector store
│   └── review_queue/   # Pending ingress items
├── xmldump/            # MediaWiki XML dumps
├── docker-compose.yml
├── Dockerfile
└── pyproject.toml

Content Coverage

The P2P Foundation Wiki contains ~23,000 articles covering:

  • Peer-to-peer networks and culture
  • Commons-based peer production (CBPP)
  • Alternative economics and post-capitalism
  • Cooperative business models
  • Open source and free culture
  • Collaborative governance
  • Sustainability and ecology

License

The wiki content is from the P2P Foundation under their respective licenses. The AI system code is provided as-is for educational purposes.