P2P Wiki AI — RAG chat and article generation for the P2P Foundation Wiki
Go to file
Jeff Emmett 156a5324a2
CI/CD / deploy (push) Failing after 8m41s Details
ci: run smoke test via SSH from host for reliable DNS
Runner container can't always resolve Cloudflare-tunneled domains.
Run curl from host via SSH instead.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-01 12:32:34 -07:00
.gitea/workflows ci: run smoke test via SSH from host for reliable DNS 2026-04-01 12:32:34 -07:00
backlog chore: add backlog-notify onStatusChange hook 2026-03-10 15:30:29 -07:00
blog_sample Add wiki draft approval UI, MediaWiki client, HitCounters extension, and blog parser 2026-02-12 21:03:20 -07:00
blog_static fix: match all blog* hostnames in rdata analytics mu-plugin 2026-03-10 14:16:56 -07:00
drafts Add draft approval gadgets and wiki draft articles 2026-02-02 16:58:51 +00:00
src Add wiki draft approval UI, MediaWiki client, HitCounters extension, and blog parser 2026-02-12 21:03:20 -07:00
web feat: add rData (Umami) analytics to p2pwiki, p2pf blog and website 2026-03-10 14:06:08 -07:00
wiki_deploy Add CirrusSearch deployment script and backlog tasks 2026-02-21 14:41:32 -07:00
wiki_scripts Add MediaWiki client and gadget installation script 2026-02-05 12:57:53 +00:00
.dockerignore Add .dockerignore for optimized Docker builds 2026-02-21 17:48:54 -07:00
.env.example Initial commit: P2P Wiki AI system 2026-01-23 13:53:29 +01:00
.gitignore Add wiki draft approval UI, MediaWiki client, HitCounters extension, and blog parser 2026-02-12 21:03:20 -07:00
Dockerfile Wire Infisical secret injection for ANTHROPIC_API_KEY 2026-02-24 09:32:37 -08:00
README.md Initial commit: P2P Wiki AI system 2026-01-23 13:53:29 +01:00
docker-compose.yml Update docker-compose configuration 2026-03-21 21:20:33 +00:00
entrypoint.sh Wire Infisical secret injection for ANTHROPIC_API_KEY 2026-02-24 09:32:37 -08:00
pagenames.txt Add wiki draft approval UI, MediaWiki client, HitCounters extension, and blog parser 2026-02-12 21:03:20 -07:00
pyproject.toml Initial commit: P2P Wiki AI system 2026-01-23 13:53:29 +01:00

README.md

P2P Wiki AI

AI-augmented system for the P2P Foundation Wiki with two main features:

  1. Conversational Agent - Ask questions about the 23,000+ wiki articles using RAG (Retrieval Augmented Generation)
  2. Article Ingress Pipeline - Drop article URLs to automatically analyze content, find matching wiki articles for citations, and generate draft articles

Architecture

┌─────────────────────────────────────────────────────────────────┐
│                    P2P Wiki AI System                           │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  ┌─────────────────┐     ┌─────────────────┐                   │
│  │   Chat (Q&A)    │     │  Ingress Tool   │                   │
│  │   via RAG       │     │  (URL Drop)     │                   │
│  └────────┬────────┘     └────────┬────────┘                   │
│           │                       │                             │
│           └───────────┬───────────┘                             │
│                       ▼                                         │
│           ┌───────────────────────┐                             │
│           │    FastAPI Backend    │                             │
│           └───────────┬───────────┘                             │
│                       │                                         │
│        ┌──────────────┼──────────────┐                         │
│        ▼              ▼              ▼                          │
│  ┌──────────┐  ┌─────────────┐  ┌──────────────┐               │
│  │ ChromaDB │  │ Ollama/     │  │   Article    │               │
│  │ (Vector) │  │ Claude      │  │   Scraper    │               │
│  └──────────┘  └─────────────┘  └──────────────┘               │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Quick Start

1. Prerequisites

  • Python 3.10+
  • Ollama installed locally (or access to a remote Ollama server)
  • Optional: Anthropic API key for Claude (higher quality article drafts)

2. Install Dependencies

cd /home/jeffe/Github/p2pwiki-content
pip install -e .

3. Parse Wiki Content

Convert the MediaWiki XML dumps to searchable JSON:

python -m src.parser

This creates data/articles.json with all parsed articles (~23,000 pages).

4. Generate Embeddings

Create the vector store for semantic search:

python -m src.embeddings

This creates the ChromaDB vector store in data/chroma/. Takes a few minutes.

5. Configure Environment

cp .env.example .env
# Edit .env with your settings

6. Run the Server

python -m src.api

Visit http://localhost:8420/ui for the web interface.

Docker Deployment

For production deployment on the RS 8000:

# Build and run
docker compose up -d --build

# Check logs
docker compose logs -f

# Access at http://localhost:8420/ui
# Or via Traefik at https://wiki-ai.jeffemmett.com

API Endpoints

Chat

# Ask a question
curl -X POST http://localhost:8420/chat \
  -H "Content-Type: application/json" \
  -d '{"query": "What is commons-based peer production?"}'

Ingress

# Process an external article
curl -X POST http://localhost:8420/ingress \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com/article-about-cooperatives"}'

Review Queue

# Get all items in review queue
curl http://localhost:8420/review

# Approve a draft article
curl -X POST http://localhost:8420/review/action \
  -H "Content-Type: application/json" \
  -d '{"filepath": "/path/to/item.json", "item_type": "draft", "item_index": 0, "action": "approve"}'
# Direct vector search
curl "http://localhost:8420/search?q=cooperative%20economics&n=10"

# List article titles
curl "http://localhost:8420/articles?limit=100"

Hybrid AI Routing

The system uses intelligent routing between local (Ollama) and cloud (Claude) LLMs:

Task Default LLM Reasoning
Chat Q&A Ollama Fast, free, good enough for retrieval-based answers
Content Analysis Claude Better at extracting topics and identifying wiki relevance
Draft Generation Claude Higher quality article writing
Embeddings Local (sentence-transformers) Fast, free, optimized for semantic search

Configure in .env:

USE_CLAUDE_FOR_DRAFTS=true
USE_OLLAMA_FOR_CHAT=true

Project Structure

p2pwiki-content/
├── src/
│   ├── api.py          # FastAPI backend
│   ├── config.py       # Configuration settings
│   ├── embeddings.py   # Vector store (ChromaDB)
│   ├── ingress.py      # Article scraper & analyzer
│   ├── llm.py          # LLM client (Ollama/Claude)
│   ├── parser.py       # MediaWiki XML parser
│   └── rag.py          # RAG chat system
├── web/
│   └── index.html      # Web UI
├── data/
│   ├── articles.json   # Parsed wiki content
│   ├── chroma/         # Vector store
│   └── review_queue/   # Pending ingress items
├── xmldump/            # MediaWiki XML dumps
├── docker-compose.yml
├── Dockerfile
└── pyproject.toml

Content Coverage

The P2P Foundation Wiki contains ~23,000 articles covering:

  • Peer-to-peer networks and culture
  • Commons-based peer production (CBPP)
  • Alternative economics and post-capitalism
  • Cooperative business models
  • Open source and free culture
  • Collaborative governance
  • Sustainability and ecology

License

The wiki content is from the P2P Foundation under their respective licenses. The AI system code is provided as-is for educational purposes.