feat(meeting-intelligence): add backend infrastructure for transcription and AI summaries
Add complete Meeting Intelligence System infrastructure:
Backend Services:
- PostgreSQL schema with pgvector for semantic search
- Transcription service using whisper.cpp and resemblyzer for diarization
- Meeting Intelligence API with FastAPI
- Jibri configuration for recording
API Endpoints:
- /meetings - List, get, delete meetings
- /meetings/{id}/transcript - Get transcripts with speaker attribution
- /meetings/{id}/summary - Generate AI summaries via Ollama
- /search - Full-text and semantic search
- /meetings/{id}/export - Export as PDF, Markdown, JSON
- /webhooks/recording-complete - Jibri callback
Features:
- Zero-cost local transcription (whisper.cpp CPU)
- Speaker diarization (who said what)
- AI-powered summaries with key points, action items, decisions
- Vector embeddings for semantic search
- Multi-format export
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
parent
f56986818b
commit
4cb219db0f
|
|
@ -0,0 +1,17 @@
|
||||||
|
# Meeting Intelligence System - Environment Variables
|
||||||
|
# Copy this file to .env and update values
|
||||||
|
|
||||||
|
# PostgreSQL
|
||||||
|
POSTGRES_PASSWORD=your-secure-password-here
|
||||||
|
|
||||||
|
# API Security
|
||||||
|
API_SECRET_KEY=your-api-secret-key-here
|
||||||
|
|
||||||
|
# Jibri XMPP Configuration
|
||||||
|
XMPP_SERVER=meet.jeffemmett.com
|
||||||
|
XMPP_DOMAIN=meet.jeffemmett.com
|
||||||
|
JIBRI_XMPP_PASSWORD=jibri-xmpp-password
|
||||||
|
JIBRI_RECORDER_PASSWORD=recorder-password
|
||||||
|
|
||||||
|
# Ollama (uses host.docker.internal by default)
|
||||||
|
# OLLAMA_URL=http://host.docker.internal:11434
|
||||||
|
|
@ -0,0 +1,151 @@
|
||||||
|
# Meeting Intelligence System
|
||||||
|
|
||||||
|
A fully self-hosted, zero-cost meeting intelligence system for Jeffsi Meet that provides:
|
||||||
|
- Automatic meeting recording via Jibri
|
||||||
|
- Local transcription via whisper.cpp (CPU-only)
|
||||||
|
- Speaker diarization (who said what)
|
||||||
|
- AI-powered summaries via Ollama
|
||||||
|
- Searchable meeting archive with dashboard
|
||||||
|
|
||||||
|
## Architecture
|
||||||
|
|
||||||
|
```
|
||||||
|
┌─────────────────────────────────────────────────────────────────────┐
|
||||||
|
│ Netcup RS 8000 (Backend) │
|
||||||
|
│ │
|
||||||
|
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────────┐ │
|
||||||
|
│ │ Jibri │───▶│ Whisper │───▶│ AI Processor │ │
|
||||||
|
│ │ Recording │ │ Transcriber │ │ (Ollama + Summarizer) │ │
|
||||||
|
│ │ Container │ │ Service │ │ │ │
|
||||||
|
│ └─────────────┘ └─────────────┘ └─────────────────────────┘ │
|
||||||
|
│ │ │ │ │
|
||||||
|
│ ▼ ▼ ▼ │
|
||||||
|
│ ┌─────────────────────────────────────────────────────────────┐ │
|
||||||
|
│ │ PostgreSQL + pgvector │ │
|
||||||
|
│ └─────────────────────────────────────────────────────────────┘ │
|
||||||
|
└─────────────────────────────────────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
## Components
|
||||||
|
|
||||||
|
| Service | Port | Description |
|
||||||
|
|---------|------|-------------|
|
||||||
|
| PostgreSQL | 5432 | Database with pgvector for semantic search |
|
||||||
|
| Redis | 6379 | Job queue for async processing |
|
||||||
|
| Transcriber | 8001 | whisper.cpp + speaker diarization |
|
||||||
|
| API | 8000 | REST API for meetings, transcripts, search |
|
||||||
|
| Jibri | - | Recording service (joins meetings as hidden participant) |
|
||||||
|
|
||||||
|
## Deployment
|
||||||
|
|
||||||
|
### Prerequisites
|
||||||
|
|
||||||
|
1. Docker and Docker Compose installed
|
||||||
|
2. Ollama running on the host (for AI summaries)
|
||||||
|
3. Jeffsi Meet configured with recording enabled
|
||||||
|
|
||||||
|
### Setup
|
||||||
|
|
||||||
|
1. Copy environment file:
|
||||||
|
```bash
|
||||||
|
cp .env.example .env
|
||||||
|
```
|
||||||
|
|
||||||
|
2. Edit `.env` with your configuration:
|
||||||
|
```bash
|
||||||
|
vim .env
|
||||||
|
```
|
||||||
|
|
||||||
|
3. Create storage directories:
|
||||||
|
```bash
|
||||||
|
sudo mkdir -p /opt/meetings/{recordings,audio}
|
||||||
|
sudo chown -R 1000:1000 /opt/meetings
|
||||||
|
```
|
||||||
|
|
||||||
|
4. Start services:
|
||||||
|
```bash
|
||||||
|
docker compose up -d
|
||||||
|
```
|
||||||
|
|
||||||
|
5. Check logs:
|
||||||
|
```bash
|
||||||
|
docker compose logs -f
|
||||||
|
```
|
||||||
|
|
||||||
|
## API Endpoints
|
||||||
|
|
||||||
|
Base URL: `https://meet.jeffemmett.com/api/intelligence`
|
||||||
|
|
||||||
|
### Meetings
|
||||||
|
- `GET /meetings` - List all meetings
|
||||||
|
- `GET /meetings/{id}` - Get meeting details
|
||||||
|
- `DELETE /meetings/{id}` - Delete meeting
|
||||||
|
|
||||||
|
### Transcripts
|
||||||
|
- `GET /meetings/{id}/transcript` - Get full transcript
|
||||||
|
- `GET /meetings/{id}/transcript/text` - Get as plain text
|
||||||
|
- `GET /meetings/{id}/speakers` - Get speaker statistics
|
||||||
|
|
||||||
|
### Summaries
|
||||||
|
- `GET /meetings/{id}/summary` - Get AI summary
|
||||||
|
- `POST /meetings/{id}/summary` - Generate summary
|
||||||
|
|
||||||
|
### Search
|
||||||
|
- `POST /search` - Search transcripts (text + semantic)
|
||||||
|
- `GET /search/suggest` - Get search suggestions
|
||||||
|
|
||||||
|
### Export
|
||||||
|
- `GET /meetings/{id}/export?format=markdown` - Export as Markdown
|
||||||
|
- `GET /meetings/{id}/export?format=json` - Export as JSON
|
||||||
|
- `GET /meetings/{id}/export?format=pdf` - Export as PDF
|
||||||
|
|
||||||
|
### Webhooks
|
||||||
|
- `POST /webhooks/recording-complete` - Jibri recording callback
|
||||||
|
|
||||||
|
## Processing Pipeline
|
||||||
|
|
||||||
|
1. **Recording** - Jibri joins meeting and records
|
||||||
|
2. **Webhook** - Jibri calls `/webhooks/recording-complete`
|
||||||
|
3. **Audio Extraction** - FFmpeg extracts audio from video
|
||||||
|
4. **Transcription** - whisper.cpp transcribes audio
|
||||||
|
5. **Diarization** - resemblyzer identifies speakers
|
||||||
|
6. **Embedding** - Generate vector embeddings for search
|
||||||
|
7. **Summary** - Ollama generates AI summary
|
||||||
|
8. **Ready** - Meeting available in dashboard
|
||||||
|
|
||||||
|
## Resource Usage
|
||||||
|
|
||||||
|
| Service | CPU | RAM | Storage |
|
||||||
|
|---------|-----|-----|---------|
|
||||||
|
| Transcriber | 8 cores | 12GB | 5GB (models) |
|
||||||
|
| API | 1 core | 2GB | - |
|
||||||
|
| PostgreSQL | 2 cores | 4GB | ~50GB |
|
||||||
|
| Jibri | 2 cores | 4GB | - |
|
||||||
|
| Redis | 0.5 cores | 512MB | - |
|
||||||
|
|
||||||
|
## Troubleshooting
|
||||||
|
|
||||||
|
### Transcription is slow
|
||||||
|
- Check CPU usage: `docker stats meeting-intelligence-transcriber`
|
||||||
|
- Increase `WHISPER_THREADS` in docker-compose.yml
|
||||||
|
- Consider using the `tiny` model for faster (less accurate) transcription
|
||||||
|
|
||||||
|
### No summary generated
|
||||||
|
- Check Ollama is running: `curl http://localhost:11434/api/tags`
|
||||||
|
- Check logs: `docker compose logs api`
|
||||||
|
- Verify model is available: `ollama list`
|
||||||
|
|
||||||
|
### Recording not starting
|
||||||
|
- Check Jibri logs: `docker compose logs jibri`
|
||||||
|
- Verify XMPP credentials in `.env`
|
||||||
|
- Check Prosody recorder virtual host configuration
|
||||||
|
|
||||||
|
## Cost Analysis
|
||||||
|
|
||||||
|
| Component | Monthly Cost |
|
||||||
|
|-----------|-------------|
|
||||||
|
| Jibri recording | $0 (local) |
|
||||||
|
| Whisper transcription | $0 (local CPU) |
|
||||||
|
| Ollama summarization | $0 (local) |
|
||||||
|
| PostgreSQL | $0 (local) |
|
||||||
|
| **Total** | **$0/month** |
|
||||||
|
|
@ -0,0 +1,32 @@
|
||||||
|
# Meeting Intelligence API
|
||||||
|
# Provides REST API for meeting transcripts, summaries, and search
|
||||||
|
|
||||||
|
FROM python:3.11-slim
|
||||||
|
|
||||||
|
# Install dependencies
|
||||||
|
RUN apt-get update && apt-get install -y --no-install-recommends \
|
||||||
|
curl \
|
||||||
|
&& rm -rf /var/lib/apt/lists/*
|
||||||
|
|
||||||
|
WORKDIR /app
|
||||||
|
|
||||||
|
# Install Python dependencies
|
||||||
|
COPY requirements.txt .
|
||||||
|
RUN pip install --no-cache-dir -r requirements.txt
|
||||||
|
|
||||||
|
# Copy application code
|
||||||
|
COPY app/ ./app/
|
||||||
|
|
||||||
|
# Create directories
|
||||||
|
RUN mkdir -p /recordings /logs
|
||||||
|
|
||||||
|
# Environment variables
|
||||||
|
ENV PYTHONUNBUFFERED=1
|
||||||
|
|
||||||
|
# Health check
|
||||||
|
HEALTHCHECK --interval=30s --timeout=10s --start-period=30s --retries=3 \
|
||||||
|
CMD curl -f http://localhost:8000/health || exit 1
|
||||||
|
|
||||||
|
# Run the service
|
||||||
|
EXPOSE 8000
|
||||||
|
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]
|
||||||
|
|
@ -0,0 +1 @@
|
||||||
|
# Meeting Intelligence API
|
||||||
|
|
@ -0,0 +1,50 @@
|
||||||
|
"""
|
||||||
|
Configuration settings for the Meeting Intelligence API.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from typing import List
|
||||||
|
from pydantic_settings import BaseSettings
|
||||||
|
|
||||||
|
|
||||||
|
class Settings(BaseSettings):
|
||||||
|
"""Application settings loaded from environment variables."""
|
||||||
|
|
||||||
|
# Database
|
||||||
|
postgres_url: str = "postgresql://meeting_intelligence:changeme@localhost:5432/meeting_intelligence"
|
||||||
|
|
||||||
|
# Redis
|
||||||
|
redis_url: str = "redis://localhost:6379"
|
||||||
|
|
||||||
|
# Ollama (for AI summaries)
|
||||||
|
ollama_url: str = "http://localhost:11434"
|
||||||
|
ollama_model: str = "llama3.2"
|
||||||
|
|
||||||
|
# File paths
|
||||||
|
recordings_path: str = "/recordings"
|
||||||
|
|
||||||
|
# Security
|
||||||
|
secret_key: str = "changeme"
|
||||||
|
api_key: str = "" # Optional API key authentication
|
||||||
|
|
||||||
|
# CORS
|
||||||
|
cors_origins: List[str] = [
|
||||||
|
"https://meet.jeffemmett.com",
|
||||||
|
"http://localhost:8080",
|
||||||
|
"http://localhost:3000"
|
||||||
|
]
|
||||||
|
|
||||||
|
# Embeddings model for semantic search
|
||||||
|
embedding_model: str = "all-MiniLM-L6-v2"
|
||||||
|
|
||||||
|
# Export settings
|
||||||
|
export_temp_dir: str = "/tmp/exports"
|
||||||
|
|
||||||
|
# Transcriber service URL
|
||||||
|
transcriber_url: str = "http://transcriber:8001"
|
||||||
|
|
||||||
|
class Config:
|
||||||
|
env_file = ".env"
|
||||||
|
env_file_encoding = "utf-8"
|
||||||
|
|
||||||
|
|
||||||
|
settings = Settings()
|
||||||
|
|
@ -0,0 +1,355 @@
|
||||||
|
"""
|
||||||
|
Database operations for the Meeting Intelligence API.
|
||||||
|
"""
|
||||||
|
|
||||||
|
import uuid
|
||||||
|
from datetime import datetime
|
||||||
|
from typing import Optional, List, Dict, Any
|
||||||
|
|
||||||
|
import asyncpg
|
||||||
|
import structlog
|
||||||
|
|
||||||
|
log = structlog.get_logger()
|
||||||
|
|
||||||
|
|
||||||
|
class Database:
|
||||||
|
"""Database operations for Meeting Intelligence API."""
|
||||||
|
|
||||||
|
def __init__(self, connection_string: str):
|
||||||
|
self.connection_string = connection_string
|
||||||
|
self.pool: Optional[asyncpg.Pool] = None
|
||||||
|
|
||||||
|
async def connect(self):
|
||||||
|
"""Establish database connection pool."""
|
||||||
|
log.info("Connecting to database...")
|
||||||
|
self.pool = await asyncpg.create_pool(
|
||||||
|
self.connection_string,
|
||||||
|
min_size=2,
|
||||||
|
max_size=20
|
||||||
|
)
|
||||||
|
log.info("Database connected")
|
||||||
|
|
||||||
|
async def disconnect(self):
|
||||||
|
"""Close database connection pool."""
|
||||||
|
if self.pool:
|
||||||
|
await self.pool.close()
|
||||||
|
log.info("Database disconnected")
|
||||||
|
|
||||||
|
async def health_check(self):
|
||||||
|
"""Check database connectivity."""
|
||||||
|
async with self.pool.acquire() as conn:
|
||||||
|
await conn.fetchval("SELECT 1")
|
||||||
|
|
||||||
|
# ==================== Meetings ====================
|
||||||
|
|
||||||
|
async def list_meetings(
|
||||||
|
self,
|
||||||
|
limit: int = 50,
|
||||||
|
offset: int = 0,
|
||||||
|
status: Optional[str] = None
|
||||||
|
) -> List[Dict[str, Any]]:
|
||||||
|
"""List meetings with pagination."""
|
||||||
|
async with self.pool.acquire() as conn:
|
||||||
|
if status:
|
||||||
|
rows = await conn.fetch("""
|
||||||
|
SELECT id, conference_id, conference_name, title,
|
||||||
|
started_at, ended_at, duration_seconds,
|
||||||
|
status, created_at
|
||||||
|
FROM meetings
|
||||||
|
WHERE status = $1
|
||||||
|
ORDER BY created_at DESC
|
||||||
|
LIMIT $2 OFFSET $3
|
||||||
|
""", status, limit, offset)
|
||||||
|
else:
|
||||||
|
rows = await conn.fetch("""
|
||||||
|
SELECT id, conference_id, conference_name, title,
|
||||||
|
started_at, ended_at, duration_seconds,
|
||||||
|
status, created_at
|
||||||
|
FROM meetings
|
||||||
|
ORDER BY created_at DESC
|
||||||
|
LIMIT $1 OFFSET $2
|
||||||
|
""", limit, offset)
|
||||||
|
|
||||||
|
return [dict(row) for row in rows]
|
||||||
|
|
||||||
|
async def get_meeting(self, meeting_id: str) -> Optional[Dict[str, Any]]:
|
||||||
|
"""Get meeting details."""
|
||||||
|
async with self.pool.acquire() as conn:
|
||||||
|
row = await conn.fetchrow("""
|
||||||
|
SELECT m.id, m.conference_id, m.conference_name, m.title,
|
||||||
|
m.started_at, m.ended_at, m.duration_seconds,
|
||||||
|
m.recording_path, m.audio_path, m.status,
|
||||||
|
m.metadata, m.created_at,
|
||||||
|
(SELECT COUNT(*) FROM transcripts WHERE meeting_id = m.id) as segment_count,
|
||||||
|
(SELECT COUNT(*) FROM meeting_participants WHERE meeting_id = m.id) as participant_count,
|
||||||
|
(SELECT id FROM summaries WHERE meeting_id = m.id LIMIT 1) as summary_id
|
||||||
|
FROM meetings m
|
||||||
|
WHERE m.id = $1::uuid
|
||||||
|
""", meeting_id)
|
||||||
|
|
||||||
|
if row:
|
||||||
|
return dict(row)
|
||||||
|
return None
|
||||||
|
|
||||||
|
async def create_meeting(
|
||||||
|
self,
|
||||||
|
conference_id: str,
|
||||||
|
conference_name: Optional[str] = None,
|
||||||
|
title: Optional[str] = None,
|
||||||
|
recording_path: Optional[str] = None,
|
||||||
|
started_at: Optional[datetime] = None,
|
||||||
|
metadata: Optional[dict] = None
|
||||||
|
) -> str:
|
||||||
|
"""Create a new meeting record."""
|
||||||
|
meeting_id = str(uuid.uuid4())
|
||||||
|
|
||||||
|
async with self.pool.acquire() as conn:
|
||||||
|
await conn.execute("""
|
||||||
|
INSERT INTO meetings (
|
||||||
|
id, conference_id, conference_name, title,
|
||||||
|
recording_path, started_at, status, metadata
|
||||||
|
)
|
||||||
|
VALUES ($1, $2, $3, $4, $5, $6, 'recording', $7)
|
||||||
|
""", meeting_id, conference_id, conference_name, title,
|
||||||
|
recording_path, started_at or datetime.utcnow(), metadata or {})
|
||||||
|
|
||||||
|
return meeting_id
|
||||||
|
|
||||||
|
async def update_meeting(
|
||||||
|
self,
|
||||||
|
meeting_id: str,
|
||||||
|
**kwargs
|
||||||
|
):
|
||||||
|
"""Update meeting fields."""
|
||||||
|
if not kwargs:
|
||||||
|
return
|
||||||
|
|
||||||
|
set_clauses = []
|
||||||
|
values = []
|
||||||
|
i = 1
|
||||||
|
|
||||||
|
for key, value in kwargs.items():
|
||||||
|
if key in ['status', 'title', 'ended_at', 'duration_seconds',
|
||||||
|
'recording_path', 'audio_path', 'error_message']:
|
||||||
|
set_clauses.append(f"{key} = ${i}")
|
||||||
|
values.append(value)
|
||||||
|
i += 1
|
||||||
|
|
||||||
|
if not set_clauses:
|
||||||
|
return
|
||||||
|
|
||||||
|
values.append(meeting_id)
|
||||||
|
|
||||||
|
async with self.pool.acquire() as conn:
|
||||||
|
await conn.execute(f"""
|
||||||
|
UPDATE meetings
|
||||||
|
SET {', '.join(set_clauses)}, updated_at = NOW()
|
||||||
|
WHERE id = ${i}::uuid
|
||||||
|
""", *values)
|
||||||
|
|
||||||
|
# ==================== Transcripts ====================
|
||||||
|
|
||||||
|
async def get_transcript(
|
||||||
|
self,
|
||||||
|
meeting_id: str,
|
||||||
|
speaker_filter: Optional[str] = None
|
||||||
|
) -> List[Dict[str, Any]]:
|
||||||
|
"""Get transcript segments for a meeting."""
|
||||||
|
async with self.pool.acquire() as conn:
|
||||||
|
if speaker_filter:
|
||||||
|
rows = await conn.fetch("""
|
||||||
|
SELECT id, segment_index, start_time, end_time,
|
||||||
|
speaker_id, speaker_name, speaker_label,
|
||||||
|
text, confidence, language
|
||||||
|
FROM transcripts
|
||||||
|
WHERE meeting_id = $1::uuid AND speaker_id = $2
|
||||||
|
ORDER BY segment_index ASC
|
||||||
|
""", meeting_id, speaker_filter)
|
||||||
|
else:
|
||||||
|
rows = await conn.fetch("""
|
||||||
|
SELECT id, segment_index, start_time, end_time,
|
||||||
|
speaker_id, speaker_name, speaker_label,
|
||||||
|
text, confidence, language
|
||||||
|
FROM transcripts
|
||||||
|
WHERE meeting_id = $1::uuid
|
||||||
|
ORDER BY segment_index ASC
|
||||||
|
""", meeting_id)
|
||||||
|
|
||||||
|
return [dict(row) for row in rows]
|
||||||
|
|
||||||
|
async def get_speakers(self, meeting_id: str) -> List[Dict[str, Any]]:
|
||||||
|
"""Get speaker statistics for a meeting."""
|
||||||
|
async with self.pool.acquire() as conn:
|
||||||
|
rows = await conn.fetch("""
|
||||||
|
SELECT speaker_id, speaker_label,
|
||||||
|
COUNT(*) as segment_count,
|
||||||
|
SUM(end_time - start_time) as speaking_time,
|
||||||
|
SUM(LENGTH(text)) as character_count
|
||||||
|
FROM transcripts
|
||||||
|
WHERE meeting_id = $1::uuid AND speaker_id IS NOT NULL
|
||||||
|
GROUP BY speaker_id, speaker_label
|
||||||
|
ORDER BY speaking_time DESC
|
||||||
|
""", meeting_id)
|
||||||
|
|
||||||
|
return [dict(row) for row in rows]
|
||||||
|
|
||||||
|
# ==================== Summaries ====================
|
||||||
|
|
||||||
|
async def get_summary(self, meeting_id: str) -> Optional[Dict[str, Any]]:
|
||||||
|
"""Get AI summary for a meeting."""
|
||||||
|
async with self.pool.acquire() as conn:
|
||||||
|
row = await conn.fetchrow("""
|
||||||
|
SELECT id, meeting_id, summary_text, key_points,
|
||||||
|
action_items, decisions, topics, sentiment,
|
||||||
|
model_used, generated_at
|
||||||
|
FROM summaries
|
||||||
|
WHERE meeting_id = $1::uuid
|
||||||
|
ORDER BY generated_at DESC
|
||||||
|
LIMIT 1
|
||||||
|
""", meeting_id)
|
||||||
|
|
||||||
|
if row:
|
||||||
|
return dict(row)
|
||||||
|
return None
|
||||||
|
|
||||||
|
async def save_summary(
|
||||||
|
self,
|
||||||
|
meeting_id: str,
|
||||||
|
summary_text: str,
|
||||||
|
key_points: List[str],
|
||||||
|
action_items: List[dict],
|
||||||
|
decisions: List[str],
|
||||||
|
topics: List[dict],
|
||||||
|
sentiment: str,
|
||||||
|
model_used: str,
|
||||||
|
prompt_tokens: int = 0,
|
||||||
|
completion_tokens: int = 0
|
||||||
|
) -> int:
|
||||||
|
"""Save AI-generated summary."""
|
||||||
|
async with self.pool.acquire() as conn:
|
||||||
|
row = await conn.fetchrow("""
|
||||||
|
INSERT INTO summaries (
|
||||||
|
meeting_id, summary_text, key_points, action_items,
|
||||||
|
decisions, topics, sentiment, model_used,
|
||||||
|
prompt_tokens, completion_tokens
|
||||||
|
)
|
||||||
|
VALUES ($1::uuid, $2, $3, $4, $5, $6, $7, $8, $9, $10)
|
||||||
|
RETURNING id
|
||||||
|
""", meeting_id, summary_text, key_points, action_items,
|
||||||
|
decisions, topics, sentiment, model_used,
|
||||||
|
prompt_tokens, completion_tokens)
|
||||||
|
|
||||||
|
return row["id"]
|
||||||
|
|
||||||
|
# ==================== Search ====================
|
||||||
|
|
||||||
|
async def fulltext_search(
|
||||||
|
self,
|
||||||
|
query: str,
|
||||||
|
meeting_id: Optional[str] = None,
|
||||||
|
limit: int = 50
|
||||||
|
) -> List[Dict[str, Any]]:
|
||||||
|
"""Full-text search across transcripts."""
|
||||||
|
async with self.pool.acquire() as conn:
|
||||||
|
if meeting_id:
|
||||||
|
rows = await conn.fetch("""
|
||||||
|
SELECT t.id, t.meeting_id, t.start_time, t.end_time,
|
||||||
|
t.speaker_label, t.text, m.title as meeting_title,
|
||||||
|
ts_rank(to_tsvector('english', t.text),
|
||||||
|
plainto_tsquery('english', $1)) as rank
|
||||||
|
FROM transcripts t
|
||||||
|
JOIN meetings m ON t.meeting_id = m.id
|
||||||
|
WHERE t.meeting_id = $2::uuid
|
||||||
|
AND to_tsvector('english', t.text) @@ plainto_tsquery('english', $1)
|
||||||
|
ORDER BY rank DESC
|
||||||
|
LIMIT $3
|
||||||
|
""", query, meeting_id, limit)
|
||||||
|
else:
|
||||||
|
rows = await conn.fetch("""
|
||||||
|
SELECT t.id, t.meeting_id, t.start_time, t.end_time,
|
||||||
|
t.speaker_label, t.text, m.title as meeting_title,
|
||||||
|
ts_rank(to_tsvector('english', t.text),
|
||||||
|
plainto_tsquery('english', $1)) as rank
|
||||||
|
FROM transcripts t
|
||||||
|
JOIN meetings m ON t.meeting_id = m.id
|
||||||
|
WHERE to_tsvector('english', t.text) @@ plainto_tsquery('english', $1)
|
||||||
|
ORDER BY rank DESC
|
||||||
|
LIMIT $2
|
||||||
|
""", query, limit)
|
||||||
|
|
||||||
|
return [dict(row) for row in rows]
|
||||||
|
|
||||||
|
async def semantic_search(
|
||||||
|
self,
|
||||||
|
embedding: List[float],
|
||||||
|
meeting_id: Optional[str] = None,
|
||||||
|
threshold: float = 0.7,
|
||||||
|
limit: int = 20
|
||||||
|
) -> List[Dict[str, Any]]:
|
||||||
|
"""Semantic search using vector embeddings."""
|
||||||
|
async with self.pool.acquire() as conn:
|
||||||
|
embedding_str = f"[{','.join(map(str, embedding))}]"
|
||||||
|
|
||||||
|
if meeting_id:
|
||||||
|
rows = await conn.fetch("""
|
||||||
|
SELECT te.transcript_id, te.meeting_id, te.chunk_text,
|
||||||
|
t.start_time, t.speaker_label, m.title as meeting_title,
|
||||||
|
1 - (te.embedding <=> $1::vector) as similarity
|
||||||
|
FROM transcript_embeddings te
|
||||||
|
JOIN transcripts t ON te.transcript_id = t.id
|
||||||
|
JOIN meetings m ON te.meeting_id = m.id
|
||||||
|
WHERE te.meeting_id = $2::uuid
|
||||||
|
AND 1 - (te.embedding <=> $1::vector) > $3
|
||||||
|
ORDER BY te.embedding <=> $1::vector
|
||||||
|
LIMIT $4
|
||||||
|
""", embedding_str, meeting_id, threshold, limit)
|
||||||
|
else:
|
||||||
|
rows = await conn.fetch("""
|
||||||
|
SELECT te.transcript_id, te.meeting_id, te.chunk_text,
|
||||||
|
t.start_time, t.speaker_label, m.title as meeting_title,
|
||||||
|
1 - (te.embedding <=> $1::vector) as similarity
|
||||||
|
FROM transcript_embeddings te
|
||||||
|
JOIN transcripts t ON te.transcript_id = t.id
|
||||||
|
JOIN meetings m ON te.meeting_id = m.id
|
||||||
|
WHERE 1 - (te.embedding <=> $1::vector) > $2
|
||||||
|
ORDER BY te.embedding <=> $1::vector
|
||||||
|
LIMIT $3
|
||||||
|
""", embedding_str, threshold, limit)
|
||||||
|
|
||||||
|
return [dict(row) for row in rows]
|
||||||
|
|
||||||
|
# ==================== Webhooks ====================
|
||||||
|
|
||||||
|
async def save_webhook_event(
|
||||||
|
self,
|
||||||
|
event_type: str,
|
||||||
|
payload: dict
|
||||||
|
) -> int:
|
||||||
|
"""Save a webhook event for processing."""
|
||||||
|
async with self.pool.acquire() as conn:
|
||||||
|
row = await conn.fetchrow("""
|
||||||
|
INSERT INTO webhook_events (event_type, payload)
|
||||||
|
VALUES ($1, $2)
|
||||||
|
RETURNING id
|
||||||
|
""", event_type, payload)
|
||||||
|
|
||||||
|
return row["id"]
|
||||||
|
|
||||||
|
# ==================== Jobs ====================
|
||||||
|
|
||||||
|
async def create_job(
|
||||||
|
self,
|
||||||
|
meeting_id: str,
|
||||||
|
job_type: str,
|
||||||
|
priority: int = 5,
|
||||||
|
result: Optional[dict] = None
|
||||||
|
) -> int:
|
||||||
|
"""Create a processing job."""
|
||||||
|
async with self.pool.acquire() as conn:
|
||||||
|
row = await conn.fetchrow("""
|
||||||
|
INSERT INTO processing_jobs (meeting_id, job_type, priority, result)
|
||||||
|
VALUES ($1::uuid, $2, $3, $4)
|
||||||
|
RETURNING id
|
||||||
|
""", meeting_id, job_type, priority, result or {})
|
||||||
|
|
||||||
|
return row["id"]
|
||||||
|
|
@ -0,0 +1,113 @@
|
||||||
|
"""
|
||||||
|
Meeting Intelligence API
|
||||||
|
|
||||||
|
Provides REST API for:
|
||||||
|
- Meeting management
|
||||||
|
- Transcript retrieval
|
||||||
|
- AI-powered summaries
|
||||||
|
- Semantic search
|
||||||
|
- Export functionality
|
||||||
|
"""
|
||||||
|
|
||||||
|
import os
|
||||||
|
from contextlib import asynccontextmanager
|
||||||
|
from typing import Optional
|
||||||
|
|
||||||
|
from fastapi import FastAPI
|
||||||
|
from fastapi.middleware.cors import CORSMiddleware
|
||||||
|
|
||||||
|
from .config import settings
|
||||||
|
from .database import Database
|
||||||
|
from .routes import meetings, transcripts, summaries, search, webhooks, export
|
||||||
|
|
||||||
|
import structlog
|
||||||
|
|
||||||
|
log = structlog.get_logger()
|
||||||
|
|
||||||
|
|
||||||
|
# Application state
|
||||||
|
class AppState:
|
||||||
|
db: Optional[Database] = None
|
||||||
|
|
||||||
|
|
||||||
|
state = AppState()
|
||||||
|
|
||||||
|
|
||||||
|
@asynccontextmanager
|
||||||
|
async def lifespan(app: FastAPI):
|
||||||
|
"""Application startup and shutdown."""
|
||||||
|
log.info("Starting Meeting Intelligence API...")
|
||||||
|
|
||||||
|
# Initialize database
|
||||||
|
state.db = Database(settings.postgres_url)
|
||||||
|
await state.db.connect()
|
||||||
|
|
||||||
|
# Make database available to routes
|
||||||
|
app.state.db = state.db
|
||||||
|
|
||||||
|
log.info("Meeting Intelligence API started successfully")
|
||||||
|
|
||||||
|
yield
|
||||||
|
|
||||||
|
# Shutdown
|
||||||
|
log.info("Shutting down Meeting Intelligence API...")
|
||||||
|
if state.db:
|
||||||
|
await state.db.disconnect()
|
||||||
|
|
||||||
|
log.info("Meeting Intelligence API stopped")
|
||||||
|
|
||||||
|
|
||||||
|
app = FastAPI(
|
||||||
|
title="Meeting Intelligence API",
|
||||||
|
description="API for meeting transcripts, summaries, and search",
|
||||||
|
version="1.0.0",
|
||||||
|
lifespan=lifespan,
|
||||||
|
docs_url="/docs",
|
||||||
|
redoc_url="/redoc"
|
||||||
|
)
|
||||||
|
|
||||||
|
# CORS configuration
|
||||||
|
app.add_middleware(
|
||||||
|
CORSMiddleware,
|
||||||
|
allow_origins=settings.cors_origins,
|
||||||
|
allow_credentials=True,
|
||||||
|
allow_methods=["*"],
|
||||||
|
allow_headers=["*"],
|
||||||
|
)
|
||||||
|
|
||||||
|
# Include routers
|
||||||
|
app.include_router(meetings.router, prefix="/meetings", tags=["Meetings"])
|
||||||
|
app.include_router(transcripts.router, prefix="/meetings", tags=["Transcripts"])
|
||||||
|
app.include_router(summaries.router, prefix="/meetings", tags=["Summaries"])
|
||||||
|
app.include_router(search.router, prefix="/search", tags=["Search"])
|
||||||
|
app.include_router(webhooks.router, prefix="/webhooks", tags=["Webhooks"])
|
||||||
|
app.include_router(export.router, prefix="/meetings", tags=["Export"])
|
||||||
|
|
||||||
|
|
||||||
|
@app.get("/health")
|
||||||
|
async def health_check():
|
||||||
|
"""Health check endpoint."""
|
||||||
|
db_ok = False
|
||||||
|
|
||||||
|
try:
|
||||||
|
if state.db:
|
||||||
|
await state.db.health_check()
|
||||||
|
db_ok = True
|
||||||
|
except Exception as e:
|
||||||
|
log.error("Database health check failed", error=str(e))
|
||||||
|
|
||||||
|
return {
|
||||||
|
"status": "healthy" if db_ok else "unhealthy",
|
||||||
|
"database": db_ok,
|
||||||
|
"version": "1.0.0"
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
@app.get("/")
|
||||||
|
async def root():
|
||||||
|
"""Root endpoint."""
|
||||||
|
return {
|
||||||
|
"service": "Meeting Intelligence API",
|
||||||
|
"version": "1.0.0",
|
||||||
|
"docs": "/docs"
|
||||||
|
}
|
||||||
|
|
@ -0,0 +1,2 @@
|
||||||
|
# API Routes
|
||||||
|
from . import meetings, transcripts, summaries, search, webhooks, export
|
||||||
|
|
@ -0,0 +1,319 @@
|
||||||
|
"""
|
||||||
|
Export routes for Meeting Intelligence.
|
||||||
|
|
||||||
|
Supports exporting meetings as PDF, Markdown, and JSON.
|
||||||
|
"""
|
||||||
|
|
||||||
|
import io
|
||||||
|
import json
|
||||||
|
import os
|
||||||
|
from datetime import datetime
|
||||||
|
from typing import Optional
|
||||||
|
|
||||||
|
from fastapi import APIRouter, HTTPException, Request, Response
|
||||||
|
from fastapi.responses import StreamingResponse
|
||||||
|
from pydantic import BaseModel
|
||||||
|
|
||||||
|
import structlog
|
||||||
|
|
||||||
|
log = structlog.get_logger()
|
||||||
|
|
||||||
|
router = APIRouter()
|
||||||
|
|
||||||
|
|
||||||
|
class ExportRequest(BaseModel):
|
||||||
|
format: str = "markdown" # "pdf", "markdown", "json"
|
||||||
|
include_transcript: bool = True
|
||||||
|
include_summary: bool = True
|
||||||
|
|
||||||
|
|
||||||
|
@router.get("/{meeting_id}/export")
|
||||||
|
async def export_meeting(
|
||||||
|
request: Request,
|
||||||
|
meeting_id: str,
|
||||||
|
format: str = "markdown",
|
||||||
|
include_transcript: bool = True,
|
||||||
|
include_summary: bool = True
|
||||||
|
):
|
||||||
|
"""Export meeting data in various formats."""
|
||||||
|
db = request.app.state.db
|
||||||
|
|
||||||
|
# Get meeting data
|
||||||
|
meeting = await db.get_meeting(meeting_id)
|
||||||
|
if not meeting:
|
||||||
|
raise HTTPException(status_code=404, detail="Meeting not found")
|
||||||
|
|
||||||
|
# Get transcript if requested
|
||||||
|
transcript = None
|
||||||
|
if include_transcript:
|
||||||
|
transcript = await db.get_transcript(meeting_id)
|
||||||
|
|
||||||
|
# Get summary if requested
|
||||||
|
summary = None
|
||||||
|
if include_summary:
|
||||||
|
summary = await db.get_summary(meeting_id)
|
||||||
|
|
||||||
|
# Export based on format
|
||||||
|
if format == "json":
|
||||||
|
return _export_json(meeting, transcript, summary)
|
||||||
|
elif format == "markdown":
|
||||||
|
return _export_markdown(meeting, transcript, summary)
|
||||||
|
elif format == "pdf":
|
||||||
|
return await _export_pdf(meeting, transcript, summary)
|
||||||
|
else:
|
||||||
|
raise HTTPException(
|
||||||
|
status_code=400,
|
||||||
|
detail=f"Unsupported format: {format}. Use: json, markdown, pdf"
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def _export_json(meeting: dict, transcript: list, summary: dict) -> Response:
|
||||||
|
"""Export as JSON."""
|
||||||
|
data = {
|
||||||
|
"meeting": {
|
||||||
|
"id": str(meeting["id"]),
|
||||||
|
"conference_id": meeting["conference_id"],
|
||||||
|
"title": meeting.get("title"),
|
||||||
|
"started_at": meeting["started_at"].isoformat() if meeting.get("started_at") else None,
|
||||||
|
"ended_at": meeting["ended_at"].isoformat() if meeting.get("ended_at") else None,
|
||||||
|
"duration_seconds": meeting.get("duration_seconds"),
|
||||||
|
"status": meeting["status"]
|
||||||
|
},
|
||||||
|
"transcript": [
|
||||||
|
{
|
||||||
|
"start_time": s["start_time"],
|
||||||
|
"end_time": s["end_time"],
|
||||||
|
"speaker": s.get("speaker_label"),
|
||||||
|
"text": s["text"]
|
||||||
|
}
|
||||||
|
for s in (transcript or [])
|
||||||
|
] if transcript else None,
|
||||||
|
"summary": {
|
||||||
|
"text": summary["summary_text"],
|
||||||
|
"key_points": summary["key_points"],
|
||||||
|
"action_items": summary["action_items"],
|
||||||
|
"decisions": summary["decisions"],
|
||||||
|
"topics": summary["topics"],
|
||||||
|
"sentiment": summary.get("sentiment")
|
||||||
|
} if summary else None,
|
||||||
|
"exported_at": datetime.utcnow().isoformat()
|
||||||
|
}
|
||||||
|
|
||||||
|
filename = f"meeting-{meeting['conference_id']}-{datetime.utcnow().strftime('%Y%m%d')}.json"
|
||||||
|
|
||||||
|
return Response(
|
||||||
|
content=json.dumps(data, indent=2),
|
||||||
|
media_type="application/json",
|
||||||
|
headers={
|
||||||
|
"Content-Disposition": f'attachment; filename="{filename}"'
|
||||||
|
}
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def _export_markdown(meeting: dict, transcript: list, summary: dict) -> Response:
|
||||||
|
"""Export as Markdown."""
|
||||||
|
lines = []
|
||||||
|
|
||||||
|
# Header
|
||||||
|
title = meeting.get("title") or f"Meeting: {meeting['conference_id']}"
|
||||||
|
lines.append(f"# {title}")
|
||||||
|
lines.append("")
|
||||||
|
|
||||||
|
# Metadata
|
||||||
|
lines.append("## Meeting Details")
|
||||||
|
lines.append("")
|
||||||
|
lines.append(f"- **Conference ID:** {meeting['conference_id']}")
|
||||||
|
if meeting.get("started_at"):
|
||||||
|
lines.append(f"- **Date:** {meeting['started_at'].strftime('%Y-%m-%d %H:%M UTC')}")
|
||||||
|
if meeting.get("duration_seconds"):
|
||||||
|
minutes = meeting["duration_seconds"] // 60
|
||||||
|
lines.append(f"- **Duration:** {minutes} minutes")
|
||||||
|
lines.append(f"- **Status:** {meeting['status']}")
|
||||||
|
lines.append("")
|
||||||
|
|
||||||
|
# Summary
|
||||||
|
if summary:
|
||||||
|
lines.append("## Summary")
|
||||||
|
lines.append("")
|
||||||
|
lines.append(summary["summary_text"])
|
||||||
|
lines.append("")
|
||||||
|
|
||||||
|
# Key Points
|
||||||
|
if summary.get("key_points"):
|
||||||
|
lines.append("### Key Points")
|
||||||
|
lines.append("")
|
||||||
|
for point in summary["key_points"]:
|
||||||
|
lines.append(f"- {point}")
|
||||||
|
lines.append("")
|
||||||
|
|
||||||
|
# Action Items
|
||||||
|
if summary.get("action_items"):
|
||||||
|
lines.append("### Action Items")
|
||||||
|
lines.append("")
|
||||||
|
for item in summary["action_items"]:
|
||||||
|
task = item.get("task", item) if isinstance(item, dict) else item
|
||||||
|
assignee = item.get("assignee", "") if isinstance(item, dict) else ""
|
||||||
|
checkbox = "[ ]"
|
||||||
|
if assignee:
|
||||||
|
lines.append(f"- {checkbox} {task} *(Assigned: {assignee})*")
|
||||||
|
else:
|
||||||
|
lines.append(f"- {checkbox} {task}")
|
||||||
|
lines.append("")
|
||||||
|
|
||||||
|
# Decisions
|
||||||
|
if summary.get("decisions"):
|
||||||
|
lines.append("### Decisions")
|
||||||
|
lines.append("")
|
||||||
|
for decision in summary["decisions"]:
|
||||||
|
lines.append(f"- {decision}")
|
||||||
|
lines.append("")
|
||||||
|
|
||||||
|
# Transcript
|
||||||
|
if transcript:
|
||||||
|
lines.append("## Transcript")
|
||||||
|
lines.append("")
|
||||||
|
|
||||||
|
current_speaker = None
|
||||||
|
for segment in transcript:
|
||||||
|
speaker = segment.get("speaker_label") or "Speaker"
|
||||||
|
time_str = _format_time(segment["start_time"])
|
||||||
|
|
||||||
|
if speaker != current_speaker:
|
||||||
|
lines.append("")
|
||||||
|
lines.append(f"**{speaker}** *({time_str})*")
|
||||||
|
current_speaker = speaker
|
||||||
|
|
||||||
|
lines.append(f"> {segment['text']}")
|
||||||
|
|
||||||
|
lines.append("")
|
||||||
|
|
||||||
|
# Footer
|
||||||
|
lines.append("---")
|
||||||
|
lines.append(f"*Exported on {datetime.utcnow().strftime('%Y-%m-%d %H:%M UTC')} by Meeting Intelligence*")
|
||||||
|
|
||||||
|
content = "\n".join(lines)
|
||||||
|
filename = f"meeting-{meeting['conference_id']}-{datetime.utcnow().strftime('%Y%m%d')}.md"
|
||||||
|
|
||||||
|
return Response(
|
||||||
|
content=content,
|
||||||
|
media_type="text/markdown",
|
||||||
|
headers={
|
||||||
|
"Content-Disposition": f'attachment; filename="{filename}"'
|
||||||
|
}
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
async def _export_pdf(meeting: dict, transcript: list, summary: dict) -> StreamingResponse:
|
||||||
|
"""Export as PDF using reportlab."""
|
||||||
|
try:
|
||||||
|
from reportlab.lib.pagesizes import letter
|
||||||
|
from reportlab.lib.styles import getSampleStyleSheet, ParagraphStyle
|
||||||
|
from reportlab.lib.units import inch
|
||||||
|
from reportlab.platypus import SimpleDocTemplate, Paragraph, Spacer, ListFlowable, ListItem
|
||||||
|
except ImportError:
|
||||||
|
raise HTTPException(
|
||||||
|
status_code=501,
|
||||||
|
detail="PDF export requires reportlab. Use markdown or json format."
|
||||||
|
)
|
||||||
|
|
||||||
|
buffer = io.BytesIO()
|
||||||
|
|
||||||
|
# Create PDF document
|
||||||
|
doc = SimpleDocTemplate(
|
||||||
|
buffer,
|
||||||
|
pagesize=letter,
|
||||||
|
rightMargin=72,
|
||||||
|
leftMargin=72,
|
||||||
|
topMargin=72,
|
||||||
|
bottomMargin=72
|
||||||
|
)
|
||||||
|
|
||||||
|
styles = getSampleStyleSheet()
|
||||||
|
story = []
|
||||||
|
|
||||||
|
# Title
|
||||||
|
title = meeting.get("title") or f"Meeting: {meeting['conference_id']}"
|
||||||
|
story.append(Paragraph(title, styles['Title']))
|
||||||
|
story.append(Spacer(1, 12))
|
||||||
|
|
||||||
|
# Metadata
|
||||||
|
story.append(Paragraph("Meeting Details", styles['Heading2']))
|
||||||
|
if meeting.get("started_at"):
|
||||||
|
story.append(Paragraph(
|
||||||
|
f"Date: {meeting['started_at'].strftime('%Y-%m-%d %H:%M UTC')}",
|
||||||
|
styles['Normal']
|
||||||
|
))
|
||||||
|
if meeting.get("duration_seconds"):
|
||||||
|
minutes = meeting["duration_seconds"] // 60
|
||||||
|
story.append(Paragraph(f"Duration: {minutes} minutes", styles['Normal']))
|
||||||
|
story.append(Spacer(1, 12))
|
||||||
|
|
||||||
|
# Summary
|
||||||
|
if summary:
|
||||||
|
story.append(Paragraph("Summary", styles['Heading2']))
|
||||||
|
story.append(Paragraph(summary["summary_text"], styles['Normal']))
|
||||||
|
story.append(Spacer(1, 12))
|
||||||
|
|
||||||
|
if summary.get("key_points"):
|
||||||
|
story.append(Paragraph("Key Points", styles['Heading3']))
|
||||||
|
for point in summary["key_points"]:
|
||||||
|
story.append(Paragraph(f"• {point}", styles['Normal']))
|
||||||
|
story.append(Spacer(1, 12))
|
||||||
|
|
||||||
|
if summary.get("action_items"):
|
||||||
|
story.append(Paragraph("Action Items", styles['Heading3']))
|
||||||
|
for item in summary["action_items"]:
|
||||||
|
task = item.get("task", item) if isinstance(item, dict) else item
|
||||||
|
story.append(Paragraph(f"☐ {task}", styles['Normal']))
|
||||||
|
story.append(Spacer(1, 12))
|
||||||
|
|
||||||
|
# Transcript (abbreviated for PDF)
|
||||||
|
if transcript:
|
||||||
|
story.append(Paragraph("Transcript", styles['Heading2']))
|
||||||
|
current_speaker = None
|
||||||
|
|
||||||
|
for segment in transcript[:100]: # Limit segments for PDF
|
||||||
|
speaker = segment.get("speaker_label") or "Speaker"
|
||||||
|
|
||||||
|
if speaker != current_speaker:
|
||||||
|
story.append(Spacer(1, 6))
|
||||||
|
story.append(Paragraph(
|
||||||
|
f"<b>{speaker}</b> ({_format_time(segment['start_time'])})",
|
||||||
|
styles['Normal']
|
||||||
|
))
|
||||||
|
current_speaker = speaker
|
||||||
|
|
||||||
|
story.append(Paragraph(segment['text'], styles['Normal']))
|
||||||
|
|
||||||
|
if len(transcript) > 100:
|
||||||
|
story.append(Spacer(1, 12))
|
||||||
|
story.append(Paragraph(
|
||||||
|
f"[... {len(transcript) - 100} more segments not shown in PDF]",
|
||||||
|
styles['Normal']
|
||||||
|
))
|
||||||
|
|
||||||
|
# Build PDF
|
||||||
|
doc.build(story)
|
||||||
|
buffer.seek(0)
|
||||||
|
|
||||||
|
filename = f"meeting-{meeting['conference_id']}-{datetime.utcnow().strftime('%Y%m%d')}.pdf"
|
||||||
|
|
||||||
|
return StreamingResponse(
|
||||||
|
buffer,
|
||||||
|
media_type="application/pdf",
|
||||||
|
headers={
|
||||||
|
"Content-Disposition": f'attachment; filename="{filename}"'
|
||||||
|
}
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def _format_time(seconds: float) -> str:
|
||||||
|
"""Format seconds as HH:MM:SS or MM:SS."""
|
||||||
|
total_seconds = int(seconds)
|
||||||
|
hours = total_seconds // 3600
|
||||||
|
minutes = (total_seconds % 3600) // 60
|
||||||
|
secs = total_seconds % 60
|
||||||
|
|
||||||
|
if hours > 0:
|
||||||
|
return f"{hours}:{minutes:02d}:{secs:02d}"
|
||||||
|
return f"{minutes}:{secs:02d}"
|
||||||
|
|
@ -0,0 +1,112 @@
|
||||||
|
"""
|
||||||
|
Meeting management routes.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from typing import Optional, List
|
||||||
|
|
||||||
|
from fastapi import APIRouter, HTTPException, Request, Query
|
||||||
|
from pydantic import BaseModel
|
||||||
|
|
||||||
|
import structlog
|
||||||
|
|
||||||
|
log = structlog.get_logger()
|
||||||
|
|
||||||
|
router = APIRouter()
|
||||||
|
|
||||||
|
|
||||||
|
class MeetingResponse(BaseModel):
|
||||||
|
id: str
|
||||||
|
conference_id: str
|
||||||
|
conference_name: Optional[str]
|
||||||
|
title: Optional[str]
|
||||||
|
started_at: Optional[str]
|
||||||
|
ended_at: Optional[str]
|
||||||
|
duration_seconds: Optional[int]
|
||||||
|
status: str
|
||||||
|
created_at: str
|
||||||
|
segment_count: Optional[int] = None
|
||||||
|
participant_count: Optional[int] = None
|
||||||
|
has_summary: Optional[bool] = None
|
||||||
|
|
||||||
|
|
||||||
|
class MeetingListResponse(BaseModel):
|
||||||
|
meetings: List[MeetingResponse]
|
||||||
|
total: int
|
||||||
|
limit: int
|
||||||
|
offset: int
|
||||||
|
|
||||||
|
|
||||||
|
@router.get("", response_model=MeetingListResponse)
|
||||||
|
async def list_meetings(
|
||||||
|
request: Request,
|
||||||
|
limit: int = Query(default=50, le=100),
|
||||||
|
offset: int = Query(default=0, ge=0),
|
||||||
|
status: Optional[str] = Query(default=None)
|
||||||
|
):
|
||||||
|
"""List all meetings with pagination."""
|
||||||
|
db = request.app.state.db
|
||||||
|
|
||||||
|
meetings = await db.list_meetings(limit=limit, offset=offset, status=status)
|
||||||
|
|
||||||
|
return MeetingListResponse(
|
||||||
|
meetings=[
|
||||||
|
MeetingResponse(
|
||||||
|
id=str(m["id"]),
|
||||||
|
conference_id=m["conference_id"],
|
||||||
|
conference_name=m.get("conference_name"),
|
||||||
|
title=m.get("title"),
|
||||||
|
started_at=m["started_at"].isoformat() if m.get("started_at") else None,
|
||||||
|
ended_at=m["ended_at"].isoformat() if m.get("ended_at") else None,
|
||||||
|
duration_seconds=m.get("duration_seconds"),
|
||||||
|
status=m["status"],
|
||||||
|
created_at=m["created_at"].isoformat()
|
||||||
|
)
|
||||||
|
for m in meetings
|
||||||
|
],
|
||||||
|
total=len(meetings), # TODO: Add total count query
|
||||||
|
limit=limit,
|
||||||
|
offset=offset
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
@router.get("/{meeting_id}", response_model=MeetingResponse)
|
||||||
|
async def get_meeting(request: Request, meeting_id: str):
|
||||||
|
"""Get meeting details."""
|
||||||
|
db = request.app.state.db
|
||||||
|
|
||||||
|
meeting = await db.get_meeting(meeting_id)
|
||||||
|
|
||||||
|
if not meeting:
|
||||||
|
raise HTTPException(status_code=404, detail="Meeting not found")
|
||||||
|
|
||||||
|
return MeetingResponse(
|
||||||
|
id=str(meeting["id"]),
|
||||||
|
conference_id=meeting["conference_id"],
|
||||||
|
conference_name=meeting.get("conference_name"),
|
||||||
|
title=meeting.get("title"),
|
||||||
|
started_at=meeting["started_at"].isoformat() if meeting.get("started_at") else None,
|
||||||
|
ended_at=meeting["ended_at"].isoformat() if meeting.get("ended_at") else None,
|
||||||
|
duration_seconds=meeting.get("duration_seconds"),
|
||||||
|
status=meeting["status"],
|
||||||
|
created_at=meeting["created_at"].isoformat(),
|
||||||
|
segment_count=meeting.get("segment_count"),
|
||||||
|
participant_count=meeting.get("participant_count"),
|
||||||
|
has_summary=meeting.get("summary_id") is not None
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
@router.delete("/{meeting_id}")
|
||||||
|
async def delete_meeting(request: Request, meeting_id: str):
|
||||||
|
"""Delete a meeting and all associated data."""
|
||||||
|
db = request.app.state.db
|
||||||
|
|
||||||
|
meeting = await db.get_meeting(meeting_id)
|
||||||
|
|
||||||
|
if not meeting:
|
||||||
|
raise HTTPException(status_code=404, detail="Meeting not found")
|
||||||
|
|
||||||
|
# TODO: Implement cascade delete
|
||||||
|
# For now, just mark as deleted
|
||||||
|
await db.update_meeting(meeting_id, status="deleted")
|
||||||
|
|
||||||
|
return {"status": "deleted", "meeting_id": meeting_id}
|
||||||
|
|
@ -0,0 +1,173 @@
|
||||||
|
"""
|
||||||
|
Search routes for Meeting Intelligence.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from typing import Optional, List
|
||||||
|
|
||||||
|
from fastapi import APIRouter, HTTPException, Request, Query
|
||||||
|
from pydantic import BaseModel
|
||||||
|
from sentence_transformers import SentenceTransformer
|
||||||
|
|
||||||
|
from ..config import settings
|
||||||
|
|
||||||
|
import structlog
|
||||||
|
|
||||||
|
log = structlog.get_logger()
|
||||||
|
|
||||||
|
router = APIRouter()
|
||||||
|
|
||||||
|
# Lazy-load embedding model
|
||||||
|
_embedding_model = None
|
||||||
|
|
||||||
|
|
||||||
|
def get_embedding_model():
|
||||||
|
"""Get or initialize the embedding model."""
|
||||||
|
global _embedding_model
|
||||||
|
if _embedding_model is None:
|
||||||
|
log.info("Loading embedding model...", model=settings.embedding_model)
|
||||||
|
_embedding_model = SentenceTransformer(settings.embedding_model)
|
||||||
|
log.info("Embedding model loaded")
|
||||||
|
return _embedding_model
|
||||||
|
|
||||||
|
|
||||||
|
class SearchResult(BaseModel):
|
||||||
|
meeting_id: str
|
||||||
|
meeting_title: Optional[str]
|
||||||
|
text: str
|
||||||
|
start_time: Optional[float]
|
||||||
|
speaker_label: Optional[str]
|
||||||
|
score: float
|
||||||
|
search_type: str
|
||||||
|
|
||||||
|
|
||||||
|
class SearchResponse(BaseModel):
|
||||||
|
query: str
|
||||||
|
results: List[SearchResult]
|
||||||
|
total: int
|
||||||
|
search_type: str
|
||||||
|
|
||||||
|
|
||||||
|
class SearchRequest(BaseModel):
|
||||||
|
query: str
|
||||||
|
meeting_id: Optional[str] = None
|
||||||
|
search_type: str = "combined" # "text", "semantic", "combined"
|
||||||
|
limit: int = 20
|
||||||
|
|
||||||
|
|
||||||
|
@router.post("", response_model=SearchResponse)
|
||||||
|
async def search_transcripts(request: Request, body: SearchRequest):
|
||||||
|
"""Search across meeting transcripts.
|
||||||
|
|
||||||
|
Search types:
|
||||||
|
- text: Full-text search using PostgreSQL ts_vector
|
||||||
|
- semantic: Semantic search using vector embeddings
|
||||||
|
- combined: Both text and semantic search, merged results
|
||||||
|
"""
|
||||||
|
db = request.app.state.db
|
||||||
|
|
||||||
|
if not body.query or len(body.query.strip()) < 2:
|
||||||
|
raise HTTPException(
|
||||||
|
status_code=400,
|
||||||
|
detail="Query must be at least 2 characters"
|
||||||
|
)
|
||||||
|
|
||||||
|
results = []
|
||||||
|
|
||||||
|
# Full-text search
|
||||||
|
if body.search_type in ["text", "combined"]:
|
||||||
|
text_results = await db.fulltext_search(
|
||||||
|
query=body.query,
|
||||||
|
meeting_id=body.meeting_id,
|
||||||
|
limit=body.limit
|
||||||
|
)
|
||||||
|
|
||||||
|
for r in text_results:
|
||||||
|
results.append(SearchResult(
|
||||||
|
meeting_id=str(r["meeting_id"]),
|
||||||
|
meeting_title=r.get("meeting_title"),
|
||||||
|
text=r["text"],
|
||||||
|
start_time=r.get("start_time"),
|
||||||
|
speaker_label=r.get("speaker_label"),
|
||||||
|
score=float(r["rank"]),
|
||||||
|
search_type="text"
|
||||||
|
))
|
||||||
|
|
||||||
|
# Semantic search
|
||||||
|
if body.search_type in ["semantic", "combined"]:
|
||||||
|
try:
|
||||||
|
model = get_embedding_model()
|
||||||
|
query_embedding = model.encode(body.query).tolist()
|
||||||
|
|
||||||
|
semantic_results = await db.semantic_search(
|
||||||
|
embedding=query_embedding,
|
||||||
|
meeting_id=body.meeting_id,
|
||||||
|
threshold=0.6,
|
||||||
|
limit=body.limit
|
||||||
|
)
|
||||||
|
|
||||||
|
for r in semantic_results:
|
||||||
|
results.append(SearchResult(
|
||||||
|
meeting_id=str(r["meeting_id"]),
|
||||||
|
meeting_title=r.get("meeting_title"),
|
||||||
|
text=r["chunk_text"],
|
||||||
|
start_time=r.get("start_time"),
|
||||||
|
speaker_label=r.get("speaker_label"),
|
||||||
|
score=float(r["similarity"]),
|
||||||
|
search_type="semantic"
|
||||||
|
))
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
log.error("Semantic search failed", error=str(e))
|
||||||
|
if body.search_type == "semantic":
|
||||||
|
raise HTTPException(
|
||||||
|
status_code=500,
|
||||||
|
detail=f"Semantic search failed: {str(e)}"
|
||||||
|
)
|
||||||
|
|
||||||
|
# Deduplicate and sort by score
|
||||||
|
seen = set()
|
||||||
|
unique_results = []
|
||||||
|
for r in sorted(results, key=lambda x: x.score, reverse=True):
|
||||||
|
key = (r.meeting_id, r.text[:100])
|
||||||
|
if key not in seen:
|
||||||
|
seen.add(key)
|
||||||
|
unique_results.append(r)
|
||||||
|
|
||||||
|
return SearchResponse(
|
||||||
|
query=body.query,
|
||||||
|
results=unique_results[:body.limit],
|
||||||
|
total=len(unique_results),
|
||||||
|
search_type=body.search_type
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
@router.get("/suggest")
|
||||||
|
async def search_suggestions(
|
||||||
|
request: Request,
|
||||||
|
q: str = Query(..., min_length=2)
|
||||||
|
):
|
||||||
|
"""Get search suggestions based on partial query."""
|
||||||
|
db = request.app.state.db
|
||||||
|
|
||||||
|
# Simple prefix search on common terms
|
||||||
|
results = await db.fulltext_search(query=q, limit=5)
|
||||||
|
|
||||||
|
# Extract unique phrases
|
||||||
|
suggestions = []
|
||||||
|
for r in results:
|
||||||
|
# Get surrounding context
|
||||||
|
text = r["text"]
|
||||||
|
words = text.split()
|
||||||
|
|
||||||
|
# Find matching words and get context
|
||||||
|
for i, word in enumerate(words):
|
||||||
|
if q.lower() in word.lower():
|
||||||
|
start = max(0, i - 2)
|
||||||
|
end = min(len(words), i + 3)
|
||||||
|
phrase = " ".join(words[start:end])
|
||||||
|
if phrase not in suggestions:
|
||||||
|
suggestions.append(phrase)
|
||||||
|
if len(suggestions) >= 5:
|
||||||
|
break
|
||||||
|
|
||||||
|
return {"suggestions": suggestions}
|
||||||
|
|
@ -0,0 +1,251 @@
|
||||||
|
"""
|
||||||
|
AI Summary routes.
|
||||||
|
"""
|
||||||
|
|
||||||
|
import json
|
||||||
|
from typing import Optional, List
|
||||||
|
|
||||||
|
import httpx
|
||||||
|
from fastapi import APIRouter, HTTPException, Request, BackgroundTasks
|
||||||
|
from pydantic import BaseModel
|
||||||
|
|
||||||
|
from ..config import settings
|
||||||
|
|
||||||
|
import structlog
|
||||||
|
|
||||||
|
log = structlog.get_logger()
|
||||||
|
|
||||||
|
router = APIRouter()
|
||||||
|
|
||||||
|
|
||||||
|
class ActionItem(BaseModel):
|
||||||
|
task: str
|
||||||
|
assignee: Optional[str] = None
|
||||||
|
due_date: Optional[str] = None
|
||||||
|
completed: bool = False
|
||||||
|
|
||||||
|
|
||||||
|
class Topic(BaseModel):
|
||||||
|
topic: str
|
||||||
|
duration_seconds: Optional[float] = None
|
||||||
|
relevance_score: Optional[float] = None
|
||||||
|
|
||||||
|
|
||||||
|
class SummaryResponse(BaseModel):
|
||||||
|
meeting_id: str
|
||||||
|
summary_text: str
|
||||||
|
key_points: List[str]
|
||||||
|
action_items: List[ActionItem]
|
||||||
|
decisions: List[str]
|
||||||
|
topics: List[Topic]
|
||||||
|
sentiment: Optional[str]
|
||||||
|
model_used: str
|
||||||
|
generated_at: str
|
||||||
|
|
||||||
|
|
||||||
|
class GenerateSummaryRequest(BaseModel):
|
||||||
|
force_regenerate: bool = False
|
||||||
|
|
||||||
|
|
||||||
|
# Summarization prompt template
|
||||||
|
SUMMARY_PROMPT = """You are analyzing a meeting transcript. Your task is to extract key information and provide a structured summary.
|
||||||
|
|
||||||
|
## Meeting Transcript:
|
||||||
|
{transcript}
|
||||||
|
|
||||||
|
## Instructions:
|
||||||
|
Analyze the transcript and extract the following information. Be concise and accurate.
|
||||||
|
|
||||||
|
Respond ONLY with a valid JSON object in this exact format (no markdown, no extra text):
|
||||||
|
{{
|
||||||
|
"summary": "A 2-3 sentence overview of what was discussed in the meeting",
|
||||||
|
"key_points": ["Point 1", "Point 2", "Point 3"],
|
||||||
|
"action_items": [
|
||||||
|
{{"task": "Description of task", "assignee": "Person name or null", "due_date": "Date or null"}}
|
||||||
|
],
|
||||||
|
"decisions": ["Decision 1", "Decision 2"],
|
||||||
|
"topics": [
|
||||||
|
{{"topic": "Topic name", "relevance_score": 0.9}}
|
||||||
|
],
|
||||||
|
"sentiment": "positive" or "neutral" or "negative" or "mixed"
|
||||||
|
}}
|
||||||
|
|
||||||
|
Remember:
|
||||||
|
- key_points: 3-5 most important points discussed
|
||||||
|
- action_items: Tasks that need to be done, with assignees if mentioned
|
||||||
|
- decisions: Any decisions or conclusions reached
|
||||||
|
- topics: Main themes discussed with relevance scores (0-1)
|
||||||
|
- sentiment: Overall tone of the meeting
|
||||||
|
"""
|
||||||
|
|
||||||
|
|
||||||
|
@router.get("/{meeting_id}/summary", response_model=SummaryResponse)
|
||||||
|
async def get_summary(request: Request, meeting_id: str):
|
||||||
|
"""Get AI-generated summary for a meeting."""
|
||||||
|
db = request.app.state.db
|
||||||
|
|
||||||
|
# Verify meeting exists
|
||||||
|
meeting = await db.get_meeting(meeting_id)
|
||||||
|
if not meeting:
|
||||||
|
raise HTTPException(status_code=404, detail="Meeting not found")
|
||||||
|
|
||||||
|
summary = await db.get_summary(meeting_id)
|
||||||
|
|
||||||
|
if not summary:
|
||||||
|
raise HTTPException(
|
||||||
|
status_code=404,
|
||||||
|
detail="No summary available. Use POST to generate one."
|
||||||
|
)
|
||||||
|
|
||||||
|
return SummaryResponse(
|
||||||
|
meeting_id=meeting_id,
|
||||||
|
summary_text=summary["summary_text"],
|
||||||
|
key_points=summary["key_points"] or [],
|
||||||
|
action_items=[
|
||||||
|
ActionItem(**item) for item in (summary["action_items"] or [])
|
||||||
|
],
|
||||||
|
decisions=summary["decisions"] or [],
|
||||||
|
topics=[
|
||||||
|
Topic(**topic) for topic in (summary["topics"] or [])
|
||||||
|
],
|
||||||
|
sentiment=summary.get("sentiment"),
|
||||||
|
model_used=summary["model_used"],
|
||||||
|
generated_at=summary["generated_at"].isoformat()
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
@router.post("/{meeting_id}/summary", response_model=SummaryResponse)
|
||||||
|
async def generate_summary(
|
||||||
|
request: Request,
|
||||||
|
meeting_id: str,
|
||||||
|
body: GenerateSummaryRequest,
|
||||||
|
background_tasks: BackgroundTasks
|
||||||
|
):
|
||||||
|
"""Generate AI summary for a meeting."""
|
||||||
|
db = request.app.state.db
|
||||||
|
|
||||||
|
# Verify meeting exists
|
||||||
|
meeting = await db.get_meeting(meeting_id)
|
||||||
|
if not meeting:
|
||||||
|
raise HTTPException(status_code=404, detail="Meeting not found")
|
||||||
|
|
||||||
|
# Check if summary already exists
|
||||||
|
if not body.force_regenerate:
|
||||||
|
existing = await db.get_summary(meeting_id)
|
||||||
|
if existing:
|
||||||
|
raise HTTPException(
|
||||||
|
status_code=409,
|
||||||
|
detail="Summary already exists. Set force_regenerate=true to regenerate."
|
||||||
|
)
|
||||||
|
|
||||||
|
# Get transcript
|
||||||
|
segments = await db.get_transcript(meeting_id)
|
||||||
|
if not segments:
|
||||||
|
raise HTTPException(
|
||||||
|
status_code=400,
|
||||||
|
detail="No transcript available for summarization"
|
||||||
|
)
|
||||||
|
|
||||||
|
# Format transcript for LLM
|
||||||
|
transcript_text = _format_transcript(segments)
|
||||||
|
|
||||||
|
# Generate summary using Ollama
|
||||||
|
summary_data = await _generate_summary_with_ollama(transcript_text)
|
||||||
|
|
||||||
|
# Save summary
|
||||||
|
await db.save_summary(
|
||||||
|
meeting_id=meeting_id,
|
||||||
|
summary_text=summary_data["summary"],
|
||||||
|
key_points=summary_data["key_points"],
|
||||||
|
action_items=summary_data["action_items"],
|
||||||
|
decisions=summary_data["decisions"],
|
||||||
|
topics=summary_data["topics"],
|
||||||
|
sentiment=summary_data["sentiment"],
|
||||||
|
model_used=settings.ollama_model
|
||||||
|
)
|
||||||
|
|
||||||
|
# Update meeting status
|
||||||
|
await db.update_meeting(meeting_id, status="ready")
|
||||||
|
|
||||||
|
# Get the saved summary
|
||||||
|
summary = await db.get_summary(meeting_id)
|
||||||
|
|
||||||
|
return SummaryResponse(
|
||||||
|
meeting_id=meeting_id,
|
||||||
|
summary_text=summary["summary_text"],
|
||||||
|
key_points=summary["key_points"] or [],
|
||||||
|
action_items=[
|
||||||
|
ActionItem(**item) for item in (summary["action_items"] or [])
|
||||||
|
],
|
||||||
|
decisions=summary["decisions"] or [],
|
||||||
|
topics=[
|
||||||
|
Topic(**topic) for topic in (summary["topics"] or [])
|
||||||
|
],
|
||||||
|
sentiment=summary.get("sentiment"),
|
||||||
|
model_used=summary["model_used"],
|
||||||
|
generated_at=summary["generated_at"].isoformat()
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def _format_transcript(segments: list) -> str:
|
||||||
|
"""Format transcript segments for LLM processing."""
|
||||||
|
lines = []
|
||||||
|
current_speaker = None
|
||||||
|
|
||||||
|
for s in segments:
|
||||||
|
speaker = s.get("speaker_label") or "Speaker"
|
||||||
|
|
||||||
|
if speaker != current_speaker:
|
||||||
|
lines.append(f"\n[{speaker}]")
|
||||||
|
current_speaker = speaker
|
||||||
|
|
||||||
|
lines.append(s["text"])
|
||||||
|
|
||||||
|
return "\n".join(lines)
|
||||||
|
|
||||||
|
|
||||||
|
async def _generate_summary_with_ollama(transcript: str) -> dict:
|
||||||
|
"""Generate summary using Ollama."""
|
||||||
|
prompt = SUMMARY_PROMPT.format(transcript=transcript[:15000]) # Limit context
|
||||||
|
|
||||||
|
async with httpx.AsyncClient(timeout=120.0) as client:
|
||||||
|
try:
|
||||||
|
response = await client.post(
|
||||||
|
f"{settings.ollama_url}/api/generate",
|
||||||
|
json={
|
||||||
|
"model": settings.ollama_model,
|
||||||
|
"prompt": prompt,
|
||||||
|
"stream": False,
|
||||||
|
"format": "json"
|
||||||
|
}
|
||||||
|
)
|
||||||
|
response.raise_for_status()
|
||||||
|
|
||||||
|
result = response.json()
|
||||||
|
response_text = result.get("response", "")
|
||||||
|
|
||||||
|
# Parse JSON from response
|
||||||
|
summary_data = json.loads(response_text)
|
||||||
|
|
||||||
|
# Validate required fields
|
||||||
|
return {
|
||||||
|
"summary": summary_data.get("summary", "No summary generated"),
|
||||||
|
"key_points": summary_data.get("key_points", []),
|
||||||
|
"action_items": summary_data.get("action_items", []),
|
||||||
|
"decisions": summary_data.get("decisions", []),
|
||||||
|
"topics": summary_data.get("topics", []),
|
||||||
|
"sentiment": summary_data.get("sentiment", "neutral")
|
||||||
|
}
|
||||||
|
|
||||||
|
except httpx.HTTPError as e:
|
||||||
|
log.error("Ollama request failed", error=str(e))
|
||||||
|
raise HTTPException(
|
||||||
|
status_code=503,
|
||||||
|
detail=f"AI service unavailable: {str(e)}"
|
||||||
|
)
|
||||||
|
except json.JSONDecodeError as e:
|
||||||
|
log.error("Failed to parse Ollama response", error=str(e))
|
||||||
|
raise HTTPException(
|
||||||
|
status_code=500,
|
||||||
|
detail="Failed to parse AI response"
|
||||||
|
)
|
||||||
|
|
@ -0,0 +1,161 @@
|
||||||
|
"""
|
||||||
|
Transcript routes.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from typing import Optional, List
|
||||||
|
|
||||||
|
from fastapi import APIRouter, HTTPException, Request, Query
|
||||||
|
from pydantic import BaseModel
|
||||||
|
|
||||||
|
import structlog
|
||||||
|
|
||||||
|
log = structlog.get_logger()
|
||||||
|
|
||||||
|
router = APIRouter()
|
||||||
|
|
||||||
|
|
||||||
|
class TranscriptSegment(BaseModel):
|
||||||
|
id: int
|
||||||
|
segment_index: int
|
||||||
|
start_time: float
|
||||||
|
end_time: float
|
||||||
|
speaker_id: Optional[str]
|
||||||
|
speaker_name: Optional[str]
|
||||||
|
speaker_label: Optional[str]
|
||||||
|
text: str
|
||||||
|
confidence: Optional[float]
|
||||||
|
language: Optional[str]
|
||||||
|
|
||||||
|
|
||||||
|
class TranscriptResponse(BaseModel):
|
||||||
|
meeting_id: str
|
||||||
|
segments: List[TranscriptSegment]
|
||||||
|
total_segments: int
|
||||||
|
duration: Optional[float]
|
||||||
|
|
||||||
|
|
||||||
|
class SpeakerStats(BaseModel):
|
||||||
|
speaker_id: str
|
||||||
|
speaker_label: Optional[str]
|
||||||
|
segment_count: int
|
||||||
|
speaking_time: float
|
||||||
|
character_count: int
|
||||||
|
|
||||||
|
|
||||||
|
class SpeakersResponse(BaseModel):
|
||||||
|
meeting_id: str
|
||||||
|
speakers: List[SpeakerStats]
|
||||||
|
|
||||||
|
|
||||||
|
@router.get("/{meeting_id}/transcript", response_model=TranscriptResponse)
|
||||||
|
async def get_transcript(
|
||||||
|
request: Request,
|
||||||
|
meeting_id: str,
|
||||||
|
speaker: Optional[str] = Query(default=None, description="Filter by speaker ID")
|
||||||
|
):
|
||||||
|
"""Get full transcript for a meeting."""
|
||||||
|
db = request.app.state.db
|
||||||
|
|
||||||
|
# Verify meeting exists
|
||||||
|
meeting = await db.get_meeting(meeting_id)
|
||||||
|
if not meeting:
|
||||||
|
raise HTTPException(status_code=404, detail="Meeting not found")
|
||||||
|
|
||||||
|
segments = await db.get_transcript(meeting_id, speaker_filter=speaker)
|
||||||
|
|
||||||
|
if not segments:
|
||||||
|
raise HTTPException(
|
||||||
|
status_code=404,
|
||||||
|
detail="No transcript available for this meeting"
|
||||||
|
)
|
||||||
|
|
||||||
|
# Calculate duration from last segment
|
||||||
|
duration = segments[-1]["end_time"] if segments else None
|
||||||
|
|
||||||
|
return TranscriptResponse(
|
||||||
|
meeting_id=meeting_id,
|
||||||
|
segments=[
|
||||||
|
TranscriptSegment(
|
||||||
|
id=s["id"],
|
||||||
|
segment_index=s["segment_index"],
|
||||||
|
start_time=s["start_time"],
|
||||||
|
end_time=s["end_time"],
|
||||||
|
speaker_id=s.get("speaker_id"),
|
||||||
|
speaker_name=s.get("speaker_name"),
|
||||||
|
speaker_label=s.get("speaker_label"),
|
||||||
|
text=s["text"],
|
||||||
|
confidence=s.get("confidence"),
|
||||||
|
language=s.get("language")
|
||||||
|
)
|
||||||
|
for s in segments
|
||||||
|
],
|
||||||
|
total_segments=len(segments),
|
||||||
|
duration=duration
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
@router.get("/{meeting_id}/speakers", response_model=SpeakersResponse)
|
||||||
|
async def get_speakers(request: Request, meeting_id: str):
|
||||||
|
"""Get speaker statistics for a meeting."""
|
||||||
|
db = request.app.state.db
|
||||||
|
|
||||||
|
# Verify meeting exists
|
||||||
|
meeting = await db.get_meeting(meeting_id)
|
||||||
|
if not meeting:
|
||||||
|
raise HTTPException(status_code=404, detail="Meeting not found")
|
||||||
|
|
||||||
|
speakers = await db.get_speakers(meeting_id)
|
||||||
|
|
||||||
|
return SpeakersResponse(
|
||||||
|
meeting_id=meeting_id,
|
||||||
|
speakers=[
|
||||||
|
SpeakerStats(
|
||||||
|
speaker_id=s["speaker_id"],
|
||||||
|
speaker_label=s.get("speaker_label"),
|
||||||
|
segment_count=s["segment_count"],
|
||||||
|
speaking_time=float(s["speaking_time"] or 0),
|
||||||
|
character_count=s["character_count"] or 0
|
||||||
|
)
|
||||||
|
for s in speakers
|
||||||
|
]
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
@router.get("/{meeting_id}/transcript/text")
|
||||||
|
async def get_transcript_text(request: Request, meeting_id: str):
|
||||||
|
"""Get transcript as plain text."""
|
||||||
|
db = request.app.state.db
|
||||||
|
|
||||||
|
# Verify meeting exists
|
||||||
|
meeting = await db.get_meeting(meeting_id)
|
||||||
|
if not meeting:
|
||||||
|
raise HTTPException(status_code=404, detail="Meeting not found")
|
||||||
|
|
||||||
|
segments = await db.get_transcript(meeting_id)
|
||||||
|
|
||||||
|
if not segments:
|
||||||
|
raise HTTPException(
|
||||||
|
status_code=404,
|
||||||
|
detail="No transcript available for this meeting"
|
||||||
|
)
|
||||||
|
|
||||||
|
# Format as plain text
|
||||||
|
lines = []
|
||||||
|
current_speaker = None
|
||||||
|
|
||||||
|
for s in segments:
|
||||||
|
speaker = s.get("speaker_label") or "Unknown"
|
||||||
|
|
||||||
|
if speaker != current_speaker:
|
||||||
|
lines.append(f"\n{speaker}:")
|
||||||
|
current_speaker = speaker
|
||||||
|
|
||||||
|
lines.append(f" {s['text']}")
|
||||||
|
|
||||||
|
text = "\n".join(lines)
|
||||||
|
|
||||||
|
return {
|
||||||
|
"meeting_id": meeting_id,
|
||||||
|
"text": text,
|
||||||
|
"format": "plain"
|
||||||
|
}
|
||||||
|
|
@ -0,0 +1,139 @@
|
||||||
|
"""
|
||||||
|
Webhook routes for Jibri recording callbacks.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from datetime import datetime
|
||||||
|
from typing import Optional
|
||||||
|
|
||||||
|
import httpx
|
||||||
|
from fastapi import APIRouter, HTTPException, Request, BackgroundTasks
|
||||||
|
from pydantic import BaseModel
|
||||||
|
|
||||||
|
from ..config import settings
|
||||||
|
|
||||||
|
import structlog
|
||||||
|
|
||||||
|
log = structlog.get_logger()
|
||||||
|
|
||||||
|
router = APIRouter()
|
||||||
|
|
||||||
|
|
||||||
|
class RecordingCompletePayload(BaseModel):
|
||||||
|
event_type: str
|
||||||
|
conference_id: str
|
||||||
|
recording_path: str
|
||||||
|
recording_dir: Optional[str] = None
|
||||||
|
file_size_bytes: Optional[int] = None
|
||||||
|
completed_at: Optional[str] = None
|
||||||
|
metadata: Optional[dict] = None
|
||||||
|
|
||||||
|
|
||||||
|
class WebhookResponse(BaseModel):
|
||||||
|
status: str
|
||||||
|
meeting_id: str
|
||||||
|
message: str
|
||||||
|
|
||||||
|
|
||||||
|
@router.post("/recording-complete", response_model=WebhookResponse)
|
||||||
|
async def recording_complete(
|
||||||
|
request: Request,
|
||||||
|
payload: RecordingCompletePayload,
|
||||||
|
background_tasks: BackgroundTasks
|
||||||
|
):
|
||||||
|
"""
|
||||||
|
Webhook called by Jibri when a recording completes.
|
||||||
|
|
||||||
|
This triggers the processing pipeline:
|
||||||
|
1. Create meeting record
|
||||||
|
2. Queue transcription job
|
||||||
|
3. (Later) Generate summary
|
||||||
|
"""
|
||||||
|
db = request.app.state.db
|
||||||
|
|
||||||
|
log.info(
|
||||||
|
"Recording complete webhook received",
|
||||||
|
conference_id=payload.conference_id,
|
||||||
|
recording_path=payload.recording_path
|
||||||
|
)
|
||||||
|
|
||||||
|
# Save webhook event for audit
|
||||||
|
await db.save_webhook_event(
|
||||||
|
event_type=payload.event_type,
|
||||||
|
payload=payload.model_dump()
|
||||||
|
)
|
||||||
|
|
||||||
|
# Create meeting record
|
||||||
|
meeting_id = await db.create_meeting(
|
||||||
|
conference_id=payload.conference_id,
|
||||||
|
conference_name=payload.conference_id, # Use conference_id as name for now
|
||||||
|
title=f"Meeting - {payload.conference_id}",
|
||||||
|
recording_path=payload.recording_path,
|
||||||
|
started_at=datetime.utcnow(), # Will be updated from recording metadata
|
||||||
|
metadata=payload.metadata or {}
|
||||||
|
)
|
||||||
|
|
||||||
|
log.info("Meeting record created", meeting_id=meeting_id)
|
||||||
|
|
||||||
|
# Update meeting status
|
||||||
|
await db.update_meeting(meeting_id, status="extracting_audio")
|
||||||
|
|
||||||
|
# Queue transcription job
|
||||||
|
job_id = await db.create_job(
|
||||||
|
meeting_id=meeting_id,
|
||||||
|
job_type="transcribe",
|
||||||
|
priority=5,
|
||||||
|
result={
|
||||||
|
"video_path": payload.recording_path,
|
||||||
|
"enable_diarization": True
|
||||||
|
}
|
||||||
|
)
|
||||||
|
|
||||||
|
log.info("Transcription job queued", job_id=job_id, meeting_id=meeting_id)
|
||||||
|
|
||||||
|
# Trigger transcription service asynchronously
|
||||||
|
background_tasks.add_task(
|
||||||
|
_notify_transcriber,
|
||||||
|
meeting_id,
|
||||||
|
payload.recording_path
|
||||||
|
)
|
||||||
|
|
||||||
|
return WebhookResponse(
|
||||||
|
status="accepted",
|
||||||
|
meeting_id=meeting_id,
|
||||||
|
message="Recording queued for processing"
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
async def _notify_transcriber(meeting_id: str, recording_path: str):
|
||||||
|
"""Notify the transcription service to start processing."""
|
||||||
|
try:
|
||||||
|
async with httpx.AsyncClient(timeout=30.0) as client:
|
||||||
|
response = await client.post(
|
||||||
|
f"{settings.transcriber_url}/transcribe",
|
||||||
|
json={
|
||||||
|
"meeting_id": meeting_id,
|
||||||
|
"video_path": recording_path,
|
||||||
|
"enable_diarization": True
|
||||||
|
}
|
||||||
|
)
|
||||||
|
response.raise_for_status()
|
||||||
|
log.info(
|
||||||
|
"Transcriber notified",
|
||||||
|
meeting_id=meeting_id,
|
||||||
|
response=response.json()
|
||||||
|
)
|
||||||
|
except Exception as e:
|
||||||
|
log.error(
|
||||||
|
"Failed to notify transcriber",
|
||||||
|
meeting_id=meeting_id,
|
||||||
|
error=str(e)
|
||||||
|
)
|
||||||
|
# Job is in database, transcriber will pick it up on next poll
|
||||||
|
|
||||||
|
|
||||||
|
@router.post("/test")
|
||||||
|
async def test_webhook(request: Request):
|
||||||
|
"""Test endpoint for webhook connectivity."""
|
||||||
|
body = await request.json()
|
||||||
|
log.info("Test webhook received", body=body)
|
||||||
|
return {"status": "ok", "received": body}
|
||||||
|
|
@ -0,0 +1,37 @@
|
||||||
|
# Meeting Intelligence API Dependencies
|
||||||
|
|
||||||
|
# Web framework
|
||||||
|
fastapi==0.109.2
|
||||||
|
uvicorn[standard]==0.27.1
|
||||||
|
python-multipart==0.0.9
|
||||||
|
|
||||||
|
# Database
|
||||||
|
asyncpg==0.29.0
|
||||||
|
sqlalchemy[asyncio]==2.0.25
|
||||||
|
psycopg2-binary==2.9.9
|
||||||
|
|
||||||
|
# Redis
|
||||||
|
redis==5.0.1
|
||||||
|
|
||||||
|
# HTTP client (for Ollama)
|
||||||
|
httpx==0.26.0
|
||||||
|
aiohttp==3.9.3
|
||||||
|
|
||||||
|
# Validation
|
||||||
|
pydantic==2.6.1
|
||||||
|
pydantic-settings==2.1.0
|
||||||
|
|
||||||
|
# Sentence embeddings (for semantic search)
|
||||||
|
sentence-transformers==2.3.1
|
||||||
|
numpy==1.26.4
|
||||||
|
|
||||||
|
# PDF export
|
||||||
|
reportlab==4.0.8
|
||||||
|
markdown2==2.4.12
|
||||||
|
|
||||||
|
# Utilities
|
||||||
|
python-dotenv==1.0.1
|
||||||
|
tenacity==8.2.3
|
||||||
|
|
||||||
|
# Logging
|
||||||
|
structlog==24.1.0
|
||||||
|
|
@ -0,0 +1,186 @@
|
||||||
|
# Meeting Intelligence System - Full Docker Compose
|
||||||
|
# Deploy on Netcup RS 8000 at /opt/meeting-intelligence/
|
||||||
|
#
|
||||||
|
# Components:
|
||||||
|
# - Jibri (recording)
|
||||||
|
# - Transcriber (whisper.cpp + diarization)
|
||||||
|
# - Meeting Intelligence API
|
||||||
|
# - PostgreSQL (storage)
|
||||||
|
# - Redis (job queue)
|
||||||
|
|
||||||
|
services:
|
||||||
|
# ============================================================
|
||||||
|
# PostgreSQL Database
|
||||||
|
# ============================================================
|
||||||
|
postgres:
|
||||||
|
image: pgvector/pgvector:pg16
|
||||||
|
container_name: meeting-intelligence-db
|
||||||
|
restart: unless-stopped
|
||||||
|
environment:
|
||||||
|
POSTGRES_USER: meeting_intelligence
|
||||||
|
POSTGRES_PASSWORD: ${POSTGRES_PASSWORD:-changeme}
|
||||||
|
POSTGRES_DB: meeting_intelligence
|
||||||
|
volumes:
|
||||||
|
- postgres_data:/var/lib/postgresql/data
|
||||||
|
- ./postgres/init.sql:/docker-entrypoint-initdb.d/init.sql:ro
|
||||||
|
healthcheck:
|
||||||
|
test: ["CMD-SHELL", "pg_isready -U meeting_intelligence"]
|
||||||
|
interval: 10s
|
||||||
|
timeout: 5s
|
||||||
|
retries: 5
|
||||||
|
networks:
|
||||||
|
- meeting-intelligence
|
||||||
|
|
||||||
|
# ============================================================
|
||||||
|
# Redis Job Queue
|
||||||
|
# ============================================================
|
||||||
|
redis:
|
||||||
|
image: redis:7-alpine
|
||||||
|
container_name: meeting-intelligence-redis
|
||||||
|
restart: unless-stopped
|
||||||
|
command: redis-server --appendonly yes
|
||||||
|
volumes:
|
||||||
|
- redis_data:/data
|
||||||
|
healthcheck:
|
||||||
|
test: ["CMD", "redis-cli", "ping"]
|
||||||
|
interval: 10s
|
||||||
|
timeout: 5s
|
||||||
|
retries: 5
|
||||||
|
networks:
|
||||||
|
- meeting-intelligence
|
||||||
|
|
||||||
|
# ============================================================
|
||||||
|
# Transcription Service (whisper.cpp + diarization)
|
||||||
|
# ============================================================
|
||||||
|
transcriber:
|
||||||
|
build:
|
||||||
|
context: ./transcriber
|
||||||
|
dockerfile: Dockerfile
|
||||||
|
container_name: meeting-intelligence-transcriber
|
||||||
|
restart: unless-stopped
|
||||||
|
environment:
|
||||||
|
REDIS_URL: redis://redis:6379
|
||||||
|
POSTGRES_URL: postgresql://meeting_intelligence:${POSTGRES_PASSWORD:-changeme}@postgres:5432/meeting_intelligence
|
||||||
|
WHISPER_MODEL: small
|
||||||
|
WHISPER_THREADS: 8
|
||||||
|
NUM_WORKERS: 4
|
||||||
|
volumes:
|
||||||
|
- recordings:/recordings:ro
|
||||||
|
- audio_processed:/audio
|
||||||
|
- whisper_models:/models
|
||||||
|
depends_on:
|
||||||
|
postgres:
|
||||||
|
condition: service_healthy
|
||||||
|
redis:
|
||||||
|
condition: service_healthy
|
||||||
|
deploy:
|
||||||
|
resources:
|
||||||
|
limits:
|
||||||
|
cpus: '12'
|
||||||
|
memory: 16G
|
||||||
|
networks:
|
||||||
|
- meeting-intelligence
|
||||||
|
|
||||||
|
# ============================================================
|
||||||
|
# Meeting Intelligence API
|
||||||
|
# ============================================================
|
||||||
|
api:
|
||||||
|
build:
|
||||||
|
context: ./api
|
||||||
|
dockerfile: Dockerfile
|
||||||
|
container_name: meeting-intelligence-api
|
||||||
|
restart: unless-stopped
|
||||||
|
environment:
|
||||||
|
REDIS_URL: redis://redis:6379
|
||||||
|
POSTGRES_URL: postgresql://meeting_intelligence:${POSTGRES_PASSWORD:-changeme}@postgres:5432/meeting_intelligence
|
||||||
|
OLLAMA_URL: http://host.docker.internal:11434
|
||||||
|
RECORDINGS_PATH: /recordings
|
||||||
|
SECRET_KEY: ${API_SECRET_KEY:-changeme}
|
||||||
|
volumes:
|
||||||
|
- recordings:/recordings
|
||||||
|
depends_on:
|
||||||
|
postgres:
|
||||||
|
condition: service_healthy
|
||||||
|
redis:
|
||||||
|
condition: service_healthy
|
||||||
|
labels:
|
||||||
|
- "traefik.enable=true"
|
||||||
|
- "traefik.http.routers.meeting-intelligence.rule=Host(`meet.jeffemmett.com`) && PathPrefix(`/api/intelligence`)"
|
||||||
|
- "traefik.http.services.meeting-intelligence.loadbalancer.server.port=8000"
|
||||||
|
- "traefik.http.routers.meeting-intelligence.middlewares=strip-intelligence-prefix"
|
||||||
|
- "traefik.http.middlewares.strip-intelligence-prefix.stripprefix.prefixes=/api/intelligence"
|
||||||
|
networks:
|
||||||
|
- meeting-intelligence
|
||||||
|
- traefik-public
|
||||||
|
|
||||||
|
# ============================================================
|
||||||
|
# Jibri Recording Service
|
||||||
|
# ============================================================
|
||||||
|
jibri:
|
||||||
|
image: jitsi/jibri:stable-9584
|
||||||
|
container_name: meeting-intelligence-jibri
|
||||||
|
restart: unless-stopped
|
||||||
|
privileged: true
|
||||||
|
environment:
|
||||||
|
# XMPP Connection
|
||||||
|
XMPP_SERVER: ${XMPP_SERVER:-meet.jeffemmett.com}
|
||||||
|
XMPP_DOMAIN: ${XMPP_DOMAIN:-meet.jeffemmett.com}
|
||||||
|
XMPP_AUTH_DOMAIN: auth.${XMPP_DOMAIN:-meet.jeffemmett.com}
|
||||||
|
XMPP_INTERNAL_MUC_DOMAIN: internal.auth.${XMPP_DOMAIN:-meet.jeffemmett.com}
|
||||||
|
XMPP_RECORDER_DOMAIN: recorder.${XMPP_DOMAIN:-meet.jeffemmett.com}
|
||||||
|
XMPP_MUC_DOMAIN: muc.${XMPP_DOMAIN:-meet.jeffemmett.com}
|
||||||
|
|
||||||
|
# Jibri Settings
|
||||||
|
JIBRI_BREWERY_MUC: JibriBrewery
|
||||||
|
JIBRI_PENDING_TIMEOUT: 90
|
||||||
|
JIBRI_RECORDING_DIR: /recordings
|
||||||
|
JIBRI_FINALIZE_RECORDING_SCRIPT_PATH: /config/finalize.sh
|
||||||
|
JIBRI_XMPP_USER: jibri
|
||||||
|
JIBRI_XMPP_PASSWORD: ${JIBRI_XMPP_PASSWORD:-changeme}
|
||||||
|
JIBRI_RECORDER_USER: recorder
|
||||||
|
JIBRI_RECORDER_PASSWORD: ${JIBRI_RECORDER_PASSWORD:-changeme}
|
||||||
|
|
||||||
|
# Display Settings
|
||||||
|
DISPLAY: ":0"
|
||||||
|
CHROMIUM_FLAGS: --use-fake-ui-for-media-stream,--start-maximized,--kiosk,--enabled,--disable-infobars,--autoplay-policy=no-user-gesture-required
|
||||||
|
|
||||||
|
# Public URL
|
||||||
|
PUBLIC_URL: https://${XMPP_DOMAIN:-meet.jeffemmett.com}
|
||||||
|
|
||||||
|
# Timezone
|
||||||
|
TZ: UTC
|
||||||
|
volumes:
|
||||||
|
- recordings:/recordings
|
||||||
|
- ./jibri/config:/config
|
||||||
|
- /dev/shm:/dev/shm
|
||||||
|
cap_add:
|
||||||
|
- SYS_ADMIN
|
||||||
|
- NET_BIND_SERVICE
|
||||||
|
security_opt:
|
||||||
|
- seccomp:unconfined
|
||||||
|
shm_size: 2gb
|
||||||
|
networks:
|
||||||
|
- meeting-intelligence
|
||||||
|
|
||||||
|
volumes:
|
||||||
|
postgres_data:
|
||||||
|
redis_data:
|
||||||
|
recordings:
|
||||||
|
driver: local
|
||||||
|
driver_opts:
|
||||||
|
type: none
|
||||||
|
o: bind
|
||||||
|
device: /opt/meetings/recordings
|
||||||
|
audio_processed:
|
||||||
|
driver: local
|
||||||
|
driver_opts:
|
||||||
|
type: none
|
||||||
|
o: bind
|
||||||
|
device: /opt/meetings/audio
|
||||||
|
whisper_models:
|
||||||
|
|
||||||
|
networks:
|
||||||
|
meeting-intelligence:
|
||||||
|
driver: bridge
|
||||||
|
traefik-public:
|
||||||
|
external: true
|
||||||
|
|
@ -0,0 +1,104 @@
|
||||||
|
#!/bin/bash
|
||||||
|
# Jibri Recording Finalize Script
|
||||||
|
# Called when Jibri finishes recording a meeting
|
||||||
|
#
|
||||||
|
# Arguments:
|
||||||
|
# $1 - Recording directory path (e.g., /recordings/<conference_id>/<timestamp>)
|
||||||
|
#
|
||||||
|
# This script:
|
||||||
|
# 1. Finds the recording file
|
||||||
|
# 2. Notifies the Meeting Intelligence API to start processing
|
||||||
|
|
||||||
|
set -e
|
||||||
|
|
||||||
|
RECORDING_DIR="$1"
|
||||||
|
API_URL="${MEETING_INTELLIGENCE_API:-http://api:8000}"
|
||||||
|
LOG_FILE="/var/log/jibri/finalize.log"
|
||||||
|
|
||||||
|
log() {
|
||||||
|
echo "[$(date -Iseconds)] $1" >> "$LOG_FILE"
|
||||||
|
echo "[$(date -Iseconds)] $1"
|
||||||
|
}
|
||||||
|
|
||||||
|
log "=== Finalize script started ==="
|
||||||
|
log "Recording directory: $RECORDING_DIR"
|
||||||
|
|
||||||
|
# Validate recording directory
|
||||||
|
if [ -z "$RECORDING_DIR" ] || [ ! -d "$RECORDING_DIR" ]; then
|
||||||
|
log "ERROR: Invalid recording directory: $RECORDING_DIR"
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Find the recording file (MP4 or WebM)
|
||||||
|
RECORDING_FILE=$(find "$RECORDING_DIR" -type f \( -name "*.mp4" -o -name "*.webm" \) | head -1)
|
||||||
|
|
||||||
|
if [ -z "$RECORDING_FILE" ]; then
|
||||||
|
log "ERROR: No recording file found in $RECORDING_DIR"
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
|
||||||
|
log "Found recording file: $RECORDING_FILE"
|
||||||
|
|
||||||
|
# Get file info
|
||||||
|
FILE_SIZE=$(stat -c%s "$RECORDING_FILE" 2>/dev/null || echo "0")
|
||||||
|
log "Recording file size: $FILE_SIZE bytes"
|
||||||
|
|
||||||
|
# Extract conference info from path
|
||||||
|
# Expected format: /recordings/<conference_id>/<timestamp>/recording.mp4
|
||||||
|
CONFERENCE_ID=$(echo "$RECORDING_DIR" | awk -F'/' '{print $(NF-1)}')
|
||||||
|
if [ -z "$CONFERENCE_ID" ]; then
|
||||||
|
CONFERENCE_ID=$(basename "$(dirname "$RECORDING_DIR")")
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Look for metadata file (Jibri sometimes creates this)
|
||||||
|
METADATA_FILE="$RECORDING_DIR/metadata.json"
|
||||||
|
if [ -f "$METADATA_FILE" ]; then
|
||||||
|
log "Found metadata file: $METADATA_FILE"
|
||||||
|
METADATA=$(cat "$METADATA_FILE")
|
||||||
|
else
|
||||||
|
METADATA="{}"
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Prepare webhook payload
|
||||||
|
PAYLOAD=$(cat <<EOF
|
||||||
|
{
|
||||||
|
"event_type": "recording_completed",
|
||||||
|
"conference_id": "$CONFERENCE_ID",
|
||||||
|
"recording_path": "$RECORDING_FILE",
|
||||||
|
"recording_dir": "$RECORDING_DIR",
|
||||||
|
"file_size_bytes": $FILE_SIZE,
|
||||||
|
"completed_at": "$(date -Iseconds)",
|
||||||
|
"metadata": $METADATA
|
||||||
|
}
|
||||||
|
EOF
|
||||||
|
)
|
||||||
|
|
||||||
|
log "Sending webhook to $API_URL/webhooks/recording-complete"
|
||||||
|
log "Payload: $PAYLOAD"
|
||||||
|
|
||||||
|
# Send webhook to Meeting Intelligence API
|
||||||
|
RESPONSE=$(curl -s -w "\n%{http_code}" \
|
||||||
|
-X POST \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
-d "$PAYLOAD" \
|
||||||
|
"$API_URL/webhooks/recording-complete" 2>&1)
|
||||||
|
|
||||||
|
HTTP_CODE=$(echo "$RESPONSE" | tail -1)
|
||||||
|
BODY=$(echo "$RESPONSE" | head -n -1)
|
||||||
|
|
||||||
|
if [ "$HTTP_CODE" = "200" ] || [ "$HTTP_CODE" = "201" ] || [ "$HTTP_CODE" = "202" ]; then
|
||||||
|
log "SUCCESS: Webhook accepted (HTTP $HTTP_CODE)"
|
||||||
|
log "Response: $BODY"
|
||||||
|
else
|
||||||
|
log "WARNING: Webhook returned HTTP $HTTP_CODE"
|
||||||
|
log "Response: $BODY"
|
||||||
|
|
||||||
|
# Don't fail the script - the recording is still saved
|
||||||
|
# The API can be retried later
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Optional: Clean up old recordings (keep last 30 days)
|
||||||
|
# find /recordings -type f -mtime +30 -delete
|
||||||
|
|
||||||
|
log "=== Finalize script completed ==="
|
||||||
|
exit 0
|
||||||
|
|
@ -0,0 +1,310 @@
|
||||||
|
-- Meeting Intelligence System - PostgreSQL Schema
|
||||||
|
-- Uses pgvector extension for semantic search
|
||||||
|
|
||||||
|
-- Enable required extensions
|
||||||
|
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
|
||||||
|
CREATE EXTENSION IF NOT EXISTS "vector";
|
||||||
|
|
||||||
|
-- ============================================================
|
||||||
|
-- Meetings Table
|
||||||
|
-- ============================================================
|
||||||
|
CREATE TABLE meetings (
|
||||||
|
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
|
||||||
|
conference_id VARCHAR(255) NOT NULL,
|
||||||
|
conference_name VARCHAR(255),
|
||||||
|
title VARCHAR(500),
|
||||||
|
started_at TIMESTAMP WITH TIME ZONE,
|
||||||
|
ended_at TIMESTAMP WITH TIME ZONE,
|
||||||
|
duration_seconds INTEGER,
|
||||||
|
recording_path VARCHAR(1000),
|
||||||
|
audio_path VARCHAR(1000),
|
||||||
|
status VARCHAR(50) DEFAULT 'recording',
|
||||||
|
-- Status: 'recording', 'extracting_audio', 'transcribing', 'diarizing', 'summarizing', 'ready', 'failed'
|
||||||
|
error_message TEXT,
|
||||||
|
metadata JSONB DEFAULT '{}',
|
||||||
|
created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
|
||||||
|
updated_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
|
||||||
|
);
|
||||||
|
|
||||||
|
CREATE INDEX idx_meetings_conference_id ON meetings(conference_id);
|
||||||
|
CREATE INDEX idx_meetings_status ON meetings(status);
|
||||||
|
CREATE INDEX idx_meetings_started_at ON meetings(started_at DESC);
|
||||||
|
CREATE INDEX idx_meetings_created_at ON meetings(created_at DESC);
|
||||||
|
|
||||||
|
-- ============================================================
|
||||||
|
-- Meeting Participants
|
||||||
|
-- ============================================================
|
||||||
|
CREATE TABLE meeting_participants (
|
||||||
|
id SERIAL PRIMARY KEY,
|
||||||
|
meeting_id UUID NOT NULL REFERENCES meetings(id) ON DELETE CASCADE,
|
||||||
|
participant_id VARCHAR(255) NOT NULL,
|
||||||
|
display_name VARCHAR(255),
|
||||||
|
email VARCHAR(255),
|
||||||
|
joined_at TIMESTAMP WITH TIME ZONE,
|
||||||
|
left_at TIMESTAMP WITH TIME ZONE,
|
||||||
|
duration_seconds INTEGER,
|
||||||
|
is_moderator BOOLEAN DEFAULT FALSE,
|
||||||
|
metadata JSONB DEFAULT '{}',
|
||||||
|
created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
|
||||||
|
);
|
||||||
|
|
||||||
|
CREATE INDEX idx_participants_meeting_id ON meeting_participants(meeting_id);
|
||||||
|
CREATE INDEX idx_participants_participant_id ON meeting_participants(participant_id);
|
||||||
|
|
||||||
|
-- ============================================================
|
||||||
|
-- Transcripts
|
||||||
|
-- ============================================================
|
||||||
|
CREATE TABLE transcripts (
|
||||||
|
id SERIAL PRIMARY KEY,
|
||||||
|
meeting_id UUID NOT NULL REFERENCES meetings(id) ON DELETE CASCADE,
|
||||||
|
segment_index INTEGER NOT NULL,
|
||||||
|
start_time FLOAT NOT NULL,
|
||||||
|
end_time FLOAT NOT NULL,
|
||||||
|
speaker_id VARCHAR(255),
|
||||||
|
speaker_name VARCHAR(255),
|
||||||
|
speaker_label VARCHAR(50), -- e.g., "Speaker 1", "Speaker 2"
|
||||||
|
text TEXT NOT NULL,
|
||||||
|
confidence FLOAT,
|
||||||
|
language VARCHAR(10) DEFAULT 'en',
|
||||||
|
word_timestamps JSONB, -- Array of {word, start, end, confidence}
|
||||||
|
metadata JSONB DEFAULT '{}',
|
||||||
|
created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
|
||||||
|
);
|
||||||
|
|
||||||
|
CREATE INDEX idx_transcripts_meeting_id ON transcripts(meeting_id);
|
||||||
|
CREATE INDEX idx_transcripts_speaker_id ON transcripts(speaker_id);
|
||||||
|
CREATE INDEX idx_transcripts_start_time ON transcripts(meeting_id, start_time);
|
||||||
|
CREATE INDEX idx_transcripts_text_search ON transcripts USING gin(to_tsvector('english', text));
|
||||||
|
|
||||||
|
-- ============================================================
|
||||||
|
-- Transcript Embeddings (for semantic search)
|
||||||
|
-- ============================================================
|
||||||
|
CREATE TABLE transcript_embeddings (
|
||||||
|
id SERIAL PRIMARY KEY,
|
||||||
|
transcript_id INTEGER NOT NULL REFERENCES transcripts(id) ON DELETE CASCADE,
|
||||||
|
meeting_id UUID NOT NULL REFERENCES meetings(id) ON DELETE CASCADE,
|
||||||
|
embedding vector(384), -- all-MiniLM-L6-v2 dimensions
|
||||||
|
chunk_text TEXT, -- The text chunk this embedding represents
|
||||||
|
created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
|
||||||
|
);
|
||||||
|
|
||||||
|
CREATE INDEX idx_embeddings_transcript_id ON transcript_embeddings(transcript_id);
|
||||||
|
CREATE INDEX idx_embeddings_meeting_id ON transcript_embeddings(meeting_id);
|
||||||
|
CREATE INDEX idx_embeddings_vector ON transcript_embeddings USING ivfflat (embedding vector_cosine_ops) WITH (lists = 100);
|
||||||
|
|
||||||
|
-- ============================================================
|
||||||
|
-- AI Summaries
|
||||||
|
-- ============================================================
|
||||||
|
CREATE TABLE summaries (
|
||||||
|
id SERIAL PRIMARY KEY,
|
||||||
|
meeting_id UUID NOT NULL REFERENCES meetings(id) ON DELETE CASCADE,
|
||||||
|
summary_text TEXT,
|
||||||
|
key_points JSONB, -- Array of key point strings
|
||||||
|
action_items JSONB, -- Array of {task, assignee, due_date, completed}
|
||||||
|
decisions JSONB, -- Array of decision strings
|
||||||
|
topics JSONB, -- Array of {topic, duration_seconds, relevance_score}
|
||||||
|
sentiment VARCHAR(50), -- 'positive', 'neutral', 'negative', 'mixed'
|
||||||
|
sentiment_scores JSONB, -- {positive: 0.7, neutral: 0.2, negative: 0.1}
|
||||||
|
participants_summary JSONB, -- {participant_id: {speaking_time, word_count, topics}}
|
||||||
|
model_used VARCHAR(100),
|
||||||
|
prompt_tokens INTEGER,
|
||||||
|
completion_tokens INTEGER,
|
||||||
|
generated_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
|
||||||
|
metadata JSONB DEFAULT '{}'
|
||||||
|
);
|
||||||
|
|
||||||
|
CREATE INDEX idx_summaries_meeting_id ON summaries(meeting_id);
|
||||||
|
CREATE INDEX idx_summaries_generated_at ON summaries(generated_at DESC);
|
||||||
|
|
||||||
|
-- ============================================================
|
||||||
|
-- Processing Jobs Queue
|
||||||
|
-- ============================================================
|
||||||
|
CREATE TABLE processing_jobs (
|
||||||
|
id SERIAL PRIMARY KEY,
|
||||||
|
meeting_id UUID NOT NULL REFERENCES meetings(id) ON DELETE CASCADE,
|
||||||
|
job_type VARCHAR(50) NOT NULL, -- 'extract_audio', 'transcribe', 'diarize', 'summarize', 'embed'
|
||||||
|
status VARCHAR(50) DEFAULT 'pending', -- 'pending', 'processing', 'completed', 'failed', 'cancelled'
|
||||||
|
priority INTEGER DEFAULT 5, -- 1 = highest, 10 = lowest
|
||||||
|
attempts INTEGER DEFAULT 0,
|
||||||
|
max_attempts INTEGER DEFAULT 3,
|
||||||
|
started_at TIMESTAMP WITH TIME ZONE,
|
||||||
|
completed_at TIMESTAMP WITH TIME ZONE,
|
||||||
|
error_message TEXT,
|
||||||
|
result JSONB,
|
||||||
|
worker_id VARCHAR(100),
|
||||||
|
created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
|
||||||
|
updated_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
|
||||||
|
);
|
||||||
|
|
||||||
|
CREATE INDEX idx_jobs_meeting_id ON processing_jobs(meeting_id);
|
||||||
|
CREATE INDEX idx_jobs_status ON processing_jobs(status, priority, created_at);
|
||||||
|
CREATE INDEX idx_jobs_type_status ON processing_jobs(job_type, status);
|
||||||
|
|
||||||
|
-- ============================================================
|
||||||
|
-- Search History (for analytics)
|
||||||
|
-- ============================================================
|
||||||
|
CREATE TABLE search_history (
|
||||||
|
id SERIAL PRIMARY KEY,
|
||||||
|
user_id VARCHAR(255),
|
||||||
|
query TEXT NOT NULL,
|
||||||
|
search_type VARCHAR(50), -- 'text', 'semantic', 'combined'
|
||||||
|
results_count INTEGER,
|
||||||
|
meeting_ids UUID[],
|
||||||
|
filters JSONB,
|
||||||
|
created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
|
||||||
|
);
|
||||||
|
|
||||||
|
CREATE INDEX idx_search_history_created_at ON search_history(created_at DESC);
|
||||||
|
|
||||||
|
-- ============================================================
|
||||||
|
-- Webhook Events (for Jibri callbacks)
|
||||||
|
-- ============================================================
|
||||||
|
CREATE TABLE webhook_events (
|
||||||
|
id SERIAL PRIMARY KEY,
|
||||||
|
event_type VARCHAR(100) NOT NULL,
|
||||||
|
payload JSONB NOT NULL,
|
||||||
|
processed BOOLEAN DEFAULT FALSE,
|
||||||
|
processed_at TIMESTAMP WITH TIME ZONE,
|
||||||
|
error_message TEXT,
|
||||||
|
created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
|
||||||
|
);
|
||||||
|
|
||||||
|
CREATE INDEX idx_webhooks_processed ON webhook_events(processed, created_at);
|
||||||
|
|
||||||
|
-- ============================================================
|
||||||
|
-- Functions
|
||||||
|
-- ============================================================
|
||||||
|
|
||||||
|
-- Update timestamp trigger
|
||||||
|
CREATE OR REPLACE FUNCTION update_updated_at()
|
||||||
|
RETURNS TRIGGER AS $$
|
||||||
|
BEGIN
|
||||||
|
NEW.updated_at = NOW();
|
||||||
|
RETURN NEW;
|
||||||
|
END;
|
||||||
|
$$ LANGUAGE plpgsql;
|
||||||
|
|
||||||
|
CREATE TRIGGER meetings_updated_at
|
||||||
|
BEFORE UPDATE ON meetings
|
||||||
|
FOR EACH ROW
|
||||||
|
EXECUTE FUNCTION update_updated_at();
|
||||||
|
|
||||||
|
CREATE TRIGGER jobs_updated_at
|
||||||
|
BEFORE UPDATE ON processing_jobs
|
||||||
|
FOR EACH ROW
|
||||||
|
EXECUTE FUNCTION update_updated_at();
|
||||||
|
|
||||||
|
-- Semantic search function
|
||||||
|
CREATE OR REPLACE FUNCTION semantic_search(
|
||||||
|
query_embedding vector(384),
|
||||||
|
match_threshold FLOAT DEFAULT 0.7,
|
||||||
|
match_count INT DEFAULT 10,
|
||||||
|
meeting_filter UUID DEFAULT NULL
|
||||||
|
)
|
||||||
|
RETURNS TABLE (
|
||||||
|
transcript_id INT,
|
||||||
|
meeting_id UUID,
|
||||||
|
chunk_text TEXT,
|
||||||
|
similarity FLOAT
|
||||||
|
) AS $$
|
||||||
|
BEGIN
|
||||||
|
RETURN QUERY
|
||||||
|
SELECT
|
||||||
|
te.transcript_id,
|
||||||
|
te.meeting_id,
|
||||||
|
te.chunk_text,
|
||||||
|
1 - (te.embedding <=> query_embedding) AS similarity
|
||||||
|
FROM transcript_embeddings te
|
||||||
|
WHERE
|
||||||
|
(meeting_filter IS NULL OR te.meeting_id = meeting_filter)
|
||||||
|
AND 1 - (te.embedding <=> query_embedding) > match_threshold
|
||||||
|
ORDER BY te.embedding <=> query_embedding
|
||||||
|
LIMIT match_count;
|
||||||
|
END;
|
||||||
|
$$ LANGUAGE plpgsql;
|
||||||
|
|
||||||
|
-- Full-text search function
|
||||||
|
CREATE OR REPLACE FUNCTION fulltext_search(
|
||||||
|
search_query TEXT,
|
||||||
|
meeting_filter UUID DEFAULT NULL,
|
||||||
|
match_count INT DEFAULT 50
|
||||||
|
)
|
||||||
|
RETURNS TABLE (
|
||||||
|
transcript_id INT,
|
||||||
|
meeting_id UUID,
|
||||||
|
text TEXT,
|
||||||
|
speaker_name VARCHAR,
|
||||||
|
start_time FLOAT,
|
||||||
|
rank FLOAT
|
||||||
|
) AS $$
|
||||||
|
BEGIN
|
||||||
|
RETURN QUERY
|
||||||
|
SELECT
|
||||||
|
t.id,
|
||||||
|
t.meeting_id,
|
||||||
|
t.text,
|
||||||
|
t.speaker_name,
|
||||||
|
t.start_time,
|
||||||
|
ts_rank(to_tsvector('english', t.text), plainto_tsquery('english', search_query)) AS rank
|
||||||
|
FROM transcripts t
|
||||||
|
WHERE
|
||||||
|
(meeting_filter IS NULL OR t.meeting_id = meeting_filter)
|
||||||
|
AND to_tsvector('english', t.text) @@ plainto_tsquery('english', search_query)
|
||||||
|
ORDER BY rank DESC
|
||||||
|
LIMIT match_count;
|
||||||
|
END;
|
||||||
|
$$ LANGUAGE plpgsql;
|
||||||
|
|
||||||
|
-- ============================================================
|
||||||
|
-- Views
|
||||||
|
-- ============================================================
|
||||||
|
|
||||||
|
-- Meeting overview with stats
|
||||||
|
CREATE VIEW meeting_overview AS
|
||||||
|
SELECT
|
||||||
|
m.id,
|
||||||
|
m.conference_id,
|
||||||
|
m.conference_name,
|
||||||
|
m.title,
|
||||||
|
m.started_at,
|
||||||
|
m.ended_at,
|
||||||
|
m.duration_seconds,
|
||||||
|
m.status,
|
||||||
|
m.recording_path,
|
||||||
|
COUNT(DISTINCT mp.id) AS participant_count,
|
||||||
|
COUNT(DISTINCT t.id) AS transcript_segment_count,
|
||||||
|
COALESCE(SUM(LENGTH(t.text)), 0) AS total_characters,
|
||||||
|
s.id IS NOT NULL AS has_summary,
|
||||||
|
m.created_at
|
||||||
|
FROM meetings m
|
||||||
|
LEFT JOIN meeting_participants mp ON m.id = mp.meeting_id
|
||||||
|
LEFT JOIN transcripts t ON m.id = t.meeting_id
|
||||||
|
LEFT JOIN summaries s ON m.id = s.meeting_id
|
||||||
|
GROUP BY m.id, s.id;
|
||||||
|
|
||||||
|
-- Speaker stats per meeting
|
||||||
|
CREATE VIEW speaker_stats AS
|
||||||
|
SELECT
|
||||||
|
t.meeting_id,
|
||||||
|
t.speaker_id,
|
||||||
|
t.speaker_name,
|
||||||
|
t.speaker_label,
|
||||||
|
COUNT(*) AS segment_count,
|
||||||
|
SUM(t.end_time - t.start_time) AS speaking_time_seconds,
|
||||||
|
SUM(LENGTH(t.text)) AS character_count,
|
||||||
|
SUM(array_length(regexp_split_to_array(t.text, '\s+'), 1)) AS word_count
|
||||||
|
FROM transcripts t
|
||||||
|
GROUP BY t.meeting_id, t.speaker_id, t.speaker_name, t.speaker_label;
|
||||||
|
|
||||||
|
-- ============================================================
|
||||||
|
-- Sample Data (for testing - remove in production)
|
||||||
|
-- ============================================================
|
||||||
|
|
||||||
|
-- INSERT INTO meetings (conference_id, conference_name, title, started_at, status)
|
||||||
|
-- VALUES ('test-room-123', 'Test Room', 'Test Meeting', NOW() - INTERVAL '1 hour', 'ready');
|
||||||
|
|
||||||
|
COMMENT ON TABLE meetings IS 'Stores meeting metadata and processing status';
|
||||||
|
COMMENT ON TABLE transcripts IS 'Stores time-stamped transcript segments with speaker attribution';
|
||||||
|
COMMENT ON TABLE summaries IS 'Stores AI-generated meeting summaries and extracted information';
|
||||||
|
COMMENT ON TABLE transcript_embeddings IS 'Stores vector embeddings for semantic search';
|
||||||
|
COMMENT ON TABLE processing_jobs IS 'Job queue for async processing tasks';
|
||||||
|
|
@ -0,0 +1,67 @@
|
||||||
|
# Meeting Intelligence Transcription Service
|
||||||
|
# Uses whisper.cpp for fast CPU-based transcription
|
||||||
|
# Uses resemblyzer for speaker diarization
|
||||||
|
|
||||||
|
FROM python:3.11-slim AS builder
|
||||||
|
|
||||||
|
# Install build dependencies
|
||||||
|
RUN apt-get update && apt-get install -y --no-install-recommends \
|
||||||
|
build-essential \
|
||||||
|
cmake \
|
||||||
|
git \
|
||||||
|
ffmpeg \
|
||||||
|
&& rm -rf /var/lib/apt/lists/*
|
||||||
|
|
||||||
|
# Build whisper.cpp
|
||||||
|
WORKDIR /build
|
||||||
|
RUN git clone https://github.com/ggerganov/whisper.cpp.git && \
|
||||||
|
cd whisper.cpp && \
|
||||||
|
cmake -B build -DWHISPER_BUILD_EXAMPLES=ON && \
|
||||||
|
cmake --build build --config Release -j$(nproc) && \
|
||||||
|
cp build/bin/whisper-cli /usr/local/bin/whisper && \
|
||||||
|
cp build/bin/whisper-server /usr/local/bin/whisper-server 2>/dev/null || true
|
||||||
|
|
||||||
|
# Download whisper models
|
||||||
|
WORKDIR /models
|
||||||
|
RUN cd /build/whisper.cpp && \
|
||||||
|
bash models/download-ggml-model.sh small && \
|
||||||
|
mv models/ggml-small.bin /models/
|
||||||
|
|
||||||
|
# Production image
|
||||||
|
FROM python:3.11-slim
|
||||||
|
|
||||||
|
# Install runtime dependencies
|
||||||
|
RUN apt-get update && apt-get install -y --no-install-recommends \
|
||||||
|
ffmpeg \
|
||||||
|
libsndfile1 \
|
||||||
|
&& rm -rf /var/lib/apt/lists/*
|
||||||
|
|
||||||
|
# Copy whisper binary and models
|
||||||
|
COPY --from=builder /usr/local/bin/whisper /usr/local/bin/whisper
|
||||||
|
COPY --from=builder /models /models
|
||||||
|
|
||||||
|
# Set up Python environment
|
||||||
|
WORKDIR /app
|
||||||
|
|
||||||
|
# Install Python dependencies
|
||||||
|
COPY requirements.txt .
|
||||||
|
RUN pip install --no-cache-dir -r requirements.txt
|
||||||
|
|
||||||
|
# Copy application code
|
||||||
|
COPY app/ ./app/
|
||||||
|
|
||||||
|
# Create directories
|
||||||
|
RUN mkdir -p /recordings /audio /logs
|
||||||
|
|
||||||
|
# Environment variables
|
||||||
|
ENV PYTHONUNBUFFERED=1
|
||||||
|
ENV WHISPER_MODEL=/models/ggml-small.bin
|
||||||
|
ENV WHISPER_THREADS=8
|
||||||
|
|
||||||
|
# Health check
|
||||||
|
HEALTHCHECK --interval=30s --timeout=10s --start-period=60s --retries=3 \
|
||||||
|
CMD curl -f http://localhost:8001/health || exit 1
|
||||||
|
|
||||||
|
# Run the service
|
||||||
|
EXPOSE 8001
|
||||||
|
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8001", "--workers", "1"]
|
||||||
|
|
@ -0,0 +1 @@
|
||||||
|
# Meeting Intelligence Transcription Service
|
||||||
|
|
@ -0,0 +1,45 @@
|
||||||
|
"""
|
||||||
|
Configuration settings for the Transcription Service.
|
||||||
|
"""
|
||||||
|
|
||||||
|
import os
|
||||||
|
from pydantic_settings import BaseSettings
|
||||||
|
|
||||||
|
|
||||||
|
class Settings(BaseSettings):
|
||||||
|
"""Application settings loaded from environment variables."""
|
||||||
|
|
||||||
|
# Redis configuration
|
||||||
|
redis_url: str = "redis://localhost:6379"
|
||||||
|
|
||||||
|
# PostgreSQL configuration
|
||||||
|
postgres_url: str = "postgresql://meeting_intelligence:changeme@localhost:5432/meeting_intelligence"
|
||||||
|
|
||||||
|
# Whisper configuration
|
||||||
|
whisper_model: str = "/models/ggml-small.bin"
|
||||||
|
whisper_threads: int = 8
|
||||||
|
whisper_language: str = "en"
|
||||||
|
|
||||||
|
# Worker configuration
|
||||||
|
num_workers: int = 4
|
||||||
|
job_timeout: int = 7200 # 2 hours in seconds
|
||||||
|
|
||||||
|
# Audio processing
|
||||||
|
audio_sample_rate: int = 16000
|
||||||
|
audio_channels: int = 1
|
||||||
|
|
||||||
|
# Diarization settings
|
||||||
|
min_speaker_duration: float = 0.5 # Minimum speaker segment in seconds
|
||||||
|
max_speakers: int = 10
|
||||||
|
|
||||||
|
# Paths
|
||||||
|
recordings_path: str = "/recordings"
|
||||||
|
audio_output_path: str = "/audio"
|
||||||
|
temp_path: str = "/tmp/transcriber"
|
||||||
|
|
||||||
|
class Config:
|
||||||
|
env_file = ".env"
|
||||||
|
env_file_encoding = "utf-8"
|
||||||
|
|
||||||
|
|
||||||
|
settings = Settings()
|
||||||
|
|
@ -0,0 +1,245 @@
|
||||||
|
"""
|
||||||
|
Database operations for the Transcription Service.
|
||||||
|
"""
|
||||||
|
|
||||||
|
import uuid
|
||||||
|
from typing import Optional, List, Dict, Any
|
||||||
|
|
||||||
|
import asyncpg
|
||||||
|
import structlog
|
||||||
|
|
||||||
|
log = structlog.get_logger()
|
||||||
|
|
||||||
|
|
||||||
|
class Database:
|
||||||
|
"""Database operations for transcription service."""
|
||||||
|
|
||||||
|
def __init__(self, connection_string: str):
|
||||||
|
self.connection_string = connection_string
|
||||||
|
self.pool: Optional[asyncpg.Pool] = None
|
||||||
|
|
||||||
|
async def connect(self):
|
||||||
|
"""Establish database connection pool."""
|
||||||
|
log.info("Connecting to database...")
|
||||||
|
self.pool = await asyncpg.create_pool(
|
||||||
|
self.connection_string,
|
||||||
|
min_size=2,
|
||||||
|
max_size=10
|
||||||
|
)
|
||||||
|
log.info("Database connected")
|
||||||
|
|
||||||
|
async def disconnect(self):
|
||||||
|
"""Close database connection pool."""
|
||||||
|
if self.pool:
|
||||||
|
await self.pool.close()
|
||||||
|
log.info("Database disconnected")
|
||||||
|
|
||||||
|
async def health_check(self):
|
||||||
|
"""Check database connectivity."""
|
||||||
|
async with self.pool.acquire() as conn:
|
||||||
|
await conn.fetchval("SELECT 1")
|
||||||
|
|
||||||
|
async def create_transcription_job(
|
||||||
|
self,
|
||||||
|
meeting_id: str,
|
||||||
|
audio_path: Optional[str] = None,
|
||||||
|
video_path: Optional[str] = None,
|
||||||
|
enable_diarization: bool = True,
|
||||||
|
language: Optional[str] = None,
|
||||||
|
priority: int = 5
|
||||||
|
) -> str:
|
||||||
|
"""Create a new transcription job."""
|
||||||
|
job_id = str(uuid.uuid4())
|
||||||
|
|
||||||
|
async with self.pool.acquire() as conn:
|
||||||
|
await conn.execute("""
|
||||||
|
INSERT INTO processing_jobs (
|
||||||
|
id, meeting_id, job_type, status, priority,
|
||||||
|
result
|
||||||
|
)
|
||||||
|
VALUES ($1, $2::uuid, 'transcribe', 'pending', $3, $4)
|
||||||
|
""", job_id, meeting_id, priority, {
|
||||||
|
"audio_path": audio_path,
|
||||||
|
"video_path": video_path,
|
||||||
|
"enable_diarization": enable_diarization,
|
||||||
|
"language": language
|
||||||
|
})
|
||||||
|
|
||||||
|
log.info("Created transcription job", job_id=job_id, meeting_id=meeting_id)
|
||||||
|
return job_id
|
||||||
|
|
||||||
|
async def get_job(self, job_id: str) -> Optional[Dict[str, Any]]:
|
||||||
|
"""Get a job by ID."""
|
||||||
|
async with self.pool.acquire() as conn:
|
||||||
|
row = await conn.fetchrow("""
|
||||||
|
SELECT id, meeting_id, job_type, status, priority,
|
||||||
|
attempts, started_at, completed_at,
|
||||||
|
error_message, result, created_at
|
||||||
|
FROM processing_jobs
|
||||||
|
WHERE id = $1
|
||||||
|
""", job_id)
|
||||||
|
|
||||||
|
if row:
|
||||||
|
return dict(row)
|
||||||
|
return None
|
||||||
|
|
||||||
|
async def get_next_pending_job(self) -> Optional[Dict[str, Any]]:
|
||||||
|
"""Get the next pending job and mark it as processing."""
|
||||||
|
async with self.pool.acquire() as conn:
|
||||||
|
# Use FOR UPDATE SKIP LOCKED to prevent race conditions
|
||||||
|
row = await conn.fetchrow("""
|
||||||
|
UPDATE processing_jobs
|
||||||
|
SET status = 'processing',
|
||||||
|
started_at = NOW(),
|
||||||
|
attempts = attempts + 1
|
||||||
|
WHERE id = (
|
||||||
|
SELECT id FROM processing_jobs
|
||||||
|
WHERE status = 'pending'
|
||||||
|
AND job_type = 'transcribe'
|
||||||
|
ORDER BY priority ASC, created_at ASC
|
||||||
|
FOR UPDATE SKIP LOCKED
|
||||||
|
LIMIT 1
|
||||||
|
)
|
||||||
|
RETURNING id, meeting_id, job_type, result
|
||||||
|
""")
|
||||||
|
|
||||||
|
if row:
|
||||||
|
result = dict(row)
|
||||||
|
# Merge result JSON into the dict
|
||||||
|
if result.get("result"):
|
||||||
|
result.update(result["result"])
|
||||||
|
return result
|
||||||
|
return None
|
||||||
|
|
||||||
|
async def update_job_status(
|
||||||
|
self,
|
||||||
|
job_id: str,
|
||||||
|
status: str,
|
||||||
|
error_message: Optional[str] = None,
|
||||||
|
result: Optional[dict] = None,
|
||||||
|
progress: Optional[float] = None
|
||||||
|
):
|
||||||
|
"""Update job status."""
|
||||||
|
async with self.pool.acquire() as conn:
|
||||||
|
if status == "completed":
|
||||||
|
await conn.execute("""
|
||||||
|
UPDATE processing_jobs
|
||||||
|
SET status = $1,
|
||||||
|
completed_at = NOW(),
|
||||||
|
error_message = $2,
|
||||||
|
result = COALESCE($3::jsonb, result)
|
||||||
|
WHERE id = $4
|
||||||
|
""", status, error_message, result, job_id)
|
||||||
|
else:
|
||||||
|
update_result = result
|
||||||
|
if progress is not None:
|
||||||
|
update_result = result or {}
|
||||||
|
update_result["progress"] = progress
|
||||||
|
|
||||||
|
await conn.execute("""
|
||||||
|
UPDATE processing_jobs
|
||||||
|
SET status = $1,
|
||||||
|
error_message = $2,
|
||||||
|
result = COALESCE($3::jsonb, result)
|
||||||
|
WHERE id = $4
|
||||||
|
""", status, error_message, update_result, job_id)
|
||||||
|
|
||||||
|
async def update_job_audio_path(self, job_id: str, audio_path: str):
|
||||||
|
"""Update the audio path for a job."""
|
||||||
|
async with self.pool.acquire() as conn:
|
||||||
|
await conn.execute("""
|
||||||
|
UPDATE processing_jobs
|
||||||
|
SET result = result || $1::jsonb
|
||||||
|
WHERE id = $2
|
||||||
|
""", {"audio_path": audio_path}, job_id)
|
||||||
|
|
||||||
|
async def update_meeting_status(self, meeting_id: str, status: str):
|
||||||
|
"""Update meeting processing status."""
|
||||||
|
async with self.pool.acquire() as conn:
|
||||||
|
await conn.execute("""
|
||||||
|
UPDATE meetings
|
||||||
|
SET status = $1,
|
||||||
|
updated_at = NOW()
|
||||||
|
WHERE id = $2::uuid
|
||||||
|
""", status, meeting_id)
|
||||||
|
|
||||||
|
async def insert_transcript_segment(
|
||||||
|
self,
|
||||||
|
meeting_id: str,
|
||||||
|
segment_index: int,
|
||||||
|
start_time: float,
|
||||||
|
end_time: float,
|
||||||
|
text: str,
|
||||||
|
speaker_id: Optional[str] = None,
|
||||||
|
speaker_label: Optional[str] = None,
|
||||||
|
confidence: Optional[float] = None,
|
||||||
|
language: str = "en"
|
||||||
|
):
|
||||||
|
"""Insert a transcript segment."""
|
||||||
|
async with self.pool.acquire() as conn:
|
||||||
|
await conn.execute("""
|
||||||
|
INSERT INTO transcripts (
|
||||||
|
meeting_id, segment_index, start_time, end_time,
|
||||||
|
text, speaker_id, speaker_label, confidence, language
|
||||||
|
)
|
||||||
|
VALUES ($1::uuid, $2, $3, $4, $5, $6, $7, $8, $9)
|
||||||
|
""", meeting_id, segment_index, start_time, end_time,
|
||||||
|
text, speaker_id, speaker_label, confidence, language)
|
||||||
|
|
||||||
|
async def get_transcript(self, meeting_id: str) -> List[Dict[str, Any]]:
|
||||||
|
"""Get all transcript segments for a meeting."""
|
||||||
|
async with self.pool.acquire() as conn:
|
||||||
|
rows = await conn.fetch("""
|
||||||
|
SELECT id, segment_index, start_time, end_time,
|
||||||
|
speaker_id, speaker_label, text, confidence, language
|
||||||
|
FROM transcripts
|
||||||
|
WHERE meeting_id = $1::uuid
|
||||||
|
ORDER BY segment_index ASC
|
||||||
|
""", meeting_id)
|
||||||
|
|
||||||
|
return [dict(row) for row in rows]
|
||||||
|
|
||||||
|
async def get_meeting(self, meeting_id: str) -> Optional[Dict[str, Any]]:
|
||||||
|
"""Get meeting details."""
|
||||||
|
async with self.pool.acquire() as conn:
|
||||||
|
row = await conn.fetchrow("""
|
||||||
|
SELECT id, conference_id, conference_name, title,
|
||||||
|
started_at, ended_at, duration_seconds,
|
||||||
|
recording_path, audio_path, status,
|
||||||
|
metadata, created_at
|
||||||
|
FROM meetings
|
||||||
|
WHERE id = $1::uuid
|
||||||
|
""", meeting_id)
|
||||||
|
|
||||||
|
if row:
|
||||||
|
return dict(row)
|
||||||
|
return None
|
||||||
|
|
||||||
|
async def create_meeting(
|
||||||
|
self,
|
||||||
|
conference_id: str,
|
||||||
|
conference_name: Optional[str] = None,
|
||||||
|
title: Optional[str] = None,
|
||||||
|
recording_path: Optional[str] = None,
|
||||||
|
metadata: Optional[dict] = None
|
||||||
|
) -> str:
|
||||||
|
"""Create a new meeting record."""
|
||||||
|
meeting_id = str(uuid.uuid4())
|
||||||
|
|
||||||
|
async with self.pool.acquire() as conn:
|
||||||
|
await conn.execute("""
|
||||||
|
INSERT INTO meetings (
|
||||||
|
id, conference_id, conference_name, title,
|
||||||
|
recording_path, status, metadata
|
||||||
|
)
|
||||||
|
VALUES ($1, $2, $3, $4, $5, 'recording', $6)
|
||||||
|
""", meeting_id, conference_id, conference_name, title,
|
||||||
|
recording_path, metadata or {})
|
||||||
|
|
||||||
|
log.info("Created meeting", meeting_id=meeting_id, conference_id=conference_id)
|
||||||
|
return meeting_id
|
||||||
|
|
||||||
|
|
||||||
|
class DatabaseError(Exception):
|
||||||
|
"""Database operation error."""
|
||||||
|
pass
|
||||||
|
|
@ -0,0 +1,338 @@
|
||||||
|
"""
|
||||||
|
Speaker Diarization using resemblyzer.
|
||||||
|
|
||||||
|
Identifies who spoke when in the audio.
|
||||||
|
"""
|
||||||
|
|
||||||
|
import os
|
||||||
|
from dataclasses import dataclass
|
||||||
|
from typing import List, Optional, Tuple
|
||||||
|
|
||||||
|
import numpy as np
|
||||||
|
import soundfile as sf
|
||||||
|
from resemblyzer import VoiceEncoder, preprocess_wav
|
||||||
|
from sklearn.cluster import AgglomerativeClustering
|
||||||
|
|
||||||
|
import structlog
|
||||||
|
|
||||||
|
log = structlog.get_logger()
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class SpeakerSegment:
|
||||||
|
"""A segment attributed to a speaker."""
|
||||||
|
start: float
|
||||||
|
end: float
|
||||||
|
speaker_id: str
|
||||||
|
speaker_label: str # e.g., "Speaker 1"
|
||||||
|
confidence: Optional[float] = None
|
||||||
|
|
||||||
|
|
||||||
|
class SpeakerDiarizer:
|
||||||
|
"""Speaker diarization using voice embeddings."""
|
||||||
|
|
||||||
|
def __init__(
|
||||||
|
self,
|
||||||
|
min_segment_duration: float = 0.5,
|
||||||
|
max_speakers: int = 10,
|
||||||
|
embedding_step: float = 0.5 # Step size for embeddings in seconds
|
||||||
|
):
|
||||||
|
self.min_segment_duration = min_segment_duration
|
||||||
|
self.max_speakers = max_speakers
|
||||||
|
self.embedding_step = embedding_step
|
||||||
|
|
||||||
|
# Load voice encoder (this downloads the model on first use)
|
||||||
|
log.info("Loading voice encoder model...")
|
||||||
|
self.encoder = VoiceEncoder()
|
||||||
|
log.info("Voice encoder loaded")
|
||||||
|
|
||||||
|
def diarize(
|
||||||
|
self,
|
||||||
|
audio_path: str,
|
||||||
|
num_speakers: Optional[int] = None,
|
||||||
|
transcript_segments: Optional[List[dict]] = None
|
||||||
|
) -> List[SpeakerSegment]:
|
||||||
|
"""
|
||||||
|
Perform speaker diarization on an audio file.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
audio_path: Path to audio file (WAV, 16kHz mono)
|
||||||
|
num_speakers: Number of speakers (if known), otherwise auto-detected
|
||||||
|
transcript_segments: Optional transcript segments to align with
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
List of SpeakerSegment with speaker attributions
|
||||||
|
"""
|
||||||
|
if not os.path.exists(audio_path):
|
||||||
|
raise FileNotFoundError(f"Audio file not found: {audio_path}")
|
||||||
|
|
||||||
|
log.info("Starting speaker diarization", audio_path=audio_path)
|
||||||
|
|
||||||
|
# Load and preprocess audio
|
||||||
|
wav, sample_rate = sf.read(audio_path)
|
||||||
|
|
||||||
|
if sample_rate != 16000:
|
||||||
|
log.warning(f"Audio sample rate is {sample_rate}, expected 16000")
|
||||||
|
|
||||||
|
# Ensure mono
|
||||||
|
if len(wav.shape) > 1:
|
||||||
|
wav = wav.mean(axis=1)
|
||||||
|
|
||||||
|
# Preprocess for resemblyzer
|
||||||
|
wav = preprocess_wav(wav)
|
||||||
|
|
||||||
|
if len(wav) == 0:
|
||||||
|
log.warning("Audio file is empty after preprocessing")
|
||||||
|
return []
|
||||||
|
|
||||||
|
# Generate embeddings for sliding windows
|
||||||
|
embeddings, timestamps = self._generate_embeddings(wav, sample_rate)
|
||||||
|
|
||||||
|
if len(embeddings) == 0:
|
||||||
|
log.warning("No embeddings generated")
|
||||||
|
return []
|
||||||
|
|
||||||
|
# Cluster embeddings to identify speakers
|
||||||
|
speaker_labels = self._cluster_speakers(
|
||||||
|
embeddings,
|
||||||
|
num_speakers=num_speakers
|
||||||
|
)
|
||||||
|
|
||||||
|
# Convert to speaker segments
|
||||||
|
segments = self._create_segments(timestamps, speaker_labels)
|
||||||
|
|
||||||
|
# If transcript segments provided, align them
|
||||||
|
if transcript_segments:
|
||||||
|
segments = self._align_with_transcript(segments, transcript_segments)
|
||||||
|
|
||||||
|
log.info(
|
||||||
|
"Diarization complete",
|
||||||
|
num_segments=len(segments),
|
||||||
|
num_speakers=len(set(s.speaker_id for s in segments))
|
||||||
|
)
|
||||||
|
|
||||||
|
return segments
|
||||||
|
|
||||||
|
def _generate_embeddings(
|
||||||
|
self,
|
||||||
|
wav: np.ndarray,
|
||||||
|
sample_rate: int
|
||||||
|
) -> Tuple[np.ndarray, List[float]]:
|
||||||
|
"""Generate voice embeddings for sliding windows."""
|
||||||
|
embeddings = []
|
||||||
|
timestamps = []
|
||||||
|
|
||||||
|
# Window size in samples (1.5 seconds for good speaker representation)
|
||||||
|
window_size = int(1.5 * sample_rate)
|
||||||
|
step_size = int(self.embedding_step * sample_rate)
|
||||||
|
|
||||||
|
# Slide through audio
|
||||||
|
for start_sample in range(0, len(wav) - window_size, step_size):
|
||||||
|
end_sample = start_sample + window_size
|
||||||
|
window = wav[start_sample:end_sample]
|
||||||
|
|
||||||
|
# Get embedding for this window
|
||||||
|
try:
|
||||||
|
embedding = self.encoder.embed_utterance(window)
|
||||||
|
embeddings.append(embedding)
|
||||||
|
timestamps.append(start_sample / sample_rate)
|
||||||
|
except Exception as e:
|
||||||
|
log.debug(f"Failed to embed window at {start_sample/sample_rate}s: {e}")
|
||||||
|
continue
|
||||||
|
|
||||||
|
return np.array(embeddings), timestamps
|
||||||
|
|
||||||
|
def _cluster_speakers(
|
||||||
|
self,
|
||||||
|
embeddings: np.ndarray,
|
||||||
|
num_speakers: Optional[int] = None
|
||||||
|
) -> np.ndarray:
|
||||||
|
"""Cluster embeddings to identify speakers."""
|
||||||
|
if len(embeddings) == 0:
|
||||||
|
return np.array([])
|
||||||
|
|
||||||
|
# If number of speakers not specified, estimate it
|
||||||
|
if num_speakers is None:
|
||||||
|
num_speakers = self._estimate_num_speakers(embeddings)
|
||||||
|
|
||||||
|
# Ensure we don't exceed max speakers or embedding count
|
||||||
|
num_speakers = min(num_speakers, self.max_speakers, len(embeddings))
|
||||||
|
num_speakers = max(num_speakers, 1)
|
||||||
|
|
||||||
|
log.info(f"Clustering with {num_speakers} speakers")
|
||||||
|
|
||||||
|
# Use agglomerative clustering
|
||||||
|
clustering = AgglomerativeClustering(
|
||||||
|
n_clusters=num_speakers,
|
||||||
|
metric="cosine",
|
||||||
|
linkage="average"
|
||||||
|
)
|
||||||
|
|
||||||
|
labels = clustering.fit_predict(embeddings)
|
||||||
|
|
||||||
|
return labels
|
||||||
|
|
||||||
|
def _estimate_num_speakers(self, embeddings: np.ndarray) -> int:
|
||||||
|
"""Estimate the number of speakers from embeddings."""
|
||||||
|
if len(embeddings) < 2:
|
||||||
|
return 1
|
||||||
|
|
||||||
|
# Try different numbers of clusters and find the best
|
||||||
|
best_score = -1
|
||||||
|
best_n = 2
|
||||||
|
|
||||||
|
for n in range(2, min(6, len(embeddings))):
|
||||||
|
try:
|
||||||
|
clustering = AgglomerativeClustering(
|
||||||
|
n_clusters=n,
|
||||||
|
metric="cosine",
|
||||||
|
linkage="average"
|
||||||
|
)
|
||||||
|
labels = clustering.fit_predict(embeddings)
|
||||||
|
|
||||||
|
# Calculate silhouette-like score
|
||||||
|
score = self._cluster_quality_score(embeddings, labels)
|
||||||
|
|
||||||
|
if score > best_score:
|
||||||
|
best_score = score
|
||||||
|
best_n = n
|
||||||
|
except Exception:
|
||||||
|
continue
|
||||||
|
|
||||||
|
log.info(f"Estimated {best_n} speakers (score: {best_score:.3f})")
|
||||||
|
return best_n
|
||||||
|
|
||||||
|
def _cluster_quality_score(
|
||||||
|
self,
|
||||||
|
embeddings: np.ndarray,
|
||||||
|
labels: np.ndarray
|
||||||
|
) -> float:
|
||||||
|
"""Calculate a simple cluster quality score."""
|
||||||
|
unique_labels = np.unique(labels)
|
||||||
|
|
||||||
|
if len(unique_labels) < 2:
|
||||||
|
return 0.0
|
||||||
|
|
||||||
|
# Calculate average intra-cluster distance
|
||||||
|
intra_distances = []
|
||||||
|
for label in unique_labels:
|
||||||
|
cluster_embeddings = embeddings[labels == label]
|
||||||
|
if len(cluster_embeddings) > 1:
|
||||||
|
# Cosine distance within cluster
|
||||||
|
for i in range(len(cluster_embeddings)):
|
||||||
|
for j in range(i + 1, len(cluster_embeddings)):
|
||||||
|
dist = 1 - np.dot(cluster_embeddings[i], cluster_embeddings[j])
|
||||||
|
intra_distances.append(dist)
|
||||||
|
|
||||||
|
if not intra_distances:
|
||||||
|
return 0.0
|
||||||
|
|
||||||
|
avg_intra = np.mean(intra_distances)
|
||||||
|
|
||||||
|
# Calculate average inter-cluster distance
|
||||||
|
inter_distances = []
|
||||||
|
cluster_centers = []
|
||||||
|
for label in unique_labels:
|
||||||
|
cluster_embeddings = embeddings[labels == label]
|
||||||
|
center = cluster_embeddings.mean(axis=0)
|
||||||
|
cluster_centers.append(center)
|
||||||
|
|
||||||
|
for i in range(len(cluster_centers)):
|
||||||
|
for j in range(i + 1, len(cluster_centers)):
|
||||||
|
dist = 1 - np.dot(cluster_centers[i], cluster_centers[j])
|
||||||
|
inter_distances.append(dist)
|
||||||
|
|
||||||
|
avg_inter = np.mean(inter_distances) if inter_distances else 1.0
|
||||||
|
|
||||||
|
# Score: higher inter-cluster distance, lower intra-cluster distance is better
|
||||||
|
return (avg_inter - avg_intra) / max(avg_inter, avg_intra, 0.001)
|
||||||
|
|
||||||
|
def _create_segments(
|
||||||
|
self,
|
||||||
|
timestamps: List[float],
|
||||||
|
labels: np.ndarray
|
||||||
|
) -> List[SpeakerSegment]:
|
||||||
|
"""Convert clustered timestamps to speaker segments."""
|
||||||
|
if len(timestamps) == 0:
|
||||||
|
return []
|
||||||
|
|
||||||
|
segments = []
|
||||||
|
current_speaker = labels[0]
|
||||||
|
segment_start = timestamps[0]
|
||||||
|
|
||||||
|
for i in range(1, len(timestamps)):
|
||||||
|
if labels[i] != current_speaker:
|
||||||
|
# End current segment
|
||||||
|
segment_end = timestamps[i]
|
||||||
|
|
||||||
|
if segment_end - segment_start >= self.min_segment_duration:
|
||||||
|
segments.append(SpeakerSegment(
|
||||||
|
start=segment_start,
|
||||||
|
end=segment_end,
|
||||||
|
speaker_id=f"speaker_{current_speaker}",
|
||||||
|
speaker_label=f"Speaker {current_speaker + 1}"
|
||||||
|
))
|
||||||
|
|
||||||
|
# Start new segment
|
||||||
|
current_speaker = labels[i]
|
||||||
|
segment_start = timestamps[i]
|
||||||
|
|
||||||
|
# Add final segment
|
||||||
|
if len(timestamps) > 0:
|
||||||
|
segment_end = timestamps[-1] + self.embedding_step
|
||||||
|
if segment_end - segment_start >= self.min_segment_duration:
|
||||||
|
segments.append(SpeakerSegment(
|
||||||
|
start=segment_start,
|
||||||
|
end=segment_end,
|
||||||
|
speaker_id=f"speaker_{current_speaker}",
|
||||||
|
speaker_label=f"Speaker {current_speaker + 1}"
|
||||||
|
))
|
||||||
|
|
||||||
|
return segments
|
||||||
|
|
||||||
|
def _align_with_transcript(
|
||||||
|
self,
|
||||||
|
speaker_segments: List[SpeakerSegment],
|
||||||
|
transcript_segments: List[dict]
|
||||||
|
) -> List[SpeakerSegment]:
|
||||||
|
"""Align speaker segments with transcript segments."""
|
||||||
|
aligned = []
|
||||||
|
|
||||||
|
for trans in transcript_segments:
|
||||||
|
trans_start = trans.get("start", 0)
|
||||||
|
trans_end = trans.get("end", 0)
|
||||||
|
trans_mid = (trans_start + trans_end) / 2
|
||||||
|
|
||||||
|
# Find the speaker segment that best overlaps
|
||||||
|
best_speaker = None
|
||||||
|
best_overlap = 0
|
||||||
|
|
||||||
|
for speaker in speaker_segments:
|
||||||
|
# Calculate overlap
|
||||||
|
overlap_start = max(trans_start, speaker.start)
|
||||||
|
overlap_end = min(trans_end, speaker.end)
|
||||||
|
overlap = max(0, overlap_end - overlap_start)
|
||||||
|
|
||||||
|
if overlap > best_overlap:
|
||||||
|
best_overlap = overlap
|
||||||
|
best_speaker = speaker
|
||||||
|
|
||||||
|
if best_speaker:
|
||||||
|
aligned.append(SpeakerSegment(
|
||||||
|
start=trans_start,
|
||||||
|
end=trans_end,
|
||||||
|
speaker_id=best_speaker.speaker_id,
|
||||||
|
speaker_label=best_speaker.speaker_label,
|
||||||
|
confidence=best_overlap / (trans_end - trans_start) if trans_end > trans_start else 0
|
||||||
|
))
|
||||||
|
else:
|
||||||
|
# No match, assign unknown speaker
|
||||||
|
aligned.append(SpeakerSegment(
|
||||||
|
start=trans_start,
|
||||||
|
end=trans_end,
|
||||||
|
speaker_id="speaker_unknown",
|
||||||
|
speaker_label="Unknown Speaker",
|
||||||
|
confidence=0
|
||||||
|
))
|
||||||
|
|
||||||
|
return aligned
|
||||||
|
|
@ -0,0 +1,274 @@
|
||||||
|
"""
|
||||||
|
Meeting Intelligence Transcription Service
|
||||||
|
|
||||||
|
FastAPI service that handles:
|
||||||
|
- Audio extraction from video recordings
|
||||||
|
- Transcription using whisper.cpp
|
||||||
|
- Speaker diarization using resemblyzer
|
||||||
|
- Job queue management via Redis
|
||||||
|
"""
|
||||||
|
|
||||||
|
import asyncio
|
||||||
|
import os
|
||||||
|
from contextlib import asynccontextmanager
|
||||||
|
from typing import Optional
|
||||||
|
|
||||||
|
from fastapi import FastAPI, BackgroundTasks, HTTPException
|
||||||
|
from fastapi.responses import JSONResponse
|
||||||
|
from pydantic import BaseModel
|
||||||
|
from redis import Redis
|
||||||
|
from rq import Queue
|
||||||
|
|
||||||
|
from .config import settings
|
||||||
|
from .transcriber import WhisperTranscriber
|
||||||
|
from .diarizer import SpeakerDiarizer
|
||||||
|
from .processor import JobProcessor
|
||||||
|
from .database import Database
|
||||||
|
|
||||||
|
import structlog
|
||||||
|
|
||||||
|
log = structlog.get_logger()
|
||||||
|
|
||||||
|
|
||||||
|
# Pydantic models
|
||||||
|
class TranscribeRequest(BaseModel):
|
||||||
|
meeting_id: str
|
||||||
|
audio_path: str
|
||||||
|
priority: int = 5
|
||||||
|
enable_diarization: bool = True
|
||||||
|
language: Optional[str] = None
|
||||||
|
|
||||||
|
|
||||||
|
class TranscribeResponse(BaseModel):
|
||||||
|
job_id: str
|
||||||
|
status: str
|
||||||
|
message: str
|
||||||
|
|
||||||
|
|
||||||
|
class JobStatus(BaseModel):
|
||||||
|
job_id: str
|
||||||
|
status: str
|
||||||
|
progress: Optional[float] = None
|
||||||
|
result: Optional[dict] = None
|
||||||
|
error: Optional[str] = None
|
||||||
|
|
||||||
|
|
||||||
|
# Application state
|
||||||
|
class AppState:
|
||||||
|
redis: Optional[Redis] = None
|
||||||
|
queue: Optional[Queue] = None
|
||||||
|
db: Optional[Database] = None
|
||||||
|
transcriber: Optional[WhisperTranscriber] = None
|
||||||
|
diarizer: Optional[SpeakerDiarizer] = None
|
||||||
|
processor: Optional[JobProcessor] = None
|
||||||
|
|
||||||
|
|
||||||
|
state = AppState()
|
||||||
|
|
||||||
|
|
||||||
|
@asynccontextmanager
|
||||||
|
async def lifespan(app: FastAPI):
|
||||||
|
"""Application startup and shutdown."""
|
||||||
|
log.info("Starting transcription service...")
|
||||||
|
|
||||||
|
# Initialize Redis connection
|
||||||
|
state.redis = Redis.from_url(settings.redis_url)
|
||||||
|
state.queue = Queue("transcription", connection=state.redis)
|
||||||
|
|
||||||
|
# Initialize database
|
||||||
|
state.db = Database(settings.postgres_url)
|
||||||
|
await state.db.connect()
|
||||||
|
|
||||||
|
# Initialize transcriber
|
||||||
|
state.transcriber = WhisperTranscriber(
|
||||||
|
model_path=settings.whisper_model,
|
||||||
|
threads=settings.whisper_threads
|
||||||
|
)
|
||||||
|
|
||||||
|
# Initialize diarizer
|
||||||
|
state.diarizer = SpeakerDiarizer()
|
||||||
|
|
||||||
|
# Initialize job processor
|
||||||
|
state.processor = JobProcessor(
|
||||||
|
transcriber=state.transcriber,
|
||||||
|
diarizer=state.diarizer,
|
||||||
|
db=state.db,
|
||||||
|
redis=state.redis
|
||||||
|
)
|
||||||
|
|
||||||
|
# Start background worker
|
||||||
|
asyncio.create_task(state.processor.process_jobs())
|
||||||
|
|
||||||
|
log.info("Transcription service started successfully")
|
||||||
|
|
||||||
|
yield
|
||||||
|
|
||||||
|
# Shutdown
|
||||||
|
log.info("Shutting down transcription service...")
|
||||||
|
if state.processor:
|
||||||
|
await state.processor.stop()
|
||||||
|
if state.db:
|
||||||
|
await state.db.disconnect()
|
||||||
|
if state.redis:
|
||||||
|
state.redis.close()
|
||||||
|
|
||||||
|
log.info("Transcription service stopped")
|
||||||
|
|
||||||
|
|
||||||
|
app = FastAPI(
|
||||||
|
title="Meeting Intelligence Transcription Service",
|
||||||
|
description="Transcription and speaker diarization for meeting recordings",
|
||||||
|
version="1.0.0",
|
||||||
|
lifespan=lifespan
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
@app.get("/health")
|
||||||
|
async def health_check():
|
||||||
|
"""Health check endpoint."""
|
||||||
|
redis_ok = False
|
||||||
|
db_ok = False
|
||||||
|
|
||||||
|
try:
|
||||||
|
if state.redis:
|
||||||
|
state.redis.ping()
|
||||||
|
redis_ok = True
|
||||||
|
except Exception as e:
|
||||||
|
log.error("Redis health check failed", error=str(e))
|
||||||
|
|
||||||
|
try:
|
||||||
|
if state.db:
|
||||||
|
await state.db.health_check()
|
||||||
|
db_ok = True
|
||||||
|
except Exception as e:
|
||||||
|
log.error("Database health check failed", error=str(e))
|
||||||
|
|
||||||
|
status = "healthy" if (redis_ok and db_ok) else "unhealthy"
|
||||||
|
|
||||||
|
return {
|
||||||
|
"status": status,
|
||||||
|
"redis": redis_ok,
|
||||||
|
"database": db_ok,
|
||||||
|
"whisper_model": settings.whisper_model,
|
||||||
|
"threads": settings.whisper_threads
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
@app.get("/status")
|
||||||
|
async def service_status():
|
||||||
|
"""Get service status and queue info."""
|
||||||
|
queue_length = state.queue.count if state.queue else 0
|
||||||
|
processing = state.processor.active_jobs if state.processor else 0
|
||||||
|
|
||||||
|
return {
|
||||||
|
"status": "running",
|
||||||
|
"queue_length": queue_length,
|
||||||
|
"active_jobs": processing,
|
||||||
|
"workers": settings.num_workers,
|
||||||
|
"model": os.path.basename(settings.whisper_model)
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
@app.post("/transcribe", response_model=TranscribeResponse)
|
||||||
|
async def queue_transcription(request: TranscribeRequest, background_tasks: BackgroundTasks):
|
||||||
|
"""Queue a transcription job."""
|
||||||
|
log.info(
|
||||||
|
"Received transcription request",
|
||||||
|
meeting_id=request.meeting_id,
|
||||||
|
audio_path=request.audio_path
|
||||||
|
)
|
||||||
|
|
||||||
|
# Validate audio file exists
|
||||||
|
if not os.path.exists(request.audio_path):
|
||||||
|
raise HTTPException(
|
||||||
|
status_code=404,
|
||||||
|
detail=f"Audio file not found: {request.audio_path}"
|
||||||
|
)
|
||||||
|
|
||||||
|
# Create job record in database
|
||||||
|
try:
|
||||||
|
job_id = await state.db.create_transcription_job(
|
||||||
|
meeting_id=request.meeting_id,
|
||||||
|
audio_path=request.audio_path,
|
||||||
|
enable_diarization=request.enable_diarization,
|
||||||
|
language=request.language,
|
||||||
|
priority=request.priority
|
||||||
|
)
|
||||||
|
except Exception as e:
|
||||||
|
log.error("Failed to create job", error=str(e))
|
||||||
|
raise HTTPException(status_code=500, detail=str(e))
|
||||||
|
|
||||||
|
# Queue the job
|
||||||
|
state.queue.enqueue(
|
||||||
|
"app.worker.process_transcription",
|
||||||
|
job_id,
|
||||||
|
job_timeout="2h",
|
||||||
|
result_ttl=86400 # 24 hours
|
||||||
|
)
|
||||||
|
|
||||||
|
log.info("Job queued", job_id=job_id)
|
||||||
|
|
||||||
|
return TranscribeResponse(
|
||||||
|
job_id=job_id,
|
||||||
|
status="queued",
|
||||||
|
message="Transcription job queued successfully"
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
@app.get("/transcribe/{job_id}", response_model=JobStatus)
|
||||||
|
async def get_job_status(job_id: str):
|
||||||
|
"""Get the status of a transcription job."""
|
||||||
|
job = await state.db.get_job(job_id)
|
||||||
|
|
||||||
|
if not job:
|
||||||
|
raise HTTPException(status_code=404, detail="Job not found")
|
||||||
|
|
||||||
|
return JobStatus(
|
||||||
|
job_id=job_id,
|
||||||
|
status=job["status"],
|
||||||
|
progress=job.get("progress"),
|
||||||
|
result=job.get("result"),
|
||||||
|
error=job.get("error_message")
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
@app.delete("/transcribe/{job_id}")
|
||||||
|
async def cancel_job(job_id: str):
|
||||||
|
"""Cancel a pending transcription job."""
|
||||||
|
job = await state.db.get_job(job_id)
|
||||||
|
|
||||||
|
if not job:
|
||||||
|
raise HTTPException(status_code=404, detail="Job not found")
|
||||||
|
|
||||||
|
if job["status"] not in ["pending", "queued"]:
|
||||||
|
raise HTTPException(
|
||||||
|
status_code=400,
|
||||||
|
detail=f"Cannot cancel job in status: {job['status']}"
|
||||||
|
)
|
||||||
|
|
||||||
|
await state.db.update_job_status(job_id, "cancelled")
|
||||||
|
|
||||||
|
return {"status": "cancelled", "job_id": job_id}
|
||||||
|
|
||||||
|
|
||||||
|
@app.get("/meetings/{meeting_id}/transcript")
|
||||||
|
async def get_transcript(meeting_id: str):
|
||||||
|
"""Get the transcript for a meeting."""
|
||||||
|
transcript = await state.db.get_transcript(meeting_id)
|
||||||
|
|
||||||
|
if not transcript:
|
||||||
|
raise HTTPException(
|
||||||
|
status_code=404,
|
||||||
|
detail=f"No transcript found for meeting: {meeting_id}"
|
||||||
|
)
|
||||||
|
|
||||||
|
return {
|
||||||
|
"meeting_id": meeting_id,
|
||||||
|
"segments": transcript,
|
||||||
|
"segment_count": len(transcript)
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
import uvicorn
|
||||||
|
uvicorn.run(app, host="0.0.0.0", port=8001)
|
||||||
|
|
@ -0,0 +1,282 @@
|
||||||
|
"""
|
||||||
|
Job Processor for the Transcription Service.
|
||||||
|
|
||||||
|
Handles the processing pipeline:
|
||||||
|
1. Audio extraction from video
|
||||||
|
2. Transcription
|
||||||
|
3. Speaker diarization
|
||||||
|
4. Database storage
|
||||||
|
"""
|
||||||
|
|
||||||
|
import asyncio
|
||||||
|
import os
|
||||||
|
import subprocess
|
||||||
|
from typing import Optional
|
||||||
|
|
||||||
|
import structlog
|
||||||
|
|
||||||
|
from .config import settings
|
||||||
|
from .transcriber import WhisperTranscriber, TranscriptionResult
|
||||||
|
from .diarizer import SpeakerDiarizer, SpeakerSegment
|
||||||
|
from .database import Database
|
||||||
|
|
||||||
|
log = structlog.get_logger()
|
||||||
|
|
||||||
|
|
||||||
|
class JobProcessor:
|
||||||
|
"""Processes transcription jobs from the queue."""
|
||||||
|
|
||||||
|
def __init__(
|
||||||
|
self,
|
||||||
|
transcriber: WhisperTranscriber,
|
||||||
|
diarizer: SpeakerDiarizer,
|
||||||
|
db: Database,
|
||||||
|
redis
|
||||||
|
):
|
||||||
|
self.transcriber = transcriber
|
||||||
|
self.diarizer = diarizer
|
||||||
|
self.db = db
|
||||||
|
self.redis = redis
|
||||||
|
self.active_jobs = 0
|
||||||
|
self._running = False
|
||||||
|
self._workers = []
|
||||||
|
|
||||||
|
async def process_jobs(self):
|
||||||
|
"""Main job processing loop."""
|
||||||
|
self._running = True
|
||||||
|
log.info("Job processor started", num_workers=settings.num_workers)
|
||||||
|
|
||||||
|
# Start worker tasks
|
||||||
|
for i in range(settings.num_workers):
|
||||||
|
worker = asyncio.create_task(self._worker(i))
|
||||||
|
self._workers.append(worker)
|
||||||
|
|
||||||
|
# Wait for all workers
|
||||||
|
await asyncio.gather(*self._workers, return_exceptions=True)
|
||||||
|
|
||||||
|
async def stop(self):
|
||||||
|
"""Stop the job processor."""
|
||||||
|
self._running = False
|
||||||
|
for worker in self._workers:
|
||||||
|
worker.cancel()
|
||||||
|
log.info("Job processor stopped")
|
||||||
|
|
||||||
|
async def _worker(self, worker_id: int):
|
||||||
|
"""Worker that processes individual jobs."""
|
||||||
|
log.info(f"Worker {worker_id} started")
|
||||||
|
|
||||||
|
while self._running:
|
||||||
|
try:
|
||||||
|
# Get next job from database
|
||||||
|
job = await self.db.get_next_pending_job()
|
||||||
|
|
||||||
|
if job is None:
|
||||||
|
# No jobs, wait a bit
|
||||||
|
await asyncio.sleep(2)
|
||||||
|
continue
|
||||||
|
|
||||||
|
job_id = job["id"]
|
||||||
|
meeting_id = job["meeting_id"]
|
||||||
|
|
||||||
|
log.info(
|
||||||
|
f"Worker {worker_id} processing job",
|
||||||
|
job_id=job_id,
|
||||||
|
meeting_id=meeting_id
|
||||||
|
)
|
||||||
|
|
||||||
|
self.active_jobs += 1
|
||||||
|
|
||||||
|
try:
|
||||||
|
await self._process_job(job)
|
||||||
|
except Exception as e:
|
||||||
|
log.error(
|
||||||
|
"Job processing failed",
|
||||||
|
job_id=job_id,
|
||||||
|
error=str(e)
|
||||||
|
)
|
||||||
|
await self.db.update_job_status(
|
||||||
|
job_id,
|
||||||
|
"failed",
|
||||||
|
error_message=str(e)
|
||||||
|
)
|
||||||
|
finally:
|
||||||
|
self.active_jobs -= 1
|
||||||
|
|
||||||
|
except asyncio.CancelledError:
|
||||||
|
break
|
||||||
|
except Exception as e:
|
||||||
|
log.error(f"Worker {worker_id} error", error=str(e))
|
||||||
|
await asyncio.sleep(5)
|
||||||
|
|
||||||
|
log.info(f"Worker {worker_id} stopped")
|
||||||
|
|
||||||
|
async def _process_job(self, job: dict):
|
||||||
|
"""Process a single transcription job."""
|
||||||
|
job_id = job["id"]
|
||||||
|
meeting_id = job["meeting_id"]
|
||||||
|
audio_path = job.get("audio_path")
|
||||||
|
video_path = job.get("video_path")
|
||||||
|
enable_diarization = job.get("enable_diarization", True)
|
||||||
|
language = job.get("language")
|
||||||
|
|
||||||
|
# Update status to processing
|
||||||
|
await self.db.update_job_status(job_id, "processing")
|
||||||
|
await self.db.update_meeting_status(meeting_id, "transcribing")
|
||||||
|
|
||||||
|
# Step 1: Extract audio if we have video
|
||||||
|
if video_path and not audio_path:
|
||||||
|
log.info("Extracting audio from video", video_path=video_path)
|
||||||
|
await self.db.update_job_status(job_id, "processing", progress=0.1)
|
||||||
|
|
||||||
|
audio_path = await self._extract_audio(video_path, meeting_id)
|
||||||
|
await self.db.update_job_audio_path(job_id, audio_path)
|
||||||
|
|
||||||
|
if not audio_path or not os.path.exists(audio_path):
|
||||||
|
raise RuntimeError(f"Audio file not found: {audio_path}")
|
||||||
|
|
||||||
|
# Step 2: Transcribe
|
||||||
|
log.info("Starting transcription", audio_path=audio_path)
|
||||||
|
await self.db.update_job_status(job_id, "processing", progress=0.3)
|
||||||
|
|
||||||
|
transcription = await asyncio.get_event_loop().run_in_executor(
|
||||||
|
None,
|
||||||
|
lambda: self.transcriber.transcribe(audio_path, language)
|
||||||
|
)
|
||||||
|
|
||||||
|
log.info(
|
||||||
|
"Transcription complete",
|
||||||
|
segments=len(transcription.segments),
|
||||||
|
duration=transcription.duration
|
||||||
|
)
|
||||||
|
|
||||||
|
# Step 3: Speaker diarization
|
||||||
|
speaker_segments = []
|
||||||
|
if enable_diarization and len(transcription.segments) > 0:
|
||||||
|
log.info("Starting speaker diarization")
|
||||||
|
await self.db.update_job_status(job_id, "processing", progress=0.6)
|
||||||
|
await self.db.update_meeting_status(meeting_id, "diarizing")
|
||||||
|
|
||||||
|
# Convert transcript segments to dicts for diarizer
|
||||||
|
transcript_dicts = [
|
||||||
|
{"start": s.start, "end": s.end, "text": s.text}
|
||||||
|
for s in transcription.segments
|
||||||
|
]
|
||||||
|
|
||||||
|
speaker_segments = await asyncio.get_event_loop().run_in_executor(
|
||||||
|
None,
|
||||||
|
lambda: self.diarizer.diarize(
|
||||||
|
audio_path,
|
||||||
|
transcript_segments=transcript_dicts
|
||||||
|
)
|
||||||
|
)
|
||||||
|
|
||||||
|
log.info(
|
||||||
|
"Diarization complete",
|
||||||
|
num_segments=len(speaker_segments),
|
||||||
|
num_speakers=len(set(s.speaker_id for s in speaker_segments))
|
||||||
|
)
|
||||||
|
|
||||||
|
# Step 4: Store results
|
||||||
|
log.info("Storing transcript in database")
|
||||||
|
await self.db.update_job_status(job_id, "processing", progress=0.9)
|
||||||
|
|
||||||
|
await self._store_transcript(
|
||||||
|
meeting_id,
|
||||||
|
transcription,
|
||||||
|
speaker_segments
|
||||||
|
)
|
||||||
|
|
||||||
|
# Mark job complete
|
||||||
|
await self.db.update_job_status(
|
||||||
|
job_id,
|
||||||
|
"completed",
|
||||||
|
result={
|
||||||
|
"segments": len(transcription.segments),
|
||||||
|
"duration": transcription.duration,
|
||||||
|
"language": transcription.language,
|
||||||
|
"speakers": len(set(s.speaker_id for s in speaker_segments)) if speaker_segments else 0
|
||||||
|
}
|
||||||
|
)
|
||||||
|
|
||||||
|
# Update meeting status - ready for summarization
|
||||||
|
await self.db.update_meeting_status(meeting_id, "summarizing")
|
||||||
|
|
||||||
|
log.info("Job completed successfully", job_id=job_id)
|
||||||
|
|
||||||
|
async def _extract_audio(self, video_path: str, meeting_id: str) -> str:
|
||||||
|
"""Extract audio from video file using ffmpeg."""
|
||||||
|
output_dir = os.path.join(settings.audio_output_path, meeting_id)
|
||||||
|
os.makedirs(output_dir, exist_ok=True)
|
||||||
|
|
||||||
|
audio_path = os.path.join(output_dir, "audio.wav")
|
||||||
|
|
||||||
|
cmd = [
|
||||||
|
"ffmpeg",
|
||||||
|
"-i", video_path,
|
||||||
|
"-vn", # No video
|
||||||
|
"-acodec", "pcm_s16le", # PCM 16-bit
|
||||||
|
"-ar", str(settings.audio_sample_rate), # Sample rate
|
||||||
|
"-ac", str(settings.audio_channels), # Mono
|
||||||
|
"-y", # Overwrite
|
||||||
|
audio_path
|
||||||
|
]
|
||||||
|
|
||||||
|
log.debug("Running ffmpeg", cmd=" ".join(cmd))
|
||||||
|
|
||||||
|
process = await asyncio.create_subprocess_exec(
|
||||||
|
*cmd,
|
||||||
|
stdout=asyncio.subprocess.PIPE,
|
||||||
|
stderr=asyncio.subprocess.PIPE
|
||||||
|
)
|
||||||
|
|
||||||
|
_, stderr = await process.communicate()
|
||||||
|
|
||||||
|
if process.returncode != 0:
|
||||||
|
raise RuntimeError(f"FFmpeg failed: {stderr.decode()}")
|
||||||
|
|
||||||
|
log.info("Audio extracted", output=audio_path)
|
||||||
|
return audio_path
|
||||||
|
|
||||||
|
async def _store_transcript(
|
||||||
|
self,
|
||||||
|
meeting_id: str,
|
||||||
|
transcription: TranscriptionResult,
|
||||||
|
speaker_segments: list
|
||||||
|
):
|
||||||
|
"""Store transcript segments in database."""
|
||||||
|
# Create a map from time ranges to speakers
|
||||||
|
speaker_map = {}
|
||||||
|
for seg in speaker_segments:
|
||||||
|
speaker_map[(seg.start, seg.end)] = (seg.speaker_id, seg.speaker_label)
|
||||||
|
|
||||||
|
# Store each transcript segment
|
||||||
|
for i, segment in enumerate(transcription.segments):
|
||||||
|
# Find matching speaker
|
||||||
|
speaker_id = None
|
||||||
|
speaker_label = None
|
||||||
|
|
||||||
|
for (start, end), (sid, slabel) in speaker_map.items():
|
||||||
|
if segment.start >= start and segment.end <= end:
|
||||||
|
speaker_id = sid
|
||||||
|
speaker_label = slabel
|
||||||
|
break
|
||||||
|
|
||||||
|
# If no exact match, find closest overlap
|
||||||
|
if speaker_id is None:
|
||||||
|
for seg in speaker_segments:
|
||||||
|
if segment.start < seg.end and segment.end > seg.start:
|
||||||
|
speaker_id = seg.speaker_id
|
||||||
|
speaker_label = seg.speaker_label
|
||||||
|
break
|
||||||
|
|
||||||
|
await self.db.insert_transcript_segment(
|
||||||
|
meeting_id=meeting_id,
|
||||||
|
segment_index=i,
|
||||||
|
start_time=segment.start,
|
||||||
|
end_time=segment.end,
|
||||||
|
text=segment.text,
|
||||||
|
speaker_id=speaker_id,
|
||||||
|
speaker_label=speaker_label,
|
||||||
|
confidence=segment.confidence,
|
||||||
|
language=transcription.language
|
||||||
|
)
|
||||||
|
|
@ -0,0 +1,211 @@
|
||||||
|
"""
|
||||||
|
Whisper.cpp transcription wrapper.
|
||||||
|
|
||||||
|
Uses the whisper CLI to transcribe audio files.
|
||||||
|
"""
|
||||||
|
|
||||||
|
import json
|
||||||
|
import os
|
||||||
|
import subprocess
|
||||||
|
import tempfile
|
||||||
|
from dataclasses import dataclass
|
||||||
|
from typing import List, Optional
|
||||||
|
|
||||||
|
import structlog
|
||||||
|
|
||||||
|
log = structlog.get_logger()
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class TranscriptSegment:
|
||||||
|
"""A single transcript segment."""
|
||||||
|
start: float
|
||||||
|
end: float
|
||||||
|
text: str
|
||||||
|
confidence: Optional[float] = None
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class TranscriptionResult:
|
||||||
|
"""Result of a transcription job."""
|
||||||
|
segments: List[TranscriptSegment]
|
||||||
|
language: str
|
||||||
|
duration: float
|
||||||
|
text: str
|
||||||
|
|
||||||
|
|
||||||
|
class WhisperTranscriber:
|
||||||
|
"""Wrapper for whisper.cpp transcription."""
|
||||||
|
|
||||||
|
def __init__(
|
||||||
|
self,
|
||||||
|
model_path: str = "/models/ggml-small.bin",
|
||||||
|
threads: int = 8,
|
||||||
|
language: str = "en"
|
||||||
|
):
|
||||||
|
self.model_path = model_path
|
||||||
|
self.threads = threads
|
||||||
|
self.language = language
|
||||||
|
self.whisper_bin = "/usr/local/bin/whisper"
|
||||||
|
|
||||||
|
# Verify whisper binary exists
|
||||||
|
if not os.path.exists(self.whisper_bin):
|
||||||
|
raise RuntimeError(f"Whisper binary not found at {self.whisper_bin}")
|
||||||
|
|
||||||
|
# Verify model exists
|
||||||
|
if not os.path.exists(model_path):
|
||||||
|
raise RuntimeError(f"Whisper model not found at {model_path}")
|
||||||
|
|
||||||
|
log.info(
|
||||||
|
"WhisperTranscriber initialized",
|
||||||
|
model=model_path,
|
||||||
|
threads=threads,
|
||||||
|
language=language
|
||||||
|
)
|
||||||
|
|
||||||
|
def transcribe(
|
||||||
|
self,
|
||||||
|
audio_path: str,
|
||||||
|
language: Optional[str] = None,
|
||||||
|
translate: bool = False
|
||||||
|
) -> TranscriptionResult:
|
||||||
|
"""
|
||||||
|
Transcribe an audio file.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
audio_path: Path to the audio file (WAV format, 16kHz mono)
|
||||||
|
language: Language code (e.g., 'en', 'es', 'fr') or None for auto-detect
|
||||||
|
translate: If True, translate to English
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
TranscriptionResult with segments and full text
|
||||||
|
"""
|
||||||
|
if not os.path.exists(audio_path):
|
||||||
|
raise FileNotFoundError(f"Audio file not found: {audio_path}")
|
||||||
|
|
||||||
|
log.info("Starting transcription", audio_path=audio_path, language=language)
|
||||||
|
|
||||||
|
# Create temp file for JSON output
|
||||||
|
with tempfile.NamedTemporaryFile(suffix=".json", delete=False) as tmp:
|
||||||
|
output_json = tmp.name
|
||||||
|
|
||||||
|
try:
|
||||||
|
# Build whisper command
|
||||||
|
cmd = [
|
||||||
|
self.whisper_bin,
|
||||||
|
"-m", self.model_path,
|
||||||
|
"-f", audio_path,
|
||||||
|
"-t", str(self.threads),
|
||||||
|
"-oj", # Output JSON
|
||||||
|
"-of", output_json.replace(".json", ""), # Output file prefix
|
||||||
|
"--print-progress",
|
||||||
|
]
|
||||||
|
|
||||||
|
# Add language if specified
|
||||||
|
if language:
|
||||||
|
cmd.extend(["-l", language])
|
||||||
|
else:
|
||||||
|
cmd.extend(["-l", self.language])
|
||||||
|
|
||||||
|
# Add translate flag if needed
|
||||||
|
if translate:
|
||||||
|
cmd.append("--translate")
|
||||||
|
|
||||||
|
log.debug("Running whisper command", cmd=" ".join(cmd))
|
||||||
|
|
||||||
|
# Run whisper
|
||||||
|
result = subprocess.run(
|
||||||
|
cmd,
|
||||||
|
capture_output=True,
|
||||||
|
text=True,
|
||||||
|
timeout=7200 # 2 hour timeout
|
||||||
|
)
|
||||||
|
|
||||||
|
if result.returncode != 0:
|
||||||
|
log.error(
|
||||||
|
"Whisper transcription failed",
|
||||||
|
returncode=result.returncode,
|
||||||
|
stderr=result.stderr
|
||||||
|
)
|
||||||
|
raise RuntimeError(f"Whisper failed: {result.stderr}")
|
||||||
|
|
||||||
|
# Parse JSON output
|
||||||
|
with open(output_json, "r") as f:
|
||||||
|
whisper_output = json.load(f)
|
||||||
|
|
||||||
|
# Extract segments
|
||||||
|
segments = []
|
||||||
|
full_text_parts = []
|
||||||
|
|
||||||
|
for item in whisper_output.get("transcription", []):
|
||||||
|
segment = TranscriptSegment(
|
||||||
|
start=item["offsets"]["from"] / 1000.0, # Convert ms to seconds
|
||||||
|
end=item["offsets"]["to"] / 1000.0,
|
||||||
|
text=item["text"].strip(),
|
||||||
|
confidence=item.get("confidence")
|
||||||
|
)
|
||||||
|
segments.append(segment)
|
||||||
|
full_text_parts.append(segment.text)
|
||||||
|
|
||||||
|
# Get detected language
|
||||||
|
detected_language = whisper_output.get("result", {}).get("language", language or self.language)
|
||||||
|
|
||||||
|
# Calculate total duration
|
||||||
|
duration = segments[-1].end if segments else 0.0
|
||||||
|
|
||||||
|
log.info(
|
||||||
|
"Transcription complete",
|
||||||
|
segments=len(segments),
|
||||||
|
duration=duration,
|
||||||
|
language=detected_language
|
||||||
|
)
|
||||||
|
|
||||||
|
return TranscriptionResult(
|
||||||
|
segments=segments,
|
||||||
|
language=detected_language,
|
||||||
|
duration=duration,
|
||||||
|
text=" ".join(full_text_parts)
|
||||||
|
)
|
||||||
|
|
||||||
|
finally:
|
||||||
|
# Clean up temp files
|
||||||
|
for ext in [".json", ".txt", ".vtt", ".srt"]:
|
||||||
|
tmp_file = output_json.replace(".json", ext)
|
||||||
|
if os.path.exists(tmp_file):
|
||||||
|
os.remove(tmp_file)
|
||||||
|
|
||||||
|
def transcribe_with_timestamps(
|
||||||
|
self,
|
||||||
|
audio_path: str,
|
||||||
|
language: Optional[str] = None
|
||||||
|
) -> List[dict]:
|
||||||
|
"""
|
||||||
|
Transcribe with word-level timestamps.
|
||||||
|
|
||||||
|
Returns list of dicts with word, start, end, confidence.
|
||||||
|
"""
|
||||||
|
result = self.transcribe(audio_path, language)
|
||||||
|
|
||||||
|
# Convert segments to word-level format
|
||||||
|
# Note: whisper.cpp provides segment-level timestamps by default
|
||||||
|
# For true word-level, we'd need the --max-len 1 flag but it's slower
|
||||||
|
|
||||||
|
words = []
|
||||||
|
for segment in result.segments:
|
||||||
|
# Estimate word timestamps within segment
|
||||||
|
segment_words = segment.text.split()
|
||||||
|
if not segment_words:
|
||||||
|
continue
|
||||||
|
|
||||||
|
duration = segment.end - segment.start
|
||||||
|
word_duration = duration / len(segment_words)
|
||||||
|
|
||||||
|
for i, word in enumerate(segment_words):
|
||||||
|
words.append({
|
||||||
|
"word": word,
|
||||||
|
"start": segment.start + (i * word_duration),
|
||||||
|
"end": segment.start + ((i + 1) * word_duration),
|
||||||
|
"confidence": segment.confidence
|
||||||
|
})
|
||||||
|
|
||||||
|
return words
|
||||||
|
|
@ -0,0 +1,41 @@
|
||||||
|
# Transcription Service Dependencies
|
||||||
|
|
||||||
|
# Web framework
|
||||||
|
fastapi==0.109.2
|
||||||
|
uvicorn[standard]==0.27.1
|
||||||
|
python-multipart==0.0.9
|
||||||
|
|
||||||
|
# Job queue
|
||||||
|
redis==5.0.1
|
||||||
|
rq==1.16.0
|
||||||
|
|
||||||
|
# Database
|
||||||
|
asyncpg==0.29.0
|
||||||
|
sqlalchemy[asyncio]==2.0.25
|
||||||
|
psycopg2-binary==2.9.9
|
||||||
|
|
||||||
|
# Audio processing
|
||||||
|
pydub==0.25.1
|
||||||
|
soundfile==0.12.1
|
||||||
|
librosa==0.10.1
|
||||||
|
numpy==1.26.4
|
||||||
|
|
||||||
|
# Speaker diarization
|
||||||
|
resemblyzer==0.1.3
|
||||||
|
torch==2.2.0
|
||||||
|
torchaudio==2.2.0
|
||||||
|
scipy==1.12.0
|
||||||
|
scikit-learn==1.4.0
|
||||||
|
|
||||||
|
# Sentence embeddings (for semantic search)
|
||||||
|
sentence-transformers==2.3.1
|
||||||
|
|
||||||
|
# Utilities
|
||||||
|
pydantic==2.6.1
|
||||||
|
pydantic-settings==2.1.0
|
||||||
|
python-dotenv==1.0.1
|
||||||
|
httpx==0.26.0
|
||||||
|
tenacity==8.2.3
|
||||||
|
|
||||||
|
# Logging & monitoring
|
||||||
|
structlog==24.1.0
|
||||||
Loading…
Reference in New Issue