rspace-online/backlog/tasks/task-4 - Phase-3-AI-Integra...

53 lines
1.4 KiB
Markdown

---
id: task-4
title: 'Phase 3: AI Integration Shapes'
status: To Do
assignee: []
created_date: '2026-01-02 15:54'
labels:
- migration
- shapes
- ai
dependencies: []
priority: medium
---
## Description
<!-- SECTION:DESCRIPTION:BEGIN -->
Port AI-powered shapes using existing MCP servers and APIs:
1. **folk-image-gen** - Image generation (fal.ai Flux)
- Prompt input, image history thread
- Loading states, error handling
- Uses: mcp__fal-ai__fal_generate_image
2. **folk-video-gen** - Video generation (WAN 2.1)
- Image-to-video, text-to-video
- Duration control, queue polling
- Uses: mcp__fal-ai__fal_generate_video
3. **folk-prompt** - LLM prompt executor
- Agent binding, multiple personalities
- Output streaming
- Uses: mcp__gemini__gemini_generate or direct Anthropic API
4. **folk-transcription** - Audio transcription (Whisper)
- Real-time transcription, pause/resume
- Speaker diarization
- Uses: Web Speech API fallback + Whisper API
Simplifications:
- Use MCP tools directly instead of custom API clients
- Simplify loading states to CSS classes
- Remove complex React hooks, use async/await patterns
<!-- SECTION:DESCRIPTION:END -->
## Acceptance Criteria
<!-- AC:BEGIN -->
- [ ] #1 folk-image-gen with fal.ai integration
- [ ] #2 folk-video-gen with video generation
- [ ] #3 folk-prompt with LLM streaming
- [ ] #4 folk-transcription with Whisper
<!-- AC:END -->