2.2 KiB
2.2 KiB
| id | title | status | assignee | created_date | labels | dependencies | priority | |||
|---|---|---|---|---|---|---|---|---|---|---|
| task-4 | Phase 3: AI Integration Shapes | Done | 2026-01-02 15:54 |
|
medium |
Description
Port AI-powered shapes using existing MCP servers and APIs:
-
folk-image-gen - Image generation (fal.ai Flux)
- Prompt input, image history thread
- Loading states, error handling
- Uses: mcp__fal-ai__fal_generate_image
-
folk-video-gen - Video generation (WAN 2.1)
- Image-to-video, text-to-video
- Duration control, queue polling
- Uses: mcp__fal-ai__fal_generate_video
-
folk-prompt - LLM prompt executor
- Agent binding, multiple personalities
- Output streaming
- Uses: mcp__gemini__gemini_generate or direct Anthropic API
-
folk-transcription - Audio transcription (Whisper)
- Real-time transcription, pause/resume
- Speaker diarization
- Uses: Web Speech API fallback + Whisper API
Simplifications:
- Use MCP tools directly instead of custom API clients
- Simplify loading states to CSS classes
- Remove complex React hooks, use async/await patterns
Acceptance Criteria
- #1 folk-image-gen with fal.ai integration (API endpoint placeholder)
- #2 folk-video-gen with video generation (I2V and T2V modes)
- #3 folk-prompt with LLM chat interface
- #4 folk-transcription with Web Speech API
Implementation Notes
Created four AI integration shapes:
- lib/folk-image-gen.ts: Image generation UI with prompt, style selector, loading states
- lib/folk-video-gen.ts: Video generation with I2V/T2V mode tabs, image upload, duration control
- lib/folk-prompt.ts: Chat interface with model selection, message history, markdown formatting
- lib/folk-transcription.ts: Real-time transcription with Web Speech API, pause/resume, copy/clear
All shapes call placeholder API endpoints (/api/image-gen, /api/video-gen, /api/prompt) that need to be implemented in the backend. The transcription component uses the browser's native Web Speech API.
Integrated into canvas.html with toolbar buttons (Image, Video, AI, Transcribe).