rspace-online/task-4 - Phase-3-AI-Integration-Shapes.md at a6d2cdcf86c28df745e9590ab6fd3205846d81d5 - rspace-online

2.2 KiB

Raw Blame History

title

status

assignee

created_date

labels

dependencies

priority

task-4

Phase 3: AI Integration Shapes

Done

2026-01-02 15:54

migration

shapes

medium

Description

Port AI-powered shapes using existing MCP servers and APIs:

folk-image-gen - Image generation (fal.ai Flux)
- Prompt input, image history thread
- Loading states, error handling
- Uses: mcp__fal-ai__fal_generate_image
folk-video-gen - Video generation (WAN 2.1)
- Image-to-video, text-to-video
- Duration control, queue polling
- Uses: mcp__fal-ai__fal_generate_video
folk-prompt - LLM prompt executor
- Agent binding, multiple personalities
- Output streaming
- Uses: mcp__gemini__gemini_generate or direct Anthropic API
folk-transcription - Audio transcription (Whisper)
- Real-time transcription, pause/resume
- Speaker diarization
- Uses: Web Speech API fallback + Whisper API

Simplifications:

Use MCP tools directly instead of custom API clients
Simplify loading states to CSS classes
Remove complex React hooks, use async/await patterns

Acceptance Criteria

#1 folk-image-gen with fal.ai integration (API endpoint placeholder)
#2 folk-video-gen with video generation (I2V and T2V modes)
#3 folk-prompt with LLM chat interface
#4 folk-transcription with Web Speech API

Implementation Notes

Created four AI integration shapes:

lib/folk-image-gen.ts: Image generation UI with prompt, style selector, loading states
lib/folk-video-gen.ts: Video generation with I2V/T2V mode tabs, image upload, duration control
lib/folk-prompt.ts: Chat interface with model selection, message history, markdown formatting
lib/folk-transcription.ts: Real-time transcription with Web Speech API, pause/resume, copy/clear

All shapes call placeholder API endpoints (/api/image-gen, /api/video-gen, /api/prompt) that need to be implemented in the backend. The transcription component uses the browser's native Web Speech API.

Integrated into canvas.html with toolbar buttons (Image, Video, AI, Transcribe).

2.2 KiB Raw Blame History

Description

Acceptance Criteria

Implementation Notes

2.2 KiB

Raw Blame History