feat: add RunPod AI integration with image generation and enhanced LLM support

Add comprehensive RunPod AI API integration including:
- New runpodApi.ts client for RunPod endpoint communication
- Image generation tool and shape utilities for AI-generated images
- Enhanced LLM utilities with RunPod support for text generation
- Updated Whisper transcription with improved error handling
- UI components for image generation tool
- Setup and testing documentation

This commit preserves work-in-progress RunPod integration before switching branches.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
Jeff Emmett 2025-11-16 16:14:39 -07:00
parent 5878579980
commit 080e5a3b87
13 changed files with 2038 additions and 94 deletions

255
RUNPOD_SETUP.md Normal file
View File

@ -0,0 +1,255 @@
# RunPod WhisperX Integration Setup
This guide explains how to set up and use the RunPod WhisperX endpoint for transcription in the canvas website.
## Overview
The transcription system can now use a hosted WhisperX endpoint on RunPod instead of running the Whisper model locally in the browser. This provides:
- Better accuracy with WhisperX's advanced features
- Faster processing (no model download needed)
- Reduced client-side resource usage
- Support for longer audio files
## Prerequisites
1. A RunPod account with an active WhisperX endpoint
2. Your RunPod API key
3. Your RunPod endpoint ID
## Configuration
### Environment Variables
Add the following environment variables to your `.env.local` file (or your deployment environment):
```bash
# RunPod Configuration
VITE_RUNPOD_API_KEY=your_runpod_api_key_here
VITE_RUNPOD_ENDPOINT_ID=your_endpoint_id_here
```
Or if using Next.js:
```bash
NEXT_PUBLIC_RUNPOD_API_KEY=your_runpod_api_key_here
NEXT_PUBLIC_RUNPOD_ENDPOINT_ID=your_endpoint_id_here
```
### Getting Your RunPod Credentials
1. **API Key**:
- Go to [RunPod Settings](https://www.runpod.io/console/user/settings)
- Navigate to API Keys section
- Create a new API key or copy an existing one
2. **Endpoint ID**:
- Go to [RunPod Serverless Endpoints](https://www.runpod.io/console/serverless)
- Find your WhisperX endpoint
- Copy the endpoint ID from the URL or endpoint details
- Example: If your endpoint URL is `https://api.runpod.ai/v2/lrtisuv8ixbtub/run`, then `lrtisuv8ixbtub` is your endpoint ID
## Usage
### Automatic Detection
The transcription hook automatically detects if RunPod is configured and uses it instead of the local Whisper model. No code changes are needed!
### Manual Override
If you want to explicitly control which transcription method to use:
```typescript
import { useWhisperTranscription } from '@/hooks/useWhisperTranscriptionSimple'
const {
isRecording,
transcript,
startRecording,
stopRecording
} = useWhisperTranscription({
useRunPod: true, // Force RunPod usage
language: 'en',
onTranscriptUpdate: (text) => {
console.log('New transcript:', text)
}
})
```
Or to force local model:
```typescript
useWhisperTranscription({
useRunPod: false, // Force local Whisper model
// ... other options
})
```
## API Format
The integration sends audio data to your RunPod endpoint in the following format:
```json
{
"input": {
"audio": "base64_encoded_audio_data",
"audio_format": "audio/wav",
"language": "en",
"task": "transcribe"
}
}
```
### Expected Response Format
The endpoint should return one of these formats:
**Direct Response:**
```json
{
"output": {
"text": "Transcribed text here"
}
}
```
**Or with segments:**
```json
{
"output": {
"segments": [
{
"start": 0.0,
"end": 2.5,
"text": "Transcribed text here"
}
]
}
}
```
**Async Job Pattern:**
```json
{
"id": "job-id-123",
"status": "IN_QUEUE"
}
```
The integration automatically handles async jobs by polling the status endpoint until completion.
## Customizing the API Request
If your WhisperX endpoint expects a different request format, you can modify `src/lib/runpodApi.ts`:
```typescript
// In transcribeWithRunPod function
const requestBody = {
input: {
// Adjust these fields based on your endpoint
audio: audioBase64,
// Add or modify fields as needed
}
}
```
## Troubleshooting
### "RunPod API key or endpoint ID not configured"
- Ensure environment variables are set correctly
- Restart your development server after adding environment variables
- Check that variable names match exactly (case-sensitive)
### "RunPod API error: 401"
- Verify your API key is correct
- Check that your API key has not expired
- Ensure you're using the correct API key format
### "RunPod API error: 404"
- Verify your endpoint ID is correct
- Check that your endpoint is active in the RunPod console
- Ensure the endpoint URL format matches: `https://api.runpod.ai/v2/{ENDPOINT_ID}/run`
### "No transcription text found in RunPod response"
- Check your endpoint's response format matches the expected format
- Verify your WhisperX endpoint is configured correctly
- Check the browser console for detailed error messages
### "Failed to return job results" (400 Bad Request)
This error occurs on the **server side** when your WhisperX endpoint tries to return results. This typically means:
1. **Response format mismatch**: Your endpoint's response doesn't match RunPod's expected format
- Ensure your endpoint returns: `{"output": {"text": "..."}}` or `{"output": {"segments": [...]}}`
- The response must be valid JSON
- Check your endpoint handler code to ensure it's returning the correct structure
2. **Response size limits**: The response might be too large
- Try with shorter audio files first
- Check RunPod's response size limits
3. **Timeout issues**: The endpoint might be taking too long to process
- Check your endpoint logs for processing time
- Consider optimizing your WhisperX model configuration
4. **Check endpoint handler**: Review your WhisperX endpoint's `handler.py` or equivalent:
```python
# Example correct format
def handler(event):
# ... process audio ...
return {
"output": {
"text": transcription_text
}
}
```
### Transcription not working
- Check browser console for errors
- Verify your endpoint is active and responding
- Test your endpoint directly using curl or Postman
- Ensure audio format is supported (WAV format is recommended)
- Check RunPod endpoint logs for server-side errors
## Testing Your Endpoint
You can test your RunPod endpoint directly:
```bash
curl -X POST https://api.runpod.ai/v2/YOUR_ENDPOINT_ID/run \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"input": {
"audio": "base64_audio_data_here",
"audio_format": "audio/wav",
"language": "en"
}
}'
```
## Fallback Behavior
If RunPod is not configured or fails, the system will:
1. Try to use RunPod if configured
2. Fall back to local Whisper model if RunPod fails or is not configured
3. Show error messages if both methods fail
## Performance Considerations
- **RunPod**: Better for longer audio files and higher accuracy, but requires network connection
- **Local Model**: Works offline, but requires model download and uses more client resources
## Support
For issues specific to:
- **RunPod API**: Check [RunPod Documentation](https://docs.runpod.io)
- **WhisperX**: Check your WhisperX endpoint configuration
- **Integration**: Check browser console for detailed error messages

139
TEST_RUNPOD_AI.md Normal file
View File

@ -0,0 +1,139 @@
# Testing RunPod AI Integration
This guide explains how to test the RunPod AI API integration in development.
## Quick Setup
1. **Add RunPod environment variables to `.env.local`:**
```bash
# Add these lines to your .env.local file
VITE_RUNPOD_API_KEY=your_runpod_api_key_here
VITE_RUNPOD_ENDPOINT_ID=your_endpoint_id_here
```
**Important:** Replace `your_runpod_api_key_here` and `your_endpoint_id_here` with your actual RunPod credentials.
2. **Get your RunPod credentials:**
- **API Key**: Go to [RunPod Settings](https://www.runpod.io/console/user/settings) → API Keys section
- **Endpoint ID**: Go to [RunPod Serverless Endpoints](https://www.runpod.io/console/serverless) → Find your endpoint → Copy the ID from the URL
- Example: If URL is `https://api.runpod.ai/v2/jqd16o7stu29vq/run`, then `jqd16o7stu29vq` is your endpoint ID
3. **Restart the dev server:**
```bash
npm run dev
```
## Testing the Integration
### Method 1: Using Prompt Shapes
1. Open the canvas website in your browser
2. Select the **Prompt** tool from the toolbar (or press the keyboard shortcut)
3. Click on the canvas to create a prompt shape
4. Type a prompt like "Write a hello world program in Python"
5. Press Enter or click the send button
6. The AI response should appear in the prompt shape
### Method 2: Using Arrow LLM Action
1. Create an arrow shape pointing from one shape to another
2. Add text to the arrow (this becomes the prompt)
3. Select the arrow
4. Press **Alt+G** (or use the action menu)
5. The AI will process the prompt and fill the target shape with the response
### Method 3: Using Command Palette
1. Press **Cmd+J** (Mac) or **Ctrl+J** (Windows/Linux) to open the LLM view
2. Type your prompt
3. Press Enter
4. The response should appear
## Verifying RunPod is Being Used
1. **Open browser console** (F12 or Cmd+Option+I)
2. Look for these log messages:
- `🔑 Found RunPod configuration from environment variables - using as primary AI provider`
- `🔍 Found X available AI providers: runpod (default)`
- `🔄 Attempting to use runpod API (default)...`
3. **Check Network tab:**
- Look for requests to `https://api.runpod.ai/v2/{endpointId}/run`
- The request should have `Authorization: Bearer {your_api_key}` header
## Expected Behavior
- **With RunPod configured**: RunPod will be used FIRST (priority over user API keys)
- **Without RunPod**: System will fall back to user-configured API keys (OpenAI, Anthropic, etc.)
- **If both fail**: You'll see an error message
## Troubleshooting
### "No valid API key found for any provider"
- Check that `.env.local` has the correct variable names (`VITE_RUNPOD_API_KEY` and `VITE_RUNPOD_ENDPOINT_ID`)
- Restart the dev server after adding environment variables
- Check browser console for detailed error messages
### "RunPod API error: 401"
- Verify your API key is correct
- Check that your API key hasn't expired
- Ensure you're using the correct API key format
### "RunPod API error: 404"
- Verify your endpoint ID is correct
- Check that your endpoint is active in RunPod console
- Ensure the endpoint URL format matches: `https://api.runpod.ai/v2/{ENDPOINT_ID}/run`
### RunPod not being used
- Check browser console for `🔑 Found RunPod configuration` message
- Verify environment variables are loaded (check `import.meta.env.VITE_RUNPOD_API_KEY` in console)
- Make sure you restarted the dev server after adding environment variables
## Testing Different Scenarios
### Test 1: RunPod Only (No User Keys)
1. Remove or clear any user API keys from localStorage
2. Set RunPod environment variables
3. Run an AI command
4. Should use RunPod automatically
### Test 2: RunPod Priority (With User Keys)
1. Set RunPod environment variables
2. Also configure user API keys in settings
3. Run an AI command
4. Should use RunPod FIRST, then fall back to user keys if RunPod fails
### Test 3: Fallback Behavior
1. Set RunPod environment variables with invalid credentials
2. Configure valid user API keys
3. Run an AI command
4. Should try RunPod first, fail, then use user keys
## API Request Format
The integration sends requests in this format:
```json
{
"input": {
"prompt": "Your prompt text here"
}
}
```
The system prompt and user prompt are combined into a single prompt string.
## Response Handling
The integration handles multiple response formats:
- Direct text response: `{ "output": "text" }`
- Object with text: `{ "output": { "text": "..." } }`
- Object with response: `{ "output": { "response": "..." } }`
- Async jobs: Polls until completion
## Next Steps
Once testing is successful:
1. Verify RunPod responses are working correctly
2. Test with different prompt types
3. Monitor RunPod usage and costs
4. Consider adding rate limiting if needed

View File

@ -1,5 +1,7 @@
import { useCallback, useEffect, useRef, useState } from 'react'
import { pipeline, env } from '@xenova/transformers'
import { transcribeWithRunPod } from '../lib/runpodApi'
import { isRunPodConfigured } from '../lib/clientConfig'
// Configure the transformers library
env.allowRemoteModels = true
@ -48,6 +50,44 @@ function detectAudioFormat(blob: Blob): Promise<string> {
})
}
// Convert Float32Array audio data to WAV blob
async function createWavBlob(audioData: Float32Array, sampleRate: number): Promise<Blob> {
const length = audioData.length
const buffer = new ArrayBuffer(44 + length * 2)
const view = new DataView(buffer)
// WAV header
const writeString = (offset: number, string: string) => {
for (let i = 0; i < string.length; i++) {
view.setUint8(offset + i, string.charCodeAt(i))
}
}
writeString(0, 'RIFF')
view.setUint32(4, 36 + length * 2, true)
writeString(8, 'WAVE')
writeString(12, 'fmt ')
view.setUint32(16, 16, true)
view.setUint16(20, 1, true)
view.setUint16(22, 1, true)
view.setUint32(24, sampleRate, true)
view.setUint32(28, sampleRate * 2, true)
view.setUint16(32, 2, true)
view.setUint16(34, 16, true)
writeString(36, 'data')
view.setUint32(40, length * 2, true)
// Convert float samples to 16-bit PCM
let offset = 44
for (let i = 0; i < length; i++) {
const sample = Math.max(-1, Math.min(1, audioData[i]))
view.setInt16(offset, sample < 0 ? sample * 0x8000 : sample * 0x7FFF, true)
offset += 2
}
return new Blob([buffer], { type: 'audio/wav' })
}
// Simple resampling function for audio data
function resampleAudio(audioData: Float32Array, fromSampleRate: number, toSampleRate: number): Float32Array {
if (fromSampleRate === toSampleRate) {
@ -103,6 +143,7 @@ interface UseWhisperTranscriptionOptions {
enableAdvancedErrorHandling?: boolean
modelOptions?: ModelOption[]
autoInitialize?: boolean // If false, model will only load when startRecording is called
useRunPod?: boolean // If true, use RunPod WhisperX endpoint instead of local model (defaults to checking if RunPod is configured)
}
export const useWhisperTranscription = ({
@ -112,8 +153,11 @@ export const useWhisperTranscription = ({
enableStreaming = false,
enableAdvancedErrorHandling = false,
modelOptions,
autoInitialize = true // Default to true for backward compatibility
autoInitialize = true, // Default to true for backward compatibility
useRunPod = undefined // If undefined, auto-detect based on configuration
}: UseWhisperTranscriptionOptions = {}) => {
// Auto-detect RunPod usage if not explicitly set
const shouldUseRunPod = useRunPod !== undefined ? useRunPod : isRunPodConfigured()
const [isRecording, setIsRecording] = useState(false)
const [isTranscribing, setIsTranscribing] = useState(false)
const [isSpeaking, setIsSpeaking] = useState(false)
@ -161,6 +205,13 @@ export const useWhisperTranscription = ({
// Initialize transcriber with optional advanced error handling
const initializeTranscriber = useCallback(async () => {
// Skip model loading if using RunPod
if (shouldUseRunPod) {
console.log('🚀 Using RunPod WhisperX endpoint - skipping local model loading')
setModelLoaded(true) // Mark as "loaded" since we don't need a local model
return null
}
if (transcriberRef.current) return transcriberRef.current
try {
@ -432,19 +483,33 @@ export const useWhisperTranscription = ({
console.log(`🎵 Real-time audio: ${processedAudioData.length} samples (${(processedAudioData.length / 16000).toFixed(2)}s)`)
// Transcribe with parameters optimized for real-time processing
const result = await transcriberRef.current(processedAudioData, {
language: language,
task: 'transcribe',
return_timestamps: false,
chunk_length_s: 5, // Longer chunks for better context
stride_length_s: 2, // Larger stride for better coverage
no_speech_threshold: 0.3, // Higher threshold to reduce noise
logprob_threshold: -0.8, // More sensitive detection
compression_ratio_threshold: 2.0 // More permissive for real-time
})
let transcriptionText = ''
const transcriptionText = result?.text || ''
// Use RunPod if configured, otherwise use local model
if (shouldUseRunPod) {
console.log('🚀 Using RunPod WhisperX API for real-time transcription...')
// Convert processed audio data back to blob for RunPod
const wavBlob = await createWavBlob(processedAudioData, 16000)
transcriptionText = await transcribeWithRunPod(wavBlob, language)
} else {
// Use local Whisper model
if (!transcriberRef.current) {
console.log('⚠️ Transcriber not available for real-time processing')
return
}
const result = await transcriberRef.current(processedAudioData, {
language: language,
task: 'transcribe',
return_timestamps: false,
chunk_length_s: 5, // Longer chunks for better context
stride_length_s: 2, // Larger stride for better coverage
no_speech_threshold: 0.3, // Higher threshold to reduce noise
logprob_threshold: -0.8, // More sensitive detection
compression_ratio_threshold: 2.0 // More permissive for real-time
})
transcriptionText = result?.text || ''
}
if (transcriptionText.trim()) {
lastTranscriptionTimeRef.current = Date.now()
console.log(`✅ Real-time transcript: "${transcriptionText.trim()}"`)
@ -453,53 +518,63 @@ export const useWhisperTranscription = ({
} else {
console.log('⚠️ No real-time transcription text produced, trying fallback parameters...')
// Try with more permissive parameters for real-time processing
try {
const fallbackResult = await transcriberRef.current(processedAudioData, {
task: 'transcribe',
return_timestamps: false,
chunk_length_s: 3, // Shorter chunks for fallback
stride_length_s: 1, // Smaller stride for fallback
no_speech_threshold: 0.1, // Very low threshold for fallback
logprob_threshold: -1.2, // Very sensitive for fallback
compression_ratio_threshold: 2.5 // Very permissive for fallback
})
const fallbackText = fallbackResult?.text || ''
if (fallbackText.trim()) {
console.log(`✅ Fallback real-time transcript: "${fallbackText.trim()}"`)
lastTranscriptionTimeRef.current = Date.now()
handleStreamingTranscriptUpdate(fallbackText.trim())
} else {
console.log('⚠️ Fallback transcription also produced no text')
// Try with more permissive parameters for real-time processing (only for local model)
if (!shouldUseRunPod && transcriberRef.current) {
try {
const fallbackResult = await transcriberRef.current(processedAudioData, {
task: 'transcribe',
return_timestamps: false,
chunk_length_s: 3, // Shorter chunks for fallback
stride_length_s: 1, // Smaller stride for fallback
no_speech_threshold: 0.1, // Very low threshold for fallback
logprob_threshold: -1.2, // Very sensitive for fallback
compression_ratio_threshold: 2.5 // Very permissive for fallback
})
const fallbackText = fallbackResult?.text || ''
if (fallbackText.trim()) {
console.log(`✅ Fallback real-time transcript: "${fallbackText.trim()}"`)
lastTranscriptionTimeRef.current = Date.now()
handleStreamingTranscriptUpdate(fallbackText.trim())
} else {
console.log('⚠️ Fallback transcription also produced no text')
}
} catch (fallbackError) {
console.log('⚠️ Fallback transcription failed:', fallbackError)
}
} catch (fallbackError) {
console.log('⚠️ Fallback transcription failed:', fallbackError)
}
}
} catch (error) {
console.error('❌ Error processing accumulated audio chunks:', error)
}
}, [handleStreamingTranscriptUpdate, language])
}, [handleStreamingTranscriptUpdate, language, shouldUseRunPod])
// Process recorded audio chunks (final processing)
const processAudioChunks = useCallback(async () => {
if (!transcriberRef.current || audioChunksRef.current.length === 0) {
console.log('⚠️ No transcriber or audio chunks to process')
if (audioChunksRef.current.length === 0) {
console.log('⚠️ No audio chunks to process')
return
}
// Ensure model is loaded
if (!modelLoaded) {
console.log('⚠️ Model not loaded yet, waiting...')
try {
await initializeTranscriber()
} catch (error) {
console.error('❌ Failed to initialize transcriber:', error)
onError?.(error as Error)
// For local model, ensure transcriber is loaded
if (!shouldUseRunPod) {
if (!transcriberRef.current) {
console.log('⚠️ No transcriber available')
return
}
// Ensure model is loaded
if (!modelLoaded) {
console.log('⚠️ Model not loaded yet, waiting...')
try {
await initializeTranscriber()
} catch (error) {
console.error('❌ Failed to initialize transcriber:', error)
onError?.(error as Error)
return
}
}
}
try {
@ -588,24 +663,32 @@ export const useWhisperTranscription = ({
console.log(`🎵 Processing audio: ${processedAudioData.length} samples (${(processedAudioData.length / 16000).toFixed(2)}s)`)
// Check if transcriber is available
if (!transcriberRef.current) {
console.error('❌ Transcriber not available for processing')
throw new Error('Transcriber not initialized')
console.log('🔄 Starting transcription...')
let newText = ''
// Use RunPod if configured, otherwise use local model
if (shouldUseRunPod) {
console.log('🚀 Using RunPod WhisperX API...')
// Convert processed audio data back to blob for RunPod
// Create a WAV blob from the Float32Array
const wavBlob = await createWavBlob(processedAudioData, 16000)
newText = await transcribeWithRunPod(wavBlob, language)
console.log('✅ RunPod transcription result:', newText)
} else {
// Use local Whisper model
if (!transcriberRef.current) {
throw new Error('Transcriber not initialized')
}
const result = await transcriberRef.current(processedAudioData, {
language: language,
task: 'transcribe',
return_timestamps: false
})
console.log('🔍 Transcription result:', result)
newText = result?.text?.trim() || ''
}
console.log('🔄 Starting transcription with Whisper model...')
// Transcribe the audio
const result = await transcriberRef.current(processedAudioData, {
language: language,
task: 'transcribe',
return_timestamps: false
})
console.log('🔍 Transcription result:', result)
const newText = result?.text?.trim() || ''
if (newText) {
const processedText = processTranscript(newText, enableStreaming)
@ -633,16 +716,17 @@ export const useWhisperTranscription = ({
console.log('⚠️ No transcription text produced')
console.log('🔍 Full transcription result object:', result)
// Try alternative transcription parameters
console.log('🔄 Trying alternative transcription parameters...')
try {
const altResult = await transcriberRef.current(processedAudioData, {
task: 'transcribe',
return_timestamps: false
})
console.log('🔍 Alternative transcription result:', altResult)
if (altResult?.text?.trim()) {
// Try alternative transcription parameters (only for local model)
if (!shouldUseRunPod && transcriberRef.current) {
console.log('🔄 Trying alternative transcription parameters...')
try {
const altResult = await transcriberRef.current(processedAudioData, {
task: 'transcribe',
return_timestamps: false
})
console.log('🔍 Alternative transcription result:', altResult)
if (altResult?.text?.trim()) {
const processedAltText = processTranscript(altResult.text, enableStreaming)
console.log('✅ Alternative transcription successful:', processedAltText)
const currentTranscript = transcriptRef.current
@ -658,8 +742,9 @@ export const useWhisperTranscription = ({
previousTranscriptLengthRef.current = updatedTranscript.length
}
}
} catch (altError) {
console.log('⚠️ Alternative transcription also failed:', altError)
} catch (altError) {
console.log('⚠️ Alternative transcription also failed:', altError)
}
}
}
@ -672,7 +757,7 @@ export const useWhisperTranscription = ({
} finally {
setIsTranscribing(false)
}
}, [transcriberRef, language, onTranscriptUpdate, onError, enableStreaming, handleStreamingTranscriptUpdate, modelLoaded, initializeTranscriber])
}, [transcriberRef, language, onTranscriptUpdate, onError, enableStreaming, handleStreamingTranscriptUpdate, modelLoaded, initializeTranscriber, shouldUseRunPod])
// Start recording
const startRecording = useCallback(async () => {
@ -680,10 +765,13 @@ export const useWhisperTranscription = ({
console.log('🎤 Starting recording...')
console.log('🔍 enableStreaming in startRecording:', enableStreaming)
// Ensure model is loaded before starting
if (!modelLoaded) {
// Ensure model is loaded before starting (skip for RunPod)
if (!shouldUseRunPod && !modelLoaded) {
console.log('🔄 Model not loaded, initializing...')
await initializeTranscriber()
} else if (shouldUseRunPod) {
// For RunPod, just mark as ready
setModelLoaded(true)
}
// Don't reset transcripts for continuous transcription - keep existing content
@ -803,7 +891,7 @@ export const useWhisperTranscription = ({
console.error('❌ Error starting recording:', error)
onError?.(error as Error)
}
}, [processAudioChunks, processAccumulatedAudioChunks, onError, enableStreaming, modelLoaded, initializeTranscriber])
}, [processAudioChunks, processAccumulatedAudioChunks, onError, enableStreaming, modelLoaded, initializeTranscriber, shouldUseRunPod])
// Stop recording
const stopRecording = useCallback(async () => {
@ -892,9 +980,11 @@ export const useWhisperTranscription = ({
periodicTranscriptionRef.current = null
}
// Initialize the model if not already loaded
if (!modelLoaded) {
// Initialize the model if not already loaded (skip for RunPod)
if (!shouldUseRunPod && !modelLoaded) {
await initializeTranscriber()
} else if (shouldUseRunPod) {
setModelLoaded(true)
}
await startRecording()
@ -933,7 +1023,7 @@ export const useWhisperTranscription = ({
if (autoInitialize) {
initializeTranscriber().catch(console.warn)
}
}, [initializeTranscriber, autoInitialize])
}, [initializeTranscriber, autoInitialize, shouldUseRunPod])
// Cleanup on unmount
useEffect(() => {

View File

@ -14,6 +14,8 @@ export interface ClientConfig {
webhookUrl?: string
webhookSecret?: string
openaiApiKey?: string
runpodApiKey?: string
runpodEndpointId?: string
}
/**
@ -38,6 +40,8 @@ export function getClientConfig(): ClientConfig {
webhookUrl: import.meta.env.VITE_QUARTZ_WEBHOOK_URL || import.meta.env.NEXT_PUBLIC_QUARTZ_WEBHOOK_URL,
webhookSecret: import.meta.env.VITE_QUARTZ_WEBHOOK_SECRET || import.meta.env.NEXT_PUBLIC_QUARTZ_WEBHOOK_SECRET,
openaiApiKey: import.meta.env.VITE_OPENAI_API_KEY || import.meta.env.NEXT_PUBLIC_OPENAI_API_KEY,
runpodApiKey: import.meta.env.VITE_RUNPOD_API_KEY || import.meta.env.NEXT_PUBLIC_RUNPOD_API_KEY,
runpodEndpointId: import.meta.env.VITE_RUNPOD_ENDPOINT_ID || import.meta.env.NEXT_PUBLIC_RUNPOD_ENDPOINT_ID,
}
} else {
// Next.js environment
@ -52,6 +56,8 @@ export function getClientConfig(): ClientConfig {
webhookUrl: (window as any).__NEXT_DATA__?.env?.NEXT_PUBLIC_QUARTZ_WEBHOOK_URL,
webhookSecret: (window as any).__NEXT_DATA__?.env?.NEXT_PUBLIC_QUARTZ_WEBHOOK_SECRET,
openaiApiKey: (window as any).__NEXT_DATA__?.env?.NEXT_PUBLIC_OPENAI_API_KEY,
runpodApiKey: (window as any).__NEXT_DATA__?.env?.NEXT_PUBLIC_RUNPOD_API_KEY,
runpodEndpointId: (window as any).__NEXT_DATA__?.env?.NEXT_PUBLIC_RUNPOD_ENDPOINT_ID,
}
}
} else {
@ -66,10 +72,36 @@ export function getClientConfig(): ClientConfig {
quartzApiKey: process.env.VITE_QUARTZ_API_KEY || process.env.NEXT_PUBLIC_QUARTZ_API_KEY,
webhookUrl: process.env.VITE_QUARTZ_WEBHOOK_URL || process.env.NEXT_PUBLIC_QUARTZ_WEBHOOK_URL,
webhookSecret: process.env.VITE_QUARTZ_WEBHOOK_SECRET || process.env.NEXT_PUBLIC_QUARTZ_WEBHOOK_SECRET,
runpodApiKey: process.env.VITE_RUNPOD_API_KEY || process.env.NEXT_PUBLIC_RUNPOD_API_KEY,
runpodEndpointId: process.env.VITE_RUNPOD_ENDPOINT_ID || process.env.NEXT_PUBLIC_RUNPOD_ENDPOINT_ID,
}
}
}
/**
* Get RunPod configuration for API calls
*/
export function getRunPodConfig(): { apiKey: string; endpointId: string } | null {
const config = getClientConfig()
if (!config.runpodApiKey || !config.runpodEndpointId) {
return null
}
return {
apiKey: config.runpodApiKey,
endpointId: config.runpodEndpointId
}
}
/**
* Check if RunPod integration is configured
*/
export function isRunPodConfigured(): boolean {
const config = getClientConfig()
return !!(config.runpodApiKey && config.runpodEndpointId)
}
/**
* Check if GitHub integration is configured
*/

246
src/lib/runpodApi.ts Normal file
View File

@ -0,0 +1,246 @@
/**
* RunPod API utility functions
* Handles communication with RunPod WhisperX endpoints
*/
import { getRunPodConfig } from './clientConfig'
export interface RunPodTranscriptionResponse {
id?: string
status?: string
output?: {
text?: string
segments?: Array<{
start: number
end: number
text: string
}>
}
error?: string
}
/**
* Convert audio blob to base64 string
*/
export async function blobToBase64(blob: Blob): Promise<string> {
return new Promise((resolve, reject) => {
const reader = new FileReader()
reader.onloadend = () => {
if (typeof reader.result === 'string') {
// Remove data URL prefix (e.g., "data:audio/webm;base64,")
const base64 = reader.result.split(',')[1] || reader.result
resolve(base64)
} else {
reject(new Error('Failed to convert blob to base64'))
}
}
reader.onerror = reject
reader.readAsDataURL(blob)
})
}
/**
* Send transcription request to RunPod endpoint
* Handles both synchronous and asynchronous job patterns
*/
export async function transcribeWithRunPod(
audioBlob: Blob,
language?: string
): Promise<string> {
const config = getRunPodConfig()
if (!config) {
throw new Error('RunPod API key or endpoint ID not configured. Please set VITE_RUNPOD_API_KEY and VITE_RUNPOD_ENDPOINT_ID environment variables.')
}
// Check audio blob size (limit to ~10MB to prevent issues)
const maxSize = 10 * 1024 * 1024 // 10MB
if (audioBlob.size > maxSize) {
throw new Error(`Audio file too large: ${(audioBlob.size / 1024 / 1024).toFixed(2)}MB. Maximum size is ${(maxSize / 1024 / 1024).toFixed(2)}MB`)
}
// Convert audio blob to base64
const audioBase64 = await blobToBase64(audioBlob)
// Detect audio format from blob type
const audioFormat = audioBlob.type || 'audio/wav'
const url = `https://api.runpod.ai/v2/${config.endpointId}/run`
// Prepare the request payload
// WhisperX typically expects audio as base64 or file URL
// The exact format may vary based on your WhisperX endpoint implementation
const requestBody = {
input: {
audio: audioBase64,
audio_format: audioFormat,
language: language || 'en',
task: 'transcribe'
// Note: Some WhisperX endpoints may expect different field names
// Adjust the requestBody structure in this function if needed
}
}
try {
// Add timeout to prevent hanging requests (30 seconds for initial request)
const controller = new AbortController()
const timeoutId = setTimeout(() => controller.abort(), 30000)
const response = await fetch(url, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': `Bearer ${config.apiKey}`
},
body: JSON.stringify(requestBody),
signal: controller.signal
})
clearTimeout(timeoutId)
if (!response.ok) {
const errorText = await response.text()
console.error('RunPod API error response:', {
status: response.status,
statusText: response.statusText,
body: errorText
})
throw new Error(`RunPod API error: ${response.status} - ${errorText}`)
}
const data: RunPodTranscriptionResponse = await response.json()
console.log('RunPod initial response:', data)
// Handle async job pattern (RunPod often returns job IDs)
if (data.id && (data.status === 'IN_QUEUE' || data.status === 'IN_PROGRESS')) {
console.log('Job is async, polling for results...', data.id)
return await pollRunPodJob(data.id, config.apiKey, config.endpointId)
}
// Handle direct response
if (data.output?.text) {
return data.output.text.trim()
}
// Handle error response
if (data.error) {
throw new Error(`RunPod transcription error: ${data.error}`)
}
// Fallback: try to extract text from segments
if (data.output?.segments && data.output.segments.length > 0) {
return data.output.segments.map(seg => seg.text).join(' ').trim()
}
// Check if response has unexpected structure
console.warn('Unexpected RunPod response structure:', data)
throw new Error('No transcription text found in RunPod response. Check endpoint response format.')
} catch (error: any) {
if (error.name === 'AbortError') {
throw new Error('RunPod request timed out after 30 seconds')
}
console.error('RunPod transcription error:', error)
throw error
}
}
/**
* Poll RunPod job status until completion
*/
async function pollRunPodJob(
jobId: string,
apiKey: string,
endpointId: string,
maxAttempts: number = 120, // Increased to 120 attempts (2 minutes at 1s intervals)
pollInterval: number = 1000
): Promise<string> {
const statusUrl = `https://api.runpod.ai/v2/${endpointId}/status/${jobId}`
console.log(`Polling job ${jobId} (max ${maxAttempts} attempts, ${pollInterval}ms interval)`)
for (let attempt = 0; attempt < maxAttempts; attempt++) {
try {
// Add timeout for each status check (5 seconds)
const controller = new AbortController()
const timeoutId = setTimeout(() => controller.abort(), 5000)
const response = await fetch(statusUrl, {
method: 'GET',
headers: {
'Authorization': `Bearer ${apiKey}`
},
signal: controller.signal
})
clearTimeout(timeoutId)
if (!response.ok) {
const errorText = await response.text()
console.error(`Job status check failed (attempt ${attempt + 1}/${maxAttempts}):`, {
status: response.status,
statusText: response.statusText,
body: errorText
})
// Don't fail immediately on 404 - job might still be processing
if (response.status === 404 && attempt < maxAttempts - 1) {
console.log('Job not found yet, continuing to poll...')
await new Promise(resolve => setTimeout(resolve, pollInterval))
continue
}
throw new Error(`Failed to check job status: ${response.status} - ${errorText}`)
}
const data: RunPodTranscriptionResponse = await response.json()
console.log(`Job status (attempt ${attempt + 1}/${maxAttempts}):`, data.status)
if (data.status === 'COMPLETED') {
console.log('Job completed, extracting transcription...')
if (data.output?.text) {
return data.output.text.trim()
}
if (data.output?.segments && data.output.segments.length > 0) {
return data.output.segments.map(seg => seg.text).join(' ').trim()
}
// Log the full response for debugging
console.error('Job completed but no transcription found. Full response:', JSON.stringify(data, null, 2))
throw new Error('Job completed but no transcription text found in response')
}
if (data.status === 'FAILED') {
const errorMsg = data.error || 'Unknown error'
console.error('Job failed:', errorMsg)
throw new Error(`Job failed: ${errorMsg}`)
}
// Job still in progress, wait and retry
if (attempt % 10 === 0) {
console.log(`Job still processing... (${attempt + 1}/${maxAttempts} attempts)`)
}
await new Promise(resolve => setTimeout(resolve, pollInterval))
} catch (error: any) {
if (error.name === 'AbortError') {
console.warn(`Status check timed out (attempt ${attempt + 1}/${maxAttempts})`)
if (attempt < maxAttempts - 1) {
await new Promise(resolve => setTimeout(resolve, pollInterval))
continue
}
throw new Error('Status check timed out multiple times')
}
if (attempt === maxAttempts - 1) {
throw error
}
// Wait before retrying
await new Promise(resolve => setTimeout(resolve, pollInterval))
}
}
throw new Error(`Job polling timeout after ${maxAttempts} attempts (${(maxAttempts * pollInterval / 1000).toFixed(0)} seconds)`)
}

View File

@ -42,6 +42,8 @@ import { HolonBrowserShape } from "@/shapes/HolonBrowserShapeUtil"
import { ObsidianBrowserShape } from "@/shapes/ObsidianBrowserShapeUtil"
import { FathomMeetingsBrowserShape } from "@/shapes/FathomMeetingsBrowserShapeUtil"
import { LocationShareShape } from "@/shapes/LocationShareShapeUtil"
import { ImageGenShape } from "@/shapes/ImageGenShapeUtil"
import { ImageGenTool } from "@/tools/ImageGenTool"
import {
lockElement,
unlockElement,
@ -82,6 +84,7 @@ const customShapeUtils = [
ObsidianBrowserShape,
FathomMeetingsBrowserShape,
LocationShareShape,
ImageGenShape,
]
const customTools = [
ChatBoxTool,
@ -96,6 +99,7 @@ const customTools = [
TranscriptionTool,
HolonTool,
FathomMeetingsTool,
ImageGenTool,
]
export function Board() {

View File

@ -0,0 +1,730 @@
import {
BaseBoxShapeUtil,
Geometry2d,
HTMLContainer,
Rectangle2d,
TLBaseShape,
} from "tldraw"
import React, { useState } from "react"
import { getRunPodConfig } from "@/lib/clientConfig"
// Feature flag: Set to false when RunPod API is ready for production
const USE_MOCK_API = true
// Type definition for RunPod API responses
interface RunPodJobResponse {
id?: string
status?: 'IN_QUEUE' | 'IN_PROGRESS' | 'STARTING' | 'COMPLETED' | 'FAILED' | 'CANCELLED'
output?: string | {
image?: string
url?: string
images?: Array<{ data?: string; url?: string; filename?: string; type?: string }>
result?: string
[key: string]: any
}
error?: string
image?: string
url?: string
result?: string | {
image?: string
url?: string
[key: string]: any
}
[key: string]: any
}
type IImageGen = TLBaseShape<
"ImageGen",
{
w: number
h: number
prompt: string
imageUrl: string | null
isLoading: boolean
error: string | null
endpointId?: string // Optional custom endpoint ID
}
>
// Helper function to poll RunPod job status until completion
async function pollRunPodJob(
jobId: string,
apiKey: string,
endpointId: string,
maxAttempts: number = 60,
pollInterval: number = 2000
): Promise<string> {
const statusUrl = `https://api.runpod.ai/v2/${endpointId}/status/${jobId}`
console.log('🔄 ImageGen: Polling job:', jobId)
for (let attempt = 0; attempt < maxAttempts; attempt++) {
try {
const response = await fetch(statusUrl, {
method: 'GET',
headers: {
'Authorization': `Bearer ${apiKey}`
}
})
if (!response.ok) {
const errorText = await response.text()
console.error(`❌ ImageGen: Poll error (attempt ${attempt + 1}/${maxAttempts}):`, response.status, errorText)
throw new Error(`Failed to check job status: ${response.status} - ${errorText}`)
}
const data = await response.json() as RunPodJobResponse
console.log(`🔄 ImageGen: Poll attempt ${attempt + 1}/${maxAttempts}, status:`, data.status)
console.log(`📋 ImageGen: Full response data:`, JSON.stringify(data, null, 2))
if (data.status === 'COMPLETED') {
console.log('✅ ImageGen: Job completed, processing output...')
// Extract image URL from various possible response formats
let imageUrl = ''
// Check if output exists at all
if (!data.output) {
// Only retry 2-3 times, then proceed to check alternatives
if (attempt < 3) {
console.log(`⏳ ImageGen: COMPLETED but no output yet, waiting briefly (attempt ${attempt + 1}/3)...`)
await new Promise(resolve => setTimeout(resolve, 500))
continue
}
// Try alternative ways to get the output - maybe it's at the top level
console.log('⚠️ ImageGen: No output field found, checking for alternative response formats...')
console.log('📋 ImageGen: All available fields:', Object.keys(data))
// Check if image data is at top level
if (data.image) {
imageUrl = data.image
console.log('✅ ImageGen: Found image at top level')
} else if (data.url) {
imageUrl = data.url
console.log('✅ ImageGen: Found url at top level')
} else if (data.result) {
// Some endpoints return result instead of output
if (typeof data.result === 'string') {
imageUrl = data.result
} else if (data.result.image) {
imageUrl = data.result.image
} else if (data.result.url) {
imageUrl = data.result.url
}
console.log('✅ ImageGen: Found result field')
} else {
// Last resort: try to fetch output via stream endpoint (some RunPod endpoints use this)
console.log('⚠️ ImageGen: Trying alternative endpoint to retrieve output...')
try {
const streamUrl = `https://api.runpod.ai/v2/${endpointId}/stream/${jobId}`
const streamResponse = await fetch(streamUrl, {
method: 'GET',
headers: {
'Authorization': `Bearer ${apiKey}`
}
})
if (streamResponse.ok) {
const streamData = await streamResponse.json() as RunPodJobResponse
console.log('📥 ImageGen: Stream endpoint response:', JSON.stringify(streamData, null, 2))
if (streamData.output) {
if (typeof streamData.output === 'string') {
imageUrl = streamData.output
} else if (streamData.output.image) {
imageUrl = streamData.output.image
} else if (streamData.output.url) {
imageUrl = streamData.output.url
} else if (Array.isArray(streamData.output.images) && streamData.output.images.length > 0) {
const firstImage = streamData.output.images[0]
if (firstImage.data) {
imageUrl = firstImage.data.startsWith('data:') ? firstImage.data : `data:image/${firstImage.type || 'png'};base64,${firstImage.data}`
} else if (firstImage.url) {
imageUrl = firstImage.url
}
}
if (imageUrl) {
console.log('✅ ImageGen: Found image URL via stream endpoint')
return imageUrl
}
}
}
} catch (streamError) {
console.log('⚠️ ImageGen: Stream endpoint not available or failed:', streamError)
}
console.error('❌ ImageGen: Job completed but no output field in response after retries:', JSON.stringify(data, null, 2))
throw new Error(
'Job completed but no output data found.\n\n' +
'Possible issues:\n' +
'1. The RunPod endpoint handler may not be returning output correctly\n' +
'2. Check the endpoint handler logs in RunPod console\n' +
'3. Verify the handler returns: { output: { image: "url" } } or { output: "url" }\n' +
'4. For ComfyUI workers, ensure output.images array is returned\n' +
'5. The endpoint may need to be reconfigured\n\n' +
'Response received: ' + JSON.stringify(data, null, 2)
)
}
} else {
// Extract image URL from various possible response formats
if (typeof data.output === 'string') {
imageUrl = data.output
} else if (data.output?.image) {
imageUrl = data.output.image
} else if (data.output?.url) {
imageUrl = data.output.url
} else if (data.output?.output) {
// Handle nested output structure
if (typeof data.output.output === 'string') {
imageUrl = data.output.output
} else if (data.output.output?.image) {
imageUrl = data.output.output.image
} else if (data.output.output?.url) {
imageUrl = data.output.output.url
}
} else if (Array.isArray(data.output) && data.output.length > 0) {
// Handle array responses
const firstItem = data.output[0]
if (typeof firstItem === 'string') {
imageUrl = firstItem
} else if (firstItem.image) {
imageUrl = firstItem.image
} else if (firstItem.url) {
imageUrl = firstItem.url
}
} else if (data.output?.result) {
// Some formats nest result inside output
if (typeof data.output.result === 'string') {
imageUrl = data.output.result
} else if (data.output.result?.image) {
imageUrl = data.output.result.image
} else if (data.output.result?.url) {
imageUrl = data.output.result.url
}
} else if (Array.isArray(data.output?.images) && data.output.images.length > 0) {
// ComfyUI worker format: { output: { images: [{ filename, type, data }] } }
const firstImage = data.output.images[0]
if (firstImage.data) {
// Base64 encoded image
if (firstImage.data.startsWith('data:image')) {
imageUrl = firstImage.data
} else if (firstImage.data.startsWith('http')) {
imageUrl = firstImage.data
} else {
// Assume base64 without prefix
imageUrl = `data:image/${firstImage.type || 'png'};base64,${firstImage.data}`
}
console.log('✅ ImageGen: Found image in ComfyUI format (images array)')
} else if (firstImage.url) {
imageUrl = firstImage.url
console.log('✅ ImageGen: Found image URL in ComfyUI format')
} else if (firstImage.filename) {
// Try to construct URL from filename (may need endpoint-specific handling)
console.log('⚠️ ImageGen: Found filename but no URL, filename:', firstImage.filename)
}
}
}
if (!imageUrl || imageUrl.trim() === '') {
console.error('❌ ImageGen: No image URL found in response:', JSON.stringify(data, null, 2))
throw new Error(
'Job completed but no image URL found in output.\n\n' +
'Expected formats:\n' +
'- { output: "https://..." }\n' +
'- { output: { image: "https://..." } }\n' +
'- { output: { url: "https://..." } }\n' +
'- { output: ["https://..."] }\n\n' +
'Received: ' + JSON.stringify(data, null, 2)
)
}
return imageUrl
}
if (data.status === 'FAILED') {
console.error('❌ ImageGen: Job failed:', data.error || 'Unknown error')
throw new Error(`Job failed: ${data.error || 'Unknown error'}`)
}
// Wait before next poll
await new Promise(resolve => setTimeout(resolve, pollInterval))
} catch (error) {
// If we get COMPLETED status without output, don't retry - fail immediately
const errorMessage = error instanceof Error ? error.message : String(error)
if (errorMessage.includes('no output') || errorMessage.includes('no image URL')) {
console.error('❌ ImageGen: Stopping polling due to missing output data')
throw error
}
// For other errors, retry up to maxAttempts
if (attempt === maxAttempts - 1) {
throw error
}
await new Promise(resolve => setTimeout(resolve, pollInterval))
}
}
throw new Error('Job polling timed out')
}
export class ImageGenShape extends BaseBoxShapeUtil<IImageGen> {
static override type = "ImageGen" as const
MIN_WIDTH = 300 as const
MIN_HEIGHT = 300 as const
DEFAULT_WIDTH = 400 as const
DEFAULT_HEIGHT = 400 as const
getDefaultProps(): IImageGen["props"] {
return {
w: this.DEFAULT_WIDTH,
h: this.DEFAULT_HEIGHT,
prompt: "",
imageUrl: null,
isLoading: false,
error: null,
}
}
getGeometry(shape: IImageGen): Geometry2d {
return new Rectangle2d({
width: shape.props.w,
height: shape.props.h,
isFilled: true,
})
}
component(shape: IImageGen) {
const [isHovering, setIsHovering] = useState(false)
const isSelected = this.editor.getSelectedShapeIds().includes(shape.id)
const generateImage = async (prompt: string) => {
console.log("🎨 ImageGen: Generating image with prompt:", prompt)
// Clear any previous errors
this.editor.updateShape<IImageGen>({
id: shape.id,
type: "ImageGen",
props: {
error: null,
isLoading: true,
imageUrl: null
},
})
try {
// Get RunPod configuration
const runpodConfig = getRunPodConfig()
const endpointId = shape.props.endpointId || runpodConfig?.endpointId || "tzf1j3sc3zufsy"
const apiKey = runpodConfig?.apiKey
// Mock API mode: Return placeholder image without calling RunPod
if (USE_MOCK_API) {
console.log("🎭 ImageGen: Using MOCK API mode (no real RunPod call)")
console.log("🎨 ImageGen: Mock prompt:", prompt)
// Simulate API delay
await new Promise(resolve => setTimeout(resolve, 1500))
// Use a placeholder image service
const mockImageUrl = `https://via.placeholder.com/512x512/4F46E5/FFFFFF?text=${encodeURIComponent(prompt.substring(0, 30))}`
console.log("✅ ImageGen: Mock image generated:", mockImageUrl)
this.editor.updateShape<IImageGen>({
id: shape.id,
type: "ImageGen",
props: {
imageUrl: mockImageUrl,
isLoading: false,
error: null
},
})
return
}
// Real API mode: Use RunPod
if (!apiKey) {
throw new Error("RunPod API key not configured. Please set VITE_RUNPOD_API_KEY environment variable.")
}
const url = `https://api.runpod.ai/v2/${endpointId}/run`
console.log("📤 ImageGen: Sending request to:", url)
const response = await fetch(url, {
method: "POST",
headers: {
"Content-Type": "application/json",
"Authorization": `Bearer ${apiKey}`
},
body: JSON.stringify({
input: {
prompt: prompt
}
})
})
if (!response.ok) {
const errorText = await response.text()
console.error("❌ ImageGen: Error response:", errorText)
throw new Error(`HTTP error! status: ${response.status} - ${errorText}`)
}
const data = await response.json() as RunPodJobResponse
console.log("📥 ImageGen: Response data:", JSON.stringify(data, null, 2))
// Handle async job pattern (RunPod often returns job IDs)
if (data.id && (data.status === 'IN_QUEUE' || data.status === 'IN_PROGRESS' || data.status === 'STARTING')) {
console.log("⏳ ImageGen: Job queued/in progress, polling job ID:", data.id)
const imageUrl = await pollRunPodJob(data.id, apiKey, endpointId)
console.log("✅ ImageGen: Job completed, image URL:", imageUrl)
this.editor.updateShape<IImageGen>({
id: shape.id,
type: "ImageGen",
props: {
imageUrl: imageUrl,
isLoading: false,
error: null
},
})
} else if (data.output) {
// Handle direct response
let imageUrl = ''
if (typeof data.output === 'string') {
imageUrl = data.output
} else if (data.output.image) {
imageUrl = data.output.image
} else if (data.output.url) {
imageUrl = data.output.url
} else if (Array.isArray(data.output) && data.output.length > 0) {
const firstItem = data.output[0]
if (typeof firstItem === 'string') {
imageUrl = firstItem
} else if (firstItem.image) {
imageUrl = firstItem.image
} else if (firstItem.url) {
imageUrl = firstItem.url
}
}
if (imageUrl) {
this.editor.updateShape<IImageGen>({
id: shape.id,
type: "ImageGen",
props: {
imageUrl: imageUrl,
isLoading: false,
error: null
},
})
} else {
throw new Error("No image URL found in response")
}
} else if (data.error) {
throw new Error(`RunPod API error: ${data.error}`)
} else {
throw new Error("No valid response from RunPod API")
}
} catch (error) {
const errorMessage = error instanceof Error ? error.message : String(error)
console.error("❌ ImageGen: Error:", errorMessage)
let userFriendlyError = ''
if (errorMessage.includes('API key not configured')) {
userFriendlyError = '❌ RunPod API key not configured. Please set VITE_RUNPOD_API_KEY environment variable.'
} else if (errorMessage.includes('401') || errorMessage.includes('403') || errorMessage.includes('Unauthorized')) {
userFriendlyError = '❌ API key authentication failed. Please check your RunPod API key.'
} else if (errorMessage.includes('404')) {
userFriendlyError = '❌ Endpoint not found. Please check your endpoint ID.'
} else if (errorMessage.includes('no output data found') || errorMessage.includes('no image URL found')) {
// For multi-line error messages, show a concise version in the UI
// The full details are already in the console
userFriendlyError = '❌ Image generation completed but no image data was returned.\n\n' +
'This usually means the RunPod endpoint handler is not configured correctly.\n\n' +
'Please check:\n' +
'1. RunPod endpoint handler logs\n' +
'2. Handler returns: { output: { image: "url" } }\n' +
'3. See browser console for full details'
} else {
// Truncate very long error messages for UI display
const maxLength = 500
if (errorMessage.length > maxLength) {
userFriendlyError = `❌ Error: ${errorMessage.substring(0, maxLength)}...\n\n(Full error in console)`
} else {
userFriendlyError = `❌ Error: ${errorMessage}`
}
}
this.editor.updateShape<IImageGen>({
id: shape.id,
type: "ImageGen",
props: {
isLoading: false,
error: userFriendlyError
},
})
}
}
const handleGenerate = () => {
if (shape.props.prompt.trim() && !shape.props.isLoading) {
generateImage(shape.props.prompt)
this.editor.updateShape<IImageGen>({
id: shape.id,
type: "ImageGen",
props: { prompt: "" },
})
}
}
return (
<HTMLContainer
style={{
borderRadius: 6,
border: "1px solid lightgrey",
padding: 8,
height: shape.props.h,
width: shape.props.w,
pointerEvents: isSelected || isHovering ? "all" : "none",
backgroundColor: "#ffffff",
overflow: "hidden",
display: "flex",
flexDirection: "column",
gap: 8,
}}
onPointerEnter={() => setIsHovering(true)}
onPointerLeave={() => setIsHovering(false)}
>
{/* Error Display */}
{shape.props.error && (
<div
style={{
padding: "12px 16px",
backgroundColor: "#fee",
border: "1px solid #fcc",
borderRadius: "8px",
color: "#c33",
fontSize: "13px",
display: "flex",
alignItems: "flex-start",
gap: "8px",
whiteSpace: "pre-wrap",
wordBreak: "break-word",
}}
>
<span style={{ fontSize: "18px", flexShrink: 0 }}></span>
<span style={{ flex: 1, lineHeight: "1.5" }}>{shape.props.error}</span>
<button
onClick={() => {
this.editor.updateShape<IImageGen>({
id: shape.id,
type: "ImageGen",
props: { error: null },
})
}}
style={{
padding: "4px 8px",
backgroundColor: "#fcc",
border: "1px solid #c99",
borderRadius: "4px",
cursor: "pointer",
fontSize: "11px",
flexShrink: 0,
}}
>
Dismiss
</button>
</div>
)}
{/* Image Display */}
{shape.props.imageUrl && !shape.props.isLoading && (
<div
style={{
flex: 1,
display: "flex",
alignItems: "center",
justifyContent: "center",
backgroundColor: "#f5f5f5",
borderRadius: "4px",
overflow: "hidden",
minHeight: 0,
}}
>
<img
src={shape.props.imageUrl}
alt={shape.props.prompt || "Generated image"}
style={{
maxWidth: "100%",
maxHeight: "100%",
objectFit: "contain",
}}
onError={(_e) => {
console.error("❌ ImageGen: Failed to load image:", shape.props.imageUrl)
this.editor.updateShape<IImageGen>({
id: shape.id,
type: "ImageGen",
props: {
error: "Failed to load generated image",
imageUrl: null
},
})
}}
/>
</div>
)}
{/* Loading State */}
{shape.props.isLoading && (
<div
style={{
flex: 1,
display: "flex",
flexDirection: "column",
alignItems: "center",
justifyContent: "center",
backgroundColor: "#f5f5f5",
borderRadius: "4px",
gap: 12,
}}
>
<div
style={{
width: 40,
height: 40,
border: "4px solid #f3f3f3",
borderTop: "4px solid #007AFF",
borderRadius: "50%",
animation: "spin 1s linear infinite",
}}
/>
<span style={{ color: "#666", fontSize: "14px" }}>
Generating image...
</span>
</div>
)}
{/* Empty State */}
{!shape.props.imageUrl && !shape.props.isLoading && (
<div
style={{
flex: 1,
display: "flex",
alignItems: "center",
justifyContent: "center",
backgroundColor: "#f5f5f5",
borderRadius: "4px",
color: "#999",
fontSize: "14px",
}}
>
Generated image will appear here
</div>
)}
{/* Input Section */}
<div
style={{
display: "flex",
gap: 8,
pointerEvents: isSelected || isHovering ? "all" : "none",
}}
>
<input
style={{
flex: 1,
height: "36px",
backgroundColor: "rgba(0, 0, 0, 0.05)",
border: "1px solid rgba(0, 0, 0, 0.1)",
borderRadius: "4px",
fontSize: 14,
padding: "0 8px",
}}
type="text"
placeholder="Enter image prompt..."
value={shape.props.prompt}
onChange={(e) => {
this.editor.updateShape<IImageGen>({
id: shape.id,
type: "ImageGen",
props: { prompt: e.target.value },
})
}}
onKeyDown={(e) => {
e.stopPropagation()
if (e.key === 'Enter' && !e.shiftKey) {
e.preventDefault()
if (shape.props.prompt.trim() && !shape.props.isLoading) {
handleGenerate()
}
}
}}
onPointerDown={(e) => {
e.stopPropagation()
}}
onClick={(e) => {
e.stopPropagation()
}}
disabled={shape.props.isLoading}
/>
<button
style={{
height: "36px",
padding: "0 16px",
pointerEvents: "all",
cursor: shape.props.prompt.trim() && !shape.props.isLoading ? "pointer" : "not-allowed",
backgroundColor: shape.props.prompt.trim() && !shape.props.isLoading ? "#007AFF" : "#ccc",
color: "white",
border: "none",
borderRadius: "4px",
fontWeight: "500",
fontSize: "14px",
opacity: shape.props.prompt.trim() && !shape.props.isLoading ? 1 : 0.6,
}}
onPointerDown={(e) => {
e.stopPropagation()
e.preventDefault()
if (shape.props.prompt.trim() && !shape.props.isLoading) {
handleGenerate()
}
}}
onClick={(e) => {
e.preventDefault()
e.stopPropagation()
if (shape.props.prompt.trim() && !shape.props.isLoading) {
handleGenerate()
}
}}
disabled={shape.props.isLoading || !shape.props.prompt.trim()}
>
Generate
</button>
</div>
{/* Add CSS for spinner animation */}
<style>{`
@keyframes spin {
0% { transform: rotate(0deg); }
100% { transform: rotate(360deg); }
}
`}</style>
</HTMLContainer>
)
}
override indicator(shape: IImageGen) {
return (
<rect
width={shape.props.w}
height={shape.props.h}
rx={6}
/>
)
}
}

14
src/tools/ImageGenTool.ts Normal file
View File

@ -0,0 +1,14 @@
import { BaseBoxShapeTool, TLEventHandlers } from 'tldraw'
export class ImageGenTool extends BaseBoxShapeTool {
static override id = 'ImageGen'
static override initial = 'idle'
override shapeType = 'ImageGen'
override onComplete: TLEventHandlers["onComplete"] = () => {
console.log('🎨 ImageGenTool: Shape creation completed')
this.editor.setCurrentTool('select')
}
}

View File

@ -238,6 +238,7 @@ export function CustomContextMenu(props: TLUiContextMenuProps) {
<TldrawUiMenuItem {...tools.Transcription} disabled={hasSelection} />
<TldrawUiMenuItem {...tools.FathomMeetings} disabled={hasSelection} />
<TldrawUiMenuItem {...tools.Holon} disabled={hasSelection} />
<TldrawUiMenuItem {...tools.ImageGen} disabled={hasSelection} />
</TldrawUiMenuGroup>
{/* Collections Group */}

View File

@ -29,7 +29,7 @@ export function CustomMainMenu() {
const validateAndNormalizeShapeType = (shape: any): string => {
if (!shape || !shape.type) return 'text'
const validCustomShapes = ['ObsNote', 'VideoChat', 'Transcription', 'Prompt', 'ChatBox', 'Embed', 'Markdown', 'MycrozineTemplate', 'Slide', 'Holon', 'ObsidianBrowser', 'HolonBrowser', 'FathomMeetingsBrowser', 'LocationShare']
const validCustomShapes = ['ObsNote', 'VideoChat', 'Transcription', 'Prompt', 'ChatBox', 'Embed', 'Markdown', 'MycrozineTemplate', 'Slide', 'Holon', 'ObsidianBrowser', 'HolonBrowser', 'FathomMeetingsBrowser', 'LocationShare', 'ImageGen']
const validDefaultShapes = ['arrow', 'bookmark', 'draw', 'embed', 'frame', 'geo', 'group', 'highlight', 'image', 'line', 'note', 'text', 'video']
const allValidShapes = [...validCustomShapes, ...validDefaultShapes]

View File

@ -33,6 +33,7 @@ export const components: TLComponents = {
tools["Transcription"],
tools["Holon"],
tools["FathomMeetings"],
tools["ImageGen"],
].filter(tool => tool && tool.kbd)
// Get all custom actions with keyboard shortcuts

View File

@ -196,6 +196,15 @@ export const overrides: TLUiOverrides = {
// Shape creation is handled manually in FathomMeetingsTool.onPointerDown
onSelect: () => editor.setCurrentTool("fathom-meetings"),
},
ImageGen: {
id: "ImageGen",
icon: "image",
label: "Image Generation",
kbd: "alt+i",
readonlyOk: true,
type: "ImageGen",
onSelect: () => editor.setCurrentTool("ImageGen"),
},
hand: {
...tools.hand,
onDoubleClick: (info: any) => {

View File

@ -1,6 +1,7 @@
import OpenAI from "openai";
import Anthropic from "@anthropic-ai/sdk";
import { makeRealSettings, AI_PERSONALITIES } from "@/lib/settings";
import { getRunPodConfig } from "@/lib/clientConfig";
export async function llm(
userPrompt: string,
@ -59,7 +60,12 @@ export async function llm(
availableProviders.map(p => `${p.provider} (${p.model})`).join(', '));
if (availableProviders.length === 0) {
throw new Error("No valid API key found for any provider")
const runpodConfig = getRunPodConfig();
if (runpodConfig && runpodConfig.apiKey && runpodConfig.endpointId) {
// RunPod should have been added, but if not, try one more time
console.log('⚠️ No user API keys found, but RunPod is configured - this should not happen');
}
throw new Error("No valid API key found for any provider. Please configure API keys in settings or set up RunPod environment variables (VITE_RUNPOD_API_KEY and VITE_RUNPOD_ENDPOINT_ID).")
}
// Try each provider/key combination in order until one succeeds
@ -76,13 +82,14 @@ export async function llm(
'claude-3-haiku-20240307',
];
for (const { provider, apiKey, model } of availableProviders) {
for (const providerInfo of availableProviders) {
const { provider, apiKey, model, endpointId } = providerInfo as any;
try {
console.log(`🔄 Attempting to use ${provider} API (${model})...`);
attemptedProviders.push(`${provider} (${model})`);
// Add retry logic for temporary failures
await callProviderAPIWithRetry(provider, apiKey, model, userPrompt, onToken, settings);
await callProviderAPIWithRetry(provider, apiKey, model, userPrompt, onToken, settings, endpointId);
console.log(`✅ Successfully used ${provider} API (${model})`);
return; // Success, exit the function
} catch (error) {
@ -100,7 +107,9 @@ export async function llm(
try {
console.log(`🔄 Trying fallback model: ${fallbackModel}...`);
attemptedProviders.push(`${provider} (${fallbackModel})`);
await callProviderAPIWithRetry(provider, apiKey, fallbackModel, userPrompt, onToken, settings);
const providerInfo = availableProviders.find(p => p.provider === provider);
const endpointId = (providerInfo as any)?.endpointId;
await callProviderAPIWithRetry(provider, apiKey, fallbackModel, userPrompt, onToken, settings, endpointId);
console.log(`✅ Successfully used ${provider} API with fallback model ${fallbackModel}`);
fallbackSucceeded = true;
return; // Success, exit the function
@ -142,13 +151,17 @@ function getAvailableProviders(availableKeys: Record<string, string>, settings:
const providers = [];
// Helper to add a provider key if valid
const addProviderKey = (provider: string, apiKey: string, model?: string) => {
const addProviderKey = (provider: string, apiKey: string, model?: string, endpointId?: string) => {
if (isValidApiKey(provider, apiKey) && !isApiKeyInvalid(provider, apiKey)) {
providers.push({
const providerInfo: any = {
provider: provider,
apiKey: apiKey,
model: model || settings.models[provider] || getDefaultModel(provider)
});
};
if (endpointId) {
providerInfo.endpointId = endpointId;
}
providers.push(providerInfo);
return true;
} else if (isApiKeyInvalid(provider, apiKey)) {
console.log(`⏭️ Skipping ${provider} API key (marked as invalid)`);
@ -156,6 +169,20 @@ function getAvailableProviders(availableKeys: Record<string, string>, settings:
return false;
};
// PRIORITY 1: Check for RunPod configuration from environment variables FIRST
// RunPod takes priority over user-configured keys
const runpodConfig = getRunPodConfig();
if (runpodConfig && runpodConfig.apiKey && runpodConfig.endpointId) {
console.log('🔑 Found RunPod configuration from environment variables - using as primary AI provider');
providers.push({
provider: 'runpod',
apiKey: runpodConfig.apiKey,
endpointId: runpodConfig.endpointId,
model: 'default' // RunPod doesn't use model selection in the same way
});
}
// PRIORITY 2: Then add user-configured keys (they will be tried after RunPod)
// First, try the preferred provider - support multiple keys if stored as comma-separated
if (settings.provider && availableKeys[settings.provider]) {
const keyValue = availableKeys[settings.provider];
@ -239,8 +266,10 @@ function getAvailableProviders(availableKeys: Record<string, string>, settings:
}
// Additional fallback: Check for user-specific API keys from profile dashboard
if (providers.length === 0) {
providers.push(...getUserSpecificApiKeys());
// These will be tried after RunPod (if RunPod was added)
const userSpecificKeys = getUserSpecificApiKeys();
if (userSpecificKeys.length > 0) {
providers.push(...userSpecificKeys);
}
return providers;
@ -372,13 +401,14 @@ async function callProviderAPIWithRetry(
userPrompt: string,
onToken: (partialResponse: string, done?: boolean) => void,
settings?: any,
endpointId?: string,
maxRetries: number = 2
) {
let lastError: Error | null = null;
for (let attempt = 1; attempt <= maxRetries; attempt++) {
try {
await callProviderAPI(provider, apiKey, model, userPrompt, onToken, settings);
await callProviderAPI(provider, apiKey, model, userPrompt, onToken, settings, endpointId);
return; // Success
} catch (error) {
lastError = error as Error;
@ -471,12 +501,226 @@ async function callProviderAPI(
model: string,
userPrompt: string,
onToken: (partialResponse: string, done?: boolean) => void,
settings?: any
settings?: any,
endpointId?: string
) {
let partial = "";
const systemPrompt = settings ? getSystemPrompt(settings) : 'You are a helpful assistant.';
if (provider === 'openai') {
if (provider === 'runpod') {
// RunPod API integration - uses environment variables for automatic setup
// Get endpointId from parameter or from config
let runpodEndpointId = endpointId;
if (!runpodEndpointId) {
const runpodConfig = getRunPodConfig();
if (runpodConfig) {
runpodEndpointId = runpodConfig.endpointId;
}
}
if (!runpodEndpointId) {
throw new Error('RunPod endpoint ID not configured');
}
// Try /runsync first for synchronous execution (returns output immediately)
// Fall back to /run + polling if /runsync is not available
const syncUrl = `https://api.runpod.ai/v2/${runpodEndpointId}/runsync`;
const asyncUrl = `https://api.runpod.ai/v2/${runpodEndpointId}/run`;
// vLLM endpoints typically expect OpenAI-compatible format with messages array
// But some endpoints might accept simple prompt format
// Try OpenAI-compatible format first, as it's more standard for vLLM
const messages = [];
if (systemPrompt) {
messages.push({ role: 'system', content: systemPrompt });
}
messages.push({ role: 'user', content: userPrompt });
// Combine system prompt and user prompt for simple prompt format (fallback)
const fullPrompt = systemPrompt ? `${systemPrompt}\n\nUser: ${userPrompt}` : userPrompt;
const requestBody = {
input: {
messages: messages,
stream: false // vLLM can handle streaming, but we'll process it synchronously for now
}
};
console.log('📤 RunPod API: Trying synchronous endpoint first:', syncUrl);
console.log('📤 RunPod API: Using OpenAI-compatible messages format');
try {
// First, try synchronous endpoint (/runsync) - this returns output immediately
try {
const syncResponse = await fetch(syncUrl, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': `Bearer ${apiKey}`
},
body: JSON.stringify(requestBody)
});
if (syncResponse.ok) {
const syncData = await syncResponse.json();
console.log('📥 RunPod API: Synchronous response:', JSON.stringify(syncData, null, 2));
// Check if we got output directly
if (syncData.output) {
let responseText = '';
if (syncData.output.choices && Array.isArray(syncData.output.choices)) {
const choice = syncData.output.choices[0];
if (choice && choice.message && choice.message.content) {
responseText = choice.message.content;
}
} else if (typeof syncData.output === 'string') {
responseText = syncData.output;
} else if (syncData.output.text) {
responseText = syncData.output.text;
} else if (syncData.output.response) {
responseText = syncData.output.response;
}
if (responseText) {
console.log('✅ RunPod API: Got output from synchronous endpoint, length:', responseText.length);
// Stream the response character by character to simulate streaming
for (let i = 0; i < responseText.length; i++) {
partial += responseText[i];
onToken(partial, false);
await new Promise(resolve => setTimeout(resolve, 10));
}
onToken(partial, true);
return;
}
}
// If sync endpoint returned a job ID, fall through to async polling
if (syncData.id && (syncData.status === 'IN_QUEUE' || syncData.status === 'IN_PROGRESS')) {
console.log('⏳ RunPod API: Sync endpoint returned job ID, polling:', syncData.id);
const result = await pollRunPodJob(syncData.id, apiKey, runpodEndpointId);
console.log('✅ RunPod API: Job completed, result length:', result.length);
partial = result;
onToken(partial, true);
return;
}
}
} catch (syncError) {
console.log('⚠️ RunPod API: Synchronous endpoint not available, trying async:', syncError);
}
// Fall back to async endpoint (/run) if sync didn't work
console.log('📤 RunPod API: Using async endpoint:', asyncUrl);
const response = await fetch(asyncUrl, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': `Bearer ${apiKey}`
},
body: JSON.stringify(requestBody)
});
console.log('📥 RunPod API: Response status:', response.status, response.statusText);
if (!response.ok) {
const errorText = await response.text();
console.error('❌ RunPod API: Error response:', errorText);
throw new Error(`RunPod API error: ${response.status} - ${errorText}`);
}
const data = await response.json();
console.log('📥 RunPod API: Response data:', JSON.stringify(data, null, 2));
// Handle async job pattern (RunPod often returns job IDs)
if (data.id && (data.status === 'IN_QUEUE' || data.status === 'IN_PROGRESS')) {
console.log('⏳ RunPod API: Job queued/in progress, polling job ID:', data.id);
const result = await pollRunPodJob(data.id, apiKey, runpodEndpointId);
console.log('✅ RunPod API: Job completed, result length:', result.length);
partial = result;
onToken(partial, true);
return;
}
// Handle OpenAI-compatible response format (vLLM endpoints)
if (data.output && data.output.choices && Array.isArray(data.output.choices)) {
console.log('📥 RunPod API: Detected OpenAI-compatible response format');
const choice = data.output.choices[0];
if (choice && choice.message && choice.message.content) {
const responseText = choice.message.content;
console.log('✅ RunPod API: Extracted content from OpenAI-compatible format, length:', responseText.length);
// Stream the response character by character to simulate streaming
for (let i = 0; i < responseText.length; i++) {
partial += responseText[i];
onToken(partial, false);
// Small delay to simulate streaming
await new Promise(resolve => setTimeout(resolve, 10));
}
onToken(partial, true);
return;
}
}
// Handle direct response
if (data.output) {
console.log('📥 RunPod API: Processing output:', typeof data.output, Array.isArray(data.output) ? 'array' : 'object');
// Try to extract text from various possible response formats
let responseText = '';
if (typeof data.output === 'string') {
responseText = data.output;
console.log('✅ RunPod API: Extracted string output, length:', responseText.length);
} else if (data.output.text) {
responseText = data.output.text;
console.log('✅ RunPod API: Extracted text from output.text, length:', responseText.length);
} else if (data.output.response) {
responseText = data.output.response;
console.log('✅ RunPod API: Extracted response from output.response, length:', responseText.length);
} else if (data.output.content) {
responseText = data.output.content;
console.log('✅ RunPod API: Extracted content from output.content, length:', responseText.length);
} else if (Array.isArray(data.output.segments)) {
responseText = data.output.segments.map((seg: any) => seg.text || seg).join(' ');
console.log('✅ RunPod API: Extracted text from segments, length:', responseText.length);
} else {
// Fallback: stringify the output
console.warn('⚠️ RunPod API: Unknown output format, stringifying:', Object.keys(data.output));
responseText = JSON.stringify(data.output);
}
// Stream the response character by character to simulate streaming
for (let i = 0; i < responseText.length; i++) {
partial += responseText[i];
onToken(partial, false);
// Small delay to simulate streaming
await new Promise(resolve => setTimeout(resolve, 10));
}
onToken(partial, true);
return;
}
// Handle error response
if (data.error) {
console.error('❌ RunPod API: Error in response:', data.error);
throw new Error(`RunPod API error: ${data.error}`);
}
// Check for status messages that might indicate endpoint is starting up
if (data.status) {
console.log(' RunPod API: Response status:', data.status);
if (data.status === 'STARTING' || data.status === 'PENDING') {
console.log('⏳ RunPod API: Endpoint appears to be starting up, this may take a moment...');
// Wait a bit and retry
await new Promise(resolve => setTimeout(resolve, 2000));
throw new Error('RunPod endpoint is starting up. Please wait a moment and try again.');
}
}
console.error('❌ RunPod API: No valid response format detected. Full response:', JSON.stringify(data, null, 2));
throw new Error('No valid response from RunPod API');
} catch (error) {
console.error('❌ RunPod API error:', error);
throw error;
}
} else if (provider === 'openai') {
const openai = new OpenAI({
apiKey,
dangerouslyAllowBrowser: true,
@ -556,6 +800,185 @@ async function callProviderAPI(
onToken(partial, true);
}
// Helper function to poll RunPod job status until completion
async function pollRunPodJob(
jobId: string,
apiKey: string,
endpointId: string,
maxAttempts: number = 60,
pollInterval: number = 1000
): Promise<string> {
const statusUrl = `https://api.runpod.ai/v2/${endpointId}/status/${jobId}`;
console.log('🔄 RunPod API: Starting to poll job:', jobId);
for (let attempt = 0; attempt < maxAttempts; attempt++) {
try {
const response = await fetch(statusUrl, {
method: 'GET',
headers: {
'Authorization': `Bearer ${apiKey}`
}
});
if (!response.ok) {
const errorText = await response.text();
console.error(`❌ RunPod API: Poll error (attempt ${attempt + 1}/${maxAttempts}):`, response.status, errorText);
throw new Error(`Failed to check job status: ${response.status} - ${errorText}`);
}
const data = await response.json();
console.log(`🔄 RunPod API: Poll attempt ${attempt + 1}/${maxAttempts}, status:`, data.status);
console.log(`📥 RunPod API: Full poll response:`, JSON.stringify(data, null, 2));
if (data.status === 'COMPLETED') {
console.log('✅ RunPod API: Job completed, processing output...');
console.log('📥 RunPod API: Output structure:', typeof data.output, data.output ? Object.keys(data.output) : 'null');
console.log('📥 RunPod API: Full data object keys:', Object.keys(data));
// If no output after a couple of retries, try the stream endpoint as fallback
if (!data.output) {
if (attempt < 3) {
// Only retry 2-3 times, then try stream endpoint
console.log(`⏳ RunPod API: COMPLETED but no output yet, waiting briefly (attempt ${attempt + 1}/3)...`);
await new Promise(resolve => setTimeout(resolve, 500));
continue;
}
// After a few retries, try the stream endpoint as fallback
console.log('⚠️ RunPod API: Status endpoint not returning output, trying stream endpoint...');
try {
const streamUrl = `https://api.runpod.ai/v2/${endpointId}/stream/${jobId}`;
const streamResponse = await fetch(streamUrl, {
method: 'GET',
headers: {
'Authorization': `Bearer ${apiKey}`
}
});
if (streamResponse.ok) {
const streamData = await streamResponse.json();
console.log('📥 RunPod API: Stream endpoint response:', JSON.stringify(streamData, null, 2));
if (streamData.output) {
// Use stream endpoint output
data.output = streamData.output;
console.log('✅ RunPod API: Found output via stream endpoint');
} else if (streamData.choices && Array.isArray(streamData.choices)) {
// Handle OpenAI-compatible format from stream endpoint
data.output = { choices: streamData.choices };
console.log('✅ RunPod API: Found choices via stream endpoint');
}
} else {
console.log(`⚠️ RunPod API: Stream endpoint returned ${streamResponse.status}`);
}
} catch (streamError) {
console.log('⚠️ RunPod API: Stream endpoint not available or failed:', streamError);
}
}
// Extract text from various possible response formats
let result = '';
if (typeof data.output === 'string') {
result = data.output;
console.log('✅ RunPod API: Extracted string output from job, length:', result.length);
} else if (data.output?.text) {
result = data.output.text;
console.log('✅ RunPod API: Extracted text from output.text, length:', result.length);
} else if (data.output?.response) {
result = data.output.response;
console.log('✅ RunPod API: Extracted response from output.response, length:', result.length);
} else if (data.output?.content) {
result = data.output.content;
console.log('✅ RunPod API: Extracted content from output.content, length:', result.length);
} else if (data.output?.choices && Array.isArray(data.output.choices)) {
// Handle OpenAI-compatible response format (vLLM endpoints)
const choice = data.output.choices[0];
if (choice && choice.message && choice.message.content) {
result = choice.message.content;
console.log('✅ RunPod API: Extracted content from OpenAI-compatible format, length:', result.length);
}
} else if (data.output?.segments && Array.isArray(data.output.segments)) {
result = data.output.segments.map((seg: any) => seg.text || seg).join(' ');
console.log('✅ RunPod API: Extracted text from segments, length:', result.length);
} else if (Array.isArray(data.output)) {
// Handle array responses (some vLLM endpoints return arrays)
result = data.output.map((item: any) => {
if (typeof item === 'string') return item;
if (item.text) return item.text;
if (item.response) return item.response;
return JSON.stringify(item);
}).join('\n');
console.log('✅ RunPod API: Extracted text from array output, length:', result.length);
} else if (!data.output) {
// No output field - check alternative structures or return empty
console.warn('⚠️ RunPod API: No output field found, checking alternative structures...');
console.log('📥 RunPod API: Full data structure:', JSON.stringify(data, null, 2));
// Try checking if output is directly in data (not data.output)
if (typeof data === 'string') {
result = data;
console.log('✅ RunPod API: Data itself is a string, length:', result.length);
} else if (data.text) {
result = data.text;
console.log('✅ RunPod API: Found text at top level, length:', result.length);
} else if (data.response) {
result = data.response;
console.log('✅ RunPod API: Found response at top level, length:', result.length);
} else if (data.content) {
result = data.content;
console.log('✅ RunPod API: Found content at top level, length:', result.length);
} else {
// Stream endpoint already tried above (around line 848), just log that we couldn't find output
if (attempt >= 3) {
console.warn('⚠️ RunPod API: Could not find output in status or stream endpoint after multiple attempts');
}
// If still no result, return empty string instead of throwing error
// This allows the UI to render something instead of failing
if (!result) {
console.warn('⚠️ RunPod API: No output found in response. Returning empty result.');
console.log('📥 RunPod API: Available fields:', Object.keys(data));
result = ''; // Return empty string so UI can render
}
}
}
// Return result even if empty - don't loop forever
if (result !== undefined) {
// Return empty string if no result found - allows UI to render
console.log('✅ RunPod API: Returning result (may be empty):', result ? `length ${result.length}` : 'empty');
return result || '';
}
// If we get here, no output was found - return empty string instead of looping
console.warn('⚠️ RunPod API: No output found after checking all formats. Returning empty result.');
return '';
}
if (data.status === 'FAILED') {
console.error('❌ RunPod API: Job failed:', data.error || 'Unknown error');
throw new Error(`Job failed: ${data.error || 'Unknown error'}`);
}
// Check for starting/pending status
if (data.status === 'STARTING' || data.status === 'PENDING') {
console.log(`⏳ RunPod API: Endpoint still starting (attempt ${attempt + 1}/${maxAttempts})...`);
}
// Job still in progress, wait and retry
await new Promise(resolve => setTimeout(resolve, pollInterval));
} catch (error) {
if (attempt === maxAttempts - 1) {
throw error;
}
// Wait before retrying
await new Promise(resolve => setTimeout(resolve, pollInterval));
}
}
throw new Error('Job polling timeout - job did not complete in time');
}
// Auto-migration function that runs automatically
async function autoMigrateAPIKeys() {
try {