diff --git a/RUNPOD_SETUP.md b/RUNPOD_SETUP.md new file mode 100644 index 0000000..da788c5 --- /dev/null +++ b/RUNPOD_SETUP.md @@ -0,0 +1,255 @@ +# RunPod WhisperX Integration Setup + +This guide explains how to set up and use the RunPod WhisperX endpoint for transcription in the canvas website. + +## Overview + +The transcription system can now use a hosted WhisperX endpoint on RunPod instead of running the Whisper model locally in the browser. This provides: +- Better accuracy with WhisperX's advanced features +- Faster processing (no model download needed) +- Reduced client-side resource usage +- Support for longer audio files + +## Prerequisites + +1. A RunPod account with an active WhisperX endpoint +2. Your RunPod API key +3. Your RunPod endpoint ID + +## Configuration + +### Environment Variables + +Add the following environment variables to your `.env.local` file (or your deployment environment): + +```bash +# RunPod Configuration +VITE_RUNPOD_API_KEY=your_runpod_api_key_here +VITE_RUNPOD_ENDPOINT_ID=your_endpoint_id_here +``` + +Or if using Next.js: + +```bash +NEXT_PUBLIC_RUNPOD_API_KEY=your_runpod_api_key_here +NEXT_PUBLIC_RUNPOD_ENDPOINT_ID=your_endpoint_id_here +``` + +### Getting Your RunPod Credentials + +1. **API Key**: + - Go to [RunPod Settings](https://www.runpod.io/console/user/settings) + - Navigate to API Keys section + - Create a new API key or copy an existing one + +2. **Endpoint ID**: + - Go to [RunPod Serverless Endpoints](https://www.runpod.io/console/serverless) + - Find your WhisperX endpoint + - Copy the endpoint ID from the URL or endpoint details + - Example: If your endpoint URL is `https://api.runpod.ai/v2/lrtisuv8ixbtub/run`, then `lrtisuv8ixbtub` is your endpoint ID + +## Usage + +### Automatic Detection + +The transcription hook automatically detects if RunPod is configured and uses it instead of the local Whisper model. No code changes are needed! + +### Manual Override + +If you want to explicitly control which transcription method to use: + +```typescript +import { useWhisperTranscription } from '@/hooks/useWhisperTranscriptionSimple' + +const { + isRecording, + transcript, + startRecording, + stopRecording +} = useWhisperTranscription({ + useRunPod: true, // Force RunPod usage + language: 'en', + onTranscriptUpdate: (text) => { + console.log('New transcript:', text) + } +}) +``` + +Or to force local model: + +```typescript +useWhisperTranscription({ + useRunPod: false, // Force local Whisper model + // ... other options +}) +``` + +## API Format + +The integration sends audio data to your RunPod endpoint in the following format: + +```json +{ + "input": { + "audio": "base64_encoded_audio_data", + "audio_format": "audio/wav", + "language": "en", + "task": "transcribe" + } +} +``` + +### Expected Response Format + +The endpoint should return one of these formats: + +**Direct Response:** +```json +{ + "output": { + "text": "Transcribed text here" + } +} +``` + +**Or with segments:** +```json +{ + "output": { + "segments": [ + { + "start": 0.0, + "end": 2.5, + "text": "Transcribed text here" + } + ] + } +} +``` + +**Async Job Pattern:** +```json +{ + "id": "job-id-123", + "status": "IN_QUEUE" +} +``` + +The integration automatically handles async jobs by polling the status endpoint until completion. + +## Customizing the API Request + +If your WhisperX endpoint expects a different request format, you can modify `src/lib/runpodApi.ts`: + +```typescript +// In transcribeWithRunPod function +const requestBody = { + input: { + // Adjust these fields based on your endpoint + audio: audioBase64, + // Add or modify fields as needed + } +} +``` + +## Troubleshooting + +### "RunPod API key or endpoint ID not configured" + +- Ensure environment variables are set correctly +- Restart your development server after adding environment variables +- Check that variable names match exactly (case-sensitive) + +### "RunPod API error: 401" + +- Verify your API key is correct +- Check that your API key has not expired +- Ensure you're using the correct API key format + +### "RunPod API error: 404" + +- Verify your endpoint ID is correct +- Check that your endpoint is active in the RunPod console +- Ensure the endpoint URL format matches: `https://api.runpod.ai/v2/{ENDPOINT_ID}/run` + +### "No transcription text found in RunPod response" + +- Check your endpoint's response format matches the expected format +- Verify your WhisperX endpoint is configured correctly +- Check the browser console for detailed error messages + +### "Failed to return job results" (400 Bad Request) + +This error occurs on the **server side** when your WhisperX endpoint tries to return results. This typically means: + +1. **Response format mismatch**: Your endpoint's response doesn't match RunPod's expected format + - Ensure your endpoint returns: `{"output": {"text": "..."}}` or `{"output": {"segments": [...]}}` + - The response must be valid JSON + - Check your endpoint handler code to ensure it's returning the correct structure + +2. **Response size limits**: The response might be too large + - Try with shorter audio files first + - Check RunPod's response size limits + +3. **Timeout issues**: The endpoint might be taking too long to process + - Check your endpoint logs for processing time + - Consider optimizing your WhisperX model configuration + +4. **Check endpoint handler**: Review your WhisperX endpoint's `handler.py` or equivalent: + ```python + # Example correct format + def handler(event): + # ... process audio ... + return { + "output": { + "text": transcription_text + } + } + ``` + +### Transcription not working + +- Check browser console for errors +- Verify your endpoint is active and responding +- Test your endpoint directly using curl or Postman +- Ensure audio format is supported (WAV format is recommended) +- Check RunPod endpoint logs for server-side errors + +## Testing Your Endpoint + +You can test your RunPod endpoint directly: + +```bash +curl -X POST https://api.runpod.ai/v2/YOUR_ENDPOINT_ID/run \ + -H "Content-Type: application/json" \ + -H "Authorization: Bearer YOUR_API_KEY" \ + -d '{ + "input": { + "audio": "base64_audio_data_here", + "audio_format": "audio/wav", + "language": "en" + } + }' +``` + +## Fallback Behavior + +If RunPod is not configured or fails, the system will: +1. Try to use RunPod if configured +2. Fall back to local Whisper model if RunPod fails or is not configured +3. Show error messages if both methods fail + +## Performance Considerations + +- **RunPod**: Better for longer audio files and higher accuracy, but requires network connection +- **Local Model**: Works offline, but requires model download and uses more client resources + +## Support + +For issues specific to: +- **RunPod API**: Check [RunPod Documentation](https://docs.runpod.io) +- **WhisperX**: Check your WhisperX endpoint configuration +- **Integration**: Check browser console for detailed error messages + + + diff --git a/TEST_RUNPOD_AI.md b/TEST_RUNPOD_AI.md new file mode 100644 index 0000000..63d8164 --- /dev/null +++ b/TEST_RUNPOD_AI.md @@ -0,0 +1,139 @@ +# Testing RunPod AI Integration + +This guide explains how to test the RunPod AI API integration in development. + +## Quick Setup + +1. **Add RunPod environment variables to `.env.local`:** + +```bash +# Add these lines to your .env.local file +VITE_RUNPOD_API_KEY=your_runpod_api_key_here +VITE_RUNPOD_ENDPOINT_ID=your_endpoint_id_here +``` + +**Important:** Replace `your_runpod_api_key_here` and `your_endpoint_id_here` with your actual RunPod credentials. + +2. **Get your RunPod credentials:** + - **API Key**: Go to [RunPod Settings](https://www.runpod.io/console/user/settings) → API Keys section + - **Endpoint ID**: Go to [RunPod Serverless Endpoints](https://www.runpod.io/console/serverless) → Find your endpoint → Copy the ID from the URL + - Example: If URL is `https://api.runpod.ai/v2/jqd16o7stu29vq/run`, then `jqd16o7stu29vq` is your endpoint ID + +3. **Restart the dev server:** + ```bash + npm run dev + ``` + +## Testing the Integration + +### Method 1: Using Prompt Shapes +1. Open the canvas website in your browser +2. Select the **Prompt** tool from the toolbar (or press the keyboard shortcut) +3. Click on the canvas to create a prompt shape +4. Type a prompt like "Write a hello world program in Python" +5. Press Enter or click the send button +6. The AI response should appear in the prompt shape + +### Method 2: Using Arrow LLM Action +1. Create an arrow shape pointing from one shape to another +2. Add text to the arrow (this becomes the prompt) +3. Select the arrow +4. Press **Alt+G** (or use the action menu) +5. The AI will process the prompt and fill the target shape with the response + +### Method 3: Using Command Palette +1. Press **Cmd+J** (Mac) or **Ctrl+J** (Windows/Linux) to open the LLM view +2. Type your prompt +3. Press Enter +4. The response should appear + +## Verifying RunPod is Being Used + +1. **Open browser console** (F12 or Cmd+Option+I) +2. Look for these log messages: + - `🔑 Found RunPod configuration from environment variables - using as primary AI provider` + - `🔍 Found X available AI providers: runpod (default)` + - `🔄 Attempting to use runpod API (default)...` + +3. **Check Network tab:** + - Look for requests to `https://api.runpod.ai/v2/{endpointId}/run` + - The request should have `Authorization: Bearer {your_api_key}` header + +## Expected Behavior + +- **With RunPod configured**: RunPod will be used FIRST (priority over user API keys) +- **Without RunPod**: System will fall back to user-configured API keys (OpenAI, Anthropic, etc.) +- **If both fail**: You'll see an error message + +## Troubleshooting + +### "No valid API key found for any provider" +- Check that `.env.local` has the correct variable names (`VITE_RUNPOD_API_KEY` and `VITE_RUNPOD_ENDPOINT_ID`) +- Restart the dev server after adding environment variables +- Check browser console for detailed error messages + +### "RunPod API error: 401" +- Verify your API key is correct +- Check that your API key hasn't expired +- Ensure you're using the correct API key format + +### "RunPod API error: 404" +- Verify your endpoint ID is correct +- Check that your endpoint is active in RunPod console +- Ensure the endpoint URL format matches: `https://api.runpod.ai/v2/{ENDPOINT_ID}/run` + +### RunPod not being used +- Check browser console for `🔑 Found RunPod configuration` message +- Verify environment variables are loaded (check `import.meta.env.VITE_RUNPOD_API_KEY` in console) +- Make sure you restarted the dev server after adding environment variables + +## Testing Different Scenarios + +### Test 1: RunPod Only (No User Keys) +1. Remove or clear any user API keys from localStorage +2. Set RunPod environment variables +3. Run an AI command +4. Should use RunPod automatically + +### Test 2: RunPod Priority (With User Keys) +1. Set RunPod environment variables +2. Also configure user API keys in settings +3. Run an AI command +4. Should use RunPod FIRST, then fall back to user keys if RunPod fails + +### Test 3: Fallback Behavior +1. Set RunPod environment variables with invalid credentials +2. Configure valid user API keys +3. Run an AI command +4. Should try RunPod first, fail, then use user keys + +## API Request Format + +The integration sends requests in this format: + +```json +{ + "input": { + "prompt": "Your prompt text here" + } +} +``` + +The system prompt and user prompt are combined into a single prompt string. + +## Response Handling + +The integration handles multiple response formats: +- Direct text response: `{ "output": "text" }` +- Object with text: `{ "output": { "text": "..." } }` +- Object with response: `{ "output": { "response": "..." } }` +- Async jobs: Polls until completion + +## Next Steps + +Once testing is successful: +1. Verify RunPod responses are working correctly +2. Test with different prompt types +3. Monitor RunPod usage and costs +4. Consider adding rate limiting if needed + diff --git a/src/hooks/useWhisperTranscriptionSimple.ts b/src/hooks/useWhisperTranscriptionSimple.ts index 1be6b7c..17bee76 100644 --- a/src/hooks/useWhisperTranscriptionSimple.ts +++ b/src/hooks/useWhisperTranscriptionSimple.ts @@ -1,5 +1,7 @@ import { useCallback, useEffect, useRef, useState } from 'react' import { pipeline, env } from '@xenova/transformers' +import { transcribeWithRunPod } from '../lib/runpodApi' +import { isRunPodConfigured } from '../lib/clientConfig' // Configure the transformers library env.allowRemoteModels = true @@ -48,6 +50,44 @@ function detectAudioFormat(blob: Blob): Promise { }) } +// Convert Float32Array audio data to WAV blob +async function createWavBlob(audioData: Float32Array, sampleRate: number): Promise { + const length = audioData.length + const buffer = new ArrayBuffer(44 + length * 2) + const view = new DataView(buffer) + + // WAV header + const writeString = (offset: number, string: string) => { + for (let i = 0; i < string.length; i++) { + view.setUint8(offset + i, string.charCodeAt(i)) + } + } + + writeString(0, 'RIFF') + view.setUint32(4, 36 + length * 2, true) + writeString(8, 'WAVE') + writeString(12, 'fmt ') + view.setUint32(16, 16, true) + view.setUint16(20, 1, true) + view.setUint16(22, 1, true) + view.setUint32(24, sampleRate, true) + view.setUint32(28, sampleRate * 2, true) + view.setUint16(32, 2, true) + view.setUint16(34, 16, true) + writeString(36, 'data') + view.setUint32(40, length * 2, true) + + // Convert float samples to 16-bit PCM + let offset = 44 + for (let i = 0; i < length; i++) { + const sample = Math.max(-1, Math.min(1, audioData[i])) + view.setInt16(offset, sample < 0 ? sample * 0x8000 : sample * 0x7FFF, true) + offset += 2 + } + + return new Blob([buffer], { type: 'audio/wav' }) +} + // Simple resampling function for audio data function resampleAudio(audioData: Float32Array, fromSampleRate: number, toSampleRate: number): Float32Array { if (fromSampleRate === toSampleRate) { @@ -103,6 +143,7 @@ interface UseWhisperTranscriptionOptions { enableAdvancedErrorHandling?: boolean modelOptions?: ModelOption[] autoInitialize?: boolean // If false, model will only load when startRecording is called + useRunPod?: boolean // If true, use RunPod WhisperX endpoint instead of local model (defaults to checking if RunPod is configured) } export const useWhisperTranscription = ({ @@ -112,8 +153,11 @@ export const useWhisperTranscription = ({ enableStreaming = false, enableAdvancedErrorHandling = false, modelOptions, - autoInitialize = true // Default to true for backward compatibility + autoInitialize = true, // Default to true for backward compatibility + useRunPod = undefined // If undefined, auto-detect based on configuration }: UseWhisperTranscriptionOptions = {}) => { + // Auto-detect RunPod usage if not explicitly set + const shouldUseRunPod = useRunPod !== undefined ? useRunPod : isRunPodConfigured() const [isRecording, setIsRecording] = useState(false) const [isTranscribing, setIsTranscribing] = useState(false) const [isSpeaking, setIsSpeaking] = useState(false) @@ -161,6 +205,13 @@ export const useWhisperTranscription = ({ // Initialize transcriber with optional advanced error handling const initializeTranscriber = useCallback(async () => { + // Skip model loading if using RunPod + if (shouldUseRunPod) { + console.log('🚀 Using RunPod WhisperX endpoint - skipping local model loading') + setModelLoaded(true) // Mark as "loaded" since we don't need a local model + return null + } + if (transcriberRef.current) return transcriberRef.current try { @@ -432,19 +483,33 @@ export const useWhisperTranscription = ({ console.log(`đŸŽĩ Real-time audio: ${processedAudioData.length} samples (${(processedAudioData.length / 16000).toFixed(2)}s)`) - // Transcribe with parameters optimized for real-time processing - const result = await transcriberRef.current(processedAudioData, { - language: language, - task: 'transcribe', - return_timestamps: false, - chunk_length_s: 5, // Longer chunks for better context - stride_length_s: 2, // Larger stride for better coverage - no_speech_threshold: 0.3, // Higher threshold to reduce noise - logprob_threshold: -0.8, // More sensitive detection - compression_ratio_threshold: 2.0 // More permissive for real-time - }) + let transcriptionText = '' - const transcriptionText = result?.text || '' + // Use RunPod if configured, otherwise use local model + if (shouldUseRunPod) { + console.log('🚀 Using RunPod WhisperX API for real-time transcription...') + // Convert processed audio data back to blob for RunPod + const wavBlob = await createWavBlob(processedAudioData, 16000) + transcriptionText = await transcribeWithRunPod(wavBlob, language) + } else { + // Use local Whisper model + if (!transcriberRef.current) { + console.log('âš ī¸ Transcriber not available for real-time processing') + return + } + const result = await transcriberRef.current(processedAudioData, { + language: language, + task: 'transcribe', + return_timestamps: false, + chunk_length_s: 5, // Longer chunks for better context + stride_length_s: 2, // Larger stride for better coverage + no_speech_threshold: 0.3, // Higher threshold to reduce noise + logprob_threshold: -0.8, // More sensitive detection + compression_ratio_threshold: 2.0 // More permissive for real-time + }) + + transcriptionText = result?.text || '' + } if (transcriptionText.trim()) { lastTranscriptionTimeRef.current = Date.now() console.log(`✅ Real-time transcript: "${transcriptionText.trim()}"`) @@ -453,53 +518,63 @@ export const useWhisperTranscription = ({ } else { console.log('âš ī¸ No real-time transcription text produced, trying fallback parameters...') - // Try with more permissive parameters for real-time processing - try { - const fallbackResult = await transcriberRef.current(processedAudioData, { - task: 'transcribe', - return_timestamps: false, - chunk_length_s: 3, // Shorter chunks for fallback - stride_length_s: 1, // Smaller stride for fallback - no_speech_threshold: 0.1, // Very low threshold for fallback - logprob_threshold: -1.2, // Very sensitive for fallback - compression_ratio_threshold: 2.5 // Very permissive for fallback - }) - - const fallbackText = fallbackResult?.text || '' - if (fallbackText.trim()) { - console.log(`✅ Fallback real-time transcript: "${fallbackText.trim()}"`) - lastTranscriptionTimeRef.current = Date.now() - handleStreamingTranscriptUpdate(fallbackText.trim()) - } else { - console.log('âš ī¸ Fallback transcription also produced no text') + // Try with more permissive parameters for real-time processing (only for local model) + if (!shouldUseRunPod && transcriberRef.current) { + try { + const fallbackResult = await transcriberRef.current(processedAudioData, { + task: 'transcribe', + return_timestamps: false, + chunk_length_s: 3, // Shorter chunks for fallback + stride_length_s: 1, // Smaller stride for fallback + no_speech_threshold: 0.1, // Very low threshold for fallback + logprob_threshold: -1.2, // Very sensitive for fallback + compression_ratio_threshold: 2.5 // Very permissive for fallback + }) + + const fallbackText = fallbackResult?.text || '' + if (fallbackText.trim()) { + console.log(`✅ Fallback real-time transcript: "${fallbackText.trim()}"`) + lastTranscriptionTimeRef.current = Date.now() + handleStreamingTranscriptUpdate(fallbackText.trim()) + } else { + console.log('âš ī¸ Fallback transcription also produced no text') + } + } catch (fallbackError) { + console.log('âš ī¸ Fallback transcription failed:', fallbackError) } - } catch (fallbackError) { - console.log('âš ī¸ Fallback transcription failed:', fallbackError) } } } catch (error) { console.error('❌ Error processing accumulated audio chunks:', error) } - }, [handleStreamingTranscriptUpdate, language]) + }, [handleStreamingTranscriptUpdate, language, shouldUseRunPod]) // Process recorded audio chunks (final processing) const processAudioChunks = useCallback(async () => { - if (!transcriberRef.current || audioChunksRef.current.length === 0) { - console.log('âš ī¸ No transcriber or audio chunks to process') + if (audioChunksRef.current.length === 0) { + console.log('âš ī¸ No audio chunks to process') return } - // Ensure model is loaded - if (!modelLoaded) { - console.log('âš ī¸ Model not loaded yet, waiting...') - try { - await initializeTranscriber() - } catch (error) { - console.error('❌ Failed to initialize transcriber:', error) - onError?.(error as Error) + // For local model, ensure transcriber is loaded + if (!shouldUseRunPod) { + if (!transcriberRef.current) { + console.log('âš ī¸ No transcriber available') return } + + // Ensure model is loaded + if (!modelLoaded) { + console.log('âš ī¸ Model not loaded yet, waiting...') + try { + await initializeTranscriber() + } catch (error) { + console.error('❌ Failed to initialize transcriber:', error) + onError?.(error as Error) + return + } + } } try { @@ -588,24 +663,32 @@ export const useWhisperTranscription = ({ console.log(`đŸŽĩ Processing audio: ${processedAudioData.length} samples (${(processedAudioData.length / 16000).toFixed(2)}s)`) - // Check if transcriber is available - if (!transcriberRef.current) { - console.error('❌ Transcriber not available for processing') - throw new Error('Transcriber not initialized') + console.log('🔄 Starting transcription...') + + let newText = '' + + // Use RunPod if configured, otherwise use local model + if (shouldUseRunPod) { + console.log('🚀 Using RunPod WhisperX API...') + // Convert processed audio data back to blob for RunPod + // Create a WAV blob from the Float32Array + const wavBlob = await createWavBlob(processedAudioData, 16000) + newText = await transcribeWithRunPod(wavBlob, language) + console.log('✅ RunPod transcription result:', newText) + } else { + // Use local Whisper model + if (!transcriberRef.current) { + throw new Error('Transcriber not initialized') + } + const result = await transcriberRef.current(processedAudioData, { + language: language, + task: 'transcribe', + return_timestamps: false + }) + + console.log('🔍 Transcription result:', result) + newText = result?.text?.trim() || '' } - - console.log('🔄 Starting transcription with Whisper model...') - - // Transcribe the audio - const result = await transcriberRef.current(processedAudioData, { - language: language, - task: 'transcribe', - return_timestamps: false - }) - - console.log('🔍 Transcription result:', result) - - const newText = result?.text?.trim() || '' if (newText) { const processedText = processTranscript(newText, enableStreaming) @@ -633,16 +716,17 @@ export const useWhisperTranscription = ({ console.log('âš ī¸ No transcription text produced') console.log('🔍 Full transcription result object:', result) - // Try alternative transcription parameters - console.log('🔄 Trying alternative transcription parameters...') - try { - const altResult = await transcriberRef.current(processedAudioData, { - task: 'transcribe', - return_timestamps: false - }) - console.log('🔍 Alternative transcription result:', altResult) - - if (altResult?.text?.trim()) { + // Try alternative transcription parameters (only for local model) + if (!shouldUseRunPod && transcriberRef.current) { + console.log('🔄 Trying alternative transcription parameters...') + try { + const altResult = await transcriberRef.current(processedAudioData, { + task: 'transcribe', + return_timestamps: false + }) + console.log('🔍 Alternative transcription result:', altResult) + + if (altResult?.text?.trim()) { const processedAltText = processTranscript(altResult.text, enableStreaming) console.log('✅ Alternative transcription successful:', processedAltText) const currentTranscript = transcriptRef.current @@ -658,8 +742,9 @@ export const useWhisperTranscription = ({ previousTranscriptLengthRef.current = updatedTranscript.length } } - } catch (altError) { - console.log('âš ī¸ Alternative transcription also failed:', altError) + } catch (altError) { + console.log('âš ī¸ Alternative transcription also failed:', altError) + } } } @@ -672,7 +757,7 @@ export const useWhisperTranscription = ({ } finally { setIsTranscribing(false) } - }, [transcriberRef, language, onTranscriptUpdate, onError, enableStreaming, handleStreamingTranscriptUpdate, modelLoaded, initializeTranscriber]) + }, [transcriberRef, language, onTranscriptUpdate, onError, enableStreaming, handleStreamingTranscriptUpdate, modelLoaded, initializeTranscriber, shouldUseRunPod]) // Start recording const startRecording = useCallback(async () => { @@ -680,10 +765,13 @@ export const useWhisperTranscription = ({ console.log('🎤 Starting recording...') console.log('🔍 enableStreaming in startRecording:', enableStreaming) - // Ensure model is loaded before starting - if (!modelLoaded) { + // Ensure model is loaded before starting (skip for RunPod) + if (!shouldUseRunPod && !modelLoaded) { console.log('🔄 Model not loaded, initializing...') await initializeTranscriber() + } else if (shouldUseRunPod) { + // For RunPod, just mark as ready + setModelLoaded(true) } // Don't reset transcripts for continuous transcription - keep existing content @@ -803,7 +891,7 @@ export const useWhisperTranscription = ({ console.error('❌ Error starting recording:', error) onError?.(error as Error) } - }, [processAudioChunks, processAccumulatedAudioChunks, onError, enableStreaming, modelLoaded, initializeTranscriber]) + }, [processAudioChunks, processAccumulatedAudioChunks, onError, enableStreaming, modelLoaded, initializeTranscriber, shouldUseRunPod]) // Stop recording const stopRecording = useCallback(async () => { @@ -892,9 +980,11 @@ export const useWhisperTranscription = ({ periodicTranscriptionRef.current = null } - // Initialize the model if not already loaded - if (!modelLoaded) { + // Initialize the model if not already loaded (skip for RunPod) + if (!shouldUseRunPod && !modelLoaded) { await initializeTranscriber() + } else if (shouldUseRunPod) { + setModelLoaded(true) } await startRecording() @@ -933,7 +1023,7 @@ export const useWhisperTranscription = ({ if (autoInitialize) { initializeTranscriber().catch(console.warn) } - }, [initializeTranscriber, autoInitialize]) + }, [initializeTranscriber, autoInitialize, shouldUseRunPod]) // Cleanup on unmount useEffect(() => { diff --git a/src/lib/clientConfig.ts b/src/lib/clientConfig.ts index ca95734..914fa35 100644 --- a/src/lib/clientConfig.ts +++ b/src/lib/clientConfig.ts @@ -14,6 +14,8 @@ export interface ClientConfig { webhookUrl?: string webhookSecret?: string openaiApiKey?: string + runpodApiKey?: string + runpodEndpointId?: string } /** @@ -38,6 +40,8 @@ export function getClientConfig(): ClientConfig { webhookUrl: import.meta.env.VITE_QUARTZ_WEBHOOK_URL || import.meta.env.NEXT_PUBLIC_QUARTZ_WEBHOOK_URL, webhookSecret: import.meta.env.VITE_QUARTZ_WEBHOOK_SECRET || import.meta.env.NEXT_PUBLIC_QUARTZ_WEBHOOK_SECRET, openaiApiKey: import.meta.env.VITE_OPENAI_API_KEY || import.meta.env.NEXT_PUBLIC_OPENAI_API_KEY, + runpodApiKey: import.meta.env.VITE_RUNPOD_API_KEY || import.meta.env.NEXT_PUBLIC_RUNPOD_API_KEY, + runpodEndpointId: import.meta.env.VITE_RUNPOD_ENDPOINT_ID || import.meta.env.NEXT_PUBLIC_RUNPOD_ENDPOINT_ID, } } else { // Next.js environment @@ -52,6 +56,8 @@ export function getClientConfig(): ClientConfig { webhookUrl: (window as any).__NEXT_DATA__?.env?.NEXT_PUBLIC_QUARTZ_WEBHOOK_URL, webhookSecret: (window as any).__NEXT_DATA__?.env?.NEXT_PUBLIC_QUARTZ_WEBHOOK_SECRET, openaiApiKey: (window as any).__NEXT_DATA__?.env?.NEXT_PUBLIC_OPENAI_API_KEY, + runpodApiKey: (window as any).__NEXT_DATA__?.env?.NEXT_PUBLIC_RUNPOD_API_KEY, + runpodEndpointId: (window as any).__NEXT_DATA__?.env?.NEXT_PUBLIC_RUNPOD_ENDPOINT_ID, } } } else { @@ -66,10 +72,36 @@ export function getClientConfig(): ClientConfig { quartzApiKey: process.env.VITE_QUARTZ_API_KEY || process.env.NEXT_PUBLIC_QUARTZ_API_KEY, webhookUrl: process.env.VITE_QUARTZ_WEBHOOK_URL || process.env.NEXT_PUBLIC_QUARTZ_WEBHOOK_URL, webhookSecret: process.env.VITE_QUARTZ_WEBHOOK_SECRET || process.env.NEXT_PUBLIC_QUARTZ_WEBHOOK_SECRET, + runpodApiKey: process.env.VITE_RUNPOD_API_KEY || process.env.NEXT_PUBLIC_RUNPOD_API_KEY, + runpodEndpointId: process.env.VITE_RUNPOD_ENDPOINT_ID || process.env.NEXT_PUBLIC_RUNPOD_ENDPOINT_ID, } } } +/** + * Get RunPod configuration for API calls + */ +export function getRunPodConfig(): { apiKey: string; endpointId: string } | null { + const config = getClientConfig() + + if (!config.runpodApiKey || !config.runpodEndpointId) { + return null + } + + return { + apiKey: config.runpodApiKey, + endpointId: config.runpodEndpointId + } +} + +/** + * Check if RunPod integration is configured + */ +export function isRunPodConfigured(): boolean { + const config = getClientConfig() + return !!(config.runpodApiKey && config.runpodEndpointId) +} + /** * Check if GitHub integration is configured */ diff --git a/src/lib/runpodApi.ts b/src/lib/runpodApi.ts new file mode 100644 index 0000000..cad2f9e --- /dev/null +++ b/src/lib/runpodApi.ts @@ -0,0 +1,246 @@ +/** + * RunPod API utility functions + * Handles communication with RunPod WhisperX endpoints + */ + +import { getRunPodConfig } from './clientConfig' + +export interface RunPodTranscriptionResponse { + id?: string + status?: string + output?: { + text?: string + segments?: Array<{ + start: number + end: number + text: string + }> + } + error?: string +} + +/** + * Convert audio blob to base64 string + */ +export async function blobToBase64(blob: Blob): Promise { + return new Promise((resolve, reject) => { + const reader = new FileReader() + reader.onloadend = () => { + if (typeof reader.result === 'string') { + // Remove data URL prefix (e.g., "data:audio/webm;base64,") + const base64 = reader.result.split(',')[1] || reader.result + resolve(base64) + } else { + reject(new Error('Failed to convert blob to base64')) + } + } + reader.onerror = reject + reader.readAsDataURL(blob) + }) +} + +/** + * Send transcription request to RunPod endpoint + * Handles both synchronous and asynchronous job patterns + */ +export async function transcribeWithRunPod( + audioBlob: Blob, + language?: string +): Promise { + const config = getRunPodConfig() + + if (!config) { + throw new Error('RunPod API key or endpoint ID not configured. Please set VITE_RUNPOD_API_KEY and VITE_RUNPOD_ENDPOINT_ID environment variables.') + } + + // Check audio blob size (limit to ~10MB to prevent issues) + const maxSize = 10 * 1024 * 1024 // 10MB + if (audioBlob.size > maxSize) { + throw new Error(`Audio file too large: ${(audioBlob.size / 1024 / 1024).toFixed(2)}MB. Maximum size is ${(maxSize / 1024 / 1024).toFixed(2)}MB`) + } + + // Convert audio blob to base64 + const audioBase64 = await blobToBase64(audioBlob) + + // Detect audio format from blob type + const audioFormat = audioBlob.type || 'audio/wav' + + const url = `https://api.runpod.ai/v2/${config.endpointId}/run` + + // Prepare the request payload + // WhisperX typically expects audio as base64 or file URL + // The exact format may vary based on your WhisperX endpoint implementation + const requestBody = { + input: { + audio: audioBase64, + audio_format: audioFormat, + language: language || 'en', + task: 'transcribe' + // Note: Some WhisperX endpoints may expect different field names + // Adjust the requestBody structure in this function if needed + } + } + + try { + // Add timeout to prevent hanging requests (30 seconds for initial request) + const controller = new AbortController() + const timeoutId = setTimeout(() => controller.abort(), 30000) + + const response = await fetch(url, { + method: 'POST', + headers: { + 'Content-Type': 'application/json', + 'Authorization': `Bearer ${config.apiKey}` + }, + body: JSON.stringify(requestBody), + signal: controller.signal + }) + + clearTimeout(timeoutId) + + if (!response.ok) { + const errorText = await response.text() + console.error('RunPod API error response:', { + status: response.status, + statusText: response.statusText, + body: errorText + }) + throw new Error(`RunPod API error: ${response.status} - ${errorText}`) + } + + const data: RunPodTranscriptionResponse = await response.json() + + console.log('RunPod initial response:', data) + + // Handle async job pattern (RunPod often returns job IDs) + if (data.id && (data.status === 'IN_QUEUE' || data.status === 'IN_PROGRESS')) { + console.log('Job is async, polling for results...', data.id) + return await pollRunPodJob(data.id, config.apiKey, config.endpointId) + } + + // Handle direct response + if (data.output?.text) { + return data.output.text.trim() + } + + // Handle error response + if (data.error) { + throw new Error(`RunPod transcription error: ${data.error}`) + } + + // Fallback: try to extract text from segments + if (data.output?.segments && data.output.segments.length > 0) { + return data.output.segments.map(seg => seg.text).join(' ').trim() + } + + // Check if response has unexpected structure + console.warn('Unexpected RunPod response structure:', data) + throw new Error('No transcription text found in RunPod response. Check endpoint response format.') + } catch (error: any) { + if (error.name === 'AbortError') { + throw new Error('RunPod request timed out after 30 seconds') + } + console.error('RunPod transcription error:', error) + throw error + } +} + +/** + * Poll RunPod job status until completion + */ +async function pollRunPodJob( + jobId: string, + apiKey: string, + endpointId: string, + maxAttempts: number = 120, // Increased to 120 attempts (2 minutes at 1s intervals) + pollInterval: number = 1000 +): Promise { + const statusUrl = `https://api.runpod.ai/v2/${endpointId}/status/${jobId}` + + console.log(`Polling job ${jobId} (max ${maxAttempts} attempts, ${pollInterval}ms interval)`) + + for (let attempt = 0; attempt < maxAttempts; attempt++) { + try { + // Add timeout for each status check (5 seconds) + const controller = new AbortController() + const timeoutId = setTimeout(() => controller.abort(), 5000) + + const response = await fetch(statusUrl, { + method: 'GET', + headers: { + 'Authorization': `Bearer ${apiKey}` + }, + signal: controller.signal + }) + + clearTimeout(timeoutId) + + if (!response.ok) { + const errorText = await response.text() + console.error(`Job status check failed (attempt ${attempt + 1}/${maxAttempts}):`, { + status: response.status, + statusText: response.statusText, + body: errorText + }) + + // Don't fail immediately on 404 - job might still be processing + if (response.status === 404 && attempt < maxAttempts - 1) { + console.log('Job not found yet, continuing to poll...') + await new Promise(resolve => setTimeout(resolve, pollInterval)) + continue + } + + throw new Error(`Failed to check job status: ${response.status} - ${errorText}`) + } + + const data: RunPodTranscriptionResponse = await response.json() + + console.log(`Job status (attempt ${attempt + 1}/${maxAttempts}):`, data.status) + + if (data.status === 'COMPLETED') { + console.log('Job completed, extracting transcription...') + + if (data.output?.text) { + return data.output.text.trim() + } + if (data.output?.segments && data.output.segments.length > 0) { + return data.output.segments.map(seg => seg.text).join(' ').trim() + } + + // Log the full response for debugging + console.error('Job completed but no transcription found. Full response:', JSON.stringify(data, null, 2)) + throw new Error('Job completed but no transcription text found in response') + } + + if (data.status === 'FAILED') { + const errorMsg = data.error || 'Unknown error' + console.error('Job failed:', errorMsg) + throw new Error(`Job failed: ${errorMsg}`) + } + + // Job still in progress, wait and retry + if (attempt % 10 === 0) { + console.log(`Job still processing... (${attempt + 1}/${maxAttempts} attempts)`) + } + await new Promise(resolve => setTimeout(resolve, pollInterval)) + } catch (error: any) { + if (error.name === 'AbortError') { + console.warn(`Status check timed out (attempt ${attempt + 1}/${maxAttempts})`) + if (attempt < maxAttempts - 1) { + await new Promise(resolve => setTimeout(resolve, pollInterval)) + continue + } + throw new Error('Status check timed out multiple times') + } + + if (attempt === maxAttempts - 1) { + throw error + } + // Wait before retrying + await new Promise(resolve => setTimeout(resolve, pollInterval)) + } + } + + throw new Error(`Job polling timeout after ${maxAttempts} attempts (${(maxAttempts * pollInterval / 1000).toFixed(0)} seconds)`) +} + diff --git a/src/routes/Board.tsx b/src/routes/Board.tsx index f0fea4b..c65a734 100644 --- a/src/routes/Board.tsx +++ b/src/routes/Board.tsx @@ -42,6 +42,8 @@ import { HolonBrowserShape } from "@/shapes/HolonBrowserShapeUtil" import { ObsidianBrowserShape } from "@/shapes/ObsidianBrowserShapeUtil" import { FathomMeetingsBrowserShape } from "@/shapes/FathomMeetingsBrowserShapeUtil" import { LocationShareShape } from "@/shapes/LocationShareShapeUtil" +import { ImageGenShape } from "@/shapes/ImageGenShapeUtil" +import { ImageGenTool } from "@/tools/ImageGenTool" import { lockElement, unlockElement, @@ -82,6 +84,7 @@ const customShapeUtils = [ ObsidianBrowserShape, FathomMeetingsBrowserShape, LocationShareShape, + ImageGenShape, ] const customTools = [ ChatBoxTool, @@ -96,6 +99,7 @@ const customTools = [ TranscriptionTool, HolonTool, FathomMeetingsTool, + ImageGenTool, ] export function Board() { diff --git a/src/shapes/ImageGenShapeUtil.tsx b/src/shapes/ImageGenShapeUtil.tsx new file mode 100644 index 0000000..7929df4 --- /dev/null +++ b/src/shapes/ImageGenShapeUtil.tsx @@ -0,0 +1,730 @@ +import { + BaseBoxShapeUtil, + Geometry2d, + HTMLContainer, + Rectangle2d, + TLBaseShape, +} from "tldraw" +import React, { useState } from "react" +import { getRunPodConfig } from "@/lib/clientConfig" + +// Feature flag: Set to false when RunPod API is ready for production +const USE_MOCK_API = true + +// Type definition for RunPod API responses +interface RunPodJobResponse { + id?: string + status?: 'IN_QUEUE' | 'IN_PROGRESS' | 'STARTING' | 'COMPLETED' | 'FAILED' | 'CANCELLED' + output?: string | { + image?: string + url?: string + images?: Array<{ data?: string; url?: string; filename?: string; type?: string }> + result?: string + [key: string]: any + } + error?: string + image?: string + url?: string + result?: string | { + image?: string + url?: string + [key: string]: any + } + [key: string]: any +} + +type IImageGen = TLBaseShape< + "ImageGen", + { + w: number + h: number + prompt: string + imageUrl: string | null + isLoading: boolean + error: string | null + endpointId?: string // Optional custom endpoint ID + } +> + +// Helper function to poll RunPod job status until completion +async function pollRunPodJob( + jobId: string, + apiKey: string, + endpointId: string, + maxAttempts: number = 60, + pollInterval: number = 2000 +): Promise { + const statusUrl = `https://api.runpod.ai/v2/${endpointId}/status/${jobId}` + console.log('🔄 ImageGen: Polling job:', jobId) + + for (let attempt = 0; attempt < maxAttempts; attempt++) { + try { + const response = await fetch(statusUrl, { + method: 'GET', + headers: { + 'Authorization': `Bearer ${apiKey}` + } + }) + + if (!response.ok) { + const errorText = await response.text() + console.error(`❌ ImageGen: Poll error (attempt ${attempt + 1}/${maxAttempts}):`, response.status, errorText) + throw new Error(`Failed to check job status: ${response.status} - ${errorText}`) + } + + const data = await response.json() as RunPodJobResponse + console.log(`🔄 ImageGen: Poll attempt ${attempt + 1}/${maxAttempts}, status:`, data.status) + console.log(`📋 ImageGen: Full response data:`, JSON.stringify(data, null, 2)) + + if (data.status === 'COMPLETED') { + console.log('✅ ImageGen: Job completed, processing output...') + + // Extract image URL from various possible response formats + let imageUrl = '' + + // Check if output exists at all + if (!data.output) { + // Only retry 2-3 times, then proceed to check alternatives + if (attempt < 3) { + console.log(`âŗ ImageGen: COMPLETED but no output yet, waiting briefly (attempt ${attempt + 1}/3)...`) + await new Promise(resolve => setTimeout(resolve, 500)) + continue + } + + // Try alternative ways to get the output - maybe it's at the top level + console.log('âš ī¸ ImageGen: No output field found, checking for alternative response formats...') + console.log('📋 ImageGen: All available fields:', Object.keys(data)) + + // Check if image data is at top level + if (data.image) { + imageUrl = data.image + console.log('✅ ImageGen: Found image at top level') + } else if (data.url) { + imageUrl = data.url + console.log('✅ ImageGen: Found url at top level') + } else if (data.result) { + // Some endpoints return result instead of output + if (typeof data.result === 'string') { + imageUrl = data.result + } else if (data.result.image) { + imageUrl = data.result.image + } else if (data.result.url) { + imageUrl = data.result.url + } + console.log('✅ ImageGen: Found result field') + } else { + // Last resort: try to fetch output via stream endpoint (some RunPod endpoints use this) + console.log('âš ī¸ ImageGen: Trying alternative endpoint to retrieve output...') + try { + const streamUrl = `https://api.runpod.ai/v2/${endpointId}/stream/${jobId}` + const streamResponse = await fetch(streamUrl, { + method: 'GET', + headers: { + 'Authorization': `Bearer ${apiKey}` + } + }) + + if (streamResponse.ok) { + const streamData = await streamResponse.json() as RunPodJobResponse + console.log('đŸ“Ĩ ImageGen: Stream endpoint response:', JSON.stringify(streamData, null, 2)) + + if (streamData.output) { + if (typeof streamData.output === 'string') { + imageUrl = streamData.output + } else if (streamData.output.image) { + imageUrl = streamData.output.image + } else if (streamData.output.url) { + imageUrl = streamData.output.url + } else if (Array.isArray(streamData.output.images) && streamData.output.images.length > 0) { + const firstImage = streamData.output.images[0] + if (firstImage.data) { + imageUrl = firstImage.data.startsWith('data:') ? firstImage.data : `data:image/${firstImage.type || 'png'};base64,${firstImage.data}` + } else if (firstImage.url) { + imageUrl = firstImage.url + } + } + + if (imageUrl) { + console.log('✅ ImageGen: Found image URL via stream endpoint') + return imageUrl + } + } + } + } catch (streamError) { + console.log('âš ī¸ ImageGen: Stream endpoint not available or failed:', streamError) + } + + console.error('❌ ImageGen: Job completed but no output field in response after retries:', JSON.stringify(data, null, 2)) + throw new Error( + 'Job completed but no output data found.\n\n' + + 'Possible issues:\n' + + '1. The RunPod endpoint handler may not be returning output correctly\n' + + '2. Check the endpoint handler logs in RunPod console\n' + + '3. Verify the handler returns: { output: { image: "url" } } or { output: "url" }\n' + + '4. For ComfyUI workers, ensure output.images array is returned\n' + + '5. The endpoint may need to be reconfigured\n\n' + + 'Response received: ' + JSON.stringify(data, null, 2) + ) + } + } else { + // Extract image URL from various possible response formats + if (typeof data.output === 'string') { + imageUrl = data.output + } else if (data.output?.image) { + imageUrl = data.output.image + } else if (data.output?.url) { + imageUrl = data.output.url + } else if (data.output?.output) { + // Handle nested output structure + if (typeof data.output.output === 'string') { + imageUrl = data.output.output + } else if (data.output.output?.image) { + imageUrl = data.output.output.image + } else if (data.output.output?.url) { + imageUrl = data.output.output.url + } + } else if (Array.isArray(data.output) && data.output.length > 0) { + // Handle array responses + const firstItem = data.output[0] + if (typeof firstItem === 'string') { + imageUrl = firstItem + } else if (firstItem.image) { + imageUrl = firstItem.image + } else if (firstItem.url) { + imageUrl = firstItem.url + } + } else if (data.output?.result) { + // Some formats nest result inside output + if (typeof data.output.result === 'string') { + imageUrl = data.output.result + } else if (data.output.result?.image) { + imageUrl = data.output.result.image + } else if (data.output.result?.url) { + imageUrl = data.output.result.url + } + } else if (Array.isArray(data.output?.images) && data.output.images.length > 0) { + // ComfyUI worker format: { output: { images: [{ filename, type, data }] } } + const firstImage = data.output.images[0] + if (firstImage.data) { + // Base64 encoded image + if (firstImage.data.startsWith('data:image')) { + imageUrl = firstImage.data + } else if (firstImage.data.startsWith('http')) { + imageUrl = firstImage.data + } else { + // Assume base64 without prefix + imageUrl = `data:image/${firstImage.type || 'png'};base64,${firstImage.data}` + } + console.log('✅ ImageGen: Found image in ComfyUI format (images array)') + } else if (firstImage.url) { + imageUrl = firstImage.url + console.log('✅ ImageGen: Found image URL in ComfyUI format') + } else if (firstImage.filename) { + // Try to construct URL from filename (may need endpoint-specific handling) + console.log('âš ī¸ ImageGen: Found filename but no URL, filename:', firstImage.filename) + } + } + } + + if (!imageUrl || imageUrl.trim() === '') { + console.error('❌ ImageGen: No image URL found in response:', JSON.stringify(data, null, 2)) + throw new Error( + 'Job completed but no image URL found in output.\n\n' + + 'Expected formats:\n' + + '- { output: "https://..." }\n' + + '- { output: { image: "https://..." } }\n' + + '- { output: { url: "https://..." } }\n' + + '- { output: ["https://..."] }\n\n' + + 'Received: ' + JSON.stringify(data, null, 2) + ) + } + + return imageUrl + } + + if (data.status === 'FAILED') { + console.error('❌ ImageGen: Job failed:', data.error || 'Unknown error') + throw new Error(`Job failed: ${data.error || 'Unknown error'}`) + } + + // Wait before next poll + await new Promise(resolve => setTimeout(resolve, pollInterval)) + } catch (error) { + // If we get COMPLETED status without output, don't retry - fail immediately + const errorMessage = error instanceof Error ? error.message : String(error) + if (errorMessage.includes('no output') || errorMessage.includes('no image URL')) { + console.error('❌ ImageGen: Stopping polling due to missing output data') + throw error + } + + // For other errors, retry up to maxAttempts + if (attempt === maxAttempts - 1) { + throw error + } + await new Promise(resolve => setTimeout(resolve, pollInterval)) + } + } + + throw new Error('Job polling timed out') +} + +export class ImageGenShape extends BaseBoxShapeUtil { + static override type = "ImageGen" as const + + MIN_WIDTH = 300 as const + MIN_HEIGHT = 300 as const + DEFAULT_WIDTH = 400 as const + DEFAULT_HEIGHT = 400 as const + + getDefaultProps(): IImageGen["props"] { + return { + w: this.DEFAULT_WIDTH, + h: this.DEFAULT_HEIGHT, + prompt: "", + imageUrl: null, + isLoading: false, + error: null, + } + } + + getGeometry(shape: IImageGen): Geometry2d { + return new Rectangle2d({ + width: shape.props.w, + height: shape.props.h, + isFilled: true, + }) + } + + component(shape: IImageGen) { + const [isHovering, setIsHovering] = useState(false) + const isSelected = this.editor.getSelectedShapeIds().includes(shape.id) + + const generateImage = async (prompt: string) => { + console.log("🎨 ImageGen: Generating image with prompt:", prompt) + + // Clear any previous errors + this.editor.updateShape({ + id: shape.id, + type: "ImageGen", + props: { + error: null, + isLoading: true, + imageUrl: null + }, + }) + + try { + // Get RunPod configuration + const runpodConfig = getRunPodConfig() + const endpointId = shape.props.endpointId || runpodConfig?.endpointId || "tzf1j3sc3zufsy" + const apiKey = runpodConfig?.apiKey + + // Mock API mode: Return placeholder image without calling RunPod + if (USE_MOCK_API) { + console.log("🎭 ImageGen: Using MOCK API mode (no real RunPod call)") + console.log("🎨 ImageGen: Mock prompt:", prompt) + + // Simulate API delay + await new Promise(resolve => setTimeout(resolve, 1500)) + + // Use a placeholder image service + const mockImageUrl = `https://via.placeholder.com/512x512/4F46E5/FFFFFF?text=${encodeURIComponent(prompt.substring(0, 30))}` + + console.log("✅ ImageGen: Mock image generated:", mockImageUrl) + + this.editor.updateShape({ + id: shape.id, + type: "ImageGen", + props: { + imageUrl: mockImageUrl, + isLoading: false, + error: null + }, + }) + + return + } + + // Real API mode: Use RunPod + if (!apiKey) { + throw new Error("RunPod API key not configured. Please set VITE_RUNPOD_API_KEY environment variable.") + } + + const url = `https://api.runpod.ai/v2/${endpointId}/run` + + console.log("📤 ImageGen: Sending request to:", url) + + const response = await fetch(url, { + method: "POST", + headers: { + "Content-Type": "application/json", + "Authorization": `Bearer ${apiKey}` + }, + body: JSON.stringify({ + input: { + prompt: prompt + } + }) + }) + + if (!response.ok) { + const errorText = await response.text() + console.error("❌ ImageGen: Error response:", errorText) + throw new Error(`HTTP error! status: ${response.status} - ${errorText}`) + } + + const data = await response.json() as RunPodJobResponse + console.log("đŸ“Ĩ ImageGen: Response data:", JSON.stringify(data, null, 2)) + + // Handle async job pattern (RunPod often returns job IDs) + if (data.id && (data.status === 'IN_QUEUE' || data.status === 'IN_PROGRESS' || data.status === 'STARTING')) { + console.log("âŗ ImageGen: Job queued/in progress, polling job ID:", data.id) + const imageUrl = await pollRunPodJob(data.id, apiKey, endpointId) + console.log("✅ ImageGen: Job completed, image URL:", imageUrl) + + this.editor.updateShape({ + id: shape.id, + type: "ImageGen", + props: { + imageUrl: imageUrl, + isLoading: false, + error: null + }, + }) + } else if (data.output) { + // Handle direct response + let imageUrl = '' + if (typeof data.output === 'string') { + imageUrl = data.output + } else if (data.output.image) { + imageUrl = data.output.image + } else if (data.output.url) { + imageUrl = data.output.url + } else if (Array.isArray(data.output) && data.output.length > 0) { + const firstItem = data.output[0] + if (typeof firstItem === 'string') { + imageUrl = firstItem + } else if (firstItem.image) { + imageUrl = firstItem.image + } else if (firstItem.url) { + imageUrl = firstItem.url + } + } + + if (imageUrl) { + this.editor.updateShape({ + id: shape.id, + type: "ImageGen", + props: { + imageUrl: imageUrl, + isLoading: false, + error: null + }, + }) + } else { + throw new Error("No image URL found in response") + } + } else if (data.error) { + throw new Error(`RunPod API error: ${data.error}`) + } else { + throw new Error("No valid response from RunPod API") + } + } catch (error) { + const errorMessage = error instanceof Error ? error.message : String(error) + console.error("❌ ImageGen: Error:", errorMessage) + + let userFriendlyError = '' + + if (errorMessage.includes('API key not configured')) { + userFriendlyError = '❌ RunPod API key not configured. Please set VITE_RUNPOD_API_KEY environment variable.' + } else if (errorMessage.includes('401') || errorMessage.includes('403') || errorMessage.includes('Unauthorized')) { + userFriendlyError = '❌ API key authentication failed. Please check your RunPod API key.' + } else if (errorMessage.includes('404')) { + userFriendlyError = '❌ Endpoint not found. Please check your endpoint ID.' + } else if (errorMessage.includes('no output data found') || errorMessage.includes('no image URL found')) { + // For multi-line error messages, show a concise version in the UI + // The full details are already in the console + userFriendlyError = '❌ Image generation completed but no image data was returned.\n\n' + + 'This usually means the RunPod endpoint handler is not configured correctly.\n\n' + + 'Please check:\n' + + '1. RunPod endpoint handler logs\n' + + '2. Handler returns: { output: { image: "url" } }\n' + + '3. See browser console for full details' + } else { + // Truncate very long error messages for UI display + const maxLength = 500 + if (errorMessage.length > maxLength) { + userFriendlyError = `❌ Error: ${errorMessage.substring(0, maxLength)}...\n\n(Full error in console)` + } else { + userFriendlyError = `❌ Error: ${errorMessage}` + } + } + + this.editor.updateShape({ + id: shape.id, + type: "ImageGen", + props: { + isLoading: false, + error: userFriendlyError + }, + }) + } + } + + const handleGenerate = () => { + if (shape.props.prompt.trim() && !shape.props.isLoading) { + generateImage(shape.props.prompt) + this.editor.updateShape({ + id: shape.id, + type: "ImageGen", + props: { prompt: "" }, + }) + } + } + + return ( + setIsHovering(true)} + onPointerLeave={() => setIsHovering(false)} + > + {/* Error Display */} + {shape.props.error && ( +
+ âš ī¸ + {shape.props.error} + +
+ )} + + {/* Image Display */} + {shape.props.imageUrl && !shape.props.isLoading && ( +
+ {shape.props.prompt { + console.error("❌ ImageGen: Failed to load image:", shape.props.imageUrl) + this.editor.updateShape({ + id: shape.id, + type: "ImageGen", + props: { + error: "Failed to load generated image", + imageUrl: null + }, + }) + }} + /> +
+ )} + + {/* Loading State */} + {shape.props.isLoading && ( +
+
+ + Generating image... + +
+ )} + + {/* Empty State */} + {!shape.props.imageUrl && !shape.props.isLoading && ( +
+ Generated image will appear here +
+ )} + + {/* Input Section */} +
+ { + this.editor.updateShape({ + id: shape.id, + type: "ImageGen", + props: { prompt: e.target.value }, + }) + }} + onKeyDown={(e) => { + e.stopPropagation() + if (e.key === 'Enter' && !e.shiftKey) { + e.preventDefault() + if (shape.props.prompt.trim() && !shape.props.isLoading) { + handleGenerate() + } + } + }} + onPointerDown={(e) => { + e.stopPropagation() + }} + onClick={(e) => { + e.stopPropagation() + }} + disabled={shape.props.isLoading} + /> + +
+ + {/* Add CSS for spinner animation */} + + + ) + } + + override indicator(shape: IImageGen) { + return ( + + ) + } +} + diff --git a/src/tools/ImageGenTool.ts b/src/tools/ImageGenTool.ts new file mode 100644 index 0000000..7248a14 --- /dev/null +++ b/src/tools/ImageGenTool.ts @@ -0,0 +1,14 @@ +import { BaseBoxShapeTool, TLEventHandlers } from 'tldraw' + +export class ImageGenTool extends BaseBoxShapeTool { + static override id = 'ImageGen' + static override initial = 'idle' + override shapeType = 'ImageGen' + + override onComplete: TLEventHandlers["onComplete"] = () => { + console.log('🎨 ImageGenTool: Shape creation completed') + this.editor.setCurrentTool('select') + } +} + + diff --git a/src/ui/CustomContextMenu.tsx b/src/ui/CustomContextMenu.tsx index b636ba5..a223d60 100644 --- a/src/ui/CustomContextMenu.tsx +++ b/src/ui/CustomContextMenu.tsx @@ -238,6 +238,7 @@ export function CustomContextMenu(props: TLUiContextMenuProps) { + {/* Collections Group */} diff --git a/src/ui/CustomMainMenu.tsx b/src/ui/CustomMainMenu.tsx index 899254b..2f0bd1b 100644 --- a/src/ui/CustomMainMenu.tsx +++ b/src/ui/CustomMainMenu.tsx @@ -29,7 +29,7 @@ export function CustomMainMenu() { const validateAndNormalizeShapeType = (shape: any): string => { if (!shape || !shape.type) return 'text' - const validCustomShapes = ['ObsNote', 'VideoChat', 'Transcription', 'Prompt', 'ChatBox', 'Embed', 'Markdown', 'MycrozineTemplate', 'Slide', 'Holon', 'ObsidianBrowser', 'HolonBrowser', 'FathomMeetingsBrowser', 'LocationShare'] + const validCustomShapes = ['ObsNote', 'VideoChat', 'Transcription', 'Prompt', 'ChatBox', 'Embed', 'Markdown', 'MycrozineTemplate', 'Slide', 'Holon', 'ObsidianBrowser', 'HolonBrowser', 'FathomMeetingsBrowser', 'LocationShare', 'ImageGen'] const validDefaultShapes = ['arrow', 'bookmark', 'draw', 'embed', 'frame', 'geo', 'group', 'highlight', 'image', 'line', 'note', 'text', 'video'] const allValidShapes = [...validCustomShapes, ...validDefaultShapes] diff --git a/src/ui/components.tsx b/src/ui/components.tsx index 04c9cf1..c09460c 100644 --- a/src/ui/components.tsx +++ b/src/ui/components.tsx @@ -33,6 +33,7 @@ export const components: TLComponents = { tools["Transcription"], tools["Holon"], tools["FathomMeetings"], + tools["ImageGen"], ].filter(tool => tool && tool.kbd) // Get all custom actions with keyboard shortcuts diff --git a/src/ui/overrides.tsx b/src/ui/overrides.tsx index 185fc2f..57bbaee 100644 --- a/src/ui/overrides.tsx +++ b/src/ui/overrides.tsx @@ -196,6 +196,15 @@ export const overrides: TLUiOverrides = { // Shape creation is handled manually in FathomMeetingsTool.onPointerDown onSelect: () => editor.setCurrentTool("fathom-meetings"), }, + ImageGen: { + id: "ImageGen", + icon: "image", + label: "Image Generation", + kbd: "alt+i", + readonlyOk: true, + type: "ImageGen", + onSelect: () => editor.setCurrentTool("ImageGen"), + }, hand: { ...tools.hand, onDoubleClick: (info: any) => { diff --git a/src/utils/llmUtils.ts b/src/utils/llmUtils.ts index 2533e39..56b0fef 100644 --- a/src/utils/llmUtils.ts +++ b/src/utils/llmUtils.ts @@ -1,6 +1,7 @@ import OpenAI from "openai"; import Anthropic from "@anthropic-ai/sdk"; import { makeRealSettings, AI_PERSONALITIES } from "@/lib/settings"; +import { getRunPodConfig } from "@/lib/clientConfig"; export async function llm( userPrompt: string, @@ -59,7 +60,12 @@ export async function llm( availableProviders.map(p => `${p.provider} (${p.model})`).join(', ')); if (availableProviders.length === 0) { - throw new Error("No valid API key found for any provider") + const runpodConfig = getRunPodConfig(); + if (runpodConfig && runpodConfig.apiKey && runpodConfig.endpointId) { + // RunPod should have been added, but if not, try one more time + console.log('âš ī¸ No user API keys found, but RunPod is configured - this should not happen'); + } + throw new Error("No valid API key found for any provider. Please configure API keys in settings or set up RunPod environment variables (VITE_RUNPOD_API_KEY and VITE_RUNPOD_ENDPOINT_ID).") } // Try each provider/key combination in order until one succeeds @@ -76,13 +82,14 @@ export async function llm( 'claude-3-haiku-20240307', ]; - for (const { provider, apiKey, model } of availableProviders) { + for (const providerInfo of availableProviders) { + const { provider, apiKey, model, endpointId } = providerInfo as any; try { console.log(`🔄 Attempting to use ${provider} API (${model})...`); attemptedProviders.push(`${provider} (${model})`); // Add retry logic for temporary failures - await callProviderAPIWithRetry(provider, apiKey, model, userPrompt, onToken, settings); + await callProviderAPIWithRetry(provider, apiKey, model, userPrompt, onToken, settings, endpointId); console.log(`✅ Successfully used ${provider} API (${model})`); return; // Success, exit the function } catch (error) { @@ -100,7 +107,9 @@ export async function llm( try { console.log(`🔄 Trying fallback model: ${fallbackModel}...`); attemptedProviders.push(`${provider} (${fallbackModel})`); - await callProviderAPIWithRetry(provider, apiKey, fallbackModel, userPrompt, onToken, settings); + const providerInfo = availableProviders.find(p => p.provider === provider); + const endpointId = (providerInfo as any)?.endpointId; + await callProviderAPIWithRetry(provider, apiKey, fallbackModel, userPrompt, onToken, settings, endpointId); console.log(`✅ Successfully used ${provider} API with fallback model ${fallbackModel}`); fallbackSucceeded = true; return; // Success, exit the function @@ -142,13 +151,17 @@ function getAvailableProviders(availableKeys: Record, settings: const providers = []; // Helper to add a provider key if valid - const addProviderKey = (provider: string, apiKey: string, model?: string) => { + const addProviderKey = (provider: string, apiKey: string, model?: string, endpointId?: string) => { if (isValidApiKey(provider, apiKey) && !isApiKeyInvalid(provider, apiKey)) { - providers.push({ + const providerInfo: any = { provider: provider, apiKey: apiKey, model: model || settings.models[provider] || getDefaultModel(provider) - }); + }; + if (endpointId) { + providerInfo.endpointId = endpointId; + } + providers.push(providerInfo); return true; } else if (isApiKeyInvalid(provider, apiKey)) { console.log(`â­ī¸ Skipping ${provider} API key (marked as invalid)`); @@ -156,6 +169,20 @@ function getAvailableProviders(availableKeys: Record, settings: return false; }; + // PRIORITY 1: Check for RunPod configuration from environment variables FIRST + // RunPod takes priority over user-configured keys + const runpodConfig = getRunPodConfig(); + if (runpodConfig && runpodConfig.apiKey && runpodConfig.endpointId) { + console.log('🔑 Found RunPod configuration from environment variables - using as primary AI provider'); + providers.push({ + provider: 'runpod', + apiKey: runpodConfig.apiKey, + endpointId: runpodConfig.endpointId, + model: 'default' // RunPod doesn't use model selection in the same way + }); + } + + // PRIORITY 2: Then add user-configured keys (they will be tried after RunPod) // First, try the preferred provider - support multiple keys if stored as comma-separated if (settings.provider && availableKeys[settings.provider]) { const keyValue = availableKeys[settings.provider]; @@ -239,8 +266,10 @@ function getAvailableProviders(availableKeys: Record, settings: } // Additional fallback: Check for user-specific API keys from profile dashboard - if (providers.length === 0) { - providers.push(...getUserSpecificApiKeys()); + // These will be tried after RunPod (if RunPod was added) + const userSpecificKeys = getUserSpecificApiKeys(); + if (userSpecificKeys.length > 0) { + providers.push(...userSpecificKeys); } return providers; @@ -372,13 +401,14 @@ async function callProviderAPIWithRetry( userPrompt: string, onToken: (partialResponse: string, done?: boolean) => void, settings?: any, + endpointId?: string, maxRetries: number = 2 ) { let lastError: Error | null = null; for (let attempt = 1; attempt <= maxRetries; attempt++) { try { - await callProviderAPI(provider, apiKey, model, userPrompt, onToken, settings); + await callProviderAPI(provider, apiKey, model, userPrompt, onToken, settings, endpointId); return; // Success } catch (error) { lastError = error as Error; @@ -471,12 +501,226 @@ async function callProviderAPI( model: string, userPrompt: string, onToken: (partialResponse: string, done?: boolean) => void, - settings?: any + settings?: any, + endpointId?: string ) { let partial = ""; const systemPrompt = settings ? getSystemPrompt(settings) : 'You are a helpful assistant.'; - if (provider === 'openai') { + if (provider === 'runpod') { + // RunPod API integration - uses environment variables for automatic setup + // Get endpointId from parameter or from config + let runpodEndpointId = endpointId; + if (!runpodEndpointId) { + const runpodConfig = getRunPodConfig(); + if (runpodConfig) { + runpodEndpointId = runpodConfig.endpointId; + } + } + + if (!runpodEndpointId) { + throw new Error('RunPod endpoint ID not configured'); + } + + // Try /runsync first for synchronous execution (returns output immediately) + // Fall back to /run + polling if /runsync is not available + const syncUrl = `https://api.runpod.ai/v2/${runpodEndpointId}/runsync`; + const asyncUrl = `https://api.runpod.ai/v2/${runpodEndpointId}/run`; + + // vLLM endpoints typically expect OpenAI-compatible format with messages array + // But some endpoints might accept simple prompt format + // Try OpenAI-compatible format first, as it's more standard for vLLM + const messages = []; + if (systemPrompt) { + messages.push({ role: 'system', content: systemPrompt }); + } + messages.push({ role: 'user', content: userPrompt }); + + // Combine system prompt and user prompt for simple prompt format (fallback) + const fullPrompt = systemPrompt ? `${systemPrompt}\n\nUser: ${userPrompt}` : userPrompt; + + const requestBody = { + input: { + messages: messages, + stream: false // vLLM can handle streaming, but we'll process it synchronously for now + } + }; + + console.log('📤 RunPod API: Trying synchronous endpoint first:', syncUrl); + console.log('📤 RunPod API: Using OpenAI-compatible messages format'); + + try { + // First, try synchronous endpoint (/runsync) - this returns output immediately + try { + const syncResponse = await fetch(syncUrl, { + method: 'POST', + headers: { + 'Content-Type': 'application/json', + 'Authorization': `Bearer ${apiKey}` + }, + body: JSON.stringify(requestBody) + }); + + if (syncResponse.ok) { + const syncData = await syncResponse.json(); + console.log('đŸ“Ĩ RunPod API: Synchronous response:', JSON.stringify(syncData, null, 2)); + + // Check if we got output directly + if (syncData.output) { + let responseText = ''; + if (syncData.output.choices && Array.isArray(syncData.output.choices)) { + const choice = syncData.output.choices[0]; + if (choice && choice.message && choice.message.content) { + responseText = choice.message.content; + } + } else if (typeof syncData.output === 'string') { + responseText = syncData.output; + } else if (syncData.output.text) { + responseText = syncData.output.text; + } else if (syncData.output.response) { + responseText = syncData.output.response; + } + + if (responseText) { + console.log('✅ RunPod API: Got output from synchronous endpoint, length:', responseText.length); + // Stream the response character by character to simulate streaming + for (let i = 0; i < responseText.length; i++) { + partial += responseText[i]; + onToken(partial, false); + await new Promise(resolve => setTimeout(resolve, 10)); + } + onToken(partial, true); + return; + } + } + + // If sync endpoint returned a job ID, fall through to async polling + if (syncData.id && (syncData.status === 'IN_QUEUE' || syncData.status === 'IN_PROGRESS')) { + console.log('âŗ RunPod API: Sync endpoint returned job ID, polling:', syncData.id); + const result = await pollRunPodJob(syncData.id, apiKey, runpodEndpointId); + console.log('✅ RunPod API: Job completed, result length:', result.length); + partial = result; + onToken(partial, true); + return; + } + } + } catch (syncError) { + console.log('âš ī¸ RunPod API: Synchronous endpoint not available, trying async:', syncError); + } + + // Fall back to async endpoint (/run) if sync didn't work + console.log('📤 RunPod API: Using async endpoint:', asyncUrl); + const response = await fetch(asyncUrl, { + method: 'POST', + headers: { + 'Content-Type': 'application/json', + 'Authorization': `Bearer ${apiKey}` + }, + body: JSON.stringify(requestBody) + }); + + console.log('đŸ“Ĩ RunPod API: Response status:', response.status, response.statusText); + + if (!response.ok) { + const errorText = await response.text(); + console.error('❌ RunPod API: Error response:', errorText); + throw new Error(`RunPod API error: ${response.status} - ${errorText}`); + } + + const data = await response.json(); + console.log('đŸ“Ĩ RunPod API: Response data:', JSON.stringify(data, null, 2)); + + // Handle async job pattern (RunPod often returns job IDs) + if (data.id && (data.status === 'IN_QUEUE' || data.status === 'IN_PROGRESS')) { + console.log('âŗ RunPod API: Job queued/in progress, polling job ID:', data.id); + const result = await pollRunPodJob(data.id, apiKey, runpodEndpointId); + console.log('✅ RunPod API: Job completed, result length:', result.length); + partial = result; + onToken(partial, true); + return; + } + + // Handle OpenAI-compatible response format (vLLM endpoints) + if (data.output && data.output.choices && Array.isArray(data.output.choices)) { + console.log('đŸ“Ĩ RunPod API: Detected OpenAI-compatible response format'); + const choice = data.output.choices[0]; + if (choice && choice.message && choice.message.content) { + const responseText = choice.message.content; + console.log('✅ RunPod API: Extracted content from OpenAI-compatible format, length:', responseText.length); + + // Stream the response character by character to simulate streaming + for (let i = 0; i < responseText.length; i++) { + partial += responseText[i]; + onToken(partial, false); + // Small delay to simulate streaming + await new Promise(resolve => setTimeout(resolve, 10)); + } + onToken(partial, true); + return; + } + } + + // Handle direct response + if (data.output) { + console.log('đŸ“Ĩ RunPod API: Processing output:', typeof data.output, Array.isArray(data.output) ? 'array' : 'object'); + // Try to extract text from various possible response formats + let responseText = ''; + if (typeof data.output === 'string') { + responseText = data.output; + console.log('✅ RunPod API: Extracted string output, length:', responseText.length); + } else if (data.output.text) { + responseText = data.output.text; + console.log('✅ RunPod API: Extracted text from output.text, length:', responseText.length); + } else if (data.output.response) { + responseText = data.output.response; + console.log('✅ RunPod API: Extracted response from output.response, length:', responseText.length); + } else if (data.output.content) { + responseText = data.output.content; + console.log('✅ RunPod API: Extracted content from output.content, length:', responseText.length); + } else if (Array.isArray(data.output.segments)) { + responseText = data.output.segments.map((seg: any) => seg.text || seg).join(' '); + console.log('✅ RunPod API: Extracted text from segments, length:', responseText.length); + } else { + // Fallback: stringify the output + console.warn('âš ī¸ RunPod API: Unknown output format, stringifying:', Object.keys(data.output)); + responseText = JSON.stringify(data.output); + } + + // Stream the response character by character to simulate streaming + for (let i = 0; i < responseText.length; i++) { + partial += responseText[i]; + onToken(partial, false); + // Small delay to simulate streaming + await new Promise(resolve => setTimeout(resolve, 10)); + } + onToken(partial, true); + return; + } + + // Handle error response + if (data.error) { + console.error('❌ RunPod API: Error in response:', data.error); + throw new Error(`RunPod API error: ${data.error}`); + } + + // Check for status messages that might indicate endpoint is starting up + if (data.status) { + console.log('â„šī¸ RunPod API: Response status:', data.status); + if (data.status === 'STARTING' || data.status === 'PENDING') { + console.log('âŗ RunPod API: Endpoint appears to be starting up, this may take a moment...'); + // Wait a bit and retry + await new Promise(resolve => setTimeout(resolve, 2000)); + throw new Error('RunPod endpoint is starting up. Please wait a moment and try again.'); + } + } + + console.error('❌ RunPod API: No valid response format detected. Full response:', JSON.stringify(data, null, 2)); + throw new Error('No valid response from RunPod API'); + } catch (error) { + console.error('❌ RunPod API error:', error); + throw error; + } + } else if (provider === 'openai') { const openai = new OpenAI({ apiKey, dangerouslyAllowBrowser: true, @@ -556,6 +800,185 @@ async function callProviderAPI( onToken(partial, true); } +// Helper function to poll RunPod job status until completion +async function pollRunPodJob( + jobId: string, + apiKey: string, + endpointId: string, + maxAttempts: number = 60, + pollInterval: number = 1000 +): Promise { + const statusUrl = `https://api.runpod.ai/v2/${endpointId}/status/${jobId}`; + console.log('🔄 RunPod API: Starting to poll job:', jobId); + + for (let attempt = 0; attempt < maxAttempts; attempt++) { + try { + const response = await fetch(statusUrl, { + method: 'GET', + headers: { + 'Authorization': `Bearer ${apiKey}` + } + }); + + if (!response.ok) { + const errorText = await response.text(); + console.error(`❌ RunPod API: Poll error (attempt ${attempt + 1}/${maxAttempts}):`, response.status, errorText); + throw new Error(`Failed to check job status: ${response.status} - ${errorText}`); + } + + const data = await response.json(); + console.log(`🔄 RunPod API: Poll attempt ${attempt + 1}/${maxAttempts}, status:`, data.status); + console.log(`đŸ“Ĩ RunPod API: Full poll response:`, JSON.stringify(data, null, 2)); + + if (data.status === 'COMPLETED') { + console.log('✅ RunPod API: Job completed, processing output...'); + console.log('đŸ“Ĩ RunPod API: Output structure:', typeof data.output, data.output ? Object.keys(data.output) : 'null'); + console.log('đŸ“Ĩ RunPod API: Full data object keys:', Object.keys(data)); + + // If no output after a couple of retries, try the stream endpoint as fallback + if (!data.output) { + if (attempt < 3) { + // Only retry 2-3 times, then try stream endpoint + console.log(`âŗ RunPod API: COMPLETED but no output yet, waiting briefly (attempt ${attempt + 1}/3)...`); + await new Promise(resolve => setTimeout(resolve, 500)); + continue; + } + + // After a few retries, try the stream endpoint as fallback + console.log('âš ī¸ RunPod API: Status endpoint not returning output, trying stream endpoint...'); + try { + const streamUrl = `https://api.runpod.ai/v2/${endpointId}/stream/${jobId}`; + const streamResponse = await fetch(streamUrl, { + method: 'GET', + headers: { + 'Authorization': `Bearer ${apiKey}` + } + }); + + if (streamResponse.ok) { + const streamData = await streamResponse.json(); + console.log('đŸ“Ĩ RunPod API: Stream endpoint response:', JSON.stringify(streamData, null, 2)); + + if (streamData.output) { + // Use stream endpoint output + data.output = streamData.output; + console.log('✅ RunPod API: Found output via stream endpoint'); + } else if (streamData.choices && Array.isArray(streamData.choices)) { + // Handle OpenAI-compatible format from stream endpoint + data.output = { choices: streamData.choices }; + console.log('✅ RunPod API: Found choices via stream endpoint'); + } + } else { + console.log(`âš ī¸ RunPod API: Stream endpoint returned ${streamResponse.status}`); + } + } catch (streamError) { + console.log('âš ī¸ RunPod API: Stream endpoint not available or failed:', streamError); + } + } + + // Extract text from various possible response formats + let result = ''; + if (typeof data.output === 'string') { + result = data.output; + console.log('✅ RunPod API: Extracted string output from job, length:', result.length); + } else if (data.output?.text) { + result = data.output.text; + console.log('✅ RunPod API: Extracted text from output.text, length:', result.length); + } else if (data.output?.response) { + result = data.output.response; + console.log('✅ RunPod API: Extracted response from output.response, length:', result.length); + } else if (data.output?.content) { + result = data.output.content; + console.log('✅ RunPod API: Extracted content from output.content, length:', result.length); + } else if (data.output?.choices && Array.isArray(data.output.choices)) { + // Handle OpenAI-compatible response format (vLLM endpoints) + const choice = data.output.choices[0]; + if (choice && choice.message && choice.message.content) { + result = choice.message.content; + console.log('✅ RunPod API: Extracted content from OpenAI-compatible format, length:', result.length); + } + } else if (data.output?.segments && Array.isArray(data.output.segments)) { + result = data.output.segments.map((seg: any) => seg.text || seg).join(' '); + console.log('✅ RunPod API: Extracted text from segments, length:', result.length); + } else if (Array.isArray(data.output)) { + // Handle array responses (some vLLM endpoints return arrays) + result = data.output.map((item: any) => { + if (typeof item === 'string') return item; + if (item.text) return item.text; + if (item.response) return item.response; + return JSON.stringify(item); + }).join('\n'); + console.log('✅ RunPod API: Extracted text from array output, length:', result.length); + } else if (!data.output) { + // No output field - check alternative structures or return empty + console.warn('âš ī¸ RunPod API: No output field found, checking alternative structures...'); + console.log('đŸ“Ĩ RunPod API: Full data structure:', JSON.stringify(data, null, 2)); + + // Try checking if output is directly in data (not data.output) + if (typeof data === 'string') { + result = data; + console.log('✅ RunPod API: Data itself is a string, length:', result.length); + } else if (data.text) { + result = data.text; + console.log('✅ RunPod API: Found text at top level, length:', result.length); + } else if (data.response) { + result = data.response; + console.log('✅ RunPod API: Found response at top level, length:', result.length); + } else if (data.content) { + result = data.content; + console.log('✅ RunPod API: Found content at top level, length:', result.length); + } else { + // Stream endpoint already tried above (around line 848), just log that we couldn't find output + if (attempt >= 3) { + console.warn('âš ī¸ RunPod API: Could not find output in status or stream endpoint after multiple attempts'); + } + + // If still no result, return empty string instead of throwing error + // This allows the UI to render something instead of failing + if (!result) { + console.warn('âš ī¸ RunPod API: No output found in response. Returning empty result.'); + console.log('đŸ“Ĩ RunPod API: Available fields:', Object.keys(data)); + result = ''; // Return empty string so UI can render + } + } + } + + // Return result even if empty - don't loop forever + if (result !== undefined) { + // Return empty string if no result found - allows UI to render + console.log('✅ RunPod API: Returning result (may be empty):', result ? `length ${result.length}` : 'empty'); + return result || ''; + } + + // If we get here, no output was found - return empty string instead of looping + console.warn('âš ī¸ RunPod API: No output found after checking all formats. Returning empty result.'); + return ''; + } + + if (data.status === 'FAILED') { + console.error('❌ RunPod API: Job failed:', data.error || 'Unknown error'); + throw new Error(`Job failed: ${data.error || 'Unknown error'}`); + } + + // Check for starting/pending status + if (data.status === 'STARTING' || data.status === 'PENDING') { + console.log(`âŗ RunPod API: Endpoint still starting (attempt ${attempt + 1}/${maxAttempts})...`); + } + + // Job still in progress, wait and retry + await new Promise(resolve => setTimeout(resolve, pollInterval)); + } catch (error) { + if (attempt === maxAttempts - 1) { + throw error; + } + // Wait before retrying + await new Promise(resolve => setTimeout(resolve, pollInterval)); + } + } + + throw new Error('Job polling timeout - job did not complete in time'); +} + // Auto-migration function that runs automatically async function autoMigrateAPIKeys() { try {