transcription with webspeechAPI almost there (sync problem)
This commit is contained in:
parent
391e13c350
commit
122a2a1682
|
|
@ -0,0 +1,171 @@
|
|||
# Transcription Tool for Canvas
|
||||
|
||||
The Transcription Tool is a powerful feature that allows you to transcribe audio from participants in your Canvas sessions using the Web Speech API. This tool provides real-time speech-to-text conversion, making it easy to capture and document conversations, presentations, and discussions.
|
||||
|
||||
## Features
|
||||
|
||||
### 🎤 Real-time Transcription
|
||||
- Live speech-to-text conversion using the Web Speech API
|
||||
- Support for multiple languages including English, Spanish, French, German, and more
|
||||
- Continuous recording with interim and final results
|
||||
|
||||
### 🌐 Multi-language Support
|
||||
- **English (US/UK)**: Primary language support
|
||||
- **European Languages**: Spanish, French, German, Italian, Portuguese
|
||||
- **Asian Languages**: Japanese, Korean, Chinese (Simplified)
|
||||
- Easy language switching during recording sessions
|
||||
|
||||
### 👥 Participant Management
|
||||
- Automatic participant detection and tracking
|
||||
- Individual transcript tracking for each speaker
|
||||
- Visual indicators for speaking status
|
||||
|
||||
### 📝 Transcript Management
|
||||
- Real-time transcript display with auto-scroll
|
||||
- Clear transcript functionality
|
||||
- Download transcripts as text files
|
||||
- Persistent storage within the Canvas session
|
||||
|
||||
### ⚙️ Advanced Controls
|
||||
- Auto-scroll toggle for better reading experience
|
||||
- Recording start/stop controls
|
||||
- Error handling and status indicators
|
||||
- Microphone permission management
|
||||
|
||||
## How to Use
|
||||
|
||||
### 1. Adding the Tool to Your Canvas
|
||||
|
||||
1. In your Canvas session, look for the **Transcribe** tool in the toolbar
|
||||
2. Click on the Transcribe tool icon
|
||||
3. Click and drag on the canvas to create a transcription widget
|
||||
4. The widget will appear with default dimensions (400x300 pixels)
|
||||
|
||||
### 2. Starting a Recording Session
|
||||
|
||||
1. **Select Language**: Choose your preferred language from the dropdown menu
|
||||
2. **Enable Auto-scroll**: Check the auto-scroll checkbox for automatic scrolling
|
||||
3. **Start Recording**: Click the "🎤 Start Recording" button
|
||||
4. **Grant Permissions**: Allow microphone access when prompted by your browser
|
||||
|
||||
### 3. During Recording
|
||||
|
||||
- **Live Transcription**: See real-time text as people speak
|
||||
- **Participant Tracking**: Monitor who is speaking
|
||||
- **Status Indicators**: Red dot shows active recording
|
||||
- **Auto-scroll**: Transcript automatically scrolls to show latest content
|
||||
|
||||
### 4. Managing Your Transcript
|
||||
|
||||
- **Stop Recording**: Click "⏹️ Stop Recording" to end the session
|
||||
- **Clear Transcript**: Use "🗑️ Clear" to reset the transcript
|
||||
- **Download**: Click "💾 Download" to save as a text file
|
||||
|
||||
## Browser Compatibility
|
||||
|
||||
### ✅ Supported Browsers
|
||||
- **Chrome/Chromium**: Full support with `webkitSpeechRecognition`
|
||||
- **Edge (Chromium)**: Full support
|
||||
- **Safari**: Limited support (may require additional setup)
|
||||
|
||||
### ❌ Unsupported Browsers
|
||||
- **Firefox**: No native support for Web Speech API
|
||||
- **Internet Explorer**: No support
|
||||
|
||||
### 🔧 Recommended Setup
|
||||
For the best experience, use **Chrome** or **Chromium-based browsers** with:
|
||||
- Microphone access enabled
|
||||
- HTTPS connection (required for microphone access)
|
||||
- Stable internet connection
|
||||
|
||||
## Technical Details
|
||||
|
||||
### Web Speech API Integration
|
||||
The tool uses the Web Speech API's `SpeechRecognition` interface:
|
||||
- **Continuous Mode**: Enables ongoing transcription
|
||||
- **Interim Results**: Shows partial results in real-time
|
||||
- **Language Detection**: Automatically adjusts to selected language
|
||||
- **Error Handling**: Graceful fallback for unsupported features
|
||||
|
||||
### Audio Processing
|
||||
- **Microphone Access**: Secure microphone permission handling
|
||||
- **Audio Stream Management**: Proper cleanup of audio resources
|
||||
- **Quality Optimization**: Optimized for voice recognition
|
||||
|
||||
### Data Persistence
|
||||
- **Session Storage**: Transcripts persist during the Canvas session
|
||||
- **Shape Properties**: All settings and data stored in the Canvas shape
|
||||
- **Real-time Updates**: Changes sync across all participants
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Common Issues
|
||||
|
||||
#### "Speech recognition not supported in this browser"
|
||||
- **Solution**: Use Chrome or a Chromium-based browser
|
||||
- **Alternative**: Check if you're using the latest browser version
|
||||
|
||||
#### "Unable to access microphone"
|
||||
- **Solution**: Check browser permissions for microphone access
|
||||
- **Alternative**: Ensure you're on an HTTPS connection
|
||||
|
||||
#### Poor transcription quality
|
||||
- **Solutions**:
|
||||
- Speak clearly and at a moderate pace
|
||||
- Reduce background noise
|
||||
- Ensure good microphone positioning
|
||||
- Check internet connection stability
|
||||
|
||||
#### Language not working correctly
|
||||
- **Solution**: Verify the selected language matches the spoken language
|
||||
- **Alternative**: Try restarting the recording session
|
||||
|
||||
### Performance Tips
|
||||
|
||||
1. **Close unnecessary tabs** to free up system resources
|
||||
2. **Use a good quality microphone** for better accuracy
|
||||
3. **Minimize background noise** in your environment
|
||||
4. **Speak at a natural pace** - not too fast or slow
|
||||
5. **Ensure stable internet connection** for optimal performance
|
||||
|
||||
## Future Enhancements
|
||||
|
||||
### Planned Features
|
||||
- **Speaker Identification**: Advanced voice recognition for multiple speakers
|
||||
- **Export Formats**: Support for PDF, Word, and other document formats
|
||||
- **Real-time Translation**: Multi-language translation capabilities
|
||||
- **Voice Commands**: Canvas control through voice commands
|
||||
- **Cloud Storage**: Automatic transcript backup and sharing
|
||||
|
||||
### Integration Possibilities
|
||||
- **Daily.co Integration**: Enhanced participant detection from video sessions
|
||||
- **AI Enhancement**: Improved accuracy using machine learning
|
||||
- **Collaborative Editing**: Real-time transcript editing by multiple users
|
||||
- **Search and Indexing**: Full-text search within transcripts
|
||||
|
||||
## Support and Feedback
|
||||
|
||||
If you encounter issues or have suggestions for improvements:
|
||||
|
||||
1. **Check Browser Compatibility**: Ensure you're using a supported browser
|
||||
2. **Review Permissions**: Verify microphone access is granted
|
||||
3. **Check Network**: Ensure stable internet connection
|
||||
4. **Report Issues**: Contact the development team with detailed error information
|
||||
|
||||
## Privacy and Security
|
||||
|
||||
### Data Handling
|
||||
- **Local Processing**: Speech recognition happens locally in your browser
|
||||
- **No Cloud Storage**: Transcripts are not automatically uploaded to external services
|
||||
- **Session Privacy**: Data is only shared within your Canvas session
|
||||
- **User Control**: You control when and what to record
|
||||
|
||||
### Best Practices
|
||||
- **Inform Participants**: Let others know when recording
|
||||
- **Respect Privacy**: Don't record sensitive or confidential information
|
||||
- **Secure Sharing**: Be careful when sharing transcript files
|
||||
- **Regular Cleanup**: Clear transcripts when no longer needed
|
||||
|
||||
---
|
||||
|
||||
*The Transcription Tool is designed to enhance collaboration and documentation in Canvas sessions. Use it responsibly and respect the privacy of all participants.*
|
||||
|
|
@ -6,7 +6,7 @@
|
|||
"scripts": {
|
||||
"dev": "concurrently --kill-others --names client,worker --prefix-colors blue,red \"npm run dev:client\" \"npm run dev:worker\"",
|
||||
"dev:client": "vite --host --port 5173",
|
||||
"dev:worker": "wrangler dev --remote --port 5172 --ip 0.0.0.0",
|
||||
"dev:worker": "wrangler dev --local --port 5172 --ip 0.0.0.0",
|
||||
"build": "tsc && vite build",
|
||||
"preview": "vite preview",
|
||||
"deploy": "tsc && vite build && vercel deploy --prod && wrangler deploy",
|
||||
|
|
|
|||
|
|
@ -0,0 +1,518 @@
|
|||
import React, { useEffect, useRef, useState, useCallback } from 'react'
|
||||
import { ITranscribeShape, TranscribeShapeUtil } from '../shapes/TranscribeShapeUtil'
|
||||
import { useEditor } from '@tldraw/tldraw'
|
||||
|
||||
interface TranscribeComponentProps {
|
||||
shape: ITranscribeShape
|
||||
util: TranscribeShapeUtil
|
||||
}
|
||||
|
||||
interface Participant {
|
||||
id: string
|
||||
name: string
|
||||
isSpeaking: boolean
|
||||
lastSpoken: string
|
||||
transcript: string
|
||||
}
|
||||
|
||||
export function TranscribeComponent({ shape }: TranscribeComponentProps) {
|
||||
const editor = useEditor()
|
||||
const [isRecording, setIsRecording] = useState(shape.props.isRecording)
|
||||
const [transcript, setTranscript] = useState(shape.props.transcript)
|
||||
const [participants, setParticipants] = useState<Participant[]>(() =>
|
||||
shape.props.participants.map(p => ({
|
||||
id: p.id,
|
||||
name: p.name,
|
||||
isSpeaking: p.isSpeaking,
|
||||
lastSpoken: p.lastSpoken,
|
||||
transcript: ''
|
||||
}))
|
||||
)
|
||||
const [isPaused, setIsPaused] = useState(false)
|
||||
const [userHasScrolled, setUserHasScrolled] = useState(false)
|
||||
const [error, setError] = useState<string | null>(null)
|
||||
const [isSupported, setIsSupported] = useState(false)
|
||||
|
||||
const transcriptRef = useRef<HTMLDivElement>(null)
|
||||
const recognitionRef = useRef<any>(null)
|
||||
const mediaStreamRef = useRef<MediaStream | null>(null)
|
||||
const localTranscriptRef = useRef<string>('')
|
||||
|
||||
// Immediate update for critical state changes (recording start/stop)
|
||||
const updateShapePropsImmediate = useCallback((updates: Partial<ITranscribeShape['props']>) => {
|
||||
try {
|
||||
// Only update if the editor is still valid and the shape exists
|
||||
const currentShape = editor.getShape(shape.id)
|
||||
if (currentShape) {
|
||||
console.log('🔄 Updating shape props immediately:', updates)
|
||||
editor.updateShape({
|
||||
id: shape.id,
|
||||
type: shape.type,
|
||||
props: {
|
||||
...shape.props,
|
||||
...updates
|
||||
}
|
||||
})
|
||||
console.log('✅ Shape props updated successfully')
|
||||
} else {
|
||||
console.log('⚠️ Shape no longer exists, skipping immediate update')
|
||||
}
|
||||
} catch (error) {
|
||||
console.error('❌ Error in immediate update:', error)
|
||||
console.error('❌ Update data:', updates)
|
||||
console.error('❌ Shape data:', shape)
|
||||
}
|
||||
}, [editor, shape])
|
||||
|
||||
// Simple transcript update strategy like other shapes use
|
||||
const updateTranscriptLocal = useCallback((newTranscript: string) => {
|
||||
console.log('📝 Updating transcript:', newTranscript.length, 'chars')
|
||||
|
||||
// Always update local state immediately for responsive UI
|
||||
localTranscriptRef.current = newTranscript
|
||||
|
||||
// Use requestAnimationFrame for smooth updates like PromptShape does
|
||||
requestAnimationFrame(() => {
|
||||
try {
|
||||
const currentShape = editor.getShape(shape.id)
|
||||
if (currentShape) {
|
||||
console.log('🔄 Updating transcript in shape:', {
|
||||
transcriptLength: newTranscript.length,
|
||||
participantsCount: participants.length,
|
||||
shapeId: shape.id
|
||||
})
|
||||
editor.updateShape({
|
||||
id: shape.id,
|
||||
type: shape.type,
|
||||
props: {
|
||||
...shape.props,
|
||||
transcript: newTranscript,
|
||||
participants: participants
|
||||
}
|
||||
})
|
||||
console.log('✅ Transcript updated successfully')
|
||||
} else {
|
||||
console.log('⚠️ Shape not found for transcript update')
|
||||
}
|
||||
} catch (error) {
|
||||
console.error('❌ Error updating transcript:', error)
|
||||
console.error('❌ Transcript data:', newTranscript.slice(0, 100) + '...')
|
||||
console.error('❌ Participants data:', participants)
|
||||
}
|
||||
})
|
||||
}, [editor, shape, participants])
|
||||
|
||||
// Check if Web Speech API is supported
|
||||
useEffect(() => {
|
||||
const checkSupport = () => {
|
||||
if ('webkitSpeechRecognition' in window) {
|
||||
setIsSupported(true)
|
||||
return (window as any).webkitSpeechRecognition
|
||||
} else if ('SpeechRecognition' in window) {
|
||||
setIsSupported(true)
|
||||
return (window as any).SpeechRecognition
|
||||
} else {
|
||||
setIsSupported(false)
|
||||
setError('Speech recognition not supported in this browser')
|
||||
return null
|
||||
}
|
||||
}
|
||||
|
||||
const SpeechRecognition = checkSupport()
|
||||
if (SpeechRecognition) {
|
||||
recognitionRef.current = new SpeechRecognition()
|
||||
setupSpeechRecognition()
|
||||
}
|
||||
}, [])
|
||||
|
||||
|
||||
|
||||
const setupSpeechRecognition = useCallback(() => {
|
||||
console.log('🔧 Setting up speech recognition...')
|
||||
if (!recognitionRef.current) {
|
||||
console.log('❌ No recognition ref available')
|
||||
return
|
||||
}
|
||||
|
||||
const recognition = recognitionRef.current
|
||||
console.log('✅ Recognition ref found, configuring...')
|
||||
|
||||
recognition.continuous = true
|
||||
recognition.interimResults = true
|
||||
recognition.lang = 'en-US' // Fixed to English
|
||||
|
||||
console.log('🔧 Recognition configured:', {
|
||||
continuous: recognition.continuous,
|
||||
interimResults: recognition.interimResults,
|
||||
lang: recognition.lang
|
||||
})
|
||||
|
||||
recognition.onstart = () => {
|
||||
console.log('🎯 Speech recognition onstart event fired')
|
||||
console.log('Setting isRecording to true')
|
||||
setIsRecording(true)
|
||||
updateShapePropsImmediate({ isRecording: true })
|
||||
console.log('✅ Recording state updated')
|
||||
}
|
||||
|
||||
recognition.onresult = (event: any) => {
|
||||
console.log('🎤 Speech recognition onresult event fired', event)
|
||||
console.log('Event details:', {
|
||||
resultIndex: event.resultIndex,
|
||||
resultsLength: event.results.length,
|
||||
hasResults: !!event.results
|
||||
})
|
||||
|
||||
let finalTranscript = ''
|
||||
let interimTranscript = ''
|
||||
|
||||
for (let i = event.resultIndex; i < event.results.length; i++) {
|
||||
const transcript = event.results[i][0].transcript
|
||||
console.log(`📝 Result ${i}: "${transcript}" (final: ${event.results[i].isFinal})`)
|
||||
if (event.results[i].isFinal) {
|
||||
finalTranscript += transcript
|
||||
} else {
|
||||
interimTranscript += transcript
|
||||
}
|
||||
}
|
||||
|
||||
if (finalTranscript) {
|
||||
console.log('✅ Final transcript:', finalTranscript)
|
||||
// Use functional update to avoid dependency on current transcript state
|
||||
setTranscript(prevTranscript => {
|
||||
const newTranscript = prevTranscript + finalTranscript + '\n'
|
||||
console.log('📝 Updating transcript:', {
|
||||
prevLength: prevTranscript.length,
|
||||
newLength: newTranscript.length,
|
||||
prevText: prevTranscript.slice(-50), // Last 50 chars
|
||||
newText: newTranscript.slice(-50) // Last 50 chars
|
||||
})
|
||||
// Update shape props with the new transcript using local-first update
|
||||
updateTranscriptLocal(newTranscript)
|
||||
return newTranscript
|
||||
})
|
||||
|
||||
// Add to participants if we can identify who's speaking
|
||||
addParticipantTranscript('Speaker', finalTranscript)
|
||||
}
|
||||
|
||||
if (interimTranscript) {
|
||||
console.log('⏳ Interim transcript:', interimTranscript)
|
||||
}
|
||||
|
||||
// Smart auto-scroll: only scroll if user hasn't manually scrolled away
|
||||
if (!userHasScrolled && transcriptRef.current) {
|
||||
transcriptRef.current.scrollTop = transcriptRef.current.scrollHeight
|
||||
console.log('📜 Auto-scrolled to bottom')
|
||||
}
|
||||
}
|
||||
|
||||
recognition.onerror = (event: any) => {
|
||||
console.error('❌ Speech recognition error:', event.error)
|
||||
setError(`Recognition error: ${event.error}`)
|
||||
setIsRecording(false)
|
||||
updateShapePropsImmediate({ isRecording: false })
|
||||
}
|
||||
|
||||
recognition.onend = () => {
|
||||
console.log('🛑 Speech recognition ended')
|
||||
setIsRecording(false)
|
||||
updateShapePropsImmediate({ isRecording: false })
|
||||
}
|
||||
}, [updateShapePropsImmediate])
|
||||
|
||||
const startRecording = useCallback(async () => {
|
||||
try {
|
||||
console.log('🎤 Starting recording...')
|
||||
console.log('Recognition ref exists:', !!recognitionRef.current)
|
||||
console.log('Current recognition state:', recognitionRef.current?.state || 'unknown')
|
||||
|
||||
// Request microphone permission
|
||||
const stream = await navigator.mediaDevices.getUserMedia({ audio: true })
|
||||
mediaStreamRef.current = stream
|
||||
console.log('✅ Microphone access granted')
|
||||
|
||||
if (recognitionRef.current) {
|
||||
console.log('🎯 Starting speech recognition...')
|
||||
console.log('Recognition settings:', {
|
||||
continuous: recognitionRef.current.continuous,
|
||||
interimResults: recognitionRef.current.interimResults,
|
||||
lang: recognitionRef.current.lang
|
||||
})
|
||||
recognitionRef.current.start()
|
||||
console.log('✅ Speech recognition start() called')
|
||||
} else {
|
||||
console.error('❌ Recognition ref is null')
|
||||
setError('Speech recognition not initialized')
|
||||
}
|
||||
} catch (err) {
|
||||
console.error('❌ Error accessing microphone:', err)
|
||||
setError('Unable to access microphone. Please check permissions.')
|
||||
}
|
||||
}, [])
|
||||
|
||||
const pauseRecording = useCallback(() => {
|
||||
if (recognitionRef.current && isRecording) {
|
||||
console.log('⏸️ Pausing transcription...')
|
||||
recognitionRef.current.stop()
|
||||
setIsPaused(true)
|
||||
}
|
||||
}, [isRecording])
|
||||
|
||||
const resumeRecording = useCallback(async () => {
|
||||
if (recognitionRef.current && isPaused) {
|
||||
console.log('▶️ Resuming transcription...')
|
||||
try {
|
||||
recognitionRef.current.start()
|
||||
setIsPaused(false)
|
||||
} catch (err) {
|
||||
console.error('❌ Error resuming transcription:', err)
|
||||
setError('Unable to resume transcription')
|
||||
}
|
||||
}
|
||||
}, [isPaused])
|
||||
|
||||
// Auto-start transcription if isRecording is true from the beginning
|
||||
useEffect(() => {
|
||||
console.log('🔍 Auto-start useEffect triggered:', {
|
||||
isSupported,
|
||||
hasRecognition: !!recognitionRef.current,
|
||||
shapeIsRecording: shape.props.isRecording,
|
||||
componentIsRecording: isRecording
|
||||
})
|
||||
|
||||
if (isSupported && recognitionRef.current && shape.props.isRecording && !isRecording) {
|
||||
console.log('🚀 Auto-starting transcription from shape props...')
|
||||
setTimeout(() => {
|
||||
startRecording()
|
||||
}, 1000) // Small delay to ensure everything is set up
|
||||
}
|
||||
}, [isSupported, startRecording, shape.props.isRecording, isRecording])
|
||||
|
||||
// Add global error handler for sync errors
|
||||
useEffect(() => {
|
||||
const handleGlobalError = (event: ErrorEvent) => {
|
||||
if (event.message && event.message.includes('INVALID_RECORD')) {
|
||||
console.error('🚨 INVALID_RECORD sync error detected:', event.message)
|
||||
console.error('🚨 Error details:', event.error)
|
||||
setError('Sync error detected. Please refresh the page.')
|
||||
}
|
||||
}
|
||||
|
||||
window.addEventListener('error', handleGlobalError)
|
||||
return () => window.removeEventListener('error', handleGlobalError)
|
||||
}, [])
|
||||
|
||||
const addParticipantTranscript = useCallback((speakerName: string, text: string) => {
|
||||
setParticipants(prev => {
|
||||
const existing = prev.find(p => p.name === speakerName)
|
||||
const newParticipants = existing
|
||||
? prev.map(p =>
|
||||
p.name === speakerName
|
||||
? { ...p, lastSpoken: text, transcript: p.transcript + '\n' + text }
|
||||
: p
|
||||
)
|
||||
: [...prev, {
|
||||
id: Date.now().toString(),
|
||||
name: speakerName,
|
||||
isSpeaking: false,
|
||||
lastSpoken: text,
|
||||
transcript: text
|
||||
}]
|
||||
|
||||
// Don't update shape props for participants immediately - let it batch with transcript
|
||||
// This reduces the number of shape updates
|
||||
|
||||
return newParticipants
|
||||
})
|
||||
}, [])
|
||||
|
||||
const clearTranscript = useCallback(() => {
|
||||
setTranscript('')
|
||||
setParticipants([])
|
||||
editor.updateShape({
|
||||
id: shape.id,
|
||||
type: shape.type,
|
||||
props: {
|
||||
...shape.props,
|
||||
transcript: '',
|
||||
participants: []
|
||||
}
|
||||
})
|
||||
}, [editor, shape])
|
||||
|
||||
const copyTranscript = useCallback(async () => {
|
||||
try {
|
||||
await navigator.clipboard.writeText(transcript)
|
||||
console.log('✅ Transcript copied to clipboard')
|
||||
// You could add a temporary "Copied!" message here if desired
|
||||
} catch (err) {
|
||||
console.error('❌ Failed to copy transcript:', err)
|
||||
// Fallback for older browsers
|
||||
const textArea = document.createElement('textarea')
|
||||
textArea.value = transcript
|
||||
document.body.appendChild(textArea)
|
||||
textArea.select()
|
||||
document.execCommand('copy')
|
||||
document.body.removeChild(textArea)
|
||||
console.log('✅ Transcript copied using fallback method')
|
||||
}
|
||||
}, [transcript])
|
||||
|
||||
// Cleanup on unmount
|
||||
useEffect(() => {
|
||||
return () => {
|
||||
if (recognitionRef.current) {
|
||||
recognitionRef.current.stop()
|
||||
}
|
||||
if (mediaStreamRef.current) {
|
||||
mediaStreamRef.current.getTracks().forEach(track => track.stop())
|
||||
}
|
||||
// Cleanup completed
|
||||
// Ensure final transcript is saved
|
||||
if (localTranscriptRef.current) {
|
||||
editor.updateShape({
|
||||
id: shape.id,
|
||||
type: shape.type,
|
||||
props: {
|
||||
...shape.props,
|
||||
transcript: localTranscriptRef.current
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
}, [editor, shape])
|
||||
|
||||
// Handle scroll events to detect user scrolling
|
||||
const handleScroll = useCallback(() => {
|
||||
if (transcriptRef.current) {
|
||||
const { scrollTop, scrollHeight, clientHeight } = transcriptRef.current
|
||||
const isAtBottom = scrollTop + clientHeight >= scrollHeight - 10 // 10px threshold
|
||||
|
||||
if (isAtBottom) {
|
||||
setUserHasScrolled(false) // User is back at bottom, re-enable auto-scroll
|
||||
} else {
|
||||
setUserHasScrolled(true) // User has scrolled away, disable auto-scroll
|
||||
}
|
||||
}
|
||||
}, [])
|
||||
|
||||
if (!isSupported) {
|
||||
return (
|
||||
<div className="transcribe-container" style={{ width: shape.props.w, height: shape.props.h }}>
|
||||
<div className="transcribe-error">
|
||||
<p>Speech recognition not supported in this browser.</p>
|
||||
<p>Please use Chrome or a WebKit-based browser.</p>
|
||||
</div>
|
||||
</div>
|
||||
)
|
||||
}
|
||||
|
||||
return (
|
||||
<div className="transcribe-container" style={{ width: shape.props.w, height: shape.props.h }}>
|
||||
{/* Header */}
|
||||
<div className="transcribe-header">
|
||||
<h3>Live Transcription</h3>
|
||||
</div>
|
||||
|
||||
{/* Recording Controls - Simplified */}
|
||||
<div className="transcribe-controls">
|
||||
{!isRecording && !isPaused ? (
|
||||
<button
|
||||
onClick={startRecording}
|
||||
className="transcribe-btn start-btn"
|
||||
>
|
||||
🎤 Start Recording
|
||||
</button>
|
||||
) : isPaused ? (
|
||||
<button
|
||||
onClick={resumeRecording}
|
||||
className="transcribe-btn resume-btn"
|
||||
>
|
||||
▶️ Resume
|
||||
</button>
|
||||
) : (
|
||||
<button
|
||||
onClick={pauseRecording}
|
||||
className="transcribe-btn pause-btn"
|
||||
>
|
||||
⏸️ Pause
|
||||
</button>
|
||||
)}
|
||||
|
||||
<button
|
||||
onClick={copyTranscript}
|
||||
className="transcribe-btn copy-btn"
|
||||
disabled={!transcript}
|
||||
>
|
||||
📋 Copy
|
||||
</button>
|
||||
</div>
|
||||
|
||||
{/* Status */}
|
||||
<div className="transcribe-status">
|
||||
{isRecording && !isPaused && (
|
||||
<div className="recording-indicator">
|
||||
<span className="pulse">🔴</span> Recording...
|
||||
</div>
|
||||
)}
|
||||
{isPaused && (
|
||||
<div className="paused-indicator">
|
||||
<span>⏸️</span> Paused
|
||||
</div>
|
||||
)}
|
||||
{error && (
|
||||
<div className="error-message">
|
||||
⚠️ {error}
|
||||
</div>
|
||||
)}
|
||||
</div>
|
||||
|
||||
{/* Participants */}
|
||||
{participants.length > 0 && (
|
||||
<div className="participants-section">
|
||||
<h4>Participants ({participants.length})</h4>
|
||||
<div className="participants-list">
|
||||
{participants.map(participant => (
|
||||
<div key={participant.id} className="participant">
|
||||
<span className="participant-name">{participant.name}</span>
|
||||
<span className="participant-status">
|
||||
{participant.isSpeaking ? '🔊 Speaking' : '🔇 Silent'}
|
||||
</span>
|
||||
</div>
|
||||
))}
|
||||
</div>
|
||||
</div>
|
||||
)}
|
||||
|
||||
{/* Transcript */}
|
||||
<div className="transcript-section">
|
||||
<h4>Live Transcript</h4>
|
||||
<div
|
||||
ref={transcriptRef}
|
||||
className="transcript-content"
|
||||
onScroll={handleScroll}
|
||||
style={{
|
||||
height: Math.max(100, shape.props.h - 200),
|
||||
overflowY: 'auto'
|
||||
}}
|
||||
>
|
||||
{(transcript || localTranscriptRef.current) ? (
|
||||
<pre className="transcript-text">
|
||||
{transcript || localTranscriptRef.current}
|
||||
{/* Debug info */}
|
||||
<div style={{ fontSize: '10px', color: '#666', marginTop: '10px' }}>
|
||||
Debug: {transcript.length} chars (local: {localTranscriptRef.current.length}),
|
||||
isRecording: {isRecording.toString()}
|
||||
</div>
|
||||
</pre>
|
||||
) : (
|
||||
<p className="transcript-placeholder">
|
||||
Start recording to see live transcription... (Debug: transcript length = {transcript.length})
|
||||
</p>
|
||||
)}
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
)
|
||||
}
|
||||
|
|
@ -0,0 +1,261 @@
|
|||
/* Transcription Component Styles */
|
||||
.transcribe-container {
|
||||
background: #ffffff;
|
||||
border: 2px solid #e2e8f0;
|
||||
border-radius: 12px;
|
||||
padding: 16px;
|
||||
font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;
|
||||
box-shadow: 0 4px 6px -1px rgba(0, 0, 0, 0.1);
|
||||
overflow: hidden;
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
gap: 12px;
|
||||
}
|
||||
|
||||
.transcribe-header {
|
||||
display: flex;
|
||||
justify-content: space-between;
|
||||
align-items: center;
|
||||
border-bottom: 1px solid #e2e8f0;
|
||||
padding-bottom: 12px;
|
||||
margin-bottom: 12px;
|
||||
}
|
||||
|
||||
.transcribe-header h3 {
|
||||
margin: 0;
|
||||
color: #1f2937;
|
||||
font-size: 18px;
|
||||
font-weight: 600;
|
||||
}
|
||||
|
||||
.transcribe-controls {
|
||||
display: flex;
|
||||
gap: 8px;
|
||||
align-items: center;
|
||||
flex-wrap: wrap;
|
||||
}
|
||||
|
||||
.transcribe-controls select {
|
||||
padding: 6px 8px;
|
||||
border: 1px solid #d1d5db;
|
||||
border-radius: 6px;
|
||||
background: #ffffff;
|
||||
font-size: 14px;
|
||||
color: #374151;
|
||||
}
|
||||
|
||||
.transcribe-controls label {
|
||||
display: flex;
|
||||
align-items: center;
|
||||
gap: 4px;
|
||||
font-size: 14px;
|
||||
color: #6b7280;
|
||||
cursor: pointer;
|
||||
}
|
||||
|
||||
.transcribe-controls input[type="checkbox"] {
|
||||
margin: 0;
|
||||
cursor: pointer;
|
||||
}
|
||||
|
||||
.transcribe-btn {
|
||||
padding: 8px 16px;
|
||||
border: none;
|
||||
border-radius: 6px;
|
||||
font-size: 14px;
|
||||
font-weight: 500;
|
||||
cursor: pointer;
|
||||
transition: all 0.2s ease;
|
||||
display: flex;
|
||||
align-items: center;
|
||||
gap: 6px;
|
||||
}
|
||||
|
||||
.transcribe-btn:disabled {
|
||||
opacity: 0.5;
|
||||
cursor: not-allowed;
|
||||
}
|
||||
|
||||
.start-btn {
|
||||
background: #10b981;
|
||||
color: white;
|
||||
}
|
||||
|
||||
.start-btn:hover:not(:disabled) {
|
||||
background: #059669;
|
||||
transform: translateY(-1px);
|
||||
}
|
||||
|
||||
.stop-btn {
|
||||
background: #ef4444;
|
||||
color: white;
|
||||
}
|
||||
|
||||
.stop-btn:hover:not(:disabled) {
|
||||
background: #dc2626;
|
||||
transform: translateY(-1px);
|
||||
}
|
||||
|
||||
.clear-btn {
|
||||
background: #6b7280;
|
||||
color: white;
|
||||
}
|
||||
|
||||
.clear-btn:hover:not(:disabled) {
|
||||
background: #4b5563;
|
||||
transform: translateY(-1px);
|
||||
}
|
||||
|
||||
.download-btn {
|
||||
background: #3b82f6;
|
||||
color: white;
|
||||
}
|
||||
|
||||
.download-btn:hover:not(:disabled) {
|
||||
background: #2563eb;
|
||||
transform: translateY(-1px);
|
||||
}
|
||||
|
||||
.transcribe-status {
|
||||
display: flex;
|
||||
gap: 12px;
|
||||
align-items: center;
|
||||
font-size: 14px;
|
||||
}
|
||||
|
||||
.recording-indicator {
|
||||
display: flex;
|
||||
align-items: center;
|
||||
gap: 6px;
|
||||
color: #dc2626;
|
||||
font-weight: 500;
|
||||
}
|
||||
|
||||
.pulse {
|
||||
animation: pulse 2s infinite;
|
||||
}
|
||||
|
||||
@keyframes pulse {
|
||||
0% { opacity: 1; }
|
||||
50% { opacity: 0.5; }
|
||||
100% { opacity: 1; }
|
||||
}
|
||||
|
||||
.error-message {
|
||||
color: #dc2626;
|
||||
background: #fef2f2;
|
||||
padding: 8px 12px;
|
||||
border-radius: 6px;
|
||||
border: 1px solid #fecaca;
|
||||
}
|
||||
|
||||
.participants-section {
|
||||
border-top: 1px solid #e2e8f0;
|
||||
padding-top: 12px;
|
||||
}
|
||||
|
||||
.participants-section h4 {
|
||||
margin: 0 0 8px 0;
|
||||
color: #374151;
|
||||
font-size: 16px;
|
||||
font-weight: 600;
|
||||
}
|
||||
|
||||
.participants-list {
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
gap: 6px;
|
||||
}
|
||||
|
||||
.participant {
|
||||
display: flex;
|
||||
justify-content: space-between;
|
||||
align-items: center;
|
||||
padding: 6px 8px;
|
||||
background: #f9fafb;
|
||||
border-radius: 4px;
|
||||
font-size: 14px;
|
||||
}
|
||||
|
||||
.participant-name {
|
||||
font-weight: 500;
|
||||
color: #374151;
|
||||
}
|
||||
|
||||
.participant-status {
|
||||
font-size: 12px;
|
||||
color: #6b7280;
|
||||
}
|
||||
|
||||
.transcript-section {
|
||||
flex: 1;
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
min-height: 0;
|
||||
}
|
||||
|
||||
.transcript-section h4 {
|
||||
margin: 0 0 8px 0;
|
||||
color: #374151;
|
||||
font-size: 16px;
|
||||
font-weight: 600;
|
||||
}
|
||||
|
||||
.transcript-content {
|
||||
background: #f9fafb;
|
||||
border: 1px solid #e2e8f0;
|
||||
border-radius: 6px;
|
||||
padding: 12px;
|
||||
overflow-y: auto;
|
||||
flex: 1;
|
||||
}
|
||||
|
||||
.transcript-text {
|
||||
margin: 0;
|
||||
white-space: pre-wrap;
|
||||
font-family: 'Monaco', 'Menlo', 'Ubuntu Mono', monospace;
|
||||
font-size: 14px;
|
||||
line-height: 1.5;
|
||||
color: #374151;
|
||||
}
|
||||
|
||||
.transcript-placeholder {
|
||||
margin: 0;
|
||||
color: #9ca3af;
|
||||
font-style: italic;
|
||||
text-align: center;
|
||||
padding: 20px;
|
||||
}
|
||||
|
||||
.transcribe-error {
|
||||
text-align: center;
|
||||
padding: 20px;
|
||||
color: #6b7280;
|
||||
}
|
||||
|
||||
.transcribe-error p {
|
||||
margin: 8px 0;
|
||||
}
|
||||
|
||||
/* Responsive adjustments */
|
||||
@media (max-width: 480px) {
|
||||
.transcribe-container {
|
||||
padding: 12px;
|
||||
gap: 8px;
|
||||
}
|
||||
|
||||
.transcribe-header {
|
||||
flex-direction: column;
|
||||
align-items: flex-start;
|
||||
gap: 8px;
|
||||
}
|
||||
|
||||
.transcribe-controls {
|
||||
justify-content: flex-start;
|
||||
}
|
||||
|
||||
.transcribe-btn {
|
||||
padding: 6px 12px;
|
||||
font-size: 13px;
|
||||
}
|
||||
}
|
||||
|
|
@ -6,6 +6,8 @@ import { ChatBoxTool } from "@/tools/ChatBoxTool"
|
|||
import { ChatBoxShape } from "@/shapes/ChatBoxShapeUtil"
|
||||
import { VideoChatTool } from "@/tools/VideoChatTool"
|
||||
import { VideoChatShape } from "@/shapes/VideoChatShapeUtil"
|
||||
import { TranscribeTool } from "@/tools/TranscribeTool"
|
||||
import { TranscribeShapeUtil } from "@/shapes/TranscribeShapeUtil"
|
||||
import { multiplayerAssetStore } from "../utils/multiplayerAssetStore"
|
||||
import { EmbedShape } from "@/shapes/EmbedShapeUtil"
|
||||
import { EmbedTool } from "@/tools/EmbedTool"
|
||||
|
|
@ -46,6 +48,7 @@ import { CmdK } from "@/CmdK"
|
|||
|
||||
import "react-cmdk/dist/cmdk.css"
|
||||
import "@/css/style.css"
|
||||
import "@/css/transcribe.css"
|
||||
|
||||
const collections: Collection[] = [GraphLayoutCollection]
|
||||
import { useAuth } from "../context/AuthContext"
|
||||
|
|
@ -64,6 +67,7 @@ const customShapeUtils = [
|
|||
MarkdownShape,
|
||||
PromptShape,
|
||||
SharedPianoShape,
|
||||
TranscribeShapeUtil,
|
||||
]
|
||||
const customTools = [
|
||||
ChatBoxTool,
|
||||
|
|
@ -75,6 +79,7 @@ const customTools = [
|
|||
PromptShapeTool,
|
||||
SharedPianoTool,
|
||||
GestureTool,
|
||||
TranscribeTool,
|
||||
]
|
||||
|
||||
export function Board() {
|
||||
|
|
|
|||
|
|
@ -18,6 +18,28 @@ export function Presentations() {
|
|||
</div>
|
||||
|
||||
<div className="presentations-grid">
|
||||
<div className="presentation-card">
|
||||
<h3>Psilocybernetics: The Emergence of Institutional Neuroplasticity</h3>
|
||||
<p>Exploring the intersection of mycelium and cybernetic institutional design</p>
|
||||
<div className="presentation-embed">
|
||||
<div style={{position: "relative", paddingTop: "max(60%, 324px)", width: "100%", height: 0}}>
|
||||
<iframe
|
||||
style={{position: "absolute", border: "none", width: "100%", height: "100%", left: 0, top: 0}}
|
||||
src="https://online.fliphtml5.com/phqos/pnlz/"
|
||||
seamless={true}
|
||||
scrolling="no"
|
||||
frameBorder="0"
|
||||
allowTransparency={true}
|
||||
allowFullScreen={true}
|
||||
title="Psilocybernetics Presentation"
|
||||
/>
|
||||
</div>
|
||||
</div>
|
||||
<div className="presentation-meta">
|
||||
<span>Presentation at the Crypto Commons Gathering 5</span>
|
||||
<span>Video coming soon</span>
|
||||
</div>
|
||||
</div>
|
||||
<div className="presentation-card">
|
||||
<h3>Osmotic Governance</h3>
|
||||
<p>Exploring the intersection of mycelium and emancipatory technologies</p>
|
||||
|
|
@ -118,28 +140,7 @@ export function Presentations() {
|
|||
</div>
|
||||
</div>
|
||||
|
||||
<div className="presentation-card">
|
||||
<h3>Psilocybernetics: The Emergence of Institutional Neuroplasticity</h3>
|
||||
<p>Exploring the intersection of mycelium and cybernetic institutional design</p>
|
||||
<div className="presentation-embed">
|
||||
<div style={{position: "relative", paddingTop: "max(60%, 324px)", width: "100%", height: 0}}>
|
||||
<iframe
|
||||
style={{position: "absolute", border: "none", width: "100%", height: "100%", left: 0, top: 0}}
|
||||
src="https://online.fliphtml5.com/phqos/pnlz/"
|
||||
seamless={true}
|
||||
scrolling="no"
|
||||
frameBorder="0"
|
||||
allowTransparency={true}
|
||||
allowFullScreen={true}
|
||||
title="Psilocybernetics Presentation"
|
||||
/>
|
||||
</div>
|
||||
</div>
|
||||
<div className="presentation-meta">
|
||||
<span>Presentation at the General Forum for Ethereum Localism</span>
|
||||
<span>Video coming soon</span>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
|
||||
<div className="presentation-card">
|
||||
<h3>Move Slow & Fix Things: The Commons Stack Design Pattern</h3>
|
||||
|
|
|
|||
|
|
@ -0,0 +1,47 @@
|
|||
import { BaseBoxShapeUtil, TLBaseShape } from "tldraw"
|
||||
import { TranscribeComponent } from "../components/TranscribeComponent"
|
||||
|
||||
export type ITranscribeShape = TLBaseShape<
|
||||
"Transcribe",
|
||||
{
|
||||
w: number
|
||||
h: number
|
||||
isRecording: boolean
|
||||
transcript: string
|
||||
participants: Array<{
|
||||
id: string
|
||||
name: string
|
||||
isSpeaking: boolean
|
||||
lastSpoken: string
|
||||
}>
|
||||
language: string
|
||||
}
|
||||
>
|
||||
|
||||
export class TranscribeShapeUtil extends BaseBoxShapeUtil<ITranscribeShape> {
|
||||
static override type = "Transcribe"
|
||||
|
||||
override getDefaultProps(): ITranscribeShape["props"] {
|
||||
return {
|
||||
w: 400,
|
||||
h: 300,
|
||||
isRecording: false,
|
||||
transcript: "",
|
||||
participants: [],
|
||||
language: "en-US",
|
||||
}
|
||||
}
|
||||
|
||||
override indicator(shape: ITranscribeShape) {
|
||||
return <rect x={0} y={0} width={shape.props.w} height={shape.props.h} />
|
||||
}
|
||||
|
||||
override component(shape: ITranscribeShape) {
|
||||
return (
|
||||
<TranscribeComponent
|
||||
shape={shape}
|
||||
util={this}
|
||||
/>
|
||||
)
|
||||
}
|
||||
}
|
||||
|
|
@ -272,56 +272,14 @@ export class VideoChatShape extends BaseBoxShapeUtil<IVideoChatShape> {
|
|||
}
|
||||
}
|
||||
|
||||
async startTranscription(shape: IVideoChatShape) {
|
||||
console.log('🎤 startTranscription method called');
|
||||
console.log('Shape props:', shape.props);
|
||||
console.log('Room URL:', shape.props.roomUrl);
|
||||
console.log('Is owner:', shape.props.isOwner);
|
||||
|
||||
if (!shape.props.roomUrl || !shape.props.isOwner) {
|
||||
console.log('❌ Early return - missing roomUrl or not owner');
|
||||
console.log('roomUrl exists:', !!shape.props.roomUrl);
|
||||
console.log('isOwner:', shape.props.isOwner);
|
||||
return;
|
||||
}
|
||||
async startTranscription(shape: IVideoChatShape): Promise<boolean> {
|
||||
console.log('🎤 Starting Web Speech API transcription...');
|
||||
|
||||
try {
|
||||
const workerUrl = WORKER_URL;
|
||||
const apiKey = import.meta.env.VITE_DAILY_API_KEY;
|
||||
// Request microphone permission for transcription
|
||||
const stream = await navigator.mediaDevices.getUserMedia({ audio: true });
|
||||
|
||||
console.log('🔧 Environment variables:');
|
||||
console.log('Worker URL:', workerUrl);
|
||||
console.log('API Key exists:', !!apiKey);
|
||||
|
||||
// Extract room name from URL
|
||||
const roomName = shape.props.roomUrl.split('/').pop();
|
||||
console.log('📝 Extracted room name:', roomName);
|
||||
|
||||
if (!roomName) {
|
||||
throw new Error('Could not extract room name from URL');
|
||||
}
|
||||
|
||||
console.log('🌐 Making API request to start transcription...');
|
||||
console.log('Request URL:', `${workerUrl}/daily/rooms/${roomName}/start-transcription`);
|
||||
|
||||
const response = await fetch(`${workerUrl}/daily/rooms/${roomName}/start-transcription`, {
|
||||
method: 'POST',
|
||||
headers: {
|
||||
'Content-Type': 'application/json',
|
||||
'Authorization': `Bearer ${apiKey}`
|
||||
}
|
||||
});
|
||||
|
||||
console.log('📡 Response status:', response.status);
|
||||
console.log('📡 Response ok:', response.ok);
|
||||
|
||||
if (!response.ok) {
|
||||
const error = await response.json();
|
||||
console.error('❌ API error response:', error);
|
||||
throw new Error(`Failed to start transcription: ${JSON.stringify(error)}`);
|
||||
}
|
||||
|
||||
console.log('✅ API call successful, updating shape...');
|
||||
// Update shape to indicate transcription is active
|
||||
await this.editor.updateShape<IVideoChatShape>({
|
||||
id: shape.id,
|
||||
type: shape.type,
|
||||
|
|
@ -330,52 +288,20 @@ export class VideoChatShape extends BaseBoxShapeUtil<IVideoChatShape> {
|
|||
isTranscribing: true,
|
||||
}
|
||||
});
|
||||
console.log('✅ Shape updated with isTranscribing: true');
|
||||
|
||||
console.log('✅ Web Speech API transcription started');
|
||||
return true;
|
||||
} catch (error) {
|
||||
console.error('❌ Error starting transcription:', error);
|
||||
throw error;
|
||||
console.error('❌ Error starting Web Speech API transcription:', error);
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
async stopTranscription(shape: IVideoChatShape) {
|
||||
console.log('🛑 stopTranscription method called');
|
||||
console.log('Shape props:', shape.props);
|
||||
|
||||
if (!shape.props.roomUrl || !shape.props.isOwner) {
|
||||
console.log('❌ Early return - missing roomUrl or not owner');
|
||||
return;
|
||||
}
|
||||
console.log('🛑 Stopping Web Speech API transcription...');
|
||||
|
||||
try {
|
||||
const workerUrl = WORKER_URL;
|
||||
const apiKey = import.meta.env.VITE_DAILY_API_KEY;
|
||||
|
||||
// Extract room name from URL
|
||||
const roomName = shape.props.roomUrl.split('/').pop();
|
||||
console.log('📝 Extracted room name:', roomName);
|
||||
|
||||
if (!roomName) {
|
||||
throw new Error('Could not extract room name from URL');
|
||||
}
|
||||
|
||||
console.log('🌐 Making API request to stop transcription...');
|
||||
const response = await fetch(`${workerUrl}/daily/rooms/${roomName}/stop-transcription`, {
|
||||
method: 'POST',
|
||||
headers: {
|
||||
'Content-Type': 'application/json',
|
||||
'Authorization': `Bearer ${apiKey}`
|
||||
}
|
||||
});
|
||||
|
||||
console.log('📡 Response status:', response.status);
|
||||
|
||||
if (!response.ok) {
|
||||
const error = await response.json();
|
||||
console.error('❌ API error response:', error);
|
||||
throw new Error(`Failed to stop transcription: ${JSON.stringify(error)}`);
|
||||
}
|
||||
|
||||
console.log('✅ API call successful, updating shape...');
|
||||
// Update shape to indicate transcription is stopped
|
||||
await this.editor.updateShape<IVideoChatShape>({
|
||||
id: shape.id,
|
||||
type: shape.type,
|
||||
|
|
@ -384,10 +310,10 @@ export class VideoChatShape extends BaseBoxShapeUtil<IVideoChatShape> {
|
|||
isTranscribing: false,
|
||||
}
|
||||
});
|
||||
console.log('✅ Shape updated with isTranscribing: false');
|
||||
|
||||
console.log('✅ Web Speech API transcription stopped');
|
||||
} catch (error) {
|
||||
console.error('❌ Error stopping transcription:', error);
|
||||
throw error;
|
||||
console.error('❌ Error stopping Web Speech API transcription:', error);
|
||||
}
|
||||
}
|
||||
|
||||
|
|
@ -417,6 +343,37 @@ export class VideoChatShape extends BaseBoxShapeUtil<IVideoChatShape> {
|
|||
console.log('✅ Transcription message added to shape');
|
||||
}
|
||||
|
||||
createTranscriptionTool(shape: IVideoChatShape) {
|
||||
console.log('🎤 Creating transcription tool element...');
|
||||
|
||||
// Position the transcribe tool beneath the video chat shape
|
||||
const videoShape = this.editor.getShape(shape.id) as IVideoChatShape;
|
||||
if (!videoShape) return;
|
||||
|
||||
// Calculate position beneath the video
|
||||
const x = videoShape.x; // Same x position as video
|
||||
const y = videoShape.y + videoShape.props.h + 20; // Below video with 20px gap
|
||||
const width = videoShape.props.w; // Same width as video
|
||||
|
||||
// Create transcription tool shape
|
||||
this.editor.createShape({
|
||||
type: 'Transcribe',
|
||||
x: x,
|
||||
y: y,
|
||||
props: {
|
||||
w: width,
|
||||
h: 200, // Fixed height for transcript box
|
||||
isRecording: true, // Auto-start recording
|
||||
transcript: "",
|
||||
participants: [],
|
||||
language: "en-US",
|
||||
autoScroll: true,
|
||||
}
|
||||
});
|
||||
|
||||
console.log('✅ Transcription tool created successfully beneath video');
|
||||
}
|
||||
|
||||
component(shape: IVideoChatShape) {
|
||||
const [hasPermissions, setHasPermissions] = useState(false)
|
||||
const [error, setError] = useState<Error | null>(null)
|
||||
|
|
@ -525,176 +482,217 @@ export class VideoChatShape extends BaseBoxShapeUtil<IVideoChatShape> {
|
|||
style={{
|
||||
width: `${shape.props.w}px`,
|
||||
height: `${shape.props.h}px`,
|
||||
position: "relative",
|
||||
display: "flex",
|
||||
flexDirection: "column",
|
||||
pointerEvents: "all",
|
||||
overflow: "hidden",
|
||||
}}
|
||||
>
|
||||
<iframe
|
||||
src={roomUrlWithParams.toString()}
|
||||
width="100%"
|
||||
height="100%"
|
||||
style={{
|
||||
border: "none",
|
||||
position: "absolute",
|
||||
top: 0,
|
||||
left: 0,
|
||||
right: 0,
|
||||
bottom: 0,
|
||||
}}
|
||||
allow={`camera ${shape.props.allowCamera ? "self" : ""}; microphone ${
|
||||
shape.props.allowMicrophone ? "self" : ""
|
||||
}`}
|
||||
></iframe>
|
||||
|
||||
{/* Recording Button */}
|
||||
{shape.props.enableRecording && (
|
||||
<button
|
||||
onClick={async () => {
|
||||
try {
|
||||
if (shape.props.recordingId) {
|
||||
await this.stopRecording(shape);
|
||||
} else {
|
||||
await this.startRecording(shape);
|
||||
}
|
||||
} catch (err) {
|
||||
console.error('Recording error:', err);
|
||||
}
|
||||
}}
|
||||
style={{
|
||||
position: "absolute",
|
||||
top: "8px",
|
||||
right: "8px",
|
||||
padding: "4px 8px",
|
||||
background: shape.props.recordingId ? "#ff4444" : "#ffffff",
|
||||
border: "1px solid #ccc",
|
||||
borderRadius: "4px",
|
||||
cursor: "pointer",
|
||||
zIndex: 1,
|
||||
}}
|
||||
>
|
||||
{shape.props.recordingId ? "Stop Recording" : "Start Recording"}
|
||||
</button>
|
||||
)}
|
||||
|
||||
{/* Test Button - Always visible for debugging */}
|
||||
{/* Transcription Button - Above video */}
|
||||
<button
|
||||
onClick={() => {
|
||||
console.log('🧪 Test button clicked!');
|
||||
onClick={async (e) => {
|
||||
e.preventDefault();
|
||||
e.stopPropagation();
|
||||
console.log('🚀 Transcription button clicked!');
|
||||
console.log('Current transcription state:', shape.props.isTranscribing);
|
||||
console.log('Shape props:', shape.props);
|
||||
alert('Test button clicked! Check console for details.');
|
||||
|
||||
try {
|
||||
if (shape.props.isTranscribing) {
|
||||
console.log('🛑 Stopping transcription...');
|
||||
await this.stopTranscription(shape);
|
||||
console.log('✅ Transcription stopped successfully');
|
||||
} else {
|
||||
console.log('🎤 Starting transcription...');
|
||||
const success = await this.startTranscription(shape);
|
||||
if (success) {
|
||||
// Create the transcription tool for Web Speech API
|
||||
this.createTranscriptionTool(shape);
|
||||
console.log('✅ Transcription tool created');
|
||||
} else {
|
||||
console.log('❌ Failed to start transcription');
|
||||
}
|
||||
}
|
||||
} catch (err) {
|
||||
console.error('❌ Transcription error:', err);
|
||||
}
|
||||
}}
|
||||
onPointerDown={(e) => e.stopPropagation()}
|
||||
onMouseDown={(e) => e.stopPropagation()}
|
||||
style={{
|
||||
position: "absolute",
|
||||
top: "8px",
|
||||
left: "8px",
|
||||
padding: "4px 8px",
|
||||
background: "#ffff00",
|
||||
border: "1px solid #000",
|
||||
padding: "8px 16px",
|
||||
background: shape.props.isTranscribing ? "#44ff44" : "#ffffff",
|
||||
border: "1px solid #ccc",
|
||||
borderRadius: "4px",
|
||||
cursor: "pointer",
|
||||
marginBottom: "8px",
|
||||
fontSize: "14px",
|
||||
fontWeight: "500",
|
||||
pointerEvents: "all",
|
||||
zIndex: 1000,
|
||||
fontSize: "10px",
|
||||
position: "relative",
|
||||
}}
|
||||
>
|
||||
TEST
|
||||
{shape.props.isTranscribing ? "Stop Transcription" : "Start Transcription"}
|
||||
</button>
|
||||
|
||||
{/* Transcription Button - Only for owners */}
|
||||
{(() => {
|
||||
console.log('🔍 Checking transcription button conditions:');
|
||||
console.log('enableTranscription:', shape.props.enableTranscription);
|
||||
console.log('isOwner:', shape.props.isOwner);
|
||||
console.log('Button should render:', shape.props.enableTranscription && shape.props.isOwner);
|
||||
return shape.props.enableTranscription && shape.props.isOwner;
|
||||
})() && (
|
||||
<button
|
||||
onClick={async () => {
|
||||
console.log('🚀 Transcription button clicked!');
|
||||
console.log('Current transcription state:', shape.props.isTranscribing);
|
||||
console.log('Shape props:', shape.props);
|
||||
{/* Video Container */}
|
||||
<div
|
||||
style={{
|
||||
flex: 1,
|
||||
position: "relative",
|
||||
overflow: "hidden",
|
||||
}}
|
||||
>
|
||||
<iframe
|
||||
src={roomUrlWithParams.toString()}
|
||||
width="100%"
|
||||
height="100%"
|
||||
style={{
|
||||
border: "none",
|
||||
position: "absolute",
|
||||
top: 0,
|
||||
left: 0,
|
||||
right: 0,
|
||||
bottom: 0,
|
||||
}}
|
||||
allow={`camera ${shape.props.allowCamera ? "self" : ""}; microphone ${
|
||||
shape.props.allowMicrophone ? "self" : ""
|
||||
}`}
|
||||
></iframe>
|
||||
|
||||
try {
|
||||
if (shape.props.isTranscribing) {
|
||||
console.log('🛑 Stopping transcription...');
|
||||
await this.stopTranscription(shape);
|
||||
console.log('✅ Transcription stopped successfully');
|
||||
} else {
|
||||
console.log('🎤 Starting transcription...');
|
||||
await this.startTranscription(shape);
|
||||
console.log('✅ Transcription started successfully');
|
||||
}
|
||||
} catch (err) {
|
||||
console.error('❌ Transcription error:', err);
|
||||
}
|
||||
{/* Test Button - Always visible for debugging */}
|
||||
<button
|
||||
onClick={() => {
|
||||
console.log('🧪 Test button clicked!');
|
||||
console.log('Shape props:', shape.props);
|
||||
alert('Test button clicked! Check console for details.');
|
||||
}}
|
||||
style={{
|
||||
position: "absolute",
|
||||
top: "8px",
|
||||
right: shape.props.enableRecording ? "120px" : "8px",
|
||||
left: "8px",
|
||||
padding: "4px 8px",
|
||||
background: shape.props.isTranscribing ? "#44ff44" : "#ffffff",
|
||||
border: "1px solid #ccc",
|
||||
background: "#ffff00",
|
||||
border: "1px solid #000",
|
||||
borderRadius: "4px",
|
||||
cursor: "pointer",
|
||||
zIndex: 1,
|
||||
zIndex: 1000,
|
||||
fontSize: "10px",
|
||||
}}
|
||||
>
|
||||
{shape.props.isTranscribing ? "Stop Transcription" : "Start Transcription"}
|
||||
TEST
|
||||
</button>
|
||||
)}
|
||||
|
||||
{/* Transcription History */}
|
||||
{shape.props.transcriptionHistory.length > 0 && (
|
||||
{/* Transcription History */}
|
||||
{shape.props.transcriptionHistory.length > 0 && (
|
||||
<div
|
||||
style={{
|
||||
position: "absolute",
|
||||
bottom: "40px",
|
||||
left: "8px",
|
||||
right: "8px",
|
||||
maxHeight: "200px",
|
||||
overflowY: "auto",
|
||||
background: "rgba(255, 255, 255, 0.95)",
|
||||
borderRadius: "4px",
|
||||
padding: "8px",
|
||||
fontSize: "12px",
|
||||
zIndex: 1,
|
||||
border: "1px solid #ccc",
|
||||
}}
|
||||
>
|
||||
<div style={{ fontWeight: "bold", marginBottom: "4px" }}>
|
||||
Live Transcription:
|
||||
</div>
|
||||
{shape.props.transcriptionHistory.slice(-10).map((msg) => (
|
||||
<div key={msg.id} style={{ marginBottom: "2px" }}>
|
||||
<span style={{ fontWeight: "bold", color: "#666" }}>
|
||||
{msg.sender}:
|
||||
</span>{" "}
|
||||
<span>{msg.message}</span>
|
||||
</div>
|
||||
))}
|
||||
</div>
|
||||
)}
|
||||
</div>
|
||||
|
||||
{/* URL Link - Below video */}
|
||||
<div style={{ position: "relative" }}>
|
||||
<p
|
||||
onClick={async (e) => {
|
||||
e.preventDefault();
|
||||
e.stopPropagation();
|
||||
if (roomUrl) {
|
||||
try {
|
||||
await navigator.clipboard.writeText(roomUrl);
|
||||
console.log('✅ Link copied to clipboard:', roomUrl);
|
||||
|
||||
// Show temporary "link copied" message
|
||||
const messageEl = document.getElementById(`copy-message-${shape.id}`);
|
||||
if (messageEl) {
|
||||
messageEl.style.opacity = "1";
|
||||
setTimeout(() => {
|
||||
messageEl.style.opacity = "0";
|
||||
}, 2000);
|
||||
}
|
||||
} catch (err) {
|
||||
console.error('❌ Failed to copy link:', err);
|
||||
// Fallback for older browsers
|
||||
const textArea = document.createElement('textarea');
|
||||
textArea.value = roomUrl;
|
||||
document.body.appendChild(textArea);
|
||||
textArea.select();
|
||||
document.execCommand('copy');
|
||||
document.body.removeChild(textArea);
|
||||
}
|
||||
}
|
||||
}}
|
||||
style={{
|
||||
margin: "8px 0 0 0",
|
||||
padding: "4px 8px",
|
||||
background: "rgba(255, 255, 255, 0.9)",
|
||||
borderRadius: "4px",
|
||||
fontSize: "12px",
|
||||
pointerEvents: "all",
|
||||
cursor: "pointer",
|
||||
userSelect: "none",
|
||||
border: "1px solid #e0e0e0",
|
||||
transition: "background-color 0.2s ease",
|
||||
}}
|
||||
onMouseEnter={(e) => {
|
||||
e.currentTarget.style.backgroundColor = "rgba(240, 240, 240, 0.9)";
|
||||
}}
|
||||
onMouseLeave={(e) => {
|
||||
e.currentTarget.style.backgroundColor = "rgba(255, 255, 255, 0.9)";
|
||||
}}
|
||||
>
|
||||
url: {roomUrl}
|
||||
{shape.props.isOwner && " (Owner)"}
|
||||
</p>
|
||||
|
||||
{/* "Link Copied" message */}
|
||||
<div
|
||||
id={`copy-message-${shape.id}`}
|
||||
style={{
|
||||
position: "absolute",
|
||||
bottom: "40px",
|
||||
left: "8px",
|
||||
right: "8px",
|
||||
maxHeight: "200px",
|
||||
overflowY: "auto",
|
||||
background: "rgba(255, 255, 255, 0.95)",
|
||||
bottom: "0",
|
||||
right: "0",
|
||||
background: "#4CAF50",
|
||||
color: "white",
|
||||
padding: "4px 8px",
|
||||
borderRadius: "4px",
|
||||
padding: "8px",
|
||||
fontSize: "12px",
|
||||
zIndex: 1,
|
||||
border: "1px solid #ccc",
|
||||
fontSize: "11px",
|
||||
fontWeight: "500",
|
||||
opacity: "0",
|
||||
transition: "opacity 0.3s ease",
|
||||
pointerEvents: "none",
|
||||
zIndex: 1001,
|
||||
}}
|
||||
>
|
||||
<div style={{ fontWeight: "bold", marginBottom: "4px" }}>
|
||||
Live Transcription:
|
||||
</div>
|
||||
{shape.props.transcriptionHistory.slice(-10).map((msg) => (
|
||||
<div key={msg.id} style={{ marginBottom: "2px" }}>
|
||||
<span style={{ fontWeight: "bold", color: "#666" }}>
|
||||
{msg.sender}:
|
||||
</span>{" "}
|
||||
<span>{msg.message}</span>
|
||||
</div>
|
||||
))}
|
||||
Link Copied!
|
||||
</div>
|
||||
)}
|
||||
|
||||
<p
|
||||
style={{
|
||||
position: "absolute",
|
||||
bottom: 0,
|
||||
left: 0,
|
||||
margin: "8px",
|
||||
padding: "4px 8px",
|
||||
background: "rgba(255, 255, 255, 0.9)",
|
||||
borderRadius: "4px",
|
||||
fontSize: "12px",
|
||||
pointerEvents: "all",
|
||||
cursor: "text",
|
||||
userSelect: "text",
|
||||
zIndex: 1,
|
||||
}}
|
||||
>
|
||||
url: {roomUrl}
|
||||
{shape.props.isOwner && " (Owner)"}
|
||||
</p>
|
||||
</div>
|
||||
</div>
|
||||
)
|
||||
}
|
||||
|
|
|
|||
|
|
@ -0,0 +1,7 @@
|
|||
import { BaseBoxShapeTool } from "tldraw"
|
||||
|
||||
export class TranscribeTool extends BaseBoxShapeTool {
|
||||
static override id = "Transcribe"
|
||||
shapeType = "Transcribe"
|
||||
override initial = "idle"
|
||||
}
|
||||
|
|
@ -0,0 +1,56 @@
|
|||
// Web Speech API TypeScript declarations
|
||||
interface SpeechRecognition extends EventTarget {
|
||||
continuous: boolean
|
||||
interimResults: boolean
|
||||
lang: string
|
||||
start(): void
|
||||
stop(): void
|
||||
abort(): void
|
||||
onstart: ((this: SpeechRecognition, ev: Event) => any) | null
|
||||
onresult: ((this: SpeechRecognition, ev: SpeechRecognitionEvent) => any) | null
|
||||
onerror: ((this: SpeechRecognition, ev: SpeechRecognitionErrorEvent) => any) | null
|
||||
onend: ((this: SpeechRecognition, ev: Event) => any) | null
|
||||
}
|
||||
|
||||
interface SpeechRecognitionEvent extends Event {
|
||||
resultIndex: number
|
||||
results: SpeechRecognitionResultList
|
||||
}
|
||||
|
||||
interface SpeechRecognitionErrorEvent extends Event {
|
||||
error: string
|
||||
message: string
|
||||
}
|
||||
|
||||
interface SpeechRecognitionResultList {
|
||||
length: number
|
||||
item(index: number): SpeechRecognitionResult
|
||||
[index: number]: SpeechRecognitionResult
|
||||
}
|
||||
|
||||
interface SpeechRecognitionResult {
|
||||
length: number
|
||||
item(index: number): SpeechRecognitionAlternative
|
||||
[index: number]: SpeechRecognitionAlternative
|
||||
isFinal: boolean
|
||||
}
|
||||
|
||||
interface SpeechRecognitionAlternative {
|
||||
transcript: string
|
||||
confidence: number
|
||||
}
|
||||
|
||||
declare var SpeechRecognition: {
|
||||
prototype: SpeechRecognition
|
||||
new(): SpeechRecognition
|
||||
}
|
||||
|
||||
declare var webkitSpeechRecognition: {
|
||||
prototype: SpeechRecognition
|
||||
new(): SpeechRecognition
|
||||
}
|
||||
|
||||
interface Window {
|
||||
SpeechRecognition: typeof SpeechRecognition
|
||||
webkitSpeechRecognition: typeof webkitSpeechRecognition
|
||||
}
|
||||
|
|
@ -12,6 +12,7 @@ import { MycrozineTemplateShape } from "./shapes/MycrozineTemplateShapeUtil"
|
|||
import { SlideShape } from "./shapes/SlideShapeUtil"
|
||||
import { PromptShape } from "./shapes/PromptShapeUtil"
|
||||
import { SharedPianoShape } from "./shapes/SharedPianoShapeUtil"
|
||||
import { TranscribeShape } from "./shapes/TranscribeShapeUtil"
|
||||
|
||||
// Lazy load TLDraw dependencies to avoid startup timeouts
|
||||
let customSchema: any = null
|
||||
|
|
@ -56,6 +57,10 @@ async function getTldrawDependencies() {
|
|||
props: SharedPianoShape.props,
|
||||
migrations: SharedPianoShape.migrations,
|
||||
},
|
||||
Transcribe: {
|
||||
props: TranscribeShape.props,
|
||||
migrations: TranscribeShape.migrations,
|
||||
},
|
||||
},
|
||||
bindings: defaultBindingSchemas,
|
||||
})
|
||||
|
|
|
|||
|
|
@ -0,0 +1,41 @@
|
|||
import { BaseBoxShapeUtil, TLBaseShape } from "tldraw"
|
||||
|
||||
export type ITranscribeShape = TLBaseShape<
|
||||
"Transcribe",
|
||||
{
|
||||
w: number
|
||||
h: number
|
||||
isRecording: boolean
|
||||
transcript: string
|
||||
participants: Array<{
|
||||
id: string
|
||||
name: string
|
||||
isSpeaking: boolean
|
||||
lastSpoken: string
|
||||
}>
|
||||
language: string
|
||||
}
|
||||
>
|
||||
|
||||
export class TranscribeShape extends BaseBoxShapeUtil<ITranscribeShape> {
|
||||
static override type = "Transcribe"
|
||||
|
||||
override getDefaultProps(): ITranscribeShape["props"] {
|
||||
return {
|
||||
w: 400,
|
||||
h: 300,
|
||||
isRecording: false,
|
||||
transcript: "",
|
||||
participants: [],
|
||||
language: "en-US",
|
||||
}
|
||||
}
|
||||
|
||||
override indicator(_shape: ITranscribeShape) {
|
||||
return null // Simplified for worker
|
||||
}
|
||||
|
||||
override component(_shape: ITranscribeShape) {
|
||||
return null // No React components in worker
|
||||
}
|
||||
}
|
||||
Loading…
Reference in New Issue