- Fixed sherpa-onnx dependency to use Maven Central package - Fixed VoiceIntent enum name conflict with android.content.Intent - Added AndroidX configuration in gradle.properties - Added gradle wrapper jar and script - Added app launcher icons (adaptive icons) - Fixed drawable tint references - Added colors.xml resource file - Downloaded Whisper tiny.en model tokens.txt - Updated download-models.sh to download tar.bz2 package Build now produces 141MB debug APK with sherpa-onnx and Whisper model. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> |
||
|---|---|---|
| app | ||
| backlog | ||
| gradle | ||
| .gitignore | ||
| README.md | ||
| build.gradle.kts | ||
| download-models.sh | ||
| gradle.properties | ||
| gradlew | ||
| settings.gradle.kts | ||
README.md
Voice Command - Native Android App
A fully integrated Android app for voice-to-text transcription with on-device Whisper processing. No server required, no Termux, no additional apps needed.
Features
- 100% On-Device Transcription - Uses sherpa-onnx with Whisper models
- Privacy-First - All processing happens locally, no data leaves your device
- Multiple Trigger Methods:
- Floating button overlay (always accessible)
- Volume button combo (press both volumes)
- Quick Settings tile (notification shade)
- Smart Routing:
- Copy to clipboard
- Share via any app
- Save as markdown note
- Create task (Backlog.md compatible)
- Intent Detection - Automatically suggests best action based on content
Requirements
- Android 10 (API 29) or higher
- ~100-250MB storage for Whisper model
- Microphone permission
Installation
From APK (Recommended)
- Download the latest APK from releases
- Enable "Install from unknown sources" if prompted
- Install and open Voice Command
- Grant microphone permission
- Wait for model download (~40-250MB depending on selected model)
Build from Source
# Clone the repository
git clone https://gitea.jeffemmett.com/jeffemmett/voice-command.git
cd voice-command/android-native
# Build debug APK
./gradlew assembleDebug
# Build release APK (requires signing config)
./gradlew assembleRelease
The APK will be in app/build/outputs/apk/
Usage
Quick Start
- Open the app and grant microphone permission
- Tap the big mic button to start recording
- Speak your note or task
- Tap again to stop - transcription happens automatically
- Choose an action from the menu
Trigger Methods
Floating Button
- Enable in Settings
- Drag to reposition
- Tap to start/stop recording
- Works over any app
Volume Buttons
- Enable Accessibility Service in Settings
- Press Volume Up + Volume Down simultaneously
- Vibration confirms recording start/stop
Quick Settings Tile
- Swipe down notification shade
- Add "Voice Note" tile
- Tap tile to toggle recording
Models
| Model | Size | Languages | Quality |
|---|---|---|---|
| Tiny English | ~40MB | English only | Good for quick notes |
| Base English | ~75MB | English only | Better accuracy |
| Small English | ~250MB | English only | Best accuracy |
| Tiny | ~40MB | Multilingual | Basic quality |
| Base | ~75MB | Multilingual | Good quality |
| Small | ~250MB | Multilingual | Best quality |
Architecture
┌─────────────────────────────────────────────────────┐
│ Voice Command App │
├─────────────────────────────────────────────────────┤
│ UI Layer (Jetpack Compose) │
│ ├── MainActivity (main interface) │
│ ├── RecordingScreen (recording controls) │
│ └── TranscriptionResultActivity (result dialog) │
├─────────────────────────────────────────────────────┤
│ Service Layer │
│ ├── FloatingButtonService (overlay) │
│ ├── VolumeButtonAccessibilityService (vol combo) │
│ └── VoiceCommandTileService (Quick Settings) │
├─────────────────────────────────────────────────────┤
│ Core Layer │
│ ├── AudioRecorder (16kHz PCM capture) │
│ ├── SherpaTranscriptionEngine (Whisper wrapper) │
│ └── ActionRouter (clipboard, files, share) │
├─────────────────────────────────────────────────────┤
│ Native Layer (sherpa-onnx) │
│ └── Whisper ONNX models + ONNX Runtime │
└─────────────────────────────────────────────────────┘
Permissions
| Permission | Purpose |
|---|---|
RECORD_AUDIO |
Voice recording |
SYSTEM_ALERT_WINDOW |
Floating button overlay |
FOREGROUND_SERVICE |
Background recording |
POST_NOTIFICATIONS |
Service notifications |
VIBRATE |
Recording feedback |
Output Formats
Notes (Markdown)
# Voice Note Title
Your transcribed text here...
---
Created: 2025-12-06 14:30
Source: voice
Tasks (Backlog.md Compatible)
---
title: Task Title
status: To Do
priority: medium
created: 2025-12-06T14:30:00
source: voice
---
# Task Title
Your transcribed text here...
Troubleshooting
Model won't load
- Ensure sufficient storage (~250MB free)
- Check internet connection for initial download
- Try a smaller model (Tiny instead of Small)
Recording not working
- Check microphone permission is granted
- Ensure no other app is using microphone
- Try restarting the app
Volume buttons not detected
- Enable Accessibility Service in Android Settings
- Grant all requested permissions
- Some custom ROMs may block this feature
Floating button not appearing
- Enable "Display over other apps" permission
- Check notification for "Floating Button Active"
- Some launchers may hide overlays
Privacy
- All transcription happens on-device
- No audio or text is sent to any server
- No analytics or tracking
- Notes/tasks saved only to local storage
Credits
- sherpa-onnx - On-device speech recognition
- OpenAI Whisper - Original Whisper model
- Jetpack Compose - Modern Android UI
License
MIT