voice-command-android/README.md

# Voice Command - Native Android App

A fully integrated Android app for voice-to-text transcription with on-device Whisper processing. No server required, no Termux, no additional apps needed.

## Features

- **100% On-Device Transcription** - Uses sherpa-onnx with Whisper models
- **Privacy-First** - All processing happens locally, no data leaves your device
- **Multiple Trigger Methods**:
  - Floating button overlay (always accessible)
  - Volume button combo (press both volumes)
  - Quick Settings tile (notification shade)
- **Smart Routing**:
  - Copy to clipboard
  - Share via any app
  - Save as markdown note
  - Create task (Backlog.md compatible)
- **Intent Detection** - Automatically suggests best action based on content

## Requirements

- Android 10 (API 29) or higher
- ~100-250MB storage for Whisper model
- Microphone permission

## Installation

### From APK (Recommended)

1. Download the latest APK from releases
2. Enable "Install from unknown sources" if prompted
3. Install and open Voice Command
4. Grant microphone permission
5. Wait for model download (~40-250MB depending on selected model)

### Build from Source

```bash
# Clone the repository
git clone https://gitea.jeffemmett.com/jeffemmett/voice-command.git
cd voice-command/android-native

# Build debug APK
./gradlew assembleDebug

# Build release APK (requires signing config)
./gradlew assembleRelease
```

The APK will be in `app/build/outputs/apk/`

## Usage

### Quick Start

1. **Open the app** and grant microphone permission
2. **Tap the big mic button** to start recording
3. **Speak your note or task**
4. **Tap again to stop** - transcription happens automatically
5. **Choose an action** from the menu

### Trigger Methods

#### Floating Button
- Enable in Settings
- Drag to reposition
- Tap to start/stop recording
- Works over any app

#### Volume Buttons
- Enable Accessibility Service in Settings
- Press Volume Up + Volume Down simultaneously
- Vibration confirms recording start/stop

#### Quick Settings Tile
- Swipe down notification shade
- Add "Voice Note" tile
- Tap tile to toggle recording

## Models

| Model | Size | Languages | Quality |
|-------|------|-----------|---------|
| Tiny English | ~40MB | English only | Good for quick notes |
| Base English | ~75MB | English only | Better accuracy |
| Small English | ~250MB | English only | Best accuracy |
| Tiny | ~40MB | Multilingual | Basic quality |
| Base | ~75MB | Multilingual | Good quality |
| Small | ~250MB | Multilingual | Best quality |

## Architecture

```
┌─────────────────────────────────────────────────────┐
│                   Voice Command App                  │
├─────────────────────────────────────────────────────┤
│  UI Layer (Jetpack Compose)                         │
│  ├── MainActivity (main interface)                  │
│  ├── RecordingScreen (recording controls)           │
│  └── TranscriptionResultActivity (result dialog)    │
├─────────────────────────────────────────────────────┤
│  Service Layer                                      │
│  ├── FloatingButtonService (overlay)                │
│  ├── VolumeButtonAccessibilityService (vol combo)   │
│  └── VoiceCommandTileService (Quick Settings)       │
├─────────────────────────────────────────────────────┤
│  Core Layer                                         │
│  ├── AudioRecorder (16kHz PCM capture)              │
│  ├── SherpaTranscriptionEngine (Whisper wrapper)    │
│  └── ActionRouter (clipboard, files, share)         │
├─────────────────────────────────────────────────────┤
│  Native Layer (sherpa-onnx)                         │
│  └── Whisper ONNX models + ONNX Runtime             │
└─────────────────────────────────────────────────────┘
```

## Permissions

| Permission | Purpose |
|------------|---------|
| `RECORD_AUDIO` | Voice recording |
| `SYSTEM_ALERT_WINDOW` | Floating button overlay |
| `FOREGROUND_SERVICE` | Background recording |
| `POST_NOTIFICATIONS` | Service notifications |
| `VIBRATE` | Recording feedback |

## Output Formats

### Notes (Markdown)
```markdown
# Voice Note Title

Your transcribed text here...

---
Created: 2025-12-06 14:30
Source: voice
```

### Tasks (Backlog.md Compatible)
```markdown
---
title: Task Title
status: To Do
priority: medium
created: 2025-12-06T14:30:00
source: voice
---

# Task Title

Your transcribed text here...
```

## Troubleshooting

### Model won't load
- Ensure sufficient storage (~250MB free)
- Check internet connection for initial download
- Try a smaller model (Tiny instead of Small)

### Recording not working
- Check microphone permission is granted
- Ensure no other app is using microphone
- Try restarting the app

### Volume buttons not detected
- Enable Accessibility Service in Android Settings
- Grant all requested permissions
- Some custom ROMs may block this feature

### Floating button not appearing
- Enable "Display over other apps" permission
- Check notification for "Floating Button Active"
- Some launchers may hide overlays

## Privacy

- **All transcription happens on-device**
- No audio or text is sent to any server
- No analytics or tracking
- Notes/tasks saved only to local storage

## Credits

- [sherpa-onnx](https://github.com/k2-fsa/sherpa-onnx) - On-device speech recognition
- [OpenAI Whisper](https://openai.com/research/whisper) - Original Whisper model
- [Jetpack Compose](https://developer.android.com/compose) - Modern Android UI

## License

MIT