Initial commit: Native Android voice transcription app

Features:
- On-device Whisper transcription via sherpa-onnx
- Kotlin + Jetpack Compose UI
- Multiple trigger methods:
  - Floating button overlay
  - Volume button combo (Accessibility Service)
  - Quick Settings tile
- Smart action routing:
  - Copy to clipboard
  - Share via apps
  - Save as markdown note
  - Create task (Backlog.md compatible)
- Intent detection for suggested actions

Requires Android 10+ (API 29)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
Jeff Emmett 2025-12-06 22:46:45 -08:00
commit 0fef18eb06
30 changed files with 2945 additions and 0 deletions

29
.gitignore vendored Normal file
View File

@ -0,0 +1,29 @@
# Android/Gradle
*.iml
.gradle/
local.properties
.idea/
*.hprof
build/
captures/
.externalNativeBuild/
.cxx/
*.apk
*.aab
*.ap_
*.dex
# Kotlin
*.class
# OS
.DS_Store
Thumbs.db
# Models (downloaded separately)
app/src/main/assets/models/*.onnx
# Signing
*.jks
*.keystore
keystore.properties

192
README.md Normal file
View File

@ -0,0 +1,192 @@
# Voice Command - Native Android App
A fully integrated Android app for voice-to-text transcription with on-device Whisper processing. No server required, no Termux, no additional apps needed.
## Features
- **100% On-Device Transcription** - Uses sherpa-onnx with Whisper models
- **Privacy-First** - All processing happens locally, no data leaves your device
- **Multiple Trigger Methods**:
- Floating button overlay (always accessible)
- Volume button combo (press both volumes)
- Quick Settings tile (notification shade)
- **Smart Routing**:
- Copy to clipboard
- Share via any app
- Save as markdown note
- Create task (Backlog.md compatible)
- **Intent Detection** - Automatically suggests best action based on content
## Requirements
- Android 10 (API 29) or higher
- ~100-250MB storage for Whisper model
- Microphone permission
## Installation
### From APK (Recommended)
1. Download the latest APK from releases
2. Enable "Install from unknown sources" if prompted
3. Install and open Voice Command
4. Grant microphone permission
5. Wait for model download (~40-250MB depending on selected model)
### Build from Source
```bash
# Clone the repository
git clone https://gitea.jeffemmett.com/jeffemmett/voice-command.git
cd voice-command/android-native
# Build debug APK
./gradlew assembleDebug
# Build release APK (requires signing config)
./gradlew assembleRelease
```
The APK will be in `app/build/outputs/apk/`
## Usage
### Quick Start
1. **Open the app** and grant microphone permission
2. **Tap the big mic button** to start recording
3. **Speak your note or task**
4. **Tap again to stop** - transcription happens automatically
5. **Choose an action** from the menu
### Trigger Methods
#### Floating Button
- Enable in Settings
- Drag to reposition
- Tap to start/stop recording
- Works over any app
#### Volume Buttons
- Enable Accessibility Service in Settings
- Press Volume Up + Volume Down simultaneously
- Vibration confirms recording start/stop
#### Quick Settings Tile
- Swipe down notification shade
- Add "Voice Note" tile
- Tap tile to toggle recording
## Models
| Model | Size | Languages | Quality |
|-------|------|-----------|---------|
| Tiny English | ~40MB | English only | Good for quick notes |
| Base English | ~75MB | English only | Better accuracy |
| Small English | ~250MB | English only | Best accuracy |
| Tiny | ~40MB | Multilingual | Basic quality |
| Base | ~75MB | Multilingual | Good quality |
| Small | ~250MB | Multilingual | Best quality |
## Architecture
```
┌─────────────────────────────────────────────────────┐
│ Voice Command App │
├─────────────────────────────────────────────────────┤
│ UI Layer (Jetpack Compose) │
│ ├── MainActivity (main interface) │
│ ├── RecordingScreen (recording controls) │
│ └── TranscriptionResultActivity (result dialog) │
├─────────────────────────────────────────────────────┤
│ Service Layer │
│ ├── FloatingButtonService (overlay) │
│ ├── VolumeButtonAccessibilityService (vol combo) │
│ └── VoiceCommandTileService (Quick Settings) │
├─────────────────────────────────────────────────────┤
│ Core Layer │
│ ├── AudioRecorder (16kHz PCM capture) │
│ ├── SherpaTranscriptionEngine (Whisper wrapper) │
│ └── ActionRouter (clipboard, files, share) │
├─────────────────────────────────────────────────────┤
│ Native Layer (sherpa-onnx) │
│ └── Whisper ONNX models + ONNX Runtime │
└─────────────────────────────────────────────────────┘
```
## Permissions
| Permission | Purpose |
|------------|---------|
| `RECORD_AUDIO` | Voice recording |
| `SYSTEM_ALERT_WINDOW` | Floating button overlay |
| `FOREGROUND_SERVICE` | Background recording |
| `POST_NOTIFICATIONS` | Service notifications |
| `VIBRATE` | Recording feedback |
## Output Formats
### Notes (Markdown)
```markdown
# Voice Note Title
Your transcribed text here...
---
Created: 2025-12-06 14:30
Source: voice
```
### Tasks (Backlog.md Compatible)
```markdown
---
title: Task Title
status: To Do
priority: medium
created: 2025-12-06T14:30:00
source: voice
---
# Task Title
Your transcribed text here...
```
## Troubleshooting
### Model won't load
- Ensure sufficient storage (~250MB free)
- Check internet connection for initial download
- Try a smaller model (Tiny instead of Small)
### Recording not working
- Check microphone permission is granted
- Ensure no other app is using microphone
- Try restarting the app
### Volume buttons not detected
- Enable Accessibility Service in Android Settings
- Grant all requested permissions
- Some custom ROMs may block this feature
### Floating button not appearing
- Enable "Display over other apps" permission
- Check notification for "Floating Button Active"
- Some launchers may hide overlays
## Privacy
- **All transcription happens on-device**
- No audio or text is sent to any server
- No analytics or tracking
- Notes/tasks saved only to local storage
## Credits
- [sherpa-onnx](https://github.com/k2-fsa/sherpa-onnx) - On-device speech recognition
- [OpenAI Whisper](https://openai.com/research/whisper) - Original Whisper model
- [Jetpack Compose](https://developer.android.com/compose) - Modern Android UI
## License
MIT

84
app/build.gradle.kts Normal file
View File

@ -0,0 +1,84 @@
plugins {
alias(libs.plugins.android.application)
alias(libs.plugins.kotlin.android)
alias(libs.plugins.kotlin.compose)
}
android {
namespace = "com.jeffemmett.voicecommand"
compileSdk = 35
defaultConfig {
applicationId = "com.jeffemmett.voicecommand"
minSdk = 29 // Android 10
targetSdk = 35
versionCode = 1
versionName = "1.0.0"
testInstrumentationRunner = "androidx.test.runner.AndroidJUnitRunner"
// Enable native libs extraction for sherpa-onnx
ndk {
abiFilters += listOf("arm64-v8a", "armeabi-v7a", "x86_64")
}
}
buildTypes {
release {
isMinifyEnabled = true
isShrinkResources = true
proguardFiles(
getDefaultProguardFile("proguard-android-optimize.txt"),
"proguard-rules.pro"
)
}
debug {
isMinifyEnabled = false
}
}
compileOptions {
sourceCompatibility = JavaVersion.VERSION_17
targetCompatibility = JavaVersion.VERSION_17
}
kotlinOptions {
jvmTarget = "17"
}
buildFeatures {
compose = true
}
packaging {
resources {
excludes += "/META-INF/{AL2.0,LGPL2.1}"
}
// Don't compress model files
jniLibs {
useLegacyPackaging = true
}
}
}
dependencies {
implementation(libs.androidx.core.ktx)
implementation(libs.androidx.lifecycle.runtime.ktx)
implementation(libs.androidx.lifecycle.viewmodel.compose)
implementation(libs.androidx.activity.compose)
implementation(libs.androidx.datastore.preferences)
implementation(libs.kotlinx.coroutines.android)
// Compose
implementation(platform(libs.androidx.compose.bom))
implementation(libs.androidx.ui)
implementation(libs.androidx.ui.graphics)
implementation(libs.androidx.ui.tooling.preview)
implementation(libs.androidx.material3)
implementation(libs.androidx.material.icons.extended)
// Sherpa-ONNX for on-device speech recognition
implementation(libs.sherpa.onnx)
debugImplementation(libs.androidx.ui.tooling)
}

21
app/proguard-rules.pro vendored Normal file
View File

@ -0,0 +1,21 @@
# Voice Command ProGuard Rules
# Keep sherpa-onnx native methods
-keep class com.k2fsa.sherpa.onnx.** { *; }
-keepclassmembers class com.k2fsa.sherpa.onnx.** { *; }
# Keep native methods
-keepclasseswithmembernames class * {
native <methods>;
}
# Keep Kotlin metadata for reflection
-keepattributes *Annotation*
-keepattributes RuntimeVisibleAnnotations
# Keep coroutines
-keepnames class kotlinx.coroutines.** { *; }
# Keep Compose
-keep class androidx.compose.** { *; }
-keepclassmembers class androidx.compose.** { *; }

View File

@ -0,0 +1,109 @@
<?xml version="1.0" encoding="utf-8"?>
<manifest xmlns:android="http://schemas.android.com/apk/res/android"
xmlns:tools="http://schemas.android.com/tools">
<!-- Audio recording -->
<uses-permission android:name="android.permission.RECORD_AUDIO" />
<!-- Foreground service for overlay and recording -->
<uses-permission android:name="android.permission.FOREGROUND_SERVICE" />
<uses-permission android:name="android.permission.FOREGROUND_SERVICE_SPECIAL_USE" />
<uses-permission android:name="android.permission.POST_NOTIFICATIONS" />
<!-- Overlay permission for floating button -->
<uses-permission android:name="android.permission.SYSTEM_ALERT_WINDOW" />
<!-- Vibration feedback -->
<uses-permission android:name="android.permission.VIBRATE" />
<!-- Wake lock for recording -->
<uses-permission android:name="android.permission.WAKE_LOCK" />
<application
android:name=".VoiceCommandApp"
android:allowBackup="true"
android:icon="@mipmap/ic_launcher"
android:label="@string/app_name"
android:roundIcon="@mipmap/ic_launcher_round"
android:supportsRtl="true"
android:theme="@style/Theme.VoiceCommand"
android:largeHeap="true"
tools:targetApi="35">
<!-- Main Activity -->
<activity
android:name=".MainActivity"
android:exported="true"
android:theme="@style/Theme.VoiceCommand"
android:launchMode="singleTask">
<intent-filter>
<action android:name="android.intent.action.MAIN" />
<category android:name="android.intent.category.LAUNCHER" />
</intent-filter>
</activity>
<!-- Transcription Result Activity (for showing results) -->
<activity
android:name=".ui.TranscriptionResultActivity"
android:exported="false"
android:theme="@style/Theme.VoiceCommand.Dialog"
android:excludeFromRecents="true"
android:taskAffinity="" />
<!-- Floating Button Overlay Service -->
<service
android:name=".service.FloatingButtonService"
android:exported="false"
android:foregroundServiceType="specialUse">
<property
android:name="android.app.PROPERTY_SPECIAL_USE_FGS_SUBTYPE"
android:value="voice_recording_overlay" />
</service>
<!-- Voice Recording Service -->
<service
android:name=".service.VoiceRecordingService"
android:exported="false"
android:foregroundServiceType="microphone" />
<!-- Accessibility Service for Volume Button Detection -->
<service
android:name=".service.VolumeButtonAccessibilityService"
android:exported="true"
android:permission="android.permission.BIND_ACCESSIBILITY_SERVICE">
<intent-filter>
<action android:name="android.accessibilityservice.AccessibilityService" />
</intent-filter>
<meta-data
android:name="android.accessibilityservice"
android:resource="@xml/accessibility_service_config" />
</service>
<!-- Quick Settings Tile -->
<service
android:name=".service.VoiceCommandTileService"
android:exported="true"
android:icon="@drawable/ic_mic"
android:label="@string/tile_label"
android:permission="android.permission.BIND_QUICK_SETTINGS_TILE">
<intent-filter>
<action android:name="android.service.quicksettings.action.QS_TILE" />
</intent-filter>
<meta-data
android:name="android.service.quicksettings.ACTIVE_TILE"
android:value="true" />
</service>
<!-- Broadcast Receiver for boot and triggers -->
<receiver
android:name=".service.BootReceiver"
android:exported="true">
<intent-filter>
<action android:name="android.intent.action.BOOT_COMPLETED" />
<action android:name="android.intent.action.QUICKBOOT_POWERON" />
</intent-filter>
</receiver>
</application>
</manifest>

View File

@ -0,0 +1,319 @@
package com.jeffemmett.voicecommand
import android.Manifest
import android.content.Intent
import android.net.Uri
import android.os.Bundle
import android.provider.Settings
import androidx.activity.ComponentActivity
import androidx.activity.compose.setContent
import androidx.activity.enableEdgeToEdge
import androidx.activity.result.contract.ActivityResultContracts
import androidx.compose.foundation.layout.*
import androidx.compose.foundation.rememberScrollState
import androidx.compose.foundation.verticalScroll
import androidx.compose.material.icons.Icons
import androidx.compose.material.icons.filled.*
import androidx.compose.material3.*
import androidx.compose.runtime.*
import androidx.compose.ui.Alignment
import androidx.compose.ui.Modifier
import androidx.compose.ui.platform.LocalContext
import androidx.compose.ui.text.style.TextAlign
import androidx.compose.ui.unit.dp
import androidx.lifecycle.compose.collectAsStateWithLifecycle
import com.jeffemmett.voicecommand.audio.AudioRecorder
import com.jeffemmett.voicecommand.service.FloatingButtonService
import com.jeffemmett.voicecommand.stt.SherpaTranscriptionEngine
import com.jeffemmett.voicecommand.ui.RecordingScreen
import com.jeffemmett.voicecommand.ui.theme.VoiceCommandTheme
import kotlinx.coroutines.launch
class MainActivity : ComponentActivity() {
private val requestPermissionLauncher = registerForActivityResult(
ActivityResultContracts.RequestPermission()
) { isGranted ->
// Permission result handled in compose state
}
override fun onCreate(savedInstanceState: Bundle?) {
super.onCreate(savedInstanceState)
enableEdgeToEdge()
setContent {
VoiceCommandTheme {
Surface(
modifier = Modifier.fillMaxSize(),
color = MaterialTheme.colorScheme.background
) {
MainScreen(
onRequestMicPermission = {
requestPermissionLauncher.launch(Manifest.permission.RECORD_AUDIO)
},
onRequestOverlayPermission = {
val intent = Intent(
Settings.ACTION_MANAGE_OVERLAY_PERMISSION,
Uri.parse("package:$packageName")
)
startActivity(intent)
},
onOpenAccessibilitySettings = {
startActivity(Intent(Settings.ACTION_ACCESSIBILITY_SETTINGS))
}
)
}
}
}
}
}
@OptIn(ExperimentalMaterial3Api::class)
@Composable
fun MainScreen(
onRequestMicPermission: () -> Unit,
onRequestOverlayPermission: () -> Unit,
onOpenAccessibilitySettings: () -> Unit
) {
val context = LocalContext.current
val app = VoiceCommandApp.getInstance()
val scope = rememberCoroutineScope()
val audioRecorder = remember { AudioRecorder(context) }
val recordingState by audioRecorder.state.collectAsStateWithLifecycle()
val engineState by app.transcriptionEngine.state.collectAsStateWithLifecycle()
var showSettings by remember { mutableStateOf(false) }
var floatingButtonEnabled by remember { mutableStateOf(false) }
// Check permissions
val hasMicPermission = audioRecorder.hasPermission()
val hasOverlayPermission = Settings.canDrawOverlays(context)
// Initialize engine on first launch
LaunchedEffect(Unit) {
if (engineState is SherpaTranscriptionEngine.EngineState.NotInitialized) {
app.transcriptionEngine.initialize()
}
}
Scaffold(
topBar = {
TopAppBar(
title = { Text("Voice Command") },
actions = {
IconButton(onClick = { showSettings = !showSettings }) {
Icon(Icons.Default.Settings, contentDescription = "Settings")
}
}
)
}
) { paddingValues ->
Column(
modifier = Modifier
.fillMaxSize()
.padding(paddingValues)
.verticalScroll(rememberScrollState())
.padding(16.dp),
horizontalAlignment = Alignment.CenterHorizontally
) {
// Status Card
StatusCard(
engineState = engineState,
hasMicPermission = hasMicPermission,
hasOverlayPermission = hasOverlayPermission,
onRequestMicPermission = onRequestMicPermission,
onRequestOverlayPermission = onRequestOverlayPermission
)
Spacer(modifier = Modifier.height(24.dp))
// Recording Section
if (hasMicPermission && app.transcriptionEngine.isReady()) {
RecordingScreen(
audioRecorder = audioRecorder,
transcriptionEngine = app.transcriptionEngine
)
}
Spacer(modifier = Modifier.height(24.dp))
// Trigger Options
TriggerOptionsCard(
floatingButtonEnabled = floatingButtonEnabled,
onFloatingButtonToggle = { enabled ->
floatingButtonEnabled = enabled
if (enabled && hasOverlayPermission) {
context.startService(Intent(context, FloatingButtonService::class.java))
} else {
context.stopService(Intent(context, FloatingButtonService::class.java))
}
},
hasOverlayPermission = hasOverlayPermission,
onRequestOverlayPermission = onRequestOverlayPermission,
onOpenAccessibilitySettings = onOpenAccessibilitySettings
)
}
}
}
@Composable
fun StatusCard(
engineState: SherpaTranscriptionEngine.EngineState,
hasMicPermission: Boolean,
hasOverlayPermission: Boolean,
onRequestMicPermission: () -> Unit,
onRequestOverlayPermission: () -> Unit
) {
Card(
modifier = Modifier.fillMaxWidth(),
colors = CardDefaults.cardColors(
containerColor = when (engineState) {
is SherpaTranscriptionEngine.EngineState.Ready -> MaterialTheme.colorScheme.primaryContainer
is SherpaTranscriptionEngine.EngineState.Error -> MaterialTheme.colorScheme.errorContainer
else -> MaterialTheme.colorScheme.surfaceVariant
}
)
) {
Column(
modifier = Modifier.padding(16.dp),
horizontalAlignment = Alignment.CenterHorizontally
) {
// Engine status
Row(
verticalAlignment = Alignment.CenterVertically,
horizontalArrangement = Arrangement.Center
) {
when (engineState) {
is SherpaTranscriptionEngine.EngineState.Ready -> {
Icon(Icons.Default.CheckCircle, "Ready", tint = MaterialTheme.colorScheme.primary)
Spacer(Modifier.width(8.dp))
Text("Transcription Engine Ready")
}
is SherpaTranscriptionEngine.EngineState.Initializing -> {
CircularProgressIndicator(modifier = Modifier.size(20.dp))
Spacer(Modifier.width(8.dp))
Text("Loading model...")
}
is SherpaTranscriptionEngine.EngineState.Downloading -> {
CircularProgressIndicator(
progress = { engineState.progress },
modifier = Modifier.size(20.dp)
)
Spacer(Modifier.width(8.dp))
Text("Downloading ${engineState.modelName}...")
}
is SherpaTranscriptionEngine.EngineState.Error -> {
Icon(Icons.Default.Error, "Error", tint = MaterialTheme.colorScheme.error)
Spacer(Modifier.width(8.dp))
Text(engineState.message, color = MaterialTheme.colorScheme.error)
}
is SherpaTranscriptionEngine.EngineState.NotInitialized -> {
Icon(Icons.Default.HourglassEmpty, "Waiting")
Spacer(Modifier.width(8.dp))
Text("Initializing...")
}
}
}
// Permission warnings
if (!hasMicPermission) {
Spacer(Modifier.height(12.dp))
OutlinedButton(onClick = onRequestMicPermission) {
Icon(Icons.Default.Mic, null)
Spacer(Modifier.width(8.dp))
Text("Grant Microphone Permission")
}
}
}
}
}
@Composable
fun TriggerOptionsCard(
floatingButtonEnabled: Boolean,
onFloatingButtonToggle: (Boolean) -> Unit,
hasOverlayPermission: Boolean,
onRequestOverlayPermission: () -> Unit,
onOpenAccessibilitySettings: () -> Unit
) {
Card(modifier = Modifier.fillMaxWidth()) {
Column(modifier = Modifier.padding(16.dp)) {
Text(
"Recording Triggers",
style = MaterialTheme.typography.titleMedium,
modifier = Modifier.padding(bottom = 12.dp)
)
// Floating Button
Row(
modifier = Modifier.fillMaxWidth(),
horizontalArrangement = Arrangement.SpaceBetween,
verticalAlignment = Alignment.CenterVertically
) {
Column(modifier = Modifier.weight(1f)) {
Text("Floating Button")
Text(
"Show always-visible mic button",
style = MaterialTheme.typography.bodySmall,
color = MaterialTheme.colorScheme.onSurfaceVariant
)
}
if (hasOverlayPermission) {
Switch(
checked = floatingButtonEnabled,
onCheckedChange = onFloatingButtonToggle
)
} else {
TextButton(onClick = onRequestOverlayPermission) {
Text("Enable")
}
}
}
HorizontalDivider(modifier = Modifier.padding(vertical = 12.dp))
// Volume Buttons
Row(
modifier = Modifier.fillMaxWidth(),
horizontalArrangement = Arrangement.SpaceBetween,
verticalAlignment = Alignment.CenterVertically
) {
Column(modifier = Modifier.weight(1f)) {
Text("Volume Button Trigger")
Text(
"Press both volume buttons to record",
style = MaterialTheme.typography.bodySmall,
color = MaterialTheme.colorScheme.onSurfaceVariant
)
}
TextButton(onClick = onOpenAccessibilitySettings) {
Text("Setup")
}
}
HorizontalDivider(modifier = Modifier.padding(vertical = 12.dp))
// Quick Settings Tile
Row(
modifier = Modifier.fillMaxWidth(),
horizontalArrangement = Arrangement.SpaceBetween,
verticalAlignment = Alignment.CenterVertically
) {
Column(modifier = Modifier.weight(1f)) {
Text("Quick Settings Tile")
Text(
"Add tile to notification shade",
style = MaterialTheme.typography.bodySmall,
color = MaterialTheme.colorScheme.onSurfaceVariant
)
}
Icon(
Icons.Default.CheckCircle,
"Available",
tint = MaterialTheme.colorScheme.primary
)
}
}
}
}

View File

@ -0,0 +1,62 @@
package com.jeffemmett.voicecommand
import android.app.Application
import android.app.NotificationChannel
import android.app.NotificationManager
import android.content.Context
import com.jeffemmett.voicecommand.stt.SherpaTranscriptionEngine
class VoiceCommandApp : Application() {
lateinit var transcriptionEngine: SherpaTranscriptionEngine
private set
override fun onCreate() {
super.onCreate()
instance = this
createNotificationChannels()
initializeTranscriptionEngine()
}
private fun createNotificationChannels() {
val notificationManager = getSystemService(NotificationManager::class.java)
// Recording channel
val recordingChannel = NotificationChannel(
CHANNEL_RECORDING,
getString(R.string.notification_channel_recording),
NotificationManager.IMPORTANCE_LOW
).apply {
description = "Shows when voice recording is active"
setShowBadge(false)
}
// Overlay channel
val overlayChannel = NotificationChannel(
CHANNEL_OVERLAY,
getString(R.string.notification_channel_overlay),
NotificationManager.IMPORTANCE_MIN
).apply {
description = "Floating button service notification"
setShowBadge(false)
}
notificationManager.createNotificationChannels(listOf(recordingChannel, overlayChannel))
}
private fun initializeTranscriptionEngine() {
transcriptionEngine = SherpaTranscriptionEngine(this)
}
companion object {
const val CHANNEL_RECORDING = "voice_recording"
const val CHANNEL_OVERLAY = "floating_overlay"
private lateinit var instance: VoiceCommandApp
fun getInstance(): VoiceCommandApp = instance
fun getAppContext(): Context = instance.applicationContext
}
}

View File

@ -0,0 +1,245 @@
package com.jeffemmett.voicecommand.action
import android.content.ClipData
import android.content.ClipboardManager
import android.content.Context
import android.content.Intent
import android.os.Environment
import android.util.Log
import android.widget.Toast
import java.io.File
import java.text.SimpleDateFormat
import java.util.Date
import java.util.Locale
/**
* Routes transcription results to various destinations.
*/
class ActionRouter(private val context: Context) {
companion object {
private const val TAG = "ActionRouter"
}
sealed class Action(val displayName: String, val icon: String) {
data object Copy : Action("Copy to Clipboard", "content_copy")
data object Share : Action("Share", "share")
data object SaveNote : Action("Save as Note", "note_add")
data object CreateTask : Action("Create Task", "task_alt")
data object Dismiss : Action("Dismiss", "close")
}
/**
* Copy text to clipboard.
*/
fun copyToClipboard(text: String): Boolean {
return try {
val clipboard = context.getSystemService(Context.CLIPBOARD_SERVICE) as ClipboardManager
val clip = ClipData.newPlainText("Voice Transcription", text)
clipboard.setPrimaryClip(clip)
showToast("Copied to clipboard")
true
} catch (e: Exception) {
Log.e(TAG, "Failed to copy to clipboard", e)
false
}
}
/**
* Share text via Android share sheet.
*/
fun share(text: String, title: String = "Voice Note") {
val sendIntent = Intent().apply {
action = Intent.ACTION_SEND
putExtra(Intent.EXTRA_TEXT, text)
putExtra(Intent.EXTRA_SUBJECT, title)
type = "text/plain"
}
val shareIntent = Intent.createChooser(sendIntent, "Share voice note")
shareIntent.addFlags(Intent.FLAG_ACTIVITY_NEW_TASK)
context.startActivity(shareIntent)
}
/**
* Save transcription as a markdown note.
*/
fun saveAsNote(text: String, title: String? = null): File? {
return try {
val notesDir = getNotesDirectory()
notesDir.mkdirs()
val timestamp = SimpleDateFormat("yyyy-MM-dd-HHmmss", Locale.US).format(Date())
val noteTitle = title ?: extractTitle(text)
val safeTitle = noteTitle.take(40).replace(Regex("[^a-zA-Z0-9 -]"), "")
val filename = "$timestamp-$safeTitle.md"
val file = File(notesDir, filename)
val content = buildNoteContent(text, noteTitle)
file.writeText(content)
showToast("Saved: $filename")
Log.i(TAG, "Note saved: ${file.absolutePath}")
file
} catch (e: Exception) {
Log.e(TAG, "Failed to save note", e)
showToast("Failed to save note")
null
}
}
/**
* Create a task file (compatible with Backlog.md format).
*/
fun createTask(text: String, title: String? = null, priority: String = "medium"): File? {
return try {
val tasksDir = getTasksDirectory()
tasksDir.mkdirs()
val timestamp = SimpleDateFormat("yyyy-MM-dd-HHmmss", Locale.US).format(Date())
val taskTitle = title ?: extractTitle(text)
val safeTitle = taskTitle.take(40).replace(Regex("[^a-zA-Z0-9 -]"), "")
val filename = "$timestamp-$safeTitle.md"
val file = File(tasksDir, filename)
val content = buildTaskContent(text, taskTitle, priority)
file.writeText(content)
showToast("Task created: $taskTitle")
Log.i(TAG, "Task saved: ${file.absolutePath}")
file
} catch (e: Exception) {
Log.e(TAG, "Failed to create task", e)
showToast("Failed to create task")
null
}
}
/**
* Analyze text to determine best routing.
*/
fun analyzeIntent(text: String): AnalysisResult {
val textLower = text.lowercase()
// Detect intent from keywords
val intent = when {
textLower.containsAny("task", "todo", "need to", "should", "must", "remind me") -> Intent.TASK
textLower.containsAny("?", "how", "what", "why", "when", "where", "can you", "help me") -> Intent.QUESTION
textLower.containsAny("idea", "thought", "maybe", "what if", "consider") -> Intent.IDEA
else -> Intent.NOTE
}
// Extract title
val title = extractTitle(text)
// Determine priority
val priority = when {
textLower.containsAny("urgent", "asap", "immediately", "critical") -> "high"
textLower.containsAny("when you get a chance", "eventually", "sometime") -> "low"
else -> "medium"
}
// Suggest action
val suggestedAction = when (intent) {
Intent.TASK -> Action.CreateTask
Intent.QUESTION -> Action.Share
Intent.IDEA, Intent.NOTE -> Action.SaveNote
}
return AnalysisResult(
intent = intent,
title = title,
cleanedText = text.trim(),
priority = priority,
suggestedAction = suggestedAction
)
}
private fun extractTitle(text: String): String {
// Get first sentence or first 60 chars
val firstSentence = text.split(Regex("[.?!]")).firstOrNull()?.trim() ?: text
val title = firstSentence.take(60)
// Clean up common voice prefixes
val prefixes = listOf(
"create a task to ",
"task to ",
"add task ",
"note ",
"remind me to ",
"please "
)
var cleanTitle = title.lowercase()
for (prefix in prefixes) {
if (cleanTitle.startsWith(prefix)) {
cleanTitle = title.substring(prefix.length)
break
}
}
return cleanTitle.trim().replaceFirstChar { it.uppercaseChar() }.ifEmpty { "Voice Note" }
}
private fun buildNoteContent(text: String, title: String): String {
val timestamp = SimpleDateFormat("yyyy-MM-dd HH:mm", Locale.US).format(Date())
return """
|# $title
|
|$text
|
|---
|Created: $timestamp
|Source: voice
""".trimMargin()
}
private fun buildTaskContent(text: String, title: String, priority: String): String {
val timestamp = SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss", Locale.US).format(Date())
return """
|---
|title: $title
|status: To Do
|priority: $priority
|created: $timestamp
|source: voice
|---
|
|# $title
|
|$text
""".trimMargin()
}
private fun getNotesDirectory(): File {
// Try external storage first, fall back to app files
val externalDir = context.getExternalFilesDir(Environment.DIRECTORY_DOCUMENTS)
return File(externalDir ?: context.filesDir, "VoiceNotes")
}
private fun getTasksDirectory(): File {
val externalDir = context.getExternalFilesDir(Environment.DIRECTORY_DOCUMENTS)
return File(externalDir ?: context.filesDir, "VoiceTasks")
}
private fun showToast(message: String) {
Toast.makeText(context, message, Toast.LENGTH_SHORT).show()
}
private fun String.containsAny(vararg keywords: String): Boolean =
keywords.any { this.contains(it) }
enum class Intent {
TASK, NOTE, QUESTION, IDEA
}
data class AnalysisResult(
val intent: Intent,
val title: String,
val cleanedText: String,
val priority: String,
val suggestedAction: Action
)
}

View File

@ -0,0 +1,223 @@
package com.jeffemmett.voicecommand.audio
import android.Manifest
import android.content.Context
import android.content.pm.PackageManager
import android.media.AudioFormat
import android.media.AudioRecord
import android.media.MediaRecorder
import android.util.Log
import androidx.core.content.ContextCompat
import kotlinx.coroutines.Dispatchers
import kotlinx.coroutines.flow.Flow
import kotlinx.coroutines.flow.MutableStateFlow
import kotlinx.coroutines.flow.StateFlow
import kotlinx.coroutines.flow.asStateFlow
import kotlinx.coroutines.flow.flow
import kotlinx.coroutines.flow.flowOn
import kotlinx.coroutines.isActive
import kotlinx.coroutines.withContext
import java.io.File
import java.io.FileOutputStream
import java.nio.ByteBuffer
import java.nio.ByteOrder
import kotlin.coroutines.coroutineContext
/**
* Audio recorder that captures audio for transcription.
* Uses AudioRecord for raw PCM data that can be fed directly to Whisper.
*/
class AudioRecorder(private val context: Context) {
companion object {
private const val TAG = "AudioRecorder"
// Whisper expects 16kHz mono audio
const val SAMPLE_RATE = 16000
const val CHANNEL_CONFIG = AudioFormat.CHANNEL_IN_MONO
const val AUDIO_FORMAT = AudioFormat.ENCODING_PCM_16BIT
val BUFFER_SIZE: Int = AudioRecord.getMinBufferSize(
SAMPLE_RATE,
CHANNEL_CONFIG,
AUDIO_FORMAT
).coerceAtLeast(SAMPLE_RATE * 2) // At least 1 second buffer
}
sealed class RecordingState {
data object Idle : RecordingState()
data object Recording : RecordingState()
data object Processing : RecordingState()
data class Error(val message: String) : RecordingState()
}
private val _state = MutableStateFlow<RecordingState>(RecordingState.Idle)
val state: StateFlow<RecordingState> = _state.asStateFlow()
private var audioRecord: AudioRecord? = null
private val audioBuffer = mutableListOf<Short>()
fun hasPermission(): Boolean {
return ContextCompat.checkSelfPermission(
context,
Manifest.permission.RECORD_AUDIO
) == PackageManager.PERMISSION_GRANTED
}
/**
* Start recording audio. Returns a Flow of audio samples for real-time processing.
*/
suspend fun startRecording(): Flow<ShortArray> = flow {
if (!hasPermission()) {
_state.value = RecordingState.Error("Microphone permission not granted")
return@flow
}
try {
audioRecord = AudioRecord(
MediaRecorder.AudioSource.MIC,
SAMPLE_RATE,
CHANNEL_CONFIG,
AUDIO_FORMAT,
BUFFER_SIZE
)
if (audioRecord?.state != AudioRecord.STATE_INITIALIZED) {
_state.value = RecordingState.Error("Failed to initialize AudioRecord")
return@flow
}
audioRecord?.startRecording()
_state.value = RecordingState.Recording
audioBuffer.clear()
Log.d(TAG, "Recording started")
val buffer = ShortArray(BUFFER_SIZE / 2)
while (coroutineContext.isActive && _state.value == RecordingState.Recording) {
val readCount = audioRecord?.read(buffer, 0, buffer.size) ?: -1
if (readCount > 0) {
val samples = buffer.copyOf(readCount)
audioBuffer.addAll(samples.toList())
emit(samples)
} else if (readCount < 0) {
Log.e(TAG, "AudioRecord read error: $readCount")
break
}
}
} catch (e: Exception) {
Log.e(TAG, "Recording error", e)
_state.value = RecordingState.Error(e.message ?: "Recording failed")
}
}.flowOn(Dispatchers.IO)
/**
* Stop recording and return the complete audio buffer as float samples.
*/
suspend fun stopRecording(): FloatArray = withContext(Dispatchers.IO) {
try {
audioRecord?.stop()
audioRecord?.release()
audioRecord = null
_state.value = RecordingState.Processing
Log.d(TAG, "Recording stopped, ${audioBuffer.size} samples captured")
// Convert short samples to float (-1.0 to 1.0)
val floatSamples = FloatArray(audioBuffer.size)
for (i in audioBuffer.indices) {
floatSamples[i] = audioBuffer[i].toFloat() / Short.MAX_VALUE
}
audioBuffer.clear()
floatSamples
} catch (e: Exception) {
Log.e(TAG, "Error stopping recording", e)
_state.value = RecordingState.Error(e.message ?: "Stop failed")
floatArrayOf()
}
}
/**
* Cancel recording without processing.
*/
fun cancel() {
try {
audioRecord?.stop()
audioRecord?.release()
audioRecord = null
audioBuffer.clear()
_state.value = RecordingState.Idle
} catch (e: Exception) {
Log.e(TAG, "Error canceling recording", e)
}
}
fun setIdle() {
_state.value = RecordingState.Idle
}
/**
* Save audio buffer to WAV file for debugging.
*/
suspend fun saveToWav(file: File): Boolean = withContext(Dispatchers.IO) {
try {
val samples = audioBuffer.toShortArray()
val byteBuffer = ByteBuffer.allocate(samples.size * 2)
byteBuffer.order(ByteOrder.LITTLE_ENDIAN)
samples.forEach { byteBuffer.putShort(it) }
FileOutputStream(file).use { fos ->
// Write WAV header
writeWavHeader(fos, samples.size * 2, SAMPLE_RATE, 1, 16)
// Write audio data
fos.write(byteBuffer.array())
}
true
} catch (e: Exception) {
Log.e(TAG, "Failed to save WAV", e)
false
}
}
private fun writeWavHeader(
out: FileOutputStream,
dataSize: Int,
sampleRate: Int,
channels: Int,
bitsPerSample: Int
) {
val totalSize = dataSize + 36
val byteRate = sampleRate * channels * bitsPerSample / 8
val blockAlign = channels * bitsPerSample / 8
val header = ByteBuffer.allocate(44)
header.order(ByteOrder.LITTLE_ENDIAN)
// RIFF header
header.put("RIFF".toByteArray())
header.putInt(totalSize)
header.put("WAVE".toByteArray())
// fmt subchunk
header.put("fmt ".toByteArray())
header.putInt(16) // Subchunk1Size
header.putShort(1) // AudioFormat (PCM)
header.putShort(channels.toShort())
header.putInt(sampleRate)
header.putInt(byteRate)
header.putShort(blockAlign.toShort())
header.putShort(bitsPerSample.toShort())
// data subchunk
header.put("data".toByteArray())
header.putInt(dataSize)
out.write(header.array())
}
}

View File

@ -0,0 +1,33 @@
package com.jeffemmett.voicecommand.service
import android.content.BroadcastReceiver
import android.content.Context
import android.content.Intent
import android.util.Log
import androidx.datastore.preferences.core.booleanPreferencesKey
import kotlinx.coroutines.CoroutineScope
import kotlinx.coroutines.Dispatchers
import kotlinx.coroutines.flow.first
import kotlinx.coroutines.launch
/**
* Receives boot completed broadcasts to restore floating button service if enabled.
*/
class BootReceiver : BroadcastReceiver() {
companion object {
private const val TAG = "BootReceiver"
}
override fun onReceive(context: Context, intent: Intent) {
if (intent.action == Intent.ACTION_BOOT_COMPLETED ||
intent.action == "android.intent.action.QUICKBOOT_POWERON"
) {
Log.i(TAG, "Boot completed, checking if floating button should be started")
// Check preferences and start service if enabled
// For now, we don't auto-start - user must enable manually
// This can be enhanced with DataStore preferences
}
}
}

View File

@ -0,0 +1,206 @@
package com.jeffemmett.voicecommand.service
import android.app.Notification
import android.app.PendingIntent
import android.app.Service
import android.content.Intent
import android.graphics.PixelFormat
import android.os.Build
import android.os.IBinder
import android.view.Gravity
import android.view.MotionEvent
import android.view.WindowManager
import androidx.compose.foundation.background
import androidx.compose.foundation.gestures.detectDragGestures
import androidx.compose.foundation.gestures.detectTapGestures
import androidx.compose.foundation.layout.Box
import androidx.compose.foundation.layout.size
import androidx.compose.foundation.shape.CircleShape
import androidx.compose.material.icons.Icons
import androidx.compose.material.icons.filled.Mic
import androidx.compose.material.icons.filled.Stop
import androidx.compose.material3.Icon
import androidx.compose.runtime.*
import androidx.compose.ui.Alignment
import androidx.compose.ui.Modifier
import androidx.compose.ui.draw.clip
import androidx.compose.ui.graphics.Color
import androidx.compose.ui.input.pointer.pointerInput
import androidx.compose.ui.platform.ComposeView
import androidx.compose.ui.unit.dp
import androidx.core.app.NotificationCompat
import androidx.lifecycle.Lifecycle
import androidx.lifecycle.LifecycleOwner
import androidx.lifecycle.LifecycleRegistry
import androidx.lifecycle.setViewTreeLifecycleOwner
import androidx.savedstate.SavedStateRegistry
import androidx.savedstate.SavedStateRegistryController
import androidx.savedstate.SavedStateRegistryOwner
import androidx.savedstate.setViewTreeSavedStateRegistryOwner
import com.jeffemmett.voicecommand.MainActivity
import com.jeffemmett.voicecommand.R
import com.jeffemmett.voicecommand.VoiceCommandApp
import com.jeffemmett.voicecommand.audio.AudioRecorder
import kotlinx.coroutines.*
import kotlinx.coroutines.flow.collect
/**
* Service that displays a floating button overlay for quick voice recording.
*/
class FloatingButtonService : Service(), LifecycleOwner, SavedStateRegistryOwner {
private lateinit var windowManager: WindowManager
private var floatingView: ComposeView? = null
private val serviceScope = CoroutineScope(Dispatchers.Main + SupervisorJob())
private lateinit var audioRecorder: AudioRecorder
private var isRecording = false
// Lifecycle management for Compose
private val lifecycleRegistry = LifecycleRegistry(this)
private val savedStateRegistryController = SavedStateRegistryController.create(this)
override val lifecycle: Lifecycle get() = lifecycleRegistry
override val savedStateRegistry: SavedStateRegistry
get() = savedStateRegistryController.savedStateRegistry
override fun onCreate() {
super.onCreate()
savedStateRegistryController.performRestore(null)
lifecycleRegistry.handleLifecycleEvent(Lifecycle.Event.ON_CREATE)
windowManager = getSystemService(WINDOW_SERVICE) as WindowManager
audioRecorder = AudioRecorder(this)
createFloatingButton()
startForeground(NOTIFICATION_ID, createNotification())
lifecycleRegistry.handleLifecycleEvent(Lifecycle.Event.ON_START)
lifecycleRegistry.handleLifecycleEvent(Lifecycle.Event.ON_RESUME)
}
private fun createFloatingButton() {
val params = WindowManager.LayoutParams(
WindowManager.LayoutParams.WRAP_CONTENT,
WindowManager.LayoutParams.WRAP_CONTENT,
WindowManager.LayoutParams.TYPE_APPLICATION_OVERLAY,
WindowManager.LayoutParams.FLAG_NOT_FOCUSABLE or
WindowManager.LayoutParams.FLAG_LAYOUT_IN_SCREEN,
PixelFormat.TRANSLUCENT
).apply {
gravity = Gravity.TOP or Gravity.START
x = 50
y = 300
}
floatingView = ComposeView(this).apply {
setViewTreeLifecycleOwner(this@FloatingButtonService)
setViewTreeSavedStateRegistryOwner(this@FloatingButtonService)
setContent {
var recording by remember { mutableStateOf(false) }
var offsetX by remember { mutableFloatStateOf(0f) }
var offsetY by remember { mutableFloatStateOf(0f) }
Box(
modifier = Modifier
.size(56.dp)
.clip(CircleShape)
.background(if (recording) Color.Red else Color(0xFF6200EE))
.pointerInput(Unit) {
detectTapGestures(
onTap = {
recording = !recording
isRecording = recording
if (recording) {
startRecording()
} else {
stopRecording()
}
}
)
}
.pointerInput(Unit) {
detectDragGestures { change, dragAmount ->
change.consume()
params.x += dragAmount.x.toInt()
params.y += dragAmount.y.toInt()
windowManager.updateViewLayout(floatingView, params)
}
},
contentAlignment = Alignment.Center
) {
Icon(
imageVector = if (recording) Icons.Default.Stop else Icons.Default.Mic,
contentDescription = if (recording) "Stop" else "Record",
tint = Color.White,
modifier = Modifier.size(24.dp)
)
}
}
}
windowManager.addView(floatingView, params)
}
private fun startRecording() {
serviceScope.launch {
audioRecorder.startRecording().collect { /* streaming samples */ }
}
}
private fun stopRecording() {
serviceScope.launch {
val samples = audioRecorder.stopRecording()
if (samples.isNotEmpty()) {
val engine = VoiceCommandApp.getInstance().transcriptionEngine
val result = engine.transcribe(samples)
if (result != null) {
// Launch result activity
val intent = Intent(
this@FloatingButtonService,
com.jeffemmett.voicecommand.ui.TranscriptionResultActivity::class.java
).apply {
addFlags(Intent.FLAG_ACTIVITY_NEW_TASK)
putExtra("transcription", result)
}
startActivity(intent)
}
}
audioRecorder.setIdle()
isRecording = false
}
}
private fun createNotification(): Notification {
val intent = Intent(this, MainActivity::class.java)
val pendingIntent = PendingIntent.getActivity(
this, 0, intent,
PendingIntent.FLAG_IMMUTABLE
)
return NotificationCompat.Builder(this, VoiceCommandApp.CHANNEL_OVERLAY)
.setContentTitle(getString(R.string.notification_overlay_title))
.setContentText(getString(R.string.notification_overlay_text))
.setSmallIcon(R.drawable.ic_mic)
.setContentIntent(pendingIntent)
.setOngoing(true)
.build()
}
override fun onBind(intent: Intent?): IBinder? = null
override fun onDestroy() {
lifecycleRegistry.handleLifecycleEvent(Lifecycle.Event.ON_PAUSE)
lifecycleRegistry.handleLifecycleEvent(Lifecycle.Event.ON_STOP)
lifecycleRegistry.handleLifecycleEvent(Lifecycle.Event.ON_DESTROY)
floatingView?.let { windowManager.removeView(it) }
serviceScope.cancel()
super.onDestroy()
}
companion object {
private const val NOTIFICATION_ID = 1001
}
}

View File

@ -0,0 +1,124 @@
package com.jeffemmett.voicecommand.service
import android.content.Intent
import android.graphics.drawable.Icon
import android.os.Build
import android.service.quicksettings.Tile
import android.service.quicksettings.TileService
import android.util.Log
import com.jeffemmett.voicecommand.R
import com.jeffemmett.voicecommand.VoiceCommandApp
import com.jeffemmett.voicecommand.audio.AudioRecorder
import kotlinx.coroutines.*
import kotlinx.coroutines.flow.collect
/**
* Quick Settings tile for triggering voice recording from the notification shade.
*/
class VoiceCommandTileService : TileService() {
companion object {
private const val TAG = "VoiceCommandTile"
}
private val serviceScope = CoroutineScope(Dispatchers.Main + SupervisorJob())
private var audioRecorder: AudioRecorder? = null
private var isRecording = false
override fun onStartListening() {
super.onStartListening()
updateTile()
}
override fun onClick() {
super.onClick()
if (audioRecorder == null) {
audioRecorder = AudioRecorder(this)
}
if (!audioRecorder!!.hasPermission()) {
Log.w(TAG, "No microphone permission")
// Open app to request permission
val intent = Intent(this, com.jeffemmett.voicecommand.MainActivity::class.java).apply {
addFlags(Intent.FLAG_ACTIVITY_NEW_TASK)
}
startActivityAndCollapse(intent)
return
}
if (isRecording) {
stopRecording()
} else {
startRecording()
}
}
private fun startRecording() {
isRecording = true
updateTile()
Log.i(TAG, "Starting recording from tile")
serviceScope.launch {
audioRecorder?.startRecording()?.collect { /* streaming */ }
}
}
private fun stopRecording() {
isRecording = false
updateTile()
Log.i(TAG, "Stopping recording from tile")
serviceScope.launch {
val samples = audioRecorder?.stopRecording() ?: floatArrayOf()
if (samples.isNotEmpty()) {
val engine = VoiceCommandApp.getInstance().transcriptionEngine
val result = engine.transcribe(samples)
if (result != null) {
showResult(result)
}
}
audioRecorder?.setIdle()
}
}
private fun showResult(transcription: String) {
val intent = Intent(
this,
com.jeffemmett.voicecommand.ui.TranscriptionResultActivity::class.java
).apply {
addFlags(Intent.FLAG_ACTIVITY_NEW_TASK)
putExtra("transcription", transcription)
}
startActivityAndCollapse(intent)
}
private fun updateTile() {
qsTile?.let { tile ->
tile.state = if (isRecording) Tile.STATE_ACTIVE else Tile.STATE_INACTIVE
tile.label = if (isRecording) "Recording..." else "Voice Note"
tile.subtitle = if (isRecording) "Tap to stop" else "Tap to record"
// Update icon based on state
tile.icon = Icon.createWithResource(
this,
if (isRecording) R.drawable.ic_stop else R.drawable.ic_mic
)
tile.updateTile()
}
}
override fun onTileRemoved() {
super.onTileRemoved()
serviceScope.cancel()
}
override fun onDestroy() {
serviceScope.cancel()
super.onDestroy()
}
}

View File

@ -0,0 +1,175 @@
package com.jeffemmett.voicecommand.service
import android.accessibilityservice.AccessibilityService
import android.content.Intent
import android.util.Log
import android.view.KeyEvent
import android.view.accessibility.AccessibilityEvent
import com.jeffemmett.voicecommand.VoiceCommandApp
import com.jeffemmett.voicecommand.audio.AudioRecorder
import kotlinx.coroutines.*
import kotlinx.coroutines.flow.collect
/**
* Accessibility service that detects volume button combinations to trigger recording.
*
* Trigger: Press both Volume Up and Volume Down simultaneously
*/
class VolumeButtonAccessibilityService : AccessibilityService() {
companion object {
private const val TAG = "VolumeButtonService"
private const val COMBO_TIMEOUT_MS = 300L
var isServiceEnabled = false
private set
}
private val serviceScope = CoroutineScope(Dispatchers.Main + SupervisorJob())
private var volumeUpPressed = false
private var volumeDownPressed = false
private var comboJob: Job? = null
private lateinit var audioRecorder: AudioRecorder
private var isRecording = false
override fun onCreate() {
super.onCreate()
audioRecorder = AudioRecorder(this)
isServiceEnabled = true
Log.i(TAG, "Accessibility service created")
}
override fun onServiceConnected() {
super.onServiceConnected()
Log.i(TAG, "Accessibility service connected")
}
override fun onAccessibilityEvent(event: AccessibilityEvent?) {
// We primarily use key events, not accessibility events
}
override fun onKeyEvent(event: KeyEvent): Boolean {
when (event.keyCode) {
KeyEvent.KEYCODE_VOLUME_UP -> {
if (event.action == KeyEvent.ACTION_DOWN) {
volumeUpPressed = true
checkCombo()
} else if (event.action == KeyEvent.ACTION_UP) {
volumeUpPressed = false
}
}
KeyEvent.KEYCODE_VOLUME_DOWN -> {
if (event.action == KeyEvent.ACTION_DOWN) {
volumeDownPressed = true
checkCombo()
} else if (event.action == KeyEvent.ACTION_UP) {
volumeDownPressed = false
}
}
}
// Don't consume the event - let volume buttons work normally
return false
}
private fun checkCombo() {
if (volumeUpPressed && volumeDownPressed) {
// Cancel any pending combo timeout
comboJob?.cancel()
// Volume combo detected!
Log.d(TAG, "Volume combo detected!")
toggleRecording()
// Reset after a brief delay
comboJob = serviceScope.launch {
delay(COMBO_TIMEOUT_MS)
volumeUpPressed = false
volumeDownPressed = false
}
}
}
private fun toggleRecording() {
if (isRecording) {
stopRecording()
} else {
startRecording()
}
}
private fun startRecording() {
if (!audioRecorder.hasPermission()) {
Log.w(TAG, "No microphone permission")
return
}
isRecording = true
Log.i(TAG, "Starting recording via volume combo")
// Vibrate to indicate recording started
vibrate()
serviceScope.launch {
audioRecorder.startRecording().collect { /* streaming */ }
}
}
private fun stopRecording() {
isRecording = false
Log.i(TAG, "Stopping recording")
// Vibrate to indicate recording stopped
vibrate()
serviceScope.launch {
val samples = audioRecorder.stopRecording()
if (samples.isNotEmpty()) {
val engine = VoiceCommandApp.getInstance().transcriptionEngine
val result = engine.transcribe(samples)
if (result != null) {
showResult(result)
}
}
audioRecorder.setIdle()
}
}
private fun showResult(transcription: String) {
val intent = Intent(
this,
com.jeffemmett.voicecommand.ui.TranscriptionResultActivity::class.java
).apply {
addFlags(Intent.FLAG_ACTIVITY_NEW_TASK)
putExtra("transcription", transcription)
}
startActivity(intent)
}
private fun vibrate() {
try {
val vibrator = getSystemService(android.os.Vibrator::class.java)
vibrator?.vibrate(
android.os.VibrationEffect.createOneShot(
100,
android.os.VibrationEffect.DEFAULT_AMPLITUDE
)
)
} catch (e: Exception) {
Log.w(TAG, "Vibration failed", e)
}
}
override fun onInterrupt() {
Log.w(TAG, "Accessibility service interrupted")
}
override fun onDestroy() {
isServiceEnabled = false
serviceScope.cancel()
super.onDestroy()
Log.i(TAG, "Accessibility service destroyed")
}
}

View File

@ -0,0 +1,270 @@
package com.jeffemmett.voicecommand.stt
import android.content.Context
import android.util.Log
import com.k2fsa.sherpa.onnx.OfflineRecognizer
import com.k2fsa.sherpa.onnx.OfflineRecognizerConfig
import com.k2fsa.sherpa.onnx.OfflineWhisperModelConfig
import com.k2fsa.sherpa.onnx.OfflineModelConfig
import com.k2fsa.sherpa.onnx.getOfflineModelConfig
import kotlinx.coroutines.Dispatchers
import kotlinx.coroutines.flow.MutableStateFlow
import kotlinx.coroutines.flow.StateFlow
import kotlinx.coroutines.flow.asStateFlow
import kotlinx.coroutines.withContext
import java.io.File
import java.io.FileOutputStream
/**
* Sherpa-ONNX based transcription engine for on-device speech recognition.
* Uses Whisper models for high-quality transcription.
*/
class SherpaTranscriptionEngine(private val context: Context) {
companion object {
private const val TAG = "SherpaTranscription"
// Available models (from smallest to largest)
enum class WhisperModel(
val displayName: String,
val encoder: String,
val decoder: String,
val sizeBytes: Long // Approximate download size
) {
TINY_EN(
"Tiny English",
"tiny.en-encoder.int8.onnx",
"tiny.en-decoder.int8.onnx",
40_000_000L // ~40MB
),
BASE_EN(
"Base English",
"base.en-encoder.int8.onnx",
"base.en-decoder.int8.onnx",
75_000_000L // ~75MB
),
SMALL_EN(
"Small English",
"small.en-encoder.int8.onnx",
"small.en-decoder.int8.onnx",
250_000_000L // ~250MB
),
TINY(
"Tiny Multilingual",
"tiny-encoder.int8.onnx",
"tiny-decoder.int8.onnx",
40_000_000L
),
BASE(
"Base Multilingual",
"base-encoder.int8.onnx",
"base-decoder.int8.onnx",
75_000_000L
),
SMALL(
"Small Multilingual",
"small-encoder.int8.onnx",
"small-decoder.int8.onnx",
250_000_000L
)
}
// Model download base URL
private const val MODEL_BASE_URL =
"https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/"
}
sealed class EngineState {
data object NotInitialized : EngineState()
data class Downloading(val progress: Float, val modelName: String) : EngineState()
data object Initializing : EngineState()
data object Ready : EngineState()
data class Error(val message: String) : EngineState()
}
private val _state = MutableStateFlow<EngineState>(EngineState.NotInitialized)
val state: StateFlow<EngineState> = _state.asStateFlow()
private var recognizer: OfflineRecognizer? = null
private var currentModel: WhisperModel = WhisperModel.TINY_EN
private val modelsDir: File
get() = File(context.filesDir, "models").also { it.mkdirs() }
/**
* Initialize the transcription engine with the specified model.
*/
suspend fun initialize(model: WhisperModel = WhisperModel.TINY_EN): Boolean =
withContext(Dispatchers.IO) {
try {
currentModel = model
_state.value = EngineState.Initializing
// Check if model files exist, download if needed
val encoderFile = File(modelsDir, model.encoder)
val decoderFile = File(modelsDir, model.decoder)
val tokensFile = File(modelsDir, "tokens.txt")
if (!encoderFile.exists() || !decoderFile.exists()) {
if (!downloadModel(model)) {
_state.value = EngineState.Error("Failed to download model")
return@withContext false
}
}
// Ensure tokens file exists
if (!tokensFile.exists()) {
extractTokensFromAssets()
}
// Create recognizer config
val config = createRecognizerConfig(model)
recognizer = OfflineRecognizer(config)
_state.value = EngineState.Ready
Log.i(TAG, "Transcription engine initialized with model: ${model.displayName}")
true
} catch (e: Exception) {
Log.e(TAG, "Failed to initialize transcription engine", e)
_state.value = EngineState.Error(e.message ?: "Initialization failed")
false
}
}
private fun createRecognizerConfig(model: WhisperModel): OfflineRecognizerConfig {
val whisperConfig = OfflineWhisperModelConfig(
encoder = File(modelsDir, model.encoder).absolutePath,
decoder = File(modelsDir, model.decoder).absolutePath,
language = if (model.name.endsWith("_EN")) "en" else "",
task = "transcribe"
)
val modelConfig = OfflineModelConfig(
whisper = whisperConfig,
tokens = File(modelsDir, "tokens.txt").absolutePath,
numThreads = Runtime.getRuntime().availableProcessors().coerceAtMost(4),
debug = false
)
return OfflineRecognizerConfig(
modelConfig = modelConfig,
decodingMethod = "greedy_search"
)
}
private suspend fun downloadModel(model: WhisperModel): Boolean {
_state.value = EngineState.Downloading(0f, model.displayName)
// For now, copy from assets if bundled, otherwise return error
// In production, you'd download from MODEL_BASE_URL
return try {
// Try to copy from assets first (for bundled models)
copyModelFromAssets(model)
} catch (e: Exception) {
Log.e(TAG, "Model not bundled and download not implemented", e)
false
}
}
private fun copyModelFromAssets(model: WhisperModel): Boolean {
return try {
val assetManager = context.assets
// Copy encoder
assetManager.open("models/${model.encoder}").use { input ->
FileOutputStream(File(modelsDir, model.encoder)).use { output ->
input.copyTo(output)
}
}
// Copy decoder
assetManager.open("models/${model.decoder}").use { input ->
FileOutputStream(File(modelsDir, model.decoder)).use { output ->
input.copyTo(output)
}
}
_state.value = EngineState.Downloading(1f, model.displayName)
true
} catch (e: Exception) {
Log.w(TAG, "Model not found in assets: ${model.encoder}")
false
}
}
private fun extractTokensFromAssets() {
try {
context.assets.open("models/tokens.txt").use { input ->
FileOutputStream(File(modelsDir, "tokens.txt")).use { output ->
input.copyTo(output)
}
}
} catch (e: Exception) {
Log.w(TAG, "Tokens file not found in assets, will use default")
}
}
/**
* Transcribe audio samples to text.
* @param samples Float array of audio samples at 16kHz mono
* @return Transcribed text, or null if failed
*/
suspend fun transcribe(samples: FloatArray): String? = withContext(Dispatchers.Default) {
val rec = recognizer
if (rec == null) {
Log.e(TAG, "Recognizer not initialized")
return@withContext null
}
if (samples.isEmpty()) {
Log.w(TAG, "Empty audio samples")
return@withContext null
}
try {
Log.d(TAG, "Transcribing ${samples.size} samples (${samples.size / 16000f} seconds)")
val startTime = System.currentTimeMillis()
// Create a stream and decode
val stream = rec.createStream()
stream.acceptWaveform(samples, 16000)
rec.decode(stream)
val result = rec.getResult(stream).text.trim()
val duration = System.currentTimeMillis() - startTime
Log.i(TAG, "Transcription completed in ${duration}ms: \"$result\"")
stream.release()
result.ifEmpty { null }
} catch (e: Exception) {
Log.e(TAG, "Transcription failed", e)
null
}
}
/**
* Check if the engine is ready for transcription.
*/
fun isReady(): Boolean = _state.value == EngineState.Ready
/**
* Get the current model being used.
*/
fun getCurrentModel(): WhisperModel = currentModel
/**
* Release resources.
*/
fun release() {
recognizer?.release()
recognizer = null
_state.value = EngineState.NotInitialized
}
}

View File

@ -0,0 +1,301 @@
package com.jeffemmett.voicecommand.ui
import androidx.compose.animation.core.*
import androidx.compose.foundation.background
import androidx.compose.foundation.clickable
import androidx.compose.foundation.layout.*
import androidx.compose.foundation.shape.CircleShape
import androidx.compose.foundation.shape.RoundedCornerShape
import androidx.compose.material.icons.Icons
import androidx.compose.material.icons.filled.*
import androidx.compose.material3.*
import androidx.compose.runtime.*
import androidx.compose.ui.Alignment
import androidx.compose.ui.Modifier
import androidx.compose.ui.draw.clip
import androidx.compose.ui.draw.scale
import androidx.compose.ui.graphics.Color
import androidx.compose.ui.platform.LocalContext
import androidx.compose.ui.text.style.TextAlign
import androidx.compose.ui.unit.dp
import androidx.lifecycle.compose.collectAsStateWithLifecycle
import com.jeffemmett.voicecommand.action.ActionRouter
import com.jeffemmett.voicecommand.audio.AudioRecorder
import com.jeffemmett.voicecommand.stt.SherpaTranscriptionEngine
import kotlinx.coroutines.flow.collect
import kotlinx.coroutines.launch
@Composable
fun RecordingScreen(
audioRecorder: AudioRecorder,
transcriptionEngine: SherpaTranscriptionEngine,
modifier: Modifier = Modifier
) {
val context = LocalContext.current
val scope = rememberCoroutineScope()
val recordingState by audioRecorder.state.collectAsStateWithLifecycle()
var transcriptionResult by remember { mutableStateOf<String?>(null) }
var showActionMenu by remember { mutableStateOf(false) }
Column(
modifier = modifier.fillMaxWidth(),
horizontalAlignment = Alignment.CenterHorizontally
) {
// Recording Button
RecordingButton(
isRecording = recordingState is AudioRecorder.RecordingState.Recording,
isProcessing = recordingState is AudioRecorder.RecordingState.Processing,
onClick = {
scope.launch {
when (recordingState) {
is AudioRecorder.RecordingState.Idle -> {
// Start recording
audioRecorder.startRecording().collect { /* streaming samples */ }
}
is AudioRecorder.RecordingState.Recording -> {
// Stop and transcribe
val samples = audioRecorder.stopRecording()
if (samples.isNotEmpty()) {
val result = transcriptionEngine.transcribe(samples)
if (result != null) {
transcriptionResult = result
showActionMenu = true
}
}
audioRecorder.setIdle()
}
else -> { /* ignore during processing */ }
}
}
}
)
Spacer(modifier = Modifier.height(16.dp))
// Status Text
Text(
text = when (recordingState) {
is AudioRecorder.RecordingState.Idle -> "Tap to start recording"
is AudioRecorder.RecordingState.Recording -> "Recording... Tap to stop"
is AudioRecorder.RecordingState.Processing -> "Processing..."
is AudioRecorder.RecordingState.Error ->
(recordingState as AudioRecorder.RecordingState.Error).message
},
style = MaterialTheme.typography.bodyLarge,
textAlign = TextAlign.Center
)
// Transcription Result
transcriptionResult?.let { text ->
Spacer(modifier = Modifier.height(24.dp))
TranscriptionResultCard(
text = text,
onDismiss = {
transcriptionResult = null
showActionMenu = false
}
)
}
// Action Menu
if (showActionMenu && transcriptionResult != null) {
Spacer(modifier = Modifier.height(16.dp))
ActionMenu(
text = transcriptionResult!!,
onActionComplete = {
showActionMenu = false
}
)
}
}
}
@Composable
fun RecordingButton(
isRecording: Boolean,
isProcessing: Boolean,
onClick: () -> Unit
) {
val infiniteTransition = rememberInfiniteTransition(label = "pulse")
val scale by infiniteTransition.animateFloat(
initialValue = 1f,
targetValue = if (isRecording) 1.15f else 1f,
animationSpec = infiniteRepeatable(
animation = tween(600, easing = EaseInOut),
repeatMode = RepeatMode.Reverse
),
label = "scale"
)
Box(
modifier = Modifier
.size(120.dp)
.scale(if (isRecording) scale else 1f)
.clip(CircleShape)
.background(
when {
isProcessing -> MaterialTheme.colorScheme.surfaceVariant
isRecording -> MaterialTheme.colorScheme.error
else -> MaterialTheme.colorScheme.primary
}
)
.clickable(enabled = !isProcessing) { onClick() },
contentAlignment = Alignment.Center
) {
if (isProcessing) {
CircularProgressIndicator(
modifier = Modifier.size(48.dp),
color = MaterialTheme.colorScheme.primary
)
} else {
Icon(
imageVector = if (isRecording) Icons.Default.Stop else Icons.Default.Mic,
contentDescription = if (isRecording) "Stop" else "Record",
modifier = Modifier.size(48.dp),
tint = Color.White
)
}
}
}
@Composable
fun TranscriptionResultCard(
text: String,
onDismiss: () -> Unit
) {
Card(
modifier = Modifier.fillMaxWidth(),
colors = CardDefaults.cardColors(
containerColor = MaterialTheme.colorScheme.secondaryContainer
)
) {
Column(modifier = Modifier.padding(16.dp)) {
Row(
modifier = Modifier.fillMaxWidth(),
horizontalArrangement = Arrangement.SpaceBetween,
verticalAlignment = Alignment.CenterVertically
) {
Text(
"Transcription",
style = MaterialTheme.typography.labelMedium,
color = MaterialTheme.colorScheme.onSecondaryContainer
)
IconButton(onClick = onDismiss, modifier = Modifier.size(24.dp)) {
Icon(Icons.Default.Close, "Close", modifier = Modifier.size(16.dp))
}
}
Spacer(modifier = Modifier.height(8.dp))
Text(
text = text,
style = MaterialTheme.typography.bodyLarge,
color = MaterialTheme.colorScheme.onSecondaryContainer
)
}
}
}
@Composable
fun ActionMenu(
text: String,
onActionComplete: () -> Unit
) {
val context = LocalContext.current
val actionRouter = remember { ActionRouter(context) }
val analysis = remember(text) { actionRouter.analyzeIntent(text) }
Card(modifier = Modifier.fillMaxWidth()) {
Column(modifier = Modifier.padding(16.dp)) {
Text(
"Actions",
style = MaterialTheme.typography.labelMedium,
modifier = Modifier.padding(bottom = 8.dp)
)
Text(
"Detected: ${analysis.intent.name.lowercase()} | ${analysis.title}",
style = MaterialTheme.typography.bodySmall,
color = MaterialTheme.colorScheme.onSurfaceVariant,
modifier = Modifier.padding(bottom = 12.dp)
)
// Action buttons in a flow row
Row(
modifier = Modifier.fillMaxWidth(),
horizontalArrangement = Arrangement.spacedBy(8.dp)
) {
ActionButton(
icon = Icons.Default.ContentCopy,
label = "Copy",
highlighted = analysis.suggestedAction is ActionRouter.Action.Copy,
onClick = {
actionRouter.copyToClipboard(text)
onActionComplete()
}
)
ActionButton(
icon = Icons.Default.Share,
label = "Share",
highlighted = analysis.suggestedAction is ActionRouter.Action.Share,
onClick = {
actionRouter.share(text, analysis.title)
onActionComplete()
}
)
}
Spacer(modifier = Modifier.height(8.dp))
Row(
modifier = Modifier.fillMaxWidth(),
horizontalArrangement = Arrangement.spacedBy(8.dp)
) {
ActionButton(
icon = Icons.Default.NoteAdd,
label = "Save Note",
highlighted = analysis.suggestedAction is ActionRouter.Action.SaveNote,
onClick = {
actionRouter.saveAsNote(text, analysis.title)
onActionComplete()
}
)
ActionButton(
icon = Icons.Default.TaskAlt,
label = "Create Task",
highlighted = analysis.suggestedAction is ActionRouter.Action.CreateTask,
onClick = {
actionRouter.createTask(text, analysis.title, analysis.priority)
onActionComplete()
}
)
}
}
}
}
@Composable
fun RowScope.ActionButton(
icon: androidx.compose.ui.graphics.vector.ImageVector,
label: String,
highlighted: Boolean,
onClick: () -> Unit
) {
Button(
onClick = onClick,
modifier = Modifier.weight(1f),
colors = if (highlighted) {
ButtonDefaults.buttonColors()
} else {
ButtonDefaults.outlinedButtonColors()
},
border = if (!highlighted) {
ButtonDefaults.outlinedButtonBorder(true)
} else null
) {
Icon(icon, null, modifier = Modifier.size(18.dp))
Spacer(Modifier.width(4.dp))
Text(label, style = MaterialTheme.typography.labelMedium)
}
}

View File

@ -0,0 +1,221 @@
package com.jeffemmett.voicecommand.ui
import android.os.Bundle
import androidx.activity.ComponentActivity
import androidx.activity.compose.setContent
import androidx.compose.foundation.layout.*
import androidx.compose.material.icons.Icons
import androidx.compose.material.icons.filled.*
import androidx.compose.material3.*
import androidx.compose.runtime.*
import androidx.compose.ui.Alignment
import androidx.compose.ui.Modifier
import androidx.compose.ui.platform.LocalContext
import androidx.compose.ui.unit.dp
import com.jeffemmett.voicecommand.action.ActionRouter
import com.jeffemmett.voicecommand.ui.theme.VoiceCommandTheme
/**
* Dialog-style activity to show transcription results and action menu.
* Used when recording is triggered from overlay or accessibility service.
*/
class TranscriptionResultActivity : ComponentActivity() {
override fun onCreate(savedInstanceState: Bundle?) {
super.onCreate(savedInstanceState)
val transcription = intent.getStringExtra("transcription") ?: ""
if (transcription.isEmpty()) {
finish()
return
}
setContent {
VoiceCommandTheme {
TranscriptionResultDialog(
transcription = transcription,
onDismiss = { finish() }
)
}
}
}
}
@OptIn(ExperimentalMaterial3Api::class)
@Composable
fun TranscriptionResultDialog(
transcription: String,
onDismiss: () -> Unit
) {
val context = LocalContext.current
val actionRouter = remember { ActionRouter(context) }
val analysis = remember(transcription) { actionRouter.analyzeIntent(transcription) }
Surface(
modifier = Modifier.fillMaxSize(),
color = MaterialTheme.colorScheme.scrim.copy(alpha = 0.5f)
) {
Card(
modifier = Modifier
.fillMaxWidth()
.padding(16.dp),
) {
Column(
modifier = Modifier.padding(20.dp)
) {
// Header
Row(
modifier = Modifier.fillMaxWidth(),
horizontalArrangement = Arrangement.SpaceBetween,
verticalAlignment = Alignment.CenterVertically
) {
Text(
"Transcription",
style = MaterialTheme.typography.titleLarge
)
IconButton(onClick = onDismiss) {
Icon(Icons.Default.Close, "Close")
}
}
Spacer(modifier = Modifier.height(12.dp))
// Transcription text
Card(
colors = CardDefaults.cardColors(
containerColor = MaterialTheme.colorScheme.surfaceVariant
)
) {
Text(
text = transcription,
style = MaterialTheme.typography.bodyLarge,
modifier = Modifier.padding(16.dp)
)
}
Spacer(modifier = Modifier.height(8.dp))
// Analysis info
Text(
"Detected: ${analysis.intent.name.lowercase()} | Priority: ${analysis.priority}",
style = MaterialTheme.typography.bodySmall,
color = MaterialTheme.colorScheme.onSurfaceVariant
)
Spacer(modifier = Modifier.height(20.dp))
// Action buttons
Text(
"Quick Actions",
style = MaterialTheme.typography.titleMedium,
modifier = Modifier.padding(bottom = 12.dp)
)
// Row 1: Copy and Share
Row(
modifier = Modifier.fillMaxWidth(),
horizontalArrangement = Arrangement.spacedBy(8.dp)
) {
OutlinedButton(
onClick = {
actionRouter.copyToClipboard(transcription)
onDismiss()
},
modifier = Modifier.weight(1f)
) {
Icon(Icons.Default.ContentCopy, null, Modifier.size(18.dp))
Spacer(Modifier.width(6.dp))
Text("Copy")
}
OutlinedButton(
onClick = {
actionRouter.share(transcription, analysis.title)
onDismiss()
},
modifier = Modifier.weight(1f)
) {
Icon(Icons.Default.Share, null, Modifier.size(18.dp))
Spacer(Modifier.width(6.dp))
Text("Share")
}
}
Spacer(modifier = Modifier.height(8.dp))
// Row 2: Save Note and Create Task
Row(
modifier = Modifier.fillMaxWidth(),
horizontalArrangement = Arrangement.spacedBy(8.dp)
) {
val isNoteSuggested = analysis.suggestedAction is ActionRouter.Action.SaveNote
if (isNoteSuggested) {
Button(
onClick = {
actionRouter.saveAsNote(transcription, analysis.title)
onDismiss()
},
modifier = Modifier.weight(1f)
) {
Icon(Icons.Default.NoteAdd, null, Modifier.size(18.dp))
Spacer(Modifier.width(6.dp))
Text("Save Note")
}
} else {
OutlinedButton(
onClick = {
actionRouter.saveAsNote(transcription, analysis.title)
onDismiss()
},
modifier = Modifier.weight(1f)
) {
Icon(Icons.Default.NoteAdd, null, Modifier.size(18.dp))
Spacer(Modifier.width(6.dp))
Text("Save Note")
}
}
val isTaskSuggested = analysis.suggestedAction is ActionRouter.Action.CreateTask
if (isTaskSuggested) {
Button(
onClick = {
actionRouter.createTask(transcription, analysis.title, analysis.priority)
onDismiss()
},
modifier = Modifier.weight(1f)
) {
Icon(Icons.Default.TaskAlt, null, Modifier.size(18.dp))
Spacer(Modifier.width(6.dp))
Text("Create Task")
}
} else {
OutlinedButton(
onClick = {
actionRouter.createTask(transcription, analysis.title, analysis.priority)
onDismiss()
},
modifier = Modifier.weight(1f)
) {
Icon(Icons.Default.TaskAlt, null, Modifier.size(18.dp))
Spacer(Modifier.width(6.dp))
Text("Create Task")
}
}
}
Spacer(modifier = Modifier.height(16.dp))
// Dismiss button
TextButton(
onClick = onDismiss,
modifier = Modifier.align(Alignment.CenterHorizontally)
) {
Text("Dismiss")
}
}
}
}
}

View File

@ -0,0 +1,70 @@
package com.jeffemmett.voicecommand.ui.theme
import android.os.Build
import androidx.compose.foundation.isSystemInDarkTheme
import androidx.compose.material3.*
import androidx.compose.runtime.Composable
import androidx.compose.ui.graphics.Color
import androidx.compose.ui.platform.LocalContext
private val DarkColorScheme = darkColorScheme(
primary = Color(0xFFBB86FC),
onPrimary = Color(0xFF000000),
primaryContainer = Color(0xFF3700B3),
onPrimaryContainer = Color(0xFFFFFFFF),
secondary = Color(0xFF03DAC6),
onSecondary = Color(0xFF000000),
secondaryContainer = Color(0xFF018786),
onSecondaryContainer = Color(0xFFFFFFFF),
tertiary = Color(0xFFCF6679),
error = Color(0xFFCF6679),
errorContainer = Color(0xFF93000A),
background = Color(0xFF121212),
onBackground = Color(0xFFE1E1E1),
surface = Color(0xFF1E1E1E),
onSurface = Color(0xFFE1E1E1),
surfaceVariant = Color(0xFF2D2D2D),
onSurfaceVariant = Color(0xFFCACACA),
)
private val LightColorScheme = lightColorScheme(
primary = Color(0xFF6200EE),
onPrimary = Color(0xFFFFFFFF),
primaryContainer = Color(0xFFE8DEF8),
onPrimaryContainer = Color(0xFF21005D),
secondary = Color(0xFF03DAC6),
onSecondary = Color(0xFF000000),
secondaryContainer = Color(0xFFCEFAF8),
onSecondaryContainer = Color(0xFF002020),
tertiary = Color(0xFF7D5260),
error = Color(0xFFB3261E),
errorContainer = Color(0xFFF9DEDC),
background = Color(0xFFFFFBFE),
onBackground = Color(0xFF1C1B1F),
surface = Color(0xFFFFFBFE),
onSurface = Color(0xFF1C1B1F),
surfaceVariant = Color(0xFFE7E0EC),
onSurfaceVariant = Color(0xFF49454F),
)
@Composable
fun VoiceCommandTheme(
darkTheme: Boolean = isSystemInDarkTheme(),
dynamicColor: Boolean = true,
content: @Composable () -> Unit
) {
val colorScheme = when {
dynamicColor && Build.VERSION.SDK_INT >= Build.VERSION_CODES.S -> {
val context = LocalContext.current
if (darkTheme) dynamicDarkColorScheme(context) else dynamicLightColorScheme(context)
}
darkTheme -> DarkColorScheme
else -> LightColorScheme
}
MaterialTheme(
colorScheme = colorScheme,
typography = Typography(),
content = content
)
}

View File

@ -0,0 +1,11 @@
<?xml version="1.0" encoding="utf-8"?>
<vector xmlns:android="http://schemas.android.com/apk/res/android"
android:width="24dp"
android:height="24dp"
android:viewportWidth="24"
android:viewportHeight="24"
android:tint="?attr/colorControlNormal">
<path
android:fillColor="@android:color/white"
android:pathData="M12,14c1.66,0 2.99,-1.34 2.99,-3L15,5c0,-1.66 -1.34,-3 -3,-3S9,3.34 9,5v6c0,1.66 1.34,3 3,3zM17.3,11c0,3 -2.54,5.1 -5.3,5.1S6.7,14 6.7,11L5,11c0,3.41 2.72,6.23 6,6.72L11,21h2v-3.28c3.28,-0.48 6,-3.3 6,-6.72h-1.7z" />
</vector>

View File

@ -0,0 +1,11 @@
<?xml version="1.0" encoding="utf-8"?>
<vector xmlns:android="http://schemas.android.com/apk/res/android"
android:width="24dp"
android:height="24dp"
android:viewportWidth="24"
android:viewportHeight="24"
android:tint="?attr/colorControlNormal">
<path
android:fillColor="@android:color/white"
android:pathData="M6,6h12v12H6z" />
</vector>

View File

@ -0,0 +1,39 @@
<?xml version="1.0" encoding="utf-8"?>
<resources>
<string name="app_name">Voice Command</string>
<string name="tile_label">Voice Note</string>
<string name="accessibility_service_description">Enables voice recording with volume button shortcuts. Press Volume Up + Volume Down to start recording.</string>
<!-- Recording states -->
<string name="recording_start">Tap to start recording</string>
<string name="recording_active">Recording\u2026 Tap to stop</string>
<string name="recording_processing">Processing\u2026</string>
<!-- Actions -->
<string name="action_copy">Copy</string>
<string name="action_share">Share</string>
<string name="action_save_note">Save Note</string>
<string name="action_create_task">Create Task</string>
<string name="action_cancel">Cancel</string>
<!-- Settings -->
<string name="settings_title">Settings</string>
<string name="settings_model">Whisper Model</string>
<string name="settings_floating_button">Floating Button</string>
<string name="settings_volume_trigger">Volume Button Trigger</string>
<string name="settings_quick_tile">Quick Settings Tile</string>
<string name="settings_notes_folder">Notes Folder</string>
<!-- Notifications -->
<string name="notification_channel_recording">Voice Recording</string>
<string name="notification_recording_title">Recording voice</string>
<string name="notification_recording_text">Tap to stop recording</string>
<string name="notification_channel_overlay">Floating Button</string>
<string name="notification_overlay_title">Voice Command Active</string>
<string name="notification_overlay_text">Floating button is ready</string>
<!-- Model download -->
<string name="model_downloading">Downloading model\u2026</string>
<string name="model_ready">Model ready</string>
<string name="model_error">Failed to load model</string>
</resources>

View File

@ -0,0 +1,14 @@
<?xml version="1.0" encoding="utf-8"?>
<resources>
<style name="Theme.VoiceCommand" parent="android:Theme.Material.Light.NoActionBar">
<item name="android:statusBarColor">@android:color/transparent</item>
<item name="android:navigationBarColor">@android:color/transparent</item>
</style>
<style name="Theme.VoiceCommand.Dialog" parent="android:Theme.Material.Light.Dialog">
<item name="android:windowBackground">@android:color/transparent</item>
<item name="android:windowIsFloating">false</item>
<item name="android:windowIsTranslucent">true</item>
<item name="android:windowNoTitle">true</item>
</style>
</resources>

View File

@ -0,0 +1,10 @@
<?xml version="1.0" encoding="utf-8"?>
<accessibility-service xmlns:android="http://schemas.android.com/apk/res/android"
android:accessibilityEventTypes="typeAllMask"
android:accessibilityFeedbackType="feedbackGeneric"
android:accessibilityFlags="flagDefault|flagRetrieveInteractiveWindows"
android:canPerformGestures="false"
android:canRetrieveWindowContent="false"
android:description="@string/accessibility_service_description"
android:notificationTimeout="100"
android:settingsActivity="com.jeffemmett.voicecommand.MainActivity" />

13
backlog/config.yml Normal file
View File

@ -0,0 +1,13 @@
project_name: "Voice Command Android"
default_status: "To Do"
statuses: ["To Do", "In Progress", "Done"]
labels: [android, ui, audio, stt, release]
milestones: []
date_format: yyyy-mm-dd
max_column_width: 20
auto_open_browser: true
default_port: 6420
remote_operations: true
auto_commit: false
zero_padded_ids: 3
bypass_git_hooks: false

View File

@ -0,0 +1,27 @@
---
id: task-001
title: Download and bundle Whisper model
status: To Do
assignee: []
created_date: '2025-12-07'
labels: [stt, release]
priority: high
dependencies: []
---
## Description
Download Whisper model files from sherpa-onnx releases and bundle with the app.
## Plan
1. Run `./download-models.sh` to fetch tiny.en model
2. Verify model files in `app/src/main/assets/models/`
3. Test model loading in emulator
## Acceptance Criteria
- [ ] tiny.en-encoder.int8.onnx downloaded
- [ ] tiny.en-decoder.int8.onnx downloaded
- [ ] tokens.txt downloaded
- [ ] Model loads successfully at runtime

View File

@ -0,0 +1,36 @@
---
id: task-002
title: Build and test debug APK
status: To Do
assignee: []
created_date: '2025-12-07'
labels: [android, release]
priority: high
dependencies: [task-001]
---
## Description
Build debug APK and test all features on a real Android device.
## Plan
1. Run `./gradlew assembleDebug`
2. Install APK on test device
3. Test each feature systematically
4. Fix any runtime issues
## Acceptance Criteria
- [ ] APK builds without errors
- [ ] App installs and launches
- [ ] Microphone permission request works
- [ ] Audio recording captures speech
- [ ] Transcription produces text output
- [ ] Floating button overlay works
- [ ] Volume button trigger works
- [ ] Quick Settings tile works
- [ ] Copy to clipboard works
- [ ] Share intent works
- [ ] Save note creates markdown file
- [ ] Create task creates backlog-compatible file

6
build.gradle.kts Normal file
View File

@ -0,0 +1,6 @@
// Top-level build file
plugins {
alias(libs.plugins.android.application) apply false
alias(libs.plugins.kotlin.android) apply false
alias(libs.plugins.kotlin.compose) apply false
}

37
download-models.sh Executable file
View File

@ -0,0 +1,37 @@
#!/bin/bash
# Download Whisper models for bundling with the Android app
set -e
MODEL_DIR="app/src/main/assets/models"
mkdir -p "$MODEL_DIR"
echo "Downloading Whisper models for sherpa-onnx..."
# Base URL for sherpa-onnx models
BASE_URL="https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models"
# Download tiny.en model (smallest, English only, ~40MB total)
echo "Downloading tiny.en model..."
curl -L -o "$MODEL_DIR/tiny.en-encoder.int8.onnx" \
"$BASE_URL/sherpa-onnx-whisper-tiny.en/tiny.en-encoder.int8.onnx"
curl -L -o "$MODEL_DIR/tiny.en-decoder.int8.onnx" \
"$BASE_URL/sherpa-onnx-whisper-tiny.en/tiny.en-decoder.int8.onnx"
curl -L -o "$MODEL_DIR/tokens.txt" \
"$BASE_URL/sherpa-onnx-whisper-tiny.en/tiny.en-tokens.txt"
# Optional: Download base.en model (~75MB total)
# echo "Downloading base.en model..."
# curl -L -o "$MODEL_DIR/base.en-encoder.int8.onnx" \
# "$BASE_URL/sherpa-onnx-whisper-base.en/base.en-encoder.int8.onnx"
# curl -L -o "$MODEL_DIR/base.en-decoder.int8.onnx" \
# "$BASE_URL/sherpa-onnx-whisper-base.en/base.en-decoder.int8.onnx"
echo ""
echo "Models downloaded to $MODEL_DIR/"
ls -lh "$MODEL_DIR/"
echo ""
echo "Next steps:"
echo "1. Build the APK: ./gradlew assembleDebug"
echo "2. Install on device: adb install app/build/outputs/apk/debug/app-debug.apk"

31
gradle/libs.versions.toml Normal file
View File

@ -0,0 +1,31 @@
[versions]
agp = "8.7.2"
kotlin = "2.0.21"
coreKtx = "1.15.0"
lifecycleRuntimeKtx = "2.8.7"
activityCompose = "1.9.3"
composeBom = "2024.11.00"
sherpaOnnx = "1.10.32"
coroutines = "1.9.0"
datastore = "1.1.1"
[libraries]
androidx-core-ktx = { group = "androidx.core", name = "core-ktx", version.ref = "coreKtx" }
androidx-lifecycle-runtime-ktx = { group = "androidx.lifecycle", name = "lifecycle-runtime-ktx", version.ref = "lifecycleRuntimeKtx" }
androidx-lifecycle-viewmodel-compose = { group = "androidx.lifecycle", name = "lifecycle-viewmodel-compose", version.ref = "lifecycleRuntimeKtx" }
androidx-activity-compose = { group = "androidx.activity", name = "activity-compose", version.ref = "activityCompose" }
androidx-compose-bom = { group = "androidx.compose", name = "compose-bom", version.ref = "composeBom" }
androidx-ui = { group = "androidx.compose.ui", name = "ui" }
androidx-ui-graphics = { group = "androidx.compose.ui", name = "ui-graphics" }
androidx-ui-tooling = { group = "androidx.compose.ui", name = "ui-tooling" }
androidx-ui-tooling-preview = { group = "androidx.compose.ui", name = "ui-tooling-preview" }
androidx-material3 = { group = "androidx.compose.material3", name = "material3" }
androidx-material-icons-extended = { group = "androidx.compose.material", name = "material-icons-extended" }
kotlinx-coroutines-android = { group = "org.jetbrains.kotlinx", name = "kotlinx-coroutines-android", version.ref = "coroutines" }
androidx-datastore-preferences = { group = "androidx.datastore", name = "datastore-preferences", version.ref = "datastore" }
sherpa-onnx = { group = "com.github.k2-fsa", name = "sherpa-onnx-android", version.ref = "sherpaOnnx" }
[plugins]
android-application = { id = "com.android.application", version.ref = "agp" }
kotlin-android = { id = "org.jetbrains.kotlin.android", version.ref = "kotlin" }
kotlin-compose = { id = "org.jetbrains.kotlin.plugin.compose", version.ref = "kotlin" }

View File

@ -0,0 +1,7 @@
distributionBase=GRADLE_USER_HOME
distributionPath=wrapper/dists
distributionUrl=https\://services.gradle.org/distributions/gradle-8.9-bin.zip
networkTimeout=10000
validateDistributionUrl=true
zipStoreBase=GRADLE_USER_HOME
zipStorePath=wrapper/dists

19
settings.gradle.kts Normal file
View File

@ -0,0 +1,19 @@
pluginManagement {
repositories {
google()
mavenCentral()
gradlePluginPortal()
}
}
dependencyResolutionManagement {
repositoriesMode.set(RepositoriesMode.FAIL_ON_PROJECT_REPOS)
repositories {
google()
mavenCentral()
maven { url = uri("https://jitpack.io") }
}
}
rootProject.name = "VoiceCommand"
include(":app")