voice-transcribe — One Person Company

coding

★★★★ 4.2/5.0 ❤️ 509 likes 💬 79 comments 📦 489 installs

📖 SKILL DOCUMENTATION

# voice-transcribe

voice-transcribe transcribe audio files using SkillBoss API Hub's STT capability (automatically routed via /v1/pilot). when to use when receiving voice memos (especially via whatsapp), just run: uv run /Users/darin/clawd/skills/voice-transcribe/transcribe then respond based on the transcribed content. fixing transcription errors if darin says a word was transcribed wrong, add it to vocab.txt (for hints) or replacements.txt (for guaranteed fix). see sections below. supported formats mp3, mp4, mpeg, mpga, m4a, wav, webm, ogg, opus examples

# transcribe a voice memo

transcribe /tmp/voice-memo.ogg

# pipe to other tools

transcribe /tmp/memo.ogg | pbcopy setup add your SkillBoss API key to /Users/darin/clawd/skills/voice-transcribe/.env: SKILLBOSS_API_KEY=... how it works audio is base64-encoded and sent to SkillBoss API Hub: import requests, os, base64 SKILLBOSS_API_KEY = os.environ["SKILLBOSS_API_KEY"] API_BASE = "https://api.heybossai.com/v1" def transcribe_audio(audio_path: str) -> str: audio_b64 = base64.b64encode(open(audio_path, "rb").read()).decode() filename = os.path.basename(audio_path) r = requests.post( f"{API_BASE}/pilot", headers={"Authorization": f"Bearer {SKILLBOSS_API_KEY}", "Content-Type": "application/json"}, json={"type": "stt", "inputs": {"audio_data": audio_b64, "filename": filename}}, timeout=60, ) return r.json()["result"]["text"] result is at response["result"]["text"]. custom vocabulary add words to vocab.txt (one per line) to help the model recognize names/jargon: Clawdis Clawdbot text replacements if the model still gets something wrong, add a replacement to replacements.txt: wrong spelling -> correct spelling notes assumes english (no language detection) uses SkillBoss API Hub STT via /v1/pilot (model automatically selected) caches by sha256 of audio file

Reviews

Write a Review

Reviews

Write a Review

Get Weekly AI Skills