Vibing Vibing The Intelligent Voice Input Method 智能语音输入法

Just Speak It
More thana voice input method, Vibingunderstands your intention, andtypes what you actually mean.
不只是语音输入法 Vibing理解你的意图 为你打出真正想说的话。
Powered by VibeVoice

Quick Start 快速开始

Hold key to talk, release to insert text. 按住说话,松开即输入文本。

① Press & Hold① 按住 ② Speaking② 说话 ③ Release③ 松开 ④ Auto-pasted④ 自动粘贴
Windows Ctrl + Win
~
1
2
3
4
5
6
7
8
9
0
-
=
Bksp
Tab
Q
W
E
R
T
Y
U
I
O
P
[
]
\
Caps
A
S
D
F
G
H
J
K
L
;
'
Enter
Shift
Z
X
C
V
B
N
M
,
.
/
Shift
Ctrl
Win
Alt
Alt
Win
Menu
Ctrl
Mac Right ⌥ Option
~
1
2
3
4
5
6
7
8
9
0
-
=
del
tab
Q
W
E
R
T
Y
U
I
O
P
[
]
\
caps
A
S
D
F
G
H
J
K
L
;
'
return
shift
Z
X
C
V
B
N
M
,
.
/
shift
fn
↑↓
Processing...
|
Auto-pasted ✓已自动粘贴 ✓
More input modes → 更多输入模式 →

What Can Vibing Do? Vibing 能做什么?

LLM-Powered Rewriting
What you said
Vibing can do three things, um, first it removes filler words like um and uh, second it handles auto-correctionsno, self-corrections, and uh, third it like auto-formats lists
What Vibing types
Vibing can do three things:
1. Remove filler words
2. Handle self-corrections
3. Auto-format lists
你说的
Vibing 能做三件事,,第一个是去掉嗯啊之类的填充词,第二个是处理口误自我更—不对自我纠正第三个就是自动把列表格式化
Vibing 输出
Vibing 能做三件事:
1. 去除填充词
2. 处理自我纠正
3. 自动格式化列表
Context-Aware Hotwords
Chat
Priyatham Reddy: "The new build of Kaleido is ready for testing."
You:
Other voice tools
Got it pretty thumb, I'll test the collide-o build this afternoon.
Vibing
Got it Priyatham, I'll test the Kaleido build this afternoon.
Chat
Priyatham Reddy: "Kaleido 的新版本已经可以测试了。"
你:
其他语音输入
收到普利亚森,我下午测一下卡雷多的新版本。
Vibing
收到 Priyatham,我下午测一下 Kaleido 的新版本。
AutoFormat
"Ugh this bug is driving me crazy, I've been debugging for two hours and it turns out it was just a typo in the config file, I can't believe it"
Vibing
💬 Chat → Keep the vibe Teams
Ugh this bug is driving me crazy, I've been debugging for two hours and it turns out it was just a typo in the config file, I can't believe it 😩
📄 Document → Just the facts Notion
Spent two hours debugging; root cause was a typo in the config file.
"我靠这个bug搞了我两小时,结果就是配置文件里一个拼写错误,我真的服了"
Vibing
💬 聊天 → 保留情绪 Teams
我靠这个bug搞了我两小时,结果就是配置文件里一个拼写错误,我真的服了😩
📄 文档 → 只留事实 Notion
排查两小时,根本原因是配置文件中的拼写错误。
Same words, different output. Vibing reads your screen — casual and emotional in chat, clean and factual in docs. 同样的话,不同的输出。Vibing 读取屏幕——聊天保留情绪和风格,文档只留事实。
Mixed Language
Voice input
Yesterday we discussed three issues on Teams: the UI polish, 模型那边的 latency optimization, and the prompt tuning
Output
Yesterday we discussed three issues on Teams: UI polish, latency optimization on the model side, and prompt tuning.
语音输入
昨天在 Teams 里我们讨论了三个 issue,一个是 UI 怎么做得更漂亮,一个是 latency 怎么从 model 层面去解决,还有一个是 prompt 的优化
Vibing 输出
昨天在 Teams 里我们讨论了三个 issue:一个是 UI 怎么做得更漂亮、更吸引人;一个是 latency,怎么从 model 层面去解决;还有一个是 prompt 的优化。
Switch between languages freely. Technical terms stay exactly as you said them — no unwanted "corrections". 中英自由切换,技术术语保留英文——"latency" 不会变成"延迟","prompt" 不会变成"提示词"。
Long-Form Dictation
Claude Code — Terminal
$ claude
Voice input
OK so basically what I want is, um, a voice input app, right, and it needs to like work system-wide on both Windows and Mac, and uh the key thing is it should use an LLM to clean up what you say, you know, remove filler words and fix grammar, and also it should be context-aware, like it reads your screen to figure out if you're in an email or a chat or a doc, and then format the output accordingly, oh and also it needs to support hotwords for technical terms, and um mixed language input like Chinese and English together, and I guess one more thing, it should be really fast, like the latency needs to be low enough that it feels real-time
Output
Build a system-wide voice input app (Windows + Mac) with the following requirements:

1. LLM-powered text cleanup — remove fillers, fix grammar, keep intent
2. Context-aware formatting — read screen to detect app context (email / chat / doc) and adapt output
3. Hotword support for technical terms and proper nouns
4. Mixed-language input (e.g., Chinese + English)
5. Low-latency pipeline — output should feel real-time
Claude Code — Terminal
$ claude
语音输入
我想做一个语音输入法,就是,它要能在Windows和Mac上全局使用,,然后最关键的是要用大模型来整理你说的话,就是那种,去掉语气词、修正语法,还要能感知上下文,比如说,它会截屏来判断你是在写邮件还是聊天还是写文档,然后自动调整输出格式,对了还有,要支持热词来识别专业术语,,还有中英文混合输入,我想想,还有一点,延迟要低,要像实时一样
Vibing 输出
开发一个全局语音输入法(Windows + Mac),具体需求如下:

1. 大模型文本清理——去除语气词、修正语法、保留核心意图
2. 上下文感知格式化——截屏识别当前应用(邮件/聊天/文档),自动调整输出格式
3. 热词支持:识别专有名词和技术术语
4. 中英文混合输入
5. 低延迟管线:输出需要达到实时体验
You can pause to think, correct yourself, and speak as long as you need. A 30-second stream of thought becomes a structured, actionable spec. 你可以中途停下来思考、修改之前说的话、想说多久就说多久。30 秒的口语需求变成结构清晰的技术规格。
Translation Mode
Reply — Outlook
From: IT Support <it@company.com>
Re: Hardware Request
Voice input (casual)
Yeah so basically my project needs a Mac for development, I'm building a new voice input experience, I'd like to borrow one for about three months
Output (formal email)
Hi IT Support,

My project requires a Mac for development — I'm building a new voice input experience. I'd like to borrow one for approximately three months.

Thanks,
[Your Name]
Casual Speech → Formal Email
Reply — Outlook
From: IT Support <it@company.com>
Re: Hardware Request
🎤 语音输入(中文)
我的项目是基于语音识别来做一个全新的输入法体验,需要用 Mac 来开发,希望能借一台 Mac Studio,大概三个月
✍️ Output (English email)
Hi IT Support,

My project is building a new voice input experience based on speech recognition, and I need a Mac for development. I'd like to borrow a Mac Studio for 3 months.

Thanks,
[Your Name]
🎤 中文 → ✍️ English
Speak in your language, output in another. Vibing detects the email context and formats your speech as a proper reply. 用母语说,用目标语言输出。Vibing 检测到邮件上下文,自动将中文语音转为英文邮件回复。

Key Features 核心特性

🧠

Context-Aware Intent上下文意图理解

Understands what you mean, not just what you say.理解你的意图,而非仅仅转录你说的话。

LLM-Powered RewritingAI 智能润色

AI rewrites your speech into polished, context-appropriate text.AI 将语音润色为合适的文字,匹配当前上下文。

🔀

Mixed-Language Input混合语言输入

Switch between languages freely within a single sentence.一句话中自由切换语言,技术术语自动保留。

🌐

Translation实时翻译

Real-time voice translation across languages.实时语音翻译,跨语言沟通。

🏷️

Personalized Hotwords个性化热词

Custom vocabulary for names, jargon, and domain-specific terms.自定义热词,精准识别人名、术语和专业词汇。

🗣️

Multilingual多语言支持

Speak in any of 50+ languages with automatic detection.支持 50+ 语言,自动检测语种。

⏱️

Long-Form Voice Input长篇语音输入

Up to 10 minutes of continuous speech in a single recording.单次录音最长支持 10 分钟的连续语音输入。

Installation Guide 安装指南

Windows

Microsoft Store Recommended推荐

Automatic updates, no SmartScreen warnings.自动更新,无 SmartScreen 弹窗。

Get it from Microsoft Store从 Microsoft Store 获取

EXE Installer

Standard installation. May trigger Windows SmartScreen — click "More info" then "Run anyway".标准安装包。可能触发 SmartScreen 弹窗——点击"更多信息"后选择"仍要运行"。

Download EXE

ZIP Portable

No installation needed — extract and run. May trigger SmartScreen — click "More info" then "Run anyway".免安装便携版——解压即用。可能触发 SmartScreen 弹窗——点击"更多信息"后选择"仍要运行"。

Download ZIP

Mac

DMG Installer

macOS requires Accessibility and Screen Recording permissions during setup. See the detailed Mac Setup Guide for step-by-step instructions.macOS 需要在安装时授予辅助功能和屏幕录制权限。请参阅 Mac 安装指南获取详细步骤。

Download DMG Mac Setup Guide →Mac 安装指南 →

Video Introduction 视频介绍

Privacy 隐私

Vibing sends audio and a screenshot of your active window to the cloud for processing. Both are deleted immediately after — nothing is stored, nothing is used for training. Vibing 将音频和当前窗口截图发送到云端处理。处理完成后立即删除——不存储,不用于训练。

Screenshots are used solely for context-aware formatting — helping Vibing understand whether you're in an email, a document, or a chat. 截图仅用于上下文感知格式化——帮助 Vibing 理解你是在邮件、文档还是聊天中。

FAQ

Built-in dictation gives raw speech-to-text. Vibing understands context — the app, language, format — and outputs text that fits. 系统自带语音输入只做逐字转录。Vibing 理解上下文——当前应用、语言、格式——输出匹配场景的文字。
No. Internet connection required for speech recognition and AI rewriting. 不可以。语音识别和 AI 润色需要联网。
50+ languages with automatic detection. Mixed-language input fully supported. 50+ 语言,自动检测语种,完全支持中英混合输入。
Audio and screenshots are deleted immediately after processing. Nothing is stored or used for training. 音频和截图在处理完成后立即删除,不会存储或用于训练。
Yes. Configurable in Settings. Toggle and hold-to-record modes both supported. 可以。在设置中配置。支持切换模式和按住模式。
Yes. Any app that accepts text input — editors, browsers, chat, terminals, email. 是的。任何接受文字输入的应用——编辑器、浏览器、聊天、终端、邮件。

Ready to Vibe? 准备好了吗?

Download Vibing and start speaking. 下载 Vibing,开口即用。