feat(docs): 更新文档、添加 macOS 平台适配指南

This commit is contained in:
himeditator mac
2025-07-08 22:44:11 +08:00
parent cbbaaa95a3
commit 3c9138f115
15 changed files with 463 additions and 244 deletions

View File

@@ -1,12 +1,14 @@
# Auto Caption User Manual
Corresponding Version: v0.2.0
Corresponding Version: v0.3.0
## Software Introduction
Auto Caption is a cross-platform caption display software that can real-time capture system audio input (recording) or output (playback) streaming data and use an audio-to-text model to generate captions for the corresponding audio. The default caption engine provided by the software (using Alibaba Cloud Gummy model) supports recognition and translation in nine languages (Chinese, English, Japanese, Korean, German, French, Russian, Spanish, Italian).
Currently, the default caption engine only has full functionality on the Windows platform. On the Linux platform, it can only generate captions for audio input (microphone) and does not support generating captions for audio output (playback).
Currently, the default caption engine of the software only has full functionality on Windows and macOS platforms. Additional configuration is required to capture system audio output on macOS.
On Linux platforms, it can only generate captions for audio input (microphone), and currently does not support generating captions for audio output (playback).
![](../../assets/media/main_en.png)
@@ -14,6 +16,8 @@ Currently, the default caption engine only has full functionality on the Windows
To use the default caption service, you need to obtain an API KEY from Alibaba Cloud.
Additional configuration is required to capture audio output on macOS platform.
The software is built using Electron, so the software size is inevitably large.
## Software Usage
@@ -29,6 +33,22 @@ Alibaba Cloud provides detailed tutorials for this:
- [Obtain API KEY (Chinese)](https://help.aliyun.com/zh/model-studio/get-api-key)
- [Configure API Key in Environment Variables (Chinese)](https://help.aliyun.com/zh/model-studio/configure-api-key-through-environment-variables)
### Capturing System Audio Output on macOS
The caption engine cannot directly capture system audio output on macOS platform and requires additional driver installation. The current caption engine uses [BlackHole](https://github.com/ExistentialAudio/BlackHole). First open Terminal and execute one of the following commands (recommended to choose the first one):
```bash
brew install blackhole-2ch
brew install blackhole-16ch
brew install blackhole-64ch
```
After installation completes, open `Audio MIDI Setup` (searchable via `cmd + space`). Check if BlackHole appears in the device list - if not, restart your computer.
Once BlackHole is confirmed installed, in the `Audio MIDI Setup` page, click the plus (+) button at bottom left and select "Create Multi-Output Device". Include both BlackHole and your desired audio output destination in the outputs. Finally, set this multi-output device as your default audio output device.
Now the caption engine can capture system audio output and generate captions.
### Modifying Settings
Caption settings can be divided into three categories: general settings, caption engine settings, and caption style settings. Note that changes to general settings take effect immediately. For the other two categories, after making changes, you need to click the "Apply" option in the upper right corner of the corresponding settings module for the changes to take effect. If you click "Cancel Changes," the current modifications will not be saved and will revert to the previous state.