mirror of
https://github.com/HiMeditator/auto-caption.git
synced 2026-03-16 03:17:34 +08:00
release v0.5.0
- 更新了发行说明和用户手册 - 优化了界面显示和功能 - 过滤 Gummy 字幕引擎输出的不完整字幕
This commit is contained in:
@@ -1,14 +1,21 @@
|
||||
# Auto Caption User Manual
|
||||
|
||||
Corresponding Version: v0.4.0
|
||||
Corresponding Version: v0.5.0
|
||||
|
||||
## Software Introduction
|
||||
|
||||
Auto Caption is a cross-platform caption display software that can real-time capture system audio input (recording) or output (playback) streaming data and use an audio-to-text model to generate captions for the corresponding audio. The default caption engine provided by the software (using Alibaba Cloud Gummy model) supports recognition and translation in nine languages (Chinese, English, Japanese, Korean, German, French, Russian, Spanish, Italian).
|
||||
|
||||
Currently, the default caption engine of the software only has full functionality on Windows and macOS platforms. Additional configuration is required to capture system audio output on macOS.
|
||||
The default caption engine currently has full functionality on Windows, macOS, and Linux platforms. Additional configuration is required to capture system audio output on macOS.
|
||||
|
||||
On Linux platforms, it can only generate captions for audio input (microphone), and currently does not support generating captions for audio output (playback).
|
||||
The following operating system versions have been tested and confirmed to work properly. The software cannot guarantee normal operation on untested OS versions.
|
||||
|
||||
| OS Version | Architecture | Audio Input Capture | Audio Output Capture |
|
||||
| ------------------ | ------------ | ------------------- | -------------------- |
|
||||
| Windows 11 24H2 | x64 | ✅ | ✅ |
|
||||
| macOS Sequoia 15.5 | arm64 | ✅ Additional config required | ✅ |
|
||||
| Ubuntu 24.04.2 | x64 | ✅ | ✅ |
|
||||
| Kali Linux 2022.3 | x64 | ✅ | ✅ |
|
||||
|
||||

|
||||
|
||||
@@ -63,28 +70,28 @@ Now the caption engine can capture system audio output and generate captions.
|
||||
|
||||
## Getting System Audio Output on Linux
|
||||
|
||||
Execute the following commands to install `pulseaudio` and `pavucontrol`:
|
||||
|
||||
```bash
|
||||
# For Debian or Ubuntu, etc.
|
||||
sudo apt install pulseaudio pavucontrol
|
||||
# For CentOS, etc.
|
||||
sudo yum install pulseaudio pavucontrol
|
||||
```
|
||||
|
||||
Then execute:
|
||||
First execute in the terminal:
|
||||
|
||||
```bash
|
||||
pactl list short sources
|
||||
```
|
||||
|
||||
If you see output similar to the following, the configuration was successful:
|
||||
If you see output similar to the following, no additional configuration is needed:
|
||||
|
||||
```bash
|
||||
220 alsa_output.pci-0000_02_02.0.3.analog-stereo.monitor PipeWire s16le 2ch 48000Hz SUSPENDED
|
||||
221 alsa_input.pci-0000_02_02.0.3.analog-stereo PipeWire s16le 2ch 48000Hz SUSPENDED
|
||||
```
|
||||
|
||||
Otherwise, install `pulseaudio` and `pavucontrol` using the following commands:
|
||||
|
||||
```bash
|
||||
# For Debian/Ubuntu etc.
|
||||
sudo apt install pulseaudio pavucontrol
|
||||
# For CentOS etc.
|
||||
sudo yum install pulseaudio pavucontrol
|
||||
```
|
||||
|
||||
## Software Usage
|
||||
|
||||
### Modifying Settings
|
||||
@@ -103,7 +110,7 @@ The following image shows the caption display window, which displays the latest
|
||||
|
||||
### Exporting Caption Records
|
||||
|
||||
In the caption control window, you can see the records of all collected captions. Click the "Export Caption Records" button to export the caption records as a JSON file.
|
||||
In the caption control window, you can see the records of all collected captions. Click the "Export Log" button to export the caption records as a JSON or SRT file.
|
||||
|
||||
## Caption Engine
|
||||
|
||||
|
||||
Reference in New Issue
Block a user