diff --git a/README.md b/README.md index 051617b..94da6f0 100644 --- a/README.md +++ b/README.md @@ -4,7 +4,7 @@
Auto Caption 是一个跨平台的实时字幕显示软件。
-
+
@@ -18,7 +18,7 @@
| English
| 日本語 |
v0.5.1 版本已经发布。目前 Vosk 本地字幕引擎效果较差,且不含翻译,更优秀的字幕引擎正在尝试开发中...
+v0.6.0 版本已经发布,对字幕引擎代码进行了大重构,提升了代码的可扩展性。更多的字幕引擎正在尝试开发中...
 @@ -33,7 +33,9 @@ [字幕引擎说明文档](./docs/engine-manual/zh.md) -[项目 API 文档](./docs/api-docs/electron-ipc.md) +[项目 API 文档](./docs/api-docs/) + +[更新日志](./docs/CHANGELOG.md) ## 📖 基本使用 @@ -122,7 +124,7 @@ npm install ### 构建字幕引擎 -首先进入 `engine` 文件夹,执行如下指令创建虚拟环境(需要使用大于等于 Python 3.10 的 Python 运行环境): +首先进入 `engine` 文件夹,执行如下指令创建虚拟环境(需要使用大于等于 Python 3.10 的 Python 运行环境,建议使用 Python 3.12): ```bash # in ./engine folder @@ -140,7 +142,7 @@ subenv/Scripts/activate source subenv/bin/activate ``` -然后安装依赖(这一步可能会报错,一般是因为构建失败,需要根据报错信息安装对应的工具包): +然后安装依赖(这一步在 macOS 和 Linux 可能会报错,一般是因为构建失败,需要根据报错信息进行处理): ```bash # Windows @@ -151,7 +153,7 @@ pip install -r requirements_darwin.txt pip install -r requirements_linux.txt ``` -如果在 Linux 系统上安装 samplerate 模块报错,可以尝试使用以下命令单独安装: +如果在 Linux 系统上安装 `samplerate` 模块报错,可以尝试使用以下命令单独安装: ```bash pip install samplerate --only-binary=:all: @@ -163,7 +165,7 @@ pip install samplerate --only-binary=:all: pyinstaller ./main.spec ``` -注意 `main.spec` 文件中 `vosk` 库的路径可能不正确,需要根据实际状况配置。 +注意 `main.spec` 文件中 `vosk` 库的路径可能不正确,需要根据实际状况配置(与 Python 环境的版本相关)。 ``` # Windows diff --git a/README_en.md b/README_en.md index 3bffe2b..b51f799 100644 --- a/README_en.md +++ b/README_en.md @@ -4,7 +4,7 @@Auto Caption is a cross-platform real-time caption display software.
-
+
@@ -18,7 +18,7 @@
| English
| 日本語 |
Version v0.5.1 has been released. The current Vosk local caption engine performs poorly and does not include translation. A better caption engine is under development...
+Version 0.6.0 has been released, featuring a major refactor of the subtitle engine code to improve code extensibility. More subtitle engines are being developed...
 @@ -33,7 +33,9 @@ [Caption Engine Documentation](./docs/engine-manual/en.md) -[Project API Documentation (Chinese)](./docs/api-docs/electron-ipc.md) +[Project API Documentation (Chinese)](./docs/api-docs/) + +[Changelog](./docs/CHANGELOG.md) ## 📖 Basic Usage @@ -122,7 +124,7 @@ npm install ### Build Subtitle Engine -First enter the `engine` folder and execute the following commands to create a virtual environment: +First enter the `engine` folder and execute the following commands to create a virtual environment (requires Python 3.10 or higher, with Python 3.12 recommended): ```bash # in ./engine folder @@ -140,7 +142,7 @@ subenv/Scripts/activate source subenv/bin/activate ``` -Then install dependencies (this step may fail, usually due to build failures - you'll need to install the corresponding tool packages based on the error messages): +Then install dependencies (this step might result in errors on macOS and Linux, usually due to build failures, and you need to handle them based on the error messages): ```bash # Windows @@ -160,11 +162,10 @@ pip install samplerate --only-binary=:all: Then use `pyinstaller` to build the project: ```bash -pyinstaller ./main-gummy.spec -pyinstaller ./main-vosk.spec +pyinstaller ./main.spec ``` -Note that the path to the `vosk` library in `main-vosk.spec` might be incorrect and needs to be configured according to the actual situation. +Note that the path to the `vosk` library in `main-vosk.spec` might be incorrect and needs to be configured according to the actual situation (related to the version of the Python environment). ``` # Windows @@ -197,13 +198,9 @@ Note: You need to modify the configuration content in the `electron-builder.yml` ```yml extraResources: # For Windows - - from: ./engine/dist/main-gummy.exe - to: ./engine/main-gummy.exe - - from: ./engine/dist/main-vosk.exe - to: ./engine/main-vosk.exe + - from: ./engine/dist/main.exe + to: ./engine/main.exe # For macOS and Linux - # - from: ./engine/dist/main-gummy - # to: ./engine/main-gummy - # - from: ./engine/dist/main-vosk - # to: ./engine/main-vosk -``` + # - from: ./engine/dist/main + # to: ./engine/main +``` \ No newline at end of file diff --git a/README_ja.md b/README_ja.md index b438ddb..d046119 100644 --- a/README_ja.md +++ b/README_ja.md @@ -4,7 +4,7 @@Auto Caption はクロスプラットフォームのリアルタイム字幕表示ソフトウェアです。
-
+
@@ -18,7 +18,7 @@
| English
| 日本語 |
バージョン v0.5.1 がリリースされました。現在の Vosk ローカル字幕エンジンは性能が低く、翻訳機能も含まれていません。より優れた字幕エンジンを開発中です...
+v0.6.0 バージョンがリリースされ、字幕エンジンコードが大規模にリファクタリングされ、コードの拡張性が向上しました。より多くの字幕エンジンの開発が試みられています...
 @@ -33,7 +33,9 @@ [字幕エンジン説明ドキュメント](./docs/engine-manual/ja.md) -[プロジェクト API ドキュメント(中国語)](./docs/api-docs/electron-ipc.md) +[プロジェクト API ドキュメント(中国語)](./docs/api-docs/) + +[更新履歴](./docs/CHANGELOG.md) ## 📖 基本使い方 @@ -122,7 +124,7 @@ npm install ### 字幕エンジンの構築 -まず `engine` フォルダに入り、以下のコマンドを実行して仮想環境を作成します: +まず `engine` フォルダに入り、以下のコマンドを実行して仮想環境を作成します(Python 3.10 以上が必要で、Python 3.12 が推奨されます): ```bash # ./engine フォルダ内 @@ -140,7 +142,7 @@ subenv/Scripts/activate source subenv/bin/activate ``` -次に依存関係をインストールします(このステップは失敗する可能性があります、通常はビルド失敗が原因です - エラーメッセージに基づいて対応するツールパッケージをインストールする必要があります): +次に依存関係をインストールします(このステップでは macOS と Linux でエラーが発生する可能性があります。通常はビルド失敗によるもので、エラーメッセージに基づいて対処する必要があります): ```bash # Windows @@ -151,7 +153,7 @@ pip install -r requirements_darwin.txt pip install -r requirements_linux.txt ``` -Linuxシステムで`samplerate`モジュールのインストールに問題が発生した場合、以下のコマンドで個別にインストールを試すことができます: +Linux システムで `samplerate` モジュールのインストールに問題が発生した場合、以下のコマンドで個別にインストールを試すことができます: ```bash pip install samplerate --only-binary=:all: @@ -160,16 +162,15 @@ pip install samplerate --only-binary=:all: その後、`pyinstaller` を使用してプロジェクトをビルドします: ```bash -pyinstaller ./main-gummy.spec -pyinstaller ./main-vosk.spec +pyinstaller ./main.spec ``` -`main-vosk.spec` ファイル内の `vosk` ライブラリのパスが正しくない可能性があるため、実際の状況に応じて設定する必要があります。 +`main-vosk.spec` ファイル内の `vosk` ライブラリのパスが正しくない可能性があるため、実際の状況(Python 環境のバージョンに関連)に応じて設定する必要があります。 ``` # Windows vosk_path = str(Path('./subenv/Lib/site-packages/vosk').resolve()) -# LinuxまたはmacOS +# Linux または macOS vosk_path = str(Path('./subenv/lib/python3.x/site-packages/vosk').resolve()) ``` @@ -196,14 +197,10 @@ npm run build:linux ```yml extraResources: - # Windows用 - - from: ./engine/dist/main-gummy.exe - to: ./engine/main-gummy.exe - - from: ./engine/dist/main-vosk.exe - to: ./engine/main-vosk.exe - # macOSとLinux用 - # - from: ./engine/dist/main-gummy - # to: ./engine/main-gummy - # - from: ./engine/dist/main-vosk - # to: ./engine/main-vosk + # Windows 用 + - from: ./engine/dist/main.exe + to: ./engine/main.exe + # macOS と Linux 用 + # - from: ./engine/dist/main + # to: ./engine/main ``` diff --git a/assets/media/main_en.png b/assets/media/main_en.png index 2959fcf..cb3bd13 100644 Binary files a/assets/media/main_en.png and b/assets/media/main_en.png differ diff --git a/assets/media/main_ja.png b/assets/media/main_ja.png index 8582f9c..eb4a2f4 100644 Binary files a/assets/media/main_ja.png and b/assets/media/main_ja.png differ diff --git a/assets/media/main_zh.png b/assets/media/main_zh.png index 4a6c3aa..d8fd27c 100644 Binary files a/assets/media/main_zh.png and b/assets/media/main_zh.png differ diff --git a/assets/media/vosk_en.png b/assets/media/vosk_en.png index aca13f1..ea991b9 100644 Binary files a/assets/media/vosk_en.png and b/assets/media/vosk_en.png differ diff --git a/assets/media/vosk_ja.png b/assets/media/vosk_ja.png index 6c483d3..e09051c 100644 Binary files a/assets/media/vosk_ja.png and b/assets/media/vosk_ja.png differ diff --git a/assets/media/vosk_zh.png b/assets/media/vosk_zh.png index b20e837..72e2111 100644 Binary files a/assets/media/vosk_zh.png and b/assets/media/vosk_zh.png differ diff --git a/docs/CHANGELOG.md b/docs/CHANGELOG.md index 0f51d25..e6de0aa 100644 --- a/docs/CHANGELOG.md +++ b/docs/CHANGELOG.md @@ -117,7 +117,7 @@ ## v0.6.0 -2025-07-xx +2025-07-29 ### 新增功能 @@ -125,11 +125,15 @@ ### 优化体验 -- 交换窗口界面信息和错误提示弹窗的位置,防止提示信息挡住操作 +- 减小了软件安装包的体积 +- 微调字幕引擎设置界面布局 +- 交换窗口界面信息弹窗和错误弹窗的位置,防止提示信息挡住操作 +- 提高程序健壮性,完全避免字幕引擎进程成为孤儿进程 +- 修改字幕引擎文档,添加更详细的开发说明 ### 项目优化 - 重构字幕引擎,提示字幕引擎代码的可扩展性和可读性 -- 合并 Gummy 和 Vosk 引擎为单个可执行文件,减小软件体积 -- 字幕引擎和主程序添加 WebScoket 通信,完全避免字幕引擎成为孤儿进程 +- 合并 Gummy 和 Vosk 引擎为单个可执行文件 +- 字幕引擎和主程序添加 Socket 通信,完全避免字幕引擎成为孤儿进程 diff --git a/docs/TODO.md b/docs/TODO.md index a725c93..3a14d6b 100644 --- a/docs/TODO.md +++ b/docs/TODO.md @@ -17,10 +17,10 @@ - [x] 可以获取字幕引擎的系统资源消耗情况 *2025/07/15* - [x] 添加字幕记录按时间降序排列选择 *2025/07/26* - [x] 重构字幕引擎 *2025/07/28* +- [x] 优化前端界面提示消息 *2025/07/29* ## 待完成 -- [ ] 优化前端界面提示消息 - [ ] 验证 / 添加基于 sherpa-onnx 的字幕引擎 ## 后续计划 diff --git a/docs/api-docs/caption-engine.md b/docs/api-docs/caption-engine.md index 52799b5..873d286 100644 --- a/docs/api-docs/caption-engine.md +++ b/docs/api-docs/caption-engine.md @@ -6,7 +6,7 @@ 本项目的 Python 进程通过标准输出向 Electron 主进程发送数据。Python 进程标准输出 (`sys.stdout`) 的内容一定为一行一行的字符串。且每行字符串均可以解释为一个 JSON 对象。每个 JSON 对象一定有 `command` 参数。 -Electron 主进程通过 WebSocket 向 Python 进程发送数据。发送的数据均是转化为字符串的对象,对象格式一定为: +Electron 主进程通过 TCP Socket 向 Python 进程发送数据。发送的数据均是转化为字符串的对象,对象格式一定为: ```js { @@ -30,7 +30,7 @@ Electron 主进程通过 WebSocket 向 Python 进程发送数据。发送的数 } ``` -字幕引擎 WebSocket 服务已经准备好,命令 Electron 主进程连接字幕引擎 WebSocket 服务 +字幕引擎 TCP Socket 服务已经准备好,命令 Electron 主进程连接字幕引擎 Socket 服务 ### `kill` @@ -91,7 +91,7 @@ Python 端打印的提示信息,比起 `print`,该信息更希望 Electron Gummy 字幕引擎结束时打印计费消耗信息。 -## WebSocket +## TCP Socket > 数据传递方向:Electron 主进程 => 字幕引擎进程 @@ -99,4 +99,11 @@ Gummy 字幕引擎结束时打印计费消耗信息。 ### `stop` +```js +{ + command: "stop", + content: "" +} +``` + 命令当前字幕引擎停止监听并结束任务。 \ No newline at end of file diff --git a/docs/engine-manual/en.md b/docs/engine-manual/en.md index 78d6e27..aa9afd0 100644 --- a/docs/engine-manual/en.md +++ b/docs/engine-manual/en.md @@ -1,203 +1,199 @@ -# Caption Engine Documentation +# Caption Engine Documentation -Corresponding Version: v0.5.1 +Corresponding Version: v0.6.0 -**Note: Due to limited personal resources, the English and Japanese documentation files for this project (except for the README document) will no longer be maintained. The content of this document may not be consistent with the latest version of the project. If you are willing to help with translation, please submit relevant Pull Requests.** + - +## Introduction to the Caption Engine -## Introduction to the Caption Engine +The so-called caption engine is essentially a subprogram that continuously captures real-time streaming data from the system's audio input (microphone) or output (speakers) and invokes an audio-to-text model to generate corresponding captions for the audio. The generated captions are converted into JSON-formatted string data and passed to the main program via standard output (ensuring the string can be correctly interpreted as a JSON object by the main program). The main program reads and interprets the caption data, processes it, and displays it in the window. -The so-called caption engine is actually a subprogram that captures real-time streaming data from the system's audio input (recording) or output (playing sound) and calls an audio-to-text model to generate captions for the corresponding audio. The generated captions are converted into a JSON-formatted string and passed to the main program through standard output (it must be ensured that the string read by the main program can be correctly interpreted as a JSON object). The main program reads and interprets the caption data, processes it, and then displays it on the window. +**The communication standard between the caption engine process and the Electron main process is: [caption engine api-doc](../api-docs/caption-engine.md).** -## Functions Required by the Caption Engine +## Workflow -### Audio Acquisition +The communication flow between the main process and the caption engine: -First, your caption engine needs to capture streaming data from the system's audio input (recording) or output (playing sound). If using Python for development, you can use the PyAudio library to obtain microphone audio input data (cross-platform). Use the PyAudioWPatch library to get system audio output (Windows platform only). +### Starting the Engine -Generally, the captured audio stream data consists of short audio chunks, and the size of these chunks should be adjusted according to the model. For example, Alibaba Cloud's Gummy model performs better with 0.05-second audio chunks compared to 0.2-second ones. +- **Main Process**: Uses `child_process.spawn()` to launch the caption engine process. +- **Caption Engine Process**: Creates a TCP Socket server thread. After creation, it outputs a JSON object string via standard output, containing a `command` field with the value `connect`. +- **Main Process**: Monitors the standard output of the caption engine process, attempts to split it line by line, parses it into a JSON object, and checks if the `command` field value is `connect`. If so, it connects to the TCP Socket server. -### Audio Processing +### Caption Recognition -The acquired audio stream may need preprocessing before being converted to text. For instance, Alibaba Cloud's Gummy model can only recognize single-channel audio streams, while the collected audio streams are typically dual-channel, thus requiring conversion from dual-channel to single-channel. Channel conversion can be achieved using methods in the NumPy library. +- **Caption Engine Process**: The main thread monitors system audio output, sends audio data chunks to the caption engine for parsing, and outputs the parsed caption data object strings via standard output. +- **Main Process**: Continues to monitor the standard output of the caption engine and performs different operations based on the `command` field of the parsed object. -You can directly use the audio acquisition (`engine/sysaudio`) and audio processing (`engine/audioprcs`) modules I have developed. +### Closing the Engine -### Audio to Text Conversion +- **Main Process**: When the user closes the caption engine via the frontend, the main process sends a JSON object string with the `command` field set to `stop` to the caption engine process via Socket communication. +- **Caption Engine Process**: Receives the object string, parses it, and if the `command` field is `stop`, sets the global variable `thread_data.status` to `stop`. +- **Caption Engine Process**: The main thread's loop for monitoring system audio output ends when `thread_data.status` is not `running`, releases resources, and terminates. +- **Main Process**: Detects the termination of the caption engine process, performs corresponding cleanup, and provides feedback to the frontend. -After obtaining the appropriate audio stream, you can convert it into text. This is generally done using various models based on your requirements. +## Implemented Features -A nearly complete implementation of a caption engine is as follows: +The following features are already implemented and can be reused directly. -```python -import sys -import argparse +### Standard Output -# Import system audio acquisition module -if sys.platform == 'win32': - from sysaudio.win import AudioStream -elif sys.platform == 'darwin': - from sysaudio.darwin import AudioStream -elif sys.platform == 'linux': - from sysaudio.linux import AudioStream -else: - raise NotImplementedError(f"Unsupported platform: {sys.platform}") +Supports printing general information, commands, and error messages. -# Import audio processing functions -from audioprcs import mergeChunkChannels -# Import audio-to-text module -from audio2text import InvalidParameter, GummyTranslator +Example: +```python +from utils import stdout, stdout_cmd, stdout_obj, stderr +stdout("Hello") # {"command": "print", "content": "Hello"}\n +stdout_cmd("connect", "8080") # {"command": "connect", "content": "8080"}\n +stdout_obj({"command": "print", "content": "Hello"}) +stderr("Error Info") +``` -def convert_audio_to_text(s_lang, t_lang, audio_type, chunk_rate, api_key): - # Set standard output to line buffering - sys.stdout.reconfigure(line_buffering=True) # type: ignore +### Creating a Socket Service - # Create instances for audio acquisition and speech-to-text - stream = AudioStream(audio_type, chunk_rate) - if t_lang == 'none': - gummy = GummyTranslator(stream.RATE, s_lang, None, api_key) - else: - gummy = GummyTranslator(stream.RATE, s_lang, t_lang, api_key) +This Socket service listens on a specified port, parses content sent by the Electron main program, and may modify the value of `thread_data.status`. - # Start instances - stream.openStream() - gummy.start() +Example: - while True: - try: - # Read audio stream data - chunk = stream.read_chunk() - chunk_mono = mergeChunkChannels(chunk, stream.CHANNELS) - try: - # Call the model for translation - gummy.send_audio_frame(chunk_mono) - except InvalidParameter: - gummy.start() - gummy.send_audio_frame(chunk_mono) - except KeyboardInterrupt: - stream.closeStream() - gummy.stop() - break -``` +```python +from utils import start_server +from utils import thread_data +port = 8080 +start_server(port) +while thread_data == 'running': + # do something + pass +``` -### Caption Translation +### Audio Capture -Some speech-to-text models don't provide translation functionality, requiring an additional translation module. This part can use either cloud-based translation APIs or local translation models. +The `AudioStream` class captures audio data and is cross-platform, supporting Windows, Linux, and macOS. Its initialization includes two parameters: -### Data Transmission +- `audio_type`: The type of audio to capture. `0` for system output audio (speakers), `1` for system input audio (microphone). +- `chunk_rate`: The frequency of audio data capture, i.e., the number of audio chunks captured per second. -After obtaining the text of the current audio stream, it needs to be transmitted to the main program. The caption engine process passes the caption data to the Electron main process through standard output. +The class includes three methods: -The content transmitted must be a JSON string, where the JSON object must contain the following parameters: +- `open_stream()`: Starts audio capture. +- `read_chunk() -> bytes`: Reads an audio chunk. +- `close_stream()`: Stops audio capture. -```typescript -export interface CaptionItem { - index: number, // Caption sequence number - time_s: string, // Caption start time - time_t: string, // Caption end time - text: string, // Caption content - translation: string // Caption translation -} -``` +Example: -**It is essential to ensure that each time we output caption JSON data, the buffer is flushed, ensuring that the string received by the Electron main process can always be interpreted as a JSON object.** +```python +from sysaudio import AudioStream +audio_type = 0 +chunk_rate = 20 +stream = AudioStream(audio_type, chunk_rate) +stream.open_stream() +while True: + data = stream.read_chunk() + # do something with data + pass +stream.close_stream() +``` -If using Python, you can refer to the following method to pass data to the main program: +### Audio Processing -```python -# engine\main-gummy.py -sys.stdout.reconfigure(line_buffering=True) +The captured audio stream may require preprocessing before conversion to text. Typically, multi-channel audio needs to be converted to mono, and resampling may be necessary. This project provides three audio processing functions: -# engine\audio2text\gummy.py -... - def send_to_node(self, data): - """ - Send data to the Node.js process - """ - try: - json_data = json.dumps(data) + '\n' - sys.stdout.write(json_data) - sys.stdout.flush() - except Exception as e: - print(f"Error sending data to Node.js: {e}", file=sys.stderr) -... -``` +- `merge_chunk_channels(chunk: bytes, channels: int) -> bytes`: Converts a multi-channel audio chunk to mono. +- `resample_chunk_mono(chunk: bytes, channels: int, orig_sr: int, target_sr: int, mode="sinc_best") -> bytes`: Converts a multi-channel audio chunk to mono and resamples it. +- `resample_mono_chunk(chunk: bytes, orig_sr: int, target_sr: int, mode="sinc_best") -> bytes`: Resamples a mono audio chunk. -Data receiver code is as follows: +## Features to Be Implemented in the Caption Engine +### Audio-to-Text Conversion -```typescript -// src\main\utils\engine.ts -... - this.process.stdout.on('data', (data) => { - const lines = data.toString().split('\n'); - lines.forEach((line: string) => { - if (line.trim()) { - try { - const caption = JSON.parse(line); - addCaptionLog(caption); - } catch (e) { - controlWindow.sendErrorMessage('Unable to parse the output from the caption engine as a JSON object: ' + e) - console.error('[ERROR] Error parsing JSON:', e); - } - } - }); - }); +After obtaining a suitable audio stream, it needs to be converted to text. Typically, various models (cloud-based or local) are used for this purpose. Choose the appropriate model based on requirements. - this.process.stderr.on('data', (data) => { - controlWindow.sendErrorMessage('Caption engine error: ' + data) - console.error(`[ERROR] Subprocess Error: ${data}`); - }); -... -``` +This part is recommended to be encapsulated as a class with three methods: -## Usage of Caption Engine +- `start(self)`: Starts the model. +- `send_audio_frame(self, data: bytes)`: Processes the current audio chunk data. **The generated caption data is sent to the Electron main process via standard output.** +- `stop(self)`: Stops the model. -### Command Line Parameter Specification +Complete caption engine examples: -The custom caption engine settings are specified via command line parameters. Common required parameters are as follows: +- [gummy.py](../../engine/audio2text/gummy.py) +- [vosk.py](../../engine/audio2text/vosk.py) -```python -import argparse +### Caption Translation -... +Some speech-to-text models do not provide translation. If needed, a translation module must be added. -if __name__ == "__main__": - parser = argparse.ArgumentParser(description='Convert system audio stream to text') - parser.add_argument('-s', '--source_language', default='en', help='Source language code') - parser.add_argument('-t', '--target_language', default='zh', help='Target language code') - parser.add_argument('-a', '--audio_type', default=0, help='Audio stream source: 0 for output audio stream, 1 for input audio stream') - parser.add_argument('-c', '--chunk_rate', default=20, help='The number of audio stream chunks collected per second.') - parser.add_argument('-k', '--api_key', default='', help='API KEY for Gummy model') - args = parser.parse_args() - convert_audio_to_text( - args.source_language, - args.target_language, - int(args.audio_type), - int(args.chunk_rate), - args.api_key - ) -``` +### Sending Caption Data -For example, to specify Japanese as source language, Chinese as target language, capture system audio output, and collect 0.1s audio chunks, use the following command: +After obtaining the text for the current audio stream, it must be sent to the main program. The caption engine process passes caption data to the Electron main process via standard output. -```bash -python main-gummy.py -s ja -t zh -a 0 -c 10 -k{{ $t('engine.custom.note') }}
+ +{{ $t('engine.custom.note') }}
- -