feat(linux): 支持 Linux 系统音频输出

- 添加了对 Linux 系统音频输出的支持 - 更新了 README 和用户手册中的平台兼容性信息 - 修改了 AudioStream 类以支持 Linux 平台
2026-02-04 04:06:09 +08:00 · 2025-07-13 23:28:40 +08:00
parent 7f8766b13e
commit 665c47d24f
17 changed files with 213 additions and 138 deletions
--- a/.npmrc
+++ b/.npmrc
@@ -1,2 +1,2 @@
-# electron_mirror=https://npmmirror.com/mirrors/electron/
-# electron_builder_binaries_mirror=https://npmmirror.com/mirrors/electron-builder-binaries/
+electron_mirror=https://npmmirror.com/mirrors/electron/
+electron_builder_binaries_mirror=https://npmmirror.com/mirrors/electron-builder-binaries/
--- a/README.md
+++ b/README.md
@@ -39,7 +39,15 @@

 ## 📖 基本使用

-目前提供了 Windows 和 macOS 平台的可安装版本。
+软件已经适配了 Windows、macOS 和 Linux 平台。测试过的平台信息如下：
+
+| 操作系统版本        | 处理器架构 | 获取系统音频输入 | 获取系统音频输出 |
+| ------------------ | ---------- | ---------------- | ---------------- |
+| Windows 11 24H2    | x64        | ✅                | ✅                |
+| macOS Sequoia 15.5 | arm64      | ✅需要额外配置    | ✅                |
+| Ubuntu 24.04.2     | x64        | ✅需要额外配置    | ✅                |
+
+macOS 平台和 Linux 平台获取系统音频输出需要进行额外设置，详见[Auto Caption 用户手册](./docs/user-manual/zh.md)。

 > 国际版的阿里云服务并没有提供 Gummy 模型，因此目前非中国用户无法使用 Gummy 字幕引擎。

@@ -54,7 +62,6 @@

 ![](./assets/media/vosk_zh.png)

-
 **如果你觉得上述字幕引擎不能满足你的需求，而且你会 Python，那么你可以考虑开发自己的字幕引擎。详细说明请参考[字幕引擎说明文档](./docs/engine-manual/zh.md)。**

 ## ✨ 特性
@@ -66,10 +73,6 @@
 - 字幕记录展示与导出
 - 生成音频输出或麦克风输入的字幕

-说明：
- Windows 和 macOS 平台支持生成音频输出和麦克风输入的字幕，但是 **macOS 平台获取系统音频输出需要进行设置，详见[Auto Caption 用户手册](./docs/user-manual/zh.md)**
- Linux 平台目前无法获取系统音频输出，仅支持生成麦克风输入的字幕
-
 ## ⚙️ 自带字幕引擎说明

 目前软件自带 2 个字幕引擎，正在规划 1 个新的引擎。它们的详细信息如下。
@@ -137,12 +140,21 @@ subenv/Scripts/activate
 source subenv/bin/activate
 ```

-然后安装依赖（注意如果是 Linux 或 macOS 环境，需要注释掉 `requirements.txt` 中的 `PyAudioWPatch`，该模块仅适用于 Windows 环境）。
-
-> 这一步可能会报错，一般是因为构建失败，需要根据报错信息安装对应的构建工具包。
+然后安装依赖（这一步可能会报错，一般是因为构建失败，需要根据报错信息安装对应的工具包）：

 ```bash
-pip install -r requirements.txt
+# Windows
+pip install -r requirements_win.txt
+# macOS
+pip install -r requirements_darwin.txt
+# Linux
+pip install -r requirements_linux.txt
+```
+
+如果在 Linux 系统上安装 samplerate 模块报错，可以尝试使用以下命令单独安装：
+
+```bash
+pip install samplerate --only-binary=:all:
 ```

 然后使用 `pyinstaller` 构建项目：
@@ -168,6 +180,7 @@ vosk_path = str(Path('./subenv/lib/python3.x/site-packages/vosk').resolve())
 ```bash
 npm run dev
 ```
+
 ### 构建项目

 注意目前软件只在 Windows 和 macOS 平台上进行了构建和测试，无法保证软件在 Linux 平台下的正确性。
--- a/README_en.md
+++ b/README_en.md
@@ -39,7 +39,15 @@

 ## 📖 Basic Usage

-Currently, installable versions are available for Windows and macOS platforms.
+The software has been adapted for Windows, macOS, and Linux platforms. The tested platform information is as follows:
+
+| OS Version         | Architecture | System Audio Input | System Audio Output |
+| ------------------ | ------------ | ------------------ | ------------------- |
+| Windows 11 24H2    | x64          | ✅                 | ✅                  |
+| macOS Sequoia 15.5 | arm64        | ✅ Additional config required | ✅                  |
+| Ubuntu 24.04.2     | x64          | ✅ Additional config required | ✅                  |
+
+Additional configuration is required to capture system audio output on macOS and Linux platforms. See [Auto Caption User Manual](./docs/user-manual/en.md) for details.

 > The international version of Alibaba Cloud services does not provide the Gummy model, so non-Chinese users currently cannot use the Gummy caption engine.

@@ -65,10 +73,6 @@ To use the Vosk local caption engine, first download your required model from [V
 - Caption recording display and export
 - Generate captions for audio output or microphone input

-Notes:
- Windows and macOS platforms support generating captions for both audio output and microphone input, but **macOS requires additional setup to capture system audio output. See [Auto Caption User Manual](./docs/user-manual/en.md) for details.**
- Linux platform currently cannot capture system audio output, only supports generating subtitles for microphone input.
-
 ## ⚙️ Built-in Subtitle Engines

 Currently, the software comes with 2 subtitle engines, with 1 new engine planned. Details are as follows.
@@ -136,12 +140,21 @@ subenv/Scripts/activate
 source subenv/bin/activate
 ```

-Then install dependencies (note: for Linux or macOS environments, you need to comment out `PyAudioWPatch` in `requirements.txt`, as this module is only for Windows environments).
-
-> This step may report errors, usually due to build failures. You need to install corresponding build tools based on the error messages.
+Then install dependencies (this step may fail, usually due to build failures - you'll need to install the corresponding tool packages based on the error messages):

 ```bash
-pip install -r requirements.txt
+# Windows
+pip install -r requirements_win.txt
+# macOS
+pip install -r requirements_darwin.txt
+# Linux
+pip install -r requirements_linux.txt
+```
+
+If you encounter errors when installing the `samplerate` module on Linux systems, you can try installing it separately with this command:
+
+```bash
+pip install samplerate --only-binary=:all:
 ```

 Then use `pyinstaller` to build the project:
--- a/README_ja.md
+++ b/README_ja.md
@@ -39,7 +39,15 @@

 ## 📖 基本使い方

-現在、Windows と macOS プラットフォーム向けのインストール可能なバージョンを提供しています。
+このソフトウェアはWindows、macOS、Linuxプラットフォームに対応しています。テスト済みのプラットフォーム情報は以下の通りです：
+
+| OS バージョン | アーキテクチャ | システムオーディオ入力 | システムオーディオ出力 |
+| ------------------ | ------------ | ------------------ | ------------------- |
+| Windows 11 24H2    | x64          | ✅                 | ✅                  |
+| macOS Sequoia 15.5 | arm64        | ✅ 追加設定が必要  | ✅                  |
+| Ubuntu 24.04.2     | x64          | ✅ 追加設定が必要  | ✅                  |
+
+macOSおよびLinuxプラットフォームでシステムオーディオ出力を取得するには追加設定が必要です。詳細は[Auto Captionユーザーマニュアル](./docs/user-manual/ja.md)をご覧ください。

 > 阿里雲の国際版サービスでは Gummy モデルを提供していないため、現在中国以外のユーザーは Gummy 字幕エンジンを使用できません。

@@ -65,10 +73,6 @@ Vosk ローカル字幕エンジンを使用するには、まず [Vosk Models](
 - 字幕記録の表示とエクスポート
 - オーディオ出力またはマイク入力からの字幕生成

-注記：
- Windows と macOS プラットフォームはオーディオ出力とマイク入力の両方からの字幕生成をサポートしていますが、**macOS プラットフォームでシステムオーディオ出力を取得するには設定が必要です。詳細は[Auto Caption ユーザーマニュアル](./docs/user-manual/ja.md)をご覧ください。**
- Linux プラットフォームは現在システムオーディオ出力を取得できず、マイク入力からの字幕生成のみをサポートしています。
-
 ## ⚙️ 字幕エンジン説明

 現在ソフトウェアには2つの字幕エンジンが組み込まれており、1つの新しいエンジンを計画中です。詳細は以下の通りです。
@@ -136,12 +140,21 @@ subenv/Scripts/activate
 source subenv/bin/activate
 ```

-その後、依存関係をインストールします（Linux または macOS 環境の場合、`requirements.txt` 内の `PyAudioWPatch` をコメントアウトする必要があります。このモジュールは Windows 環境専用です）。
-
-> このステップでエラーが発生する場合があります。一般的にはビルド失敗が原因で、エラーメッセージに基づいて対応するビルドツールパッケージをインストールする必要があります。
+次に依存関係をインストールします（このステップは失敗する可能性があります、通常はビルド失敗が原因です - エラーメッセージに基づいて対応するツールパッケージをインストールする必要があります）：

 ```bash
-pip install -r requirements.txt
+# Windows
+pip install -r requirements_win.txt
+# macOS
+pip install -r requirements_darwin.txt
+# Linux
+pip install -r requirements_linux.txt
+```
+
+Linuxシステムで`samplerate`モジュールのインストールに問題が発生した場合、以下のコマンドで個別にインストールを試すことができます：
+
+```bash
+pip install samplerate --only-binary=:all:
 ```

 その後、`pyinstaller` を使用してプロジェクトをビルドします：
--- a/caption-engine/main-vosk.spec
+++ b/caption-engine/main-vosk.spec
@@ -1,8 +1,12 @@
 # -*- mode: python ; coding: utf-8 -*-

 from pathlib import Path
+import sys

-vosk_path = str(Path('./subenv/Lib/site-packages/vosk').resolve())
+if sys.platform == 'win32':
+    vosk_path = str(Path('./subenv/Lib/site-packages/vosk').resolve())
+else:
+    vosk_path = str(Path('./subenv/lib/python3.12/site-packages/vosk').resolve())

 a = Analysis(
    ['main-vosk.py'],
--- a/caption-engine/requirements_darwin.txt
+++ b/caption-engine/requirements_darwin.txt
@@ -2,6 +2,5 @@ dashscope
 numpy
 samplerate
 PyAudio
-PyAudioWPatch # Windows only
 vosk
 pyinstaller
--- a/caption-engine/requirements_linux.txt
+++ b/caption-engine/requirements_linux.txt
@@ -0,0 +1,5 @@
+dashscope
+numpy
+vosk
+pyinstaller
+samplerate # pip install samplerate --only-binary=:all:
--- a/caption-engine/requirements_win.txt
+++ b/caption-engine/requirements_win.txt
@@ -0,0 +1,7 @@
+dashscope
+numpy
+samplerate
+PyAudio
+PyAudioWPatch
+vosk
+pyinstaller
--- a/caption-engine/sysaudio/linux.py
+++ b/caption-engine/sysaudio/linux.py
@@ -1,7 +1,34 @@
 """获取 Linux 系统音频输入流"""

-import pyaudio
+import subprocess

+def findMonitorSource():
+    result = subprocess.run(
+        ["pactl", "list", "short", "sources"],
+        stdout=subprocess.PIPE, text=True
+    )
+    lines = result.stdout.splitlines()
+
+    for line in lines:
+        parts = line.split('\t')
+        if len(parts) >= 2 and ".monitor" in parts[1]:
+            return parts[1]
+
+    raise RuntimeError("System output monitor device not found")
+
+def findInputSource():
+    result = subprocess.run(
+        ["pactl", "list", "short", "sources"],
+        stdout=subprocess.PIPE, text=True
+    )
+    lines = result.stdout.splitlines()
+
+    for line in lines:
+        parts = line.split('\t')
+        name = parts[1]
+        if ".monitor" not in name:
+            return name
+    raise RuntimeError("Microphone input device not found")

 class AudioStream:
    """
@@ -13,26 +40,26 @@ class AudioStream:
    """
    def __init__(self, audio_type=1,  chunk_rate=20):
        self.audio_type = audio_type
-        self.mic = pyaudio.PyAudio()
-        self.device = self.mic.get_default_input_device_info()
-        self.stream = None
-        self.SAMP_WIDTH = pyaudio.get_sample_size(pyaudio.paInt16)
-        self.FORMAT = pyaudio.paInt16
-        self.CHANNELS = self.device["maxInputChannels"]
-        self.RATE = int(self.device["defaultSampleRate"])
+
+        if self.audio_type == 0:
+            self.source = findMonitorSource()
+        else:
+            self.source = findInputSource()
+
+        self.process = None
+
+        self.SAMP_WIDTH = 2
+        self.FORMAT = 16
+        self.CHANNELS = 2
+        self.RATE = 48000
        self.CHUNK = self.RATE // chunk_rate
-        self.INDEX = self.device["index"]

    def printInfo(self):
        dev_info = f"""
-        采样输入设备：
-            - 设备类型：{ "音频输入（Linux平台目前仅支持该项）" }
-            - 序号：{self.device['index']}
-            - 名称：{self.device['name']}
-            - 最大输入通道数：{self.device['maxInputChannels']}
-            - 默认低输入延迟：{self.device['defaultLowInputLatency']}s
-            - 默认高输入延迟：{self.device['defaultHighInputLatency']}s
-            - 默认采样率：{self.device['defaultSampleRate']}Hz
+        音频捕获进程：
+            - 捕获类型：{"音频输出" if self.audio_type == 0 else "音频输入"}
+            - 设备源：{self.source}
+            - 捕获进程PID：{self.process.pid if self.process else "None"}

        音频样本块大小：{self.CHUNK}
        样本位宽：{self.SAMP_WIDTH}
@@ -44,30 +71,24 @@ class AudioStream:

    def openStream(self):
        """
-        打开并返回系统音频输出流
+        启动音频捕获进程
        """
-        if self.stream: return self.stream
-        self.stream = self.mic.open(
-            format = self.FORMAT,
-            channels = int(self.CHANNELS),
-            rate = self.RATE,
-            input = True,
-            input_device_index = int(self.INDEX)
+        self.process = subprocess.Popen(
+            ["parec", "-d", self.source, "--format=s16le", "--rate=48000", "--channels=2"],
+            stdout=subprocess.PIPE
        )
-        return self.stream

    def read_chunk(self):
        """
        读取音频数据
        """
-        if not self.stream: return None
-        return self.stream.read(self.CHUNK)
+        if self.process:
+            return self.process.stdout.read(self.CHUNK)
+        return None

    def closeStream(self):
        """
-        关闭系统音频输出流
+        关闭系统音频捕获进程
        """
-        if self.stream is None: return
-        self.stream.stop_stream()
-        self.stream.close()
-        self.stream = None
+        if self.process:
+            self.process.terminate()
--- a/docs/user-manual/en.md
+++ b/docs/user-manual/en.md
@@ -61,6 +61,30 @@ Once BlackHole is confirmed installed, in the `Audio MIDI Setup` page, click the

 Now the caption engine can capture system audio output and generate captions.

+## Getting System Audio Output on Linux
+
+Execute the following commands to install `pulseaudio` and `pavucontrol`:
+
+```bash
+# For Debian or Ubuntu, etc.
+sudo apt install pulseaudio pavucontrol
+# For CentOS, etc.
+sudo yum install pulseaudio pavucontrol
+```
+
+Then execute:
+
+```bash
+pactl list short sources
+```
+
+If you see output similar to the following, the configuration was successful:
+
+```bash
+220     alsa_output.pci-0000_02_02.0.3.analog-stereo.monitor    PipeWire        s16le 2ch 48000Hz       SUSPENDED
+221     alsa_input.pci-0000_02_02.0.3.analog-stereo     PipeWire        s16le 2ch 48000Hz       SUSPENDED
+```
+
 ## Software Usage

 ### Modifying Settings
--- a/docs/user-manual/ja.md
+++ b/docs/user-manual/ja.md
@@ -64,6 +64,30 @@ BlackHoleのインストールが確認できたら、`オーディオ MIDI 設

 これで字幕エンジンがシステムオーディオ出力をキャプチャし、字幕を生成できるようになります。

+## Linux でシステムオーディオ出力を取得する
+
+以下のコマンドを実行して `pulseaudio` と `pavucontrol` をインストールします:
+
+```bash
+# Debian や Ubuntu など
+sudo apt install pulseaudio pavucontrol
+# CentOS など
+sudo yum install pulseaudio pavucontrol
+```
+
+次に実行:
+
+```bash
+pactl list short sources
+```
+
+以下のような出力があれば設定は成功です:
+
+```bash
+220     alsa_output.pci-0000_02_02.0.3.analog-stereo.monitor    PipeWire        s16le 2ch 48000Hz       SUSPENDED
+221     alsa_input.pci-0000_02_02.0.3.analog-stereo     PipeWire        s16le 2ch 48000Hz       SUSPENDED
+```
+
 ## ソフトウェアの使い方

 ### 設定の変更
--- a/docs/user-manual/zh.md
+++ b/docs/user-manual/zh.md
@@ -29,7 +29,6 @@ Auto Caption 是一个跨平台的字幕显示软件，能够实时获取系统
 这部分阿里云提供了详细的教程，可参考：

 - [获取 API KEY](https://help.aliyun.com/zh/model-studio/get-api-key)
-
 - [将 API Key 配置到环境变量](https://help.aliyun.com/zh/model-studio/configure-api-key-through-environment-variables)

 ## Vosk 引擎使用前准备
@@ -62,6 +61,30 @@ brew install blackhole-64ch

 现在字幕引擎就能捕获系统的音频输出并生成字幕了。

+## Linux 获取系统音频输出
+
+执行以下命令安装 `pulseaudio` 和 `pavucontrol`：
+
+```bash
+# Debian or Ubuntu, etc.
+sudo apt install pulseaudio pavucontrol
+# CentOS, etc.
+sudo yum install pulseaudio pavucontrol
+```
+
+然后执行：
+
+```bash
+pactl list short sources
+```
+
+如果有以下类似的输出内容则配置成功：
+
+```bash
+220     alsa_output.pci-0000_02_02.0.3.analog-stereo.monitor    PipeWire        s16le 2ch 48000Hz       SUSPENDED
+221     alsa_input.pci-0000_02_02.0.3.analog-stereo     PipeWire        s16le 2ch 48000Hz       SUSPENDED
+```
+
 ## 软件使用

 ### 修改设置
--- a/engine-test/trans.ipynb
+++ b/engine-test/trans.ipynb
@@ -1,64 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "code",
-   "execution_count": 1,
-   "id": "440d4a07",
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stderr",
-     "output_type": "stream",
-     "text": [
-      "d:\\Projects\\auto-caption\\caption-engine\\subenv\\Lib\\site-packages\\tqdm\\auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n",
-      "  from .autonotebook import tqdm as notebook_tqdm\n",
-      "None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.\n"
-     ]
-    },
-    {
-     "ename": "ImportError",
-     "evalue": "\nMarianTokenizer requires the SentencePiece library but it was not found in your environment. Check out the instructions on the\ninstallation page of its repo: https://github.com/google/sentencepiece#installation and follow the ones\nthat match your environment. Please note that you may need to restart your runtime after installation.\n",
-     "output_type": "error",
-     "traceback": [
-      "\u001b[31m---------------------------------------------------------------------------\u001b[39m",
-      "\u001b[31mImportError\u001b[39m                               Traceback (most recent call last)",
-      "\u001b[36mCell\u001b[39m\u001b[36m \u001b[39m\u001b[32mIn[1]\u001b[39m\u001b[32m, line 3\u001b[39m\n\u001b[32m      1\u001b[39m \u001b[38;5;28;01mfrom\u001b[39;00m\u001b[38;5;250m \u001b[39m\u001b[34;01mtransformers\u001b[39;00m\u001b[38;5;250m \u001b[39m\u001b[38;5;28;01mimport\u001b[39;00m MarianMTModel, MarianTokenizer\n\u001b[32m----> \u001b[39m\u001b[32m3\u001b[39m tokenizer = \u001b[43mMarianTokenizer\u001b[49m\u001b[43m.\u001b[49m\u001b[43mfrom_pretrained\u001b[49m(\u001b[33m\"\u001b[39m\u001b[33mHelsinki-NLP/opus-mt-en-zh\u001b[39m\u001b[33m\"\u001b[39m)\n\u001b[32m      4\u001b[39m model = MarianMTModel.from_pretrained(\u001b[33m\"\u001b[39m\u001b[33mHelsinki-NLP/opus-mt-en-zh\u001b[39m\u001b[33m\"\u001b[39m)\n\u001b[32m      6\u001b[39m tokenizer.save_pretrained(\u001b[33m\"\u001b[39m\u001b[33m./model_en_zh\u001b[39m\u001b[33m\"\u001b[39m)\n",
-      "\u001b[36mFile \u001b[39m\u001b[32md:\\Projects\\auto-caption\\caption-engine\\subenv\\Lib\\site-packages\\transformers\\utils\\import_utils.py:1994\u001b[39m, in \u001b[36mDummyObject.__getattribute__\u001b[39m\u001b[34m(cls, key)\u001b[39m\n\u001b[32m   1992\u001b[39m \u001b[38;5;28;01mif\u001b[39;00m (key.startswith(\u001b[33m\"\u001b[39m\u001b[33m_\u001b[39m\u001b[33m\"\u001b[39m) \u001b[38;5;129;01mand\u001b[39;00m key != \u001b[33m\"\u001b[39m\u001b[33m_from_config\u001b[39m\u001b[33m\"\u001b[39m) \u001b[38;5;129;01mor\u001b[39;00m key == \u001b[33m\"\u001b[39m\u001b[33mis_dummy\u001b[39m\u001b[33m\"\u001b[39m \u001b[38;5;129;01mor\u001b[39;00m key == \u001b[33m\"\u001b[39m\u001b[33mmro\u001b[39m\u001b[33m\"\u001b[39m \u001b[38;5;129;01mor\u001b[39;00m key == \u001b[33m\"\u001b[39m\u001b[33mcall\u001b[39m\u001b[33m\"\u001b[39m:\n\u001b[32m   1993\u001b[39m     \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28msuper\u001b[39m().\u001b[34m__getattribute__\u001b[39m(key)\n\u001b[32m-> \u001b[39m\u001b[32m1994\u001b[39m \u001b[43mrequires_backends\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;28;43mcls\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;28;43mcls\u001b[39;49m\u001b[43m.\u001b[49m\u001b[43m_backends\u001b[49m\u001b[43m)\u001b[49m\n",
-      "\u001b[36mFile \u001b[39m\u001b[32md:\\Projects\\auto-caption\\caption-engine\\subenv\\Lib\\site-packages\\transformers\\utils\\import_utils.py:1980\u001b[39m, in \u001b[36mrequires_backends\u001b[39m\u001b[34m(obj, backends)\u001b[39m\n\u001b[32m   1977\u001b[39m         failed.append(msg.format(name))\n\u001b[32m   1979\u001b[39m \u001b[38;5;28;01mif\u001b[39;00m failed:\n\u001b[32m-> \u001b[39m\u001b[32m1980\u001b[39m     \u001b[38;5;28;01mraise\u001b[39;00m \u001b[38;5;167;01mImportError\u001b[39;00m(\u001b[33m\"\u001b[39m\u001b[33m\"\u001b[39m.join(failed))\n",
-      "\u001b[31mImportError\u001b[39m: \nMarianTokenizer requires the SentencePiece library but it was not found in your environment. Check out the instructions on the\ninstallation page of its repo: https://github.com/google/sentencepiece#installation and follow the ones\nthat match your environment. Please note that you may need to restart your runtime after installation.\n"
-     ]
-    }
-   ],
-   "source": [
-    "from transformers import MarianMTModel, MarianTokenizer\n",
-    "\n",
-    "tokenizer = MarianTokenizer.from_pretrained(\"Helsinki-NLP/opus-mt-en-zh\")\n",
-    "model = MarianMTModel.from_pretrained(\"Helsinki-NLP/opus-mt-en-zh\")\n",
-    "\n",
-    "tokenizer.save_pretrained(\"./model_en_zh\")\n",
-    "model.save_pretrained(\"./model_en_zh\")\n"
-   ]
-  }
- ],
- "metadata": {
-  "kernelspec": {
-   "display_name": "subenv",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.12.1"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 5
-}
--- a/src/main/utils/AllConfig.ts
+++ b/src/main/utils/AllConfig.ts
@@ -60,7 +60,6 @@ class AllConfig {
      if(config.uiTheme) this.uiTheme = config.uiTheme
      if(config.leftBarWidth) this.leftBarWidth = config.leftBarWidth
      if(config.styles) this.setStyles(config.styles)
-      if(process.platform !== 'win32' && process.platform !== 'darwin') config.controls.audio = 1
      if(config.controls) this.setControls(config.controls)
      console.log('[INFO] Read Config from:', configPath)
    }
--- a/src/main/utils/CaptionEngine.ts
+++ b/src/main/utils/CaptionEngine.ts
@@ -118,6 +118,7 @@ export class CaptionEngine {
    });

    this.process.stderr.on('data', (data) => {
+      if(this.processStatus === 'stopping') return
      controlWindow.sendErrorMessage(i18n('engine.error') + data)
      console.error(`[ERROR] Subprocess Error: ${data}`);
    });
--- a/src/renderer/src/components/EngineControl.vue
+++ b/src/renderer/src/components/EngineControl.vue
@@ -33,7 +33,6 @@
    <div class="input-item">
      <span class="input-label">{{ $t('engine.audioType') }}</span>
      <a-select
-        :disabled="platform !== 'win32' && platform !== 'darwin'"
        class="input-area"
        v-model:value="currentAudio"
        :options="audioType"
--- a/src/renderer/src/stores/engineControl.ts
+++ b/src/renderer/src/stores/engineControl.ts
@@ -104,12 +104,6 @@ export const useEngineControlStore = defineStore('engineControl', () => {
    });
  })

-  watch(platform, (newValue) => {
-    if(newValue !== 'win32' && newValue !== 'darwin') {
-      audio.value = 1
-    }
-  })
-
  return {
    platform,           // 系统平台
    captionEngine,      // 字幕引擎列表