10 Commits

Author SHA1 Message Date
himeditator
213426dace release v0.2.0
- 更新和增加文档
- 添加新的图片
- 优化文档结构和内容
2025-07-07 22:54:30 +08:00
himeditator
50ea9c5e4c refactor(caption): 重构字幕引擎结构、修复字幕引擎空置报错 (#2)
- 修复gummy字幕引擎长时间空置报错的问题
- 将 python-subprocess 文件夹重命名为 caption-engine
- 删除未使用的 prototype 代码
2025-07-07 22:53:35 +08:00
himeditator
22cfb75d2c feat(renderer): 增加长字幕隐藏功能 (#1)
- 修复暗色主题部分内容的显示颜色
- 添加长字幕内容隐藏功能
- 优化字幕样式预览界面,支持动态显示最新字幕内容
2025-07-07 22:52:49 +08:00
himeditator
f29e15cde5 feat(theme): 添加暗色主题支持
- 新增暗色主题选项和系统主题自动适配功能
- 调整了部分样式以适应暗色主题
2025-07-05 00:54:12 +08:00
himeditator
14e7a7bce4 feat: 完全实现多语言支持、优化软件体验
- 完成多语言的剩余内容的翻译
- 重构配置管理,前端页面实现更快速的配置载入
- 为字幕引擎添加更严格的状态限制,防止出现僵尸进程
2025-07-04 22:27:43 +08:00
himeditator
0b279dedbf docs(api): 修改部分通信接口、更新 API 文档
- 重新定义了通信命令的命名规则和语义
- 修改了多个前端和后端之间的通信接口
- 为模型信息添加国际化
2025-07-04 18:38:56 +08:00
himeditator
0a10068b38 feat(i18n): 实现前端国际化
- 新增英文、日文和中文翻译文件
- 添加语言切换功能
- 更新各组件的文本内容以支持国际化
2025-07-03 23:29:10 +08:00
himeditator
d608bf59c7 feat(i18n): 后端添加国际化支持、优化前端界面
- 后端添加并实现国际化支持
- 前端引入 vue-i18n 模块(尚未添加国际化逻辑)
- 优化用户界面样式,统一输入框和标签样式
2025-07-03 20:36:09 +08:00
himeditator
3dcba07b6e refactor(renderer): 重构项目前端
- 拆分了 CaptionData 和 ControlPage 组件
- 对部分页面和变量进行了重命名
- 重构优化了状态管理,新增状态管理
2025-07-02 20:56:21 +08:00
himeditator
e77779b72a refactor: 重构项目后端
- 移除 .npmrc 中的镜像配置
- 移除 package.json 中未使用的依赖
- 大幅重构后端代码
2025-07-01 21:50:33 +08:00
83 changed files with 2891 additions and 1385 deletions

View File

@@ -6,4 +6,7 @@ indent_style = space
indent_size = 2
end_of_line = lf
insert_final_newline = true
trim_trailing_whitespace = true
trim_trailing_whitespace = true
[*.py]
indent_size = 4

2
.gitignore vendored
View File

@@ -6,4 +6,4 @@ out
*.log*
__pycache__
subenv
python-subprocess/build
caption-engine/build

4
.npmrc
View File

@@ -1,2 +1,2 @@
electron_mirror=https://npmmirror.com/mirrors/electron/
electron_builder_binaries_mirror=https://npmmirror.com/mirrors/electron-builder-binaries/
# electron_mirror=https://npmmirror.com/mirrors/electron/
# electron_builder_binaries_mirror=https://npmmirror.com/mirrors/electron-builder-binaries/

View File

@@ -1,22 +0,0 @@
## v0.0.1
2025-06-22
发布第一版软件。
## v0.1.0
2025-06-26
### 新增功能
- 添加错误通知
- 添加默认引擎的环境变量检查
- 添加配置数据文件保存和载入
- 添加字幕样式恢复默认的选项
- 添加项目关于信息
### 新增文档
- 添加用户说明文档
- 添加字幕引擎说明文档

View File

@@ -4,40 +4,55 @@
<p>Auto Caption 是一个跨平台的实时字幕显示软件。</p>
<p>
| <b>简体中文</b>
| <a href="https://github.com/HiMeditator/auto-caption/blob/main/README_en.md">English</a> |
| <a href="./README_en.md">English</a>
| <a href="./README_ja.md">日本語</a> |
</p>
<p><i>v0.2.0版本已经发布。预计将添加本地字幕引擎的v1.0.0版本正在开发中...</i></p>
</div>
![](./assets/media/main.png)
![](./assets/media/main_zh.png)
## 📥 下载
[GitHub Releases](https://github.com/HiMeditator/auto-caption/releases)
## 📚 用户手册
## 📚 相关文档
[Auto Caption 用户手册](./assets/user-manual_zh.md)
[Auto Caption 用户手册](./docs/user-manual/zh.md)
[字幕引擎说明文档](./assets/engine-manual_zh.md)
[字幕引擎说明文档](./docs/engine-manual/zh.md)
[项目 API 文档](./docs/api-docs/electron-ipc.md)
### 基本使用
目前仅提供了 Windows 平台的可安装版本。如果使用默认的 Gummy 字幕引擎,需要获取阿里云百炼平台的 API KEY 并配置到环境变量中才能正常使用该模型。相关教程:[获取 API KEY](https://help.aliyun.com/zh/model-studio/get-api-key)、[将 API Key 配置到环境变量](https://help.aliyun.com/zh/model-studio/configure-api-key-through-environment-variables)。
目前仅提供了 Windows 平台的可安装版本。如果使用默认的 Gummy 字幕引擎,首先需要获取阿里云百炼平台的 API KEY 并配置到环境变量中,这样才能正常使用该模型。
对于开发者,可以自己开发新的字幕引擎,自定义字幕引擎的开发请参考[字幕引擎说明文档](./assets/engine-manual_zh.md)。
**国际版的阿里云服务并没有提供 Gummy 模型,因此目前非中国用户无法使用默认字幕引擎。我正在开发新的本地字幕引擎,以确保所有用户都有默认字幕引擎可以使用。**
相关教程:
- [获取 API KEY](https://help.aliyun.com/zh/model-studio/get-api-key)
- [将 API Key 配置到环境变量](https://help.aliyun.com/zh/model-studio/configure-api-key-through-environment-variables)。
如果你想了解字幕引擎的工作原理,或者你想开发自己的字幕引擎,请参考[字幕引擎说明文档](./docs/engine-manual/zh.md)。
## ✨ 特性
- 多界面语言支持
- 丰富的字幕样式设置
- 灵活的字幕引擎选择
- 多语言识别与翻译
- 字幕记录展示与导出
- 生成音频输出和麦克风输入的字幕
说明:Windows 平台支持生成音频输出和麦克风输入的字幕Linux 平台仅支持生成麦克风输入的字幕。
说明:
- Windows 平台支持生成音频输出和麦克风输入的字幕
- Linux 平台目前仅支持生成麦克风输入的字幕
- 目前还没有适配 macOS 平台
## 🚀 项目运行
![](./assets/media/structure.png)
![](./assets/media/structure_zh.png)
### 安装依赖
@@ -47,17 +62,7 @@ npm install
### 构建字幕引擎
> #### 背景介绍
>
> 如果你是开发者,想开发自定义字幕引擎,请查看[字幕引擎说明文档](./assets/engine-manual_zh.md)。
>
> 所谓的字幕引擎实际上是一个子程序,它会实时获取系统音频输入(录音)或输出(播放声音)的流式数据,并调用音频转文字的模型生成对应音频的字幕。生成的字幕通过 IPC 输出为转换为字符串的 JSON 数据,并返回给主程序。主程序读取字幕数据,处理后显示在窗口上。
>
>目前项目默认使用[阿里云 Gummy 模型](https://help.aliyun.com/zh/model-studio/gummy-speech-recognition-translation/),需要获取阿里云百炼平台的 API KEY 并配置到环境变量中才能正常使用该模型。
>
> 本项目的 gummy 字幕引擎是一个 python 子程序,通过 pyinstaller 打包为可执行文件。 运行字幕引擎子程序的代码在 `src\main\utils\engine.ts` 文件中。
首先进入 `python-subprocess` 文件夹,执行如下指令创建虚拟环境:
首先进入 `caption-engine` 文件夹,执行如下指令创建虚拟环境:
```bash
python -m venv subenv
@@ -72,7 +77,7 @@ subenv/Scripts/activate
source subenv/bin/activate
```
然后安装依赖(注意如果是 Linux 环境,需要注释 `requirements.txt` 中的 `PyAudioWPatch`,该模块仅适用于 Windows 环境):
然后安装依赖(注意如果是 Linux 环境,需要注释 `requirements.txt` 中的 `PyAudioWPatch`,该模块仅适用于 Windows 环境):
```bash
pip install -r requirements.txt
@@ -84,7 +89,7 @@ pip install -r requirements.txt
pyinstaller --onefile main-gummy.py
```
此时项目构建完成,在进入 `python-subprocess/dist` 文件夹可见对应的可执行文件。即可进行后续操作。
此时项目构建完成,在进入 `caption-engine/dist` 文件夹可见对应的可执行文件。即可进行后续操作。
### 运行项目
@@ -98,7 +103,7 @@ npm run dev
```bash
# For windows
npm run build:win
# For macOS
# For macOS, not avaliable yet
npm run build:mac
# For Linux
npm run build:linux

View File

@@ -1,49 +1,57 @@
<div align="center" >
<img src="./resources/icon.png" width="100px" height="100px"/>
<h1 align="center">auto-caption</h1>
<p>Auto Caption is a cross-platform real-time subtitle display software.</p>
<p>Auto Caption is a cross-platform real-time caption display software.</p>
<p>
| <a href="https://github.com/HiMeditator/auto-caption/blob/main/README.md">简体中文</a>
| <b>English</b> |
| <a href="./README.md">Chinese</a>
| <b>English</b>
| <a href="./README_ja.md">Japanese</a> |
</p>
<p><i>Version v0.2.0 has been released. Version v1.0.0, which is expected to add a local caption engine, is under development...</i></p>
</div>
![](./assets/media/main.png)
## ⚠️ Attention
**The current software interface language is Chinese. English adaptation has not been done yet.**
![](./assets/media/main_en.png)
## 📥 Download
[GitHub Releases](https://github.com/HiMeditator/auto-caption/releases)
## 📚 User Manual
## 📚 Related Documentation
[Auto Caption User Manual (Chinese)](./assets/user-manual_en.md)
[Auto Caption User Manual](./docs/user-manual/en.md)
[Caption Engine Documentation (Chinese)](./assets/engine-manual_en.md)
[Caption Engine Explanation Document](./docs/engine-manual/en.md)
[Project API Documentation (Chinese)](./docs/api-docs/electron-ipc.md)
### Basic Usage
Currently, only an installable version for the Windows platform is provided. If using the default Gummy subtitle engine, you need to obtain an API KEY from Alibaba Cloud's Bailian platform and configure it in the environment variables to use the model properly. Related tutorials: [Get API KEY](https://help.aliyun.com/zh/model-studio/get-api-key), [Configure API Key through Environment Variables](https://help.aliyun.com/zh/model-studio/configure-api-key-through-environment-variables).
Currently, only an installable version for the Windows platform is provided. If you want to use the default Gummy caption engine, you first need to obtain an API KEY from the Alibaba Cloud Model Studio and configure it in the environment variables. This is necessary to use the model properly.
For developers, you can create a new subtitle engine. For instructions on customizing the subtitle engine, please refer to the [Caption Engine Documentation (Chinese)](./assets/engine-manual_zh.md).
**The international version of Alibaba Cloud does not provide the Gummy model, so non-Chinese users currently cannot use the default caption engine. I am trying to develop a new local caption engine to ensure that all users have access to a default caption engine.**
Relevant tutorials:
- [Obtain API KEY (Chinese)](https://help.aliyun.com/zh/model-studio/get-api-key)
- [Configure API Key in Environment Variables (Chinese)](https://help.aliyun.com/zh/model-studio/configure-api-key-through-environment-variables).
If you want to understand how the caption engine works or if you want to develop your own caption engine, please refer to the [Caption Engine Explanation Document](./docs/engine-manual/en.md).
## ✨ Features
- Rich subtitle style settings
- Flexible subtitle engine selection
- Multi-language interface support
- Rich caption style settings
- Flexible caption engine selection
- Multi-language recognition and translation
- Subtitle record display and export
- Generate subtitles for audio output and microphone input
- Caption record display and export
- Generate captions for audio output and microphone input
Note: The Windows platform supports generating subtitles for both audio output and microphone input, while the Linux platform only supports generating subtitles for microphone input.
Notes:
- The Windows platform supports generating captions for both audio output and microphone input.
- The Linux platform currently only supports generating captions for microphone input.
- The macOS platform is not yet supported.
## 🚀 Project Execution
![](./assets/media/structure.png)
![](./assets/media/structure_en.png)
### Install Dependencies
@@ -51,19 +59,9 @@ Note: The Windows platform supports generating subtitles for both audio output a
npm install
```
### Build Subtitle Engine
### Build Caption Engine
> #### Background
>
> If you are a developer and want to develop a custom subtitle engine, please refer to the [Caption Engine Documentation (Chinese)](./assets/engine-manual_zh.md).
>
> The so-called subtitle engine is actually a subprocess that will real-time acquire streaming data from system audio input (recording) or output (playing sound) and call an audio-to-text model to generate corresponding subtitles for the audio. The generated subtitles are output as JSON data converted to strings via IPC and returned to the main program. The main program reads the subtitle data, processes it, and displays it on the window.
>
> Currently, the project uses the [Alibaba Cloud Gummy Model](https://help.aliyun.com/zh/model-studio/gummy-speech-recognition-translation/) by default, which requires obtaining an API KEY from Alibaba Cloud's Bailian platform and configuring it in the environment variables to function properly. Related tutorials: [Get API KEY](https://help.aliyun.com/zh/model-studio/get-api-key), [Configure API Key through Environment Variables](https://help.aliyun.com/zh/model-studio/configure-api-key-through-environment-variables).
>
> The gummy subtitle engine in this project is a Python subprocess, packaged into an executable file using pyinstaller. The code for running the subtitle engine subprocess is in the `src\main\utils\engine.ts` file.
First, enter the `python-subprocess` folder and execute the following command to create a virtual environment:
First, navigate to the `caption-engine` folder and execute the following command to create a virtual environment:
```bash
python -m venv subenv
@@ -78,7 +76,7 @@ subenv/Scripts/activate
source subenv/bin/activate
```
Then install the dependencies (note that if you are in a Linux environment, you need to comment out `PyAudioWPatch` in `requirements.txt`, as this module is only applicable to the Windows environment):
Next, install the dependencies (note that if you are in a Linux environment, you should comment out `PyAudioWPatch` in `requirements.txt`, as this module is only applicable to the Windows environment):
```bash
pip install -r requirements.txt
@@ -90,24 +88,22 @@ Then build the project using `pyinstaller`:
pyinstaller --onefile main-gummy.py
```
At this point, the project is built. You can find the corresponding executable file in the `python-subprocess/dist` folder. You can proceed with further operations.
At this point, the project is built. You can find the executable file in the `caption-engine/dist` folder and proceed with further operations.
### Run the Project
```bash
npm run dev
```
### Build the Project
Please note that the software is currently not compatible with the macOS platform. Use Windows or Linux systems for building, with Windows being more recommended as it implements the full set of features.
Note that the software is currently not adapted for the macOS platform. Please use Windows or Linux systems for building, with Windows being more recommended due to its full functionality.
```bash
# For Windows
npm run build:win
# For macOS
# For macOS, not avaliable yet
npm run build:mac
# For Linux
npm run build:linux
```
```

109
README_ja.md Normal file
View File

@@ -0,0 +1,109 @@
<div align="center" >
<img src="./resources/icon.png" width="100px" height="100px"/>
<h1 align="center">auto-caption</h1>
<p>Auto Caption はクロスプラットフォームのリアルタイム字幕表示ソフトウェアです。</p>
<p>
| <a href="./README.md">簡体中文</a>
| <a href="./README_en.md">英語</a>
| <b>日本語</b> |
</p>
<p><i>v0.2.0 バージョンがリリースされました。ローカル字幕エンジンを追加予定の v1.0.0 バージョンが開発中...</i></p>
</div>
![](./assets/media/main_ja.png)
## 📥 ダウンロード
[GitHub Releases](https://github.com/HiMeditator/auto-caption/releases)
## 📚 関連ドキュメント
[Auto Caption ユーザーマニュアル](./docs/user-manual/ja.md)
[字幕エンジン説明文書](./docs/engine-manual/ja.md)
[プロジェクト API ドキュメント(中国語)](./docs/api-docs/electron-ipc.md)
### 基本的な使用方法
現在、Windows プラットフォーム向けのインストール可能なバージョンのみ提供されています。デフォルトの Gummy 字幕エンジンを使用する場合、まず Alibaba Cloud 百煉プラットフォームの API キーを取得し、環境変数に設定する必要があります。これによりモデルが正常に動作します。
**アリババクラウドの国際版には Gummy モデルが提供されていないため、中国以外のユーザーは現在、デフォルトの字幕エンジンを使用できません。すべてのユーザーが利用できるように、新しいローカルの字幕エンジンを開発中です。**
関連チュートリアル:
- [API キーの取得(中国語)](https://help.aliyun.com/zh/model-studio/get-api-key)
- [環境変数への API キーの設定(中国語)](https://help.aliyun.com/zh/model-studio/configure-api-key-through-environment-variables)。
字幕エンジンの仕組みを理解したい場合、または独自の字幕エンジンを開発したい場合は、[字幕エンジン説明文書](./docs/engine-manual/ja.md)を参照してください。
## ✨ 特徴
- 複数言語のインターフェースサポート
- 豊富な字幕スタイル設定
- 柔軟な字幕エンジン選択
- 複数言語の認識と翻訳
- 字幕記録の表示とエクスポート
- オーディオ出力とマイク入力の字幕生成
注意事項:
- Windows プラットフォームでは、オーディオ出力とマイク入力の両方の字幕生成がサポートされています。
- Linux プラットフォームでは、現在マイク入力の字幕生成のみがサポートされています。
- 現在、macOS プラットフォームには対応していません。
## 🚀 プロジェクトの実行
![](./assets/media/structure_ja.png)
### 依存関係のインストール
```bash
npm install
```
### 字幕エンジンのビルド
まず、`caption-engine` フォルダに移動し、以下のコマンドを実行して仮想環境を作成します:
```bash
python -m venv subenv
```
次に、仮想環境をアクティブ化します:
```bash
# Windows
subenv/Scripts/activate
# Linux
source subenv/bin/activate
```
次に、依存関係をインストールしますLinux 環境の場合、`requirements.txt``PyAudioWPatch` をコメントアウトする必要があります。このモジュールは Windows 環境でのみ適用されます):
```bash
pip install -r requirements.txt
```
次に、`pyinstaller` を使用してプロジェクトをビルドします:
```bash
pyinstaller --onefile main-gummy.py
```
この時点でプロジェクトのビルドが完了し、`caption-engine/dist` フォルダで対応する実行ファイルを見つけることができます。その後、必要な操作を行ってください。
### プロジェクトの実行
```bash
npm run dev
```
### プロジェクトのビルド
現在、ソフトウェアは macOS プラットフォームに対応していません。Windows または Linux システムを使用してビルドしてください。完全な機能を備えた Windows プラットフォームが推奨されます。
```bash
# For Windows
npm run build:win
# For macOS, not avaliable yet
npm run build:mac
# For Linux
npm run build:linux
```

Binary file not shown.

Before

Width:  |  Height:  |  Size: 72 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 332 KiB

BIN
assets/media/main_en.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 373 KiB

BIN
assets/media/main_ja.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 333 KiB

BIN
assets/media/main_zh.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 384 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 321 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 321 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 324 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 323 KiB

BIN
assets/structure.pptx Normal file

Binary file not shown.

View File

@@ -2,7 +2,7 @@ from dashscope.audio.asr import (
TranslationRecognizerCallback,
TranscriptionResult,
TranslationResult,
TranslationRecognizerRealtime
TranslationRecognizerRealtime
)
from datetime import datetime
import json
@@ -17,11 +17,13 @@ class Callback(TranslationRecognizerCallback):
self.usage = 0
self.cur_id = -1
self.time_str = ''
def on_open(self) -> None:
# print("on_open")
pass
def on_close(self) -> None:
# print("on_close")
pass
def on_event(
@@ -44,11 +46,11 @@ class Callback(TranslationRecognizerCallback):
caption['time_s'] = self.time_str
caption['time_t'] = datetime.now().strftime('%H:%M:%S')
caption['translation'] = ""
if translation_result is not None:
lang = translation_result.get_language_list()[0]
caption['translation'] = translation_result.get_translation(lang).text
if usage:
self.usage += usage['duration']

View File

@@ -12,7 +12,7 @@ import sys
import argparse
def convert_audio_to_text(s_lang, t_lang, audio_type):
sys.stdout.reconfigure(line_buffering=True)
sys.stdout.reconfigure(line_buffering=True) # type: ignore
stream = AudioStream(audio_type)
stream.openStream()
@@ -27,7 +27,11 @@ def convert_audio_to_text(s_lang, t_lang, audio_type):
if not stream.stream: continue
data = stream.stream.read(stream.CHUNK)
data = mergeStreamChannels(data, stream.CHANNELS)
gummy.translator.send_audio_frame(data)
try:
gummy.translator.send_audio_frame(data)
except:
gummy.translator.start()
gummy.translator.send_audio_frame(data)
except KeyboardInterrupt:
stream.closeStream()
gummy.translator.stop()
@@ -45,4 +49,3 @@ if __name__ == "__main__":
args.target_language,
0 if args.audio_type == '0' else 1
)

View File

@@ -0,0 +1,5 @@
dashscope==1.23.5
numpy==2.2.6
PyAudio==0.2.14
PyAudioWPatch==0.2.12.7 # Windows only
pyinstaller==6.14.1

View File

@@ -35,7 +35,7 @@ def getDefaultLoopbackDevice(mic: pyaudio.PyAudio, info = True)->dict:
print("Run `python -m pyaudiowpatch` to check available devices.")
print("Exiting...")
exit()
if(info): print(f"Output Stream Device: #{default_speaker['index']} {default_speaker['name']}")
return default_speaker
@@ -64,7 +64,7 @@ def mergeStreamChannels(data, channels):
class AudioStream:
"""
获取系统音频流
参数
audio_type: 默认0-系统音频输出流1-系统音频输入流
"""
@@ -116,7 +116,7 @@ class AudioStream:
input_device_index = self.INDEX
)
return self.stream
def closeStream(self):
"""
关闭系统音频输出流
@@ -124,4 +124,4 @@ class AudioStream:
if self.stream is None: return
self.stream.stop_stream()
self.stream.close()
self.stream = None
self.stream = None

49
docs/CHANGELOG.md Normal file
View File

@@ -0,0 +1,49 @@
## v0.0.1
2025-06-22
发布第一版软件。
## v0.1.0
2025-06-26
### 新增功能
- 添加错误通知
- 添加默认引擎的环境变量检查
- 添加配置数据文件保存和载入
- 添加字幕样式恢复默认的选项
- 添加项目关于信息
### 新增文档
- 添加用户说明文档
- 添加字幕引擎说明文档
## v0.2.0
2025-07-05
对项目进行了重构,修复了 bug添加了新功能。本版本为正式版。
### 新增功能
- 添加多界面语言支持(中文、英语、日语)
- 添加暗色主题
### 提升体验
- 优化界面布局
- 添加更多可保存和载入的配置项
- 为字幕引擎添加更严格的状态限制,防止出现僵尸进程
### 修复bug
- 添加字幕引擎长时间空置后报错的问题
### 新增文档
- 新增日语说明文档
- 新增英语、日语字幕引擎说明文档和用户手册
- 新增 electron ipc api 文档

6
docs/TODO.md Normal file
View File

@@ -0,0 +1,6 @@
- [x] 添加英语和日语语言支持 *2025/07/04*
- [x] 添加暗色主题 *2025/07/04*
- [x] 优化长字幕显示效果 *2025/07/05*
- [x] 修复字幕引擎空置报错的问题 *2025/07/05*
- [ ] 添加更多字幕引擎
- [ ] 减小软件体积

View File

@@ -0,0 +1,289 @@
# electron ipc api-doc
本文档主要记录主进程和渲染进程的通信约定。
## 命名方式
本项目渲染进程包含两个:字幕窗口和控制窗口,主进程需要分别和两者进行通信。通信命令的命名规则如下:
1. 命令一般由三个关键字组成,由点号隔开。
2. 第一个关键字表示通信发送目标:
- `config` 表示控制窗口类实例(后端)或控制窗口(前端)
- `engine` 表示字幕窗口类实例(后端)或字幕窗口(前端)
- `both` 表示上述对象都有可能成为目标
3. 第二个关键字表示需要修改的对象 / 发生改变的对象,采用小驼峰命名
4. 第三个关键字一般是动词,表示通信发生时对应动作 / 需要进行的操作
根据上面的描述可以看出通信命令一般有两种语义,一种表示要求执行的操作,另一种表示当前发生的事件。
## 前端 <=> 后端
### `both.window.mounted`
**介绍:**前端窗口挂载完毕,请求最新的配置数据
**发起方:**前端
**接收方:**后端
**数据类型:**
- 发送:无数据
- 接收:`FullConfig`
### `control.nativeTheme.get`
**介绍:**前端获取系统当前的主题
**发起方:**前端控制窗口
**接收方:**后端控制窗口实例
**数据类型:**
- 发送:无数据
- 接收:`string`
## 前端 ==> 后端
### `control.uiLanguage.change`
**介绍:**前端修改字界面语言,将修改同步给后端
**发起方:**前端控制窗口
**接收方:**后端控制窗口实例
**数据类型:**`UILanguage`
### `control.uiTheme.change`
**介绍:**前端修改字界面主题,将修改同步给后端
**发起方:**前端控制窗口
**接收方:**后端控制窗口实例
**数据类型:**`UITheme`
### `control.leftBarWidth.change`
**介绍:**前端修改边栏宽度,将修改同步给后端
**发起方:**前端控制窗口
**接收方:**后端控制窗口实例
**数据类型:**`number`
### `control.captionLog.clear`
**介绍:**清空字幕记录
**发起方:**前端控制窗口
**接收方:**后端控制窗口实例
**数据类型:**无数据
### `control.styles.change`
**介绍:**前端修改字幕样式,将修改同步给后端
**发起方:**前端控制窗口
**接收方:**后端控制窗口实例
**数据类型:**`Styles`
### `control.styles.reset`
**介绍:**将字幕样式恢复为默认
**发起方:**前端控制窗口
**接收方:**后端控制窗口实例
**数据类型:**无数据
### `control.controls.change`
**介绍:**前端修改了字幕引擎配置,将最新配置发送给后端
**发起方:**前端控制窗口
**接收方:**后端控制窗口实例
**数据类型:**`Controls`
### `control.captionWindow.activate`
**介绍:**激活字幕窗口
**发起方:**前端控制窗口
**接收方:**后端控制窗口实例
**数据类型:**无数据
### `control.engine.start`
**介绍:**启动字幕引擎
**发起方:**前端控制窗口
**接收方:**后端控制窗口实例
**数据类型:**无数据
### `control.engine.stop`
**介绍:**关闭字幕引擎
**发起方:**前端控制窗口
**接收方:**后端控制窗口实例
**数据类型:**无数据
### `caption.windowHeight.change`
**介绍:**字幕窗口宽度发生改变
**发起方:**前端字幕窗口
**接收方:**后端字幕窗口实例
**数据类型:**`number`
### `caption.pin.set`
**介绍:**是否将窗口置顶
**发起方:**前端字幕窗口
**接收方:**后端字幕窗口实例
**数据类型:**`boolean`
### `caption.controlWindow.activate`
**介绍:**激活控制窗口
**发起方:**前端字幕窗口
**接收方:**后端字幕窗口实例
**数据类型:**无数据
### `caption.window.close`
**介绍:**关闭字幕窗口
**发起方:**前端字幕窗口
**接收方:**后端字幕窗口实例
**数据类型:**无数据
## 后端 ==> 前端
### `control.uiLanguage.set`
**介绍:**后端将最新界面语言发送给前端,前端进行设置
**发起方:**后端
**接收方:**字幕窗口
**数据类型:**`UILanguage`
### `control.nativeTheme.change`
**介绍:**系统主题发生改变
**发起方:**后端
**接收方:**前端控制窗口
**数据类型:**`string`
### `control.engine.started`
**介绍:**引擎启动成功
**发起方:**后端
**接收方:**前端控制窗口
**数据类型:**无数据
### `control.engine.stopped`
**介绍:**引擎关闭
**发起方:**后端
**接收方:**前端控制窗口
**数据类型:**无数据
### `control.error.occurred`
**介绍:**发送错误
**发起方:**后端
**接收方:**前端控制窗口
**数据类型:**`string`
### `control.controls.set`
**介绍:**后端将最新字幕引擎配置发送给前端,前端进行设置
**发起方:**后端
**接收方:**前端控制窗口
**数据类型:**`Controls`
### `both.styles.set`
**介绍:**后端将最新字幕样式发送给前端,前端进行设置
**发起方:**后端
**接收方:**前端
**数据类型:**`Styles`
### `both.captionLog.add`
**介绍:**添加一条新的字幕数据
**发起方:**后端
**接收方:**前端
**数据类型:**`CaptionItem`
### `both.captionLog.upd`
**介绍:**更新最后一条字幕数据
**发起方:**后端
**接收方:**前端
**数据类型:**`CaptionItem`
### `both.captionLog.set`
**介绍:**设置全部的字幕数据
**发起方:**后端
**接收方:**前端
**数据类型:**`CaptionItem[]`

116
docs/engine-manual/en.md Normal file
View File

@@ -0,0 +1,116 @@
# Caption Engine Documentation
![](../../assets/media/structure_en.png)
## Introduction to the Caption Engine
The so-called caption engine is actually a subprocess that fetches real-time streaming audio data from system audio input (recording) or output (playing sound) and calls an audio-to-text model to generate captions for the corresponding audio. The generated captions are converted into JSON formatted string data and passed to the main program via standard output (it must be ensured that the string read by the main program can be correctly interpreted as a JSON object). The main program reads and interprets the caption data, processes it, and displays it on the window.
## Features the Caption Engine Needs to Implement
### Audio Acquisition
First, your caption engine needs to acquire streaming audio data from system audio input (recording) or output (playing sound). If developing with Python, you can use the PyAudio library to get microphone audio input data (cross-platform). Use the PyAudioWPatch library to get system audio output (only applicable to Windows platform).
The acquired audio stream data is usually in the form of short audio chunks, and the size of these chunks should be adjusted according to the model. For example, Alibaba Cloud's Gummy model performs better with 0.05-second audio chunks than with 0.2-second audio chunks.
### Audio Processing
The acquired audio stream may need preprocessing before being converted to text. For instance, Alibaba Cloud's Gummy model can only recognize single-channel audio streams, while the collected audio streams are generally dual-channel, so you need to convert the dual-channel audio stream to a single channel. The conversion of channels can be achieved using methods from the NumPy library.
You can directly use the audio acquisition and processing modules I've developed (path: `caption-engine/sysaudio`):
```python
if sys.platform == 'win32':
from sysaudio.win import AudioStream, mergeStreamChannels
elif sys.platform == 'linux':
from sysaudio.linux import AudioStream, mergeStreamChannels
else:
raise NotImplementedError(f"Unsupported platform: {sys.platform}")
# Create an instance of the audio stream object
stream = AudioStream(audio_type)
# Open the audio stream
stream.openStream()
while True: # Loop to read audio data
# Read audio data
data = stream.stream.read(stream.CHUNK)
# Convert dual-channel audio data to single-channel
data = mergeStreamChannels(data, stream.CHANNELS)
# Call the audio-to-text model
# ... ...
```
### Audio to Text Conversion
Once you have the appropriate audio stream, you can convert it to text. Various models are typically used to achieve this. You can choose the model based on your requirements.
### Data Transmission
After obtaining the text for the current audio stream, you need to pass the text to the main program. The caption engine process passes the caption data to the Electron main process through standard output.
The content transmitted must be a JSON string, where the JSON object should include the following parameters:
```typescript
export interface CaptionItem {
index: number, // Caption sequence number
time_s: string, // Start time of the current caption
time_t: string, // End time of the current caption
text: string, // Caption content
translation: string // Caption translation
}
```
**It is essential to ensure that every time a caption JSON data is output, the buffer is flushed, ensuring that the string received by the Electron main process each time can be interpreted as a JSON object.**
If using Python, you can refer to the following method to pass data to the main program:
```python
# caption-engine\main-gummy.py
sys.stdout.reconfigure(line_buffering=True)
# caption-engine\audio2text\gummy.py
...
def send_to_node(self, data):
"""
Send data to the Node.js process
"""
try:
json_data = json.dumps(data) + '\n'
sys.stdout.write(json_data)
sys.stdout.flush()
except Exception as e:
print(f"Error sending data to Node.js: {e}", file=sys.stderr)
...
```
The code for the data receiving end is as follows:
```typescript
// src\main\utils\engine.ts
...
this.process.stdout.on('data', (data) => {
const lines = data.toString().split('\n');
lines.forEach((line: string) => {
if (line.trim()) {
try {
const caption = JSON.parse(line);
addCaptionLog(caption);
} catch (e) {
controlWindow.sendErrorMessage('Cannot parse caption engine output as JSON object: ' + e)
console.error('[ERROR] Error parsing JSON:', e);
}
}
});
});
this.process.stderr.on('data', (data) => {
controlWindow.sendErrorMessage('Caption engine error: ' + data)
console.error(`[ERROR] Subprocess Error: ${data}`);
});
...
```
## Code Reference
The default caption engine entry point code is located in the `main-gummy.py` file under the `caption-engine` folder of this project. The `src\main\utils\engine.ts` file contains the server-side code for acquiring and processing caption engine data. You can read and understand the implementation details and the complete runtime process of the caption engine as needed.

118
docs/engine-manual/ja.md Normal file
View File

@@ -0,0 +1,118 @@
# キャプションエンジンの説明文書
![](../../assets/media/structure_ja.png)
この文書は大規模モデルを使用して翻訳されていますので、内容に正確でない部分があるかもしれません。
## キャプションエンジンの紹介
キャプションエンジンとは、実際にはサブプログラムであり、システムの音声入力録音または出力音声再生のストリーミングデータをリアルタイムで取得し、音声をテキストに変換するモデルを呼び出して対応する音声のキャプションを生成します。生成されたキャプションはJSON形式の文字列データに変換され、標準出力を通じてメインプログラムに渡されますメインプログラムが読み取った文字列がJSONオブジェクトとして正しく解釈できるようにする必要があります。メインプログラムはキャプションデータを読み取り、解釈し、処理してウィンドウ上に表示します。
## キャプションエンジンが必要とする機能
### 音声の取得
まず、あなたのキャプションエンジンはシステムの音声入力録音または出力音声再生のストリーミングデータを取得する必要があります。Pythonを使用して開発する場合、PyAudioライブラリを使用してマイクからの音声入力データを取得できます全プラットフォーム対応。PyAudioWPatchライブラリを使用してシステムの音声出力を取得することができますWindowsプラットフォームのみ対応
一般的に取得される音声ストリームデータは、比較的短い時間の音声ブロックで構成されています。モデルに合わせて音声ブロックのサイズを調整する必要があります。例えば、アリババクラウドのGummyモデルでは、0.05秒の音声ブロックを使用した認識精度が0.2秒の音声ブロックよりも優れています。
### 音声の処理
取得した音声ストリームは、テキストに変換する前に前処理を行う必要があるかもしれません。例えば、アリババクラウドのGummyモデルは単一チャンネルの音声ストリームしか認識できませんが、収集された音声ストリームは通常二重チャンネルです。そのため、二重チャンネルの音声ストリームを単一チャンネルに変換する必要があります。チャンネル数の変換はNumPyライブラリのメソッドを使用して行うことができます。
既に開発済みの音声取得と音声処理モジュール(パス:`caption-engine/sysaudio`)を使用することもできます:
```python
if sys.platform == 'win32':
from sysaudio.win import AudioStream, mergeStreamChannels
elif sys.platform == 'linux':
from sysaudio.linux import AudioStream, mergeStreamChannels
else:
raise NotImplementedError(f"サポートされていないプラットフォーム: {sys.platform}")
# 音声ストリームオブジェクトのインスタンスを作成
stream = AudioStream(audio_type)
# 音声ストリームを開く
stream.openStream()
while True: # 音声データを繰り返し読み込む
# 音声データを読み込む
data = stream.stream.read(stream.CHUNK)
# 二重チャンネルの音声データを単一チャンネルに変換
data = mergeStreamChannels(data, stream.CHANNELS)
# 音声をテキストに変換するモデルを呼び出す
# ... ...
```
### 音声からテキストへの変換
適切な音声ストリームを得た後、それをテキストに変換することができます。通常、様々なモデルを使用してこの変換を行います。必要に応じてモデルを選択してください。
### データの伝送
現在の音声ストリームのテキストを取得したら、それをメインプログラムに伝送する必要があります。キャプションエンジンプロセスは標準出力を通じてキャプションデータをElectronのメインプロセスに伝送します。
伝送する内容はJSON文字列でなければならず、JSONオブジェクトには以下のパラメータを含める必要があります
```typescript
export interface CaptionItem {
index: number, // キャプション番号
time_s: string, // 現在のキャプションの開始時間
time_t: string, // 現在のキャプションの終了時間
text: string, // キャプションの内容
translation: string // キャプションの翻訳
}
```
**注意キャプションJSONデータを出力するたびに必ずバッファをフラッシュし、Electronのメインプロセスが受け取る文字列が常にJSONオブジェクトとして解釈できるようにする必要があります。**
Pythonを使用する場合、以下のようにデータをメインプログラムに伝送できます
```python
# caption-engine\main-gummy.py
sys.stdout.reconfigure(line_buffering=True)
# caption-engine\audio2text\gummy.py
...
def send_to_node(self, data):
"""
データをNode.jsプロセスに送信
"""
try:
json_data = json.dumps(data) + '\n'
sys.stdout.write(json_data)
sys.stdout.flush()
except Exception as e:
print(f"Node.jsへのデータ送信エラー: {e}", file=sys.stderr)
...
```
データ受信側のコードは以下の通りです:
```typescript
// src\main\utils\engine.ts
...
this.process.stdout.on('data', (data) => {
const lines = data.toString().split('\n');
lines.forEach((line: string) => {
if (line.trim()) {
try {
const caption = JSON.parse(line);
addCaptionLog(caption);
} catch (e) {
controlWindow.sendErrorMessage('キャプションエンジンの出力内容がJSONオブジェクトとして解析できません: ' + e)
console.error('[ERROR] JSON解析エラー:', e);
}
}
});
});
this.process.stderr.on('data', (data) => {
controlWindow.sendErrorMessage('キャプションエンジンエラー: ' + data)
console.error(`[ERROR] サブプロセスエラー: ${data}`);
});
...
```
## 参考コード
本プロジェクトの `caption-engine` フォルダにある `main-gummy.py` ファイルは、デフォルトのキャプションエンジンのエントリポイントコードです。`src\main\utils\engine.ts` はサーバーサイドでキャプションエンジンのデータを取得および処理するためのコードです。必要に応じて、キャプションエンジンの実装詳細と完全な実行プロセスを理解するために読み込むことができます。

View File

@@ -1,10 +1,10 @@
# 字幕引擎说明文档
![](./media/structure.png)
![](../../assets/media/structure_zh.png)
## 字幕引擎介绍
所谓的字幕引擎实际上是一个子程序,它会实时获取系统音频输入(录音)或输出(播放声音)的流式数据,并调用音频转文字的模型生成对应音频的字幕。生成的字幕通过 IPC 输出为转换为 JSON 格式的字符串数据,并返回给主程序。主程序读取字幕数据,处理后显示在窗口上。
所谓的字幕引擎实际上是一个子程序,它会实时获取系统音频输入(录音)或输出(播放声音)的流式数据,并调用音频转文字的模型生成对应音频的字幕。生成的字幕转换为 JSON 格式的字符串数据,并通过标准输出传递给主程序(需要保证主程序读取到的字符串可以被正确解释为 JSON 对象)。主程序读取并解释字幕数据,处理后显示在窗口上。
## 字幕引擎需要实现的功能
@@ -18,13 +18,38 @@
获取到的音频流在转文字之前可能需要进行预处理。比如阿里云的 Gummy 模型只能识别单通道的音频流,而收集的音频流一般是双通道的,因此要将双通道音频流转换为单通道。通道数的转换可以使用 NumPy 库中的方法实现。
你可以直接使用我开发好的音频获取和音频处理模块(路径:`caption-engine/sysaudio`
```python
if sys.platform == 'win32':
from sysaudio.win import AudioStream, mergeStreamChannels
elif sys.platform == 'linux':
from sysaudio.linux import AudioStream, mergeStreamChannels
else:
raise NotImplementedError(f"Unsupported platform: {sys.platform}")
# 创建音频流对象实例
stream = AudioStream(audio_type)
# 打开音频流
stream.openStream()
while True: # 循环读取音频数据
# 读取音频数据
data = stream.stream.read(stream.CHUNK)
# 将双通道音频数据转换为单通道
data = mergeStreamChannels(data, stream.CHANNELS)
# 调用音频转文字模型
# ... ...
```
### 音频转文字
在得到了合适的音频流后,就可以将音频流转换为文字了。一般使用各种模型来实现音频流转文字。可根据需求自行选择模型。
### 数据传递
在获取到当前音频流的文字后,需要将文字传递给主程序。使用进程间通信(IPC)的方式,比如通过标准输入输出流或者命名管道来实现。传递的内容必须是 JSON 字符串,其中 JSON 对象需要包含的参数如下:
在获取到当前音频流的文字后,需要将文字传递给主程序。字幕引擎进程通过标准输出将字幕数据传递给 electron 主进程。
传递的内容必须是 JSON 字符串,其中 JSON 对象需要包含的参数如下:
```typescript
export interface CaptionItem {
@@ -36,10 +61,15 @@ export interface CaptionItem {
}
```
**注意必须确保咱们一起每输出一次字幕 JSON 数据就得刷新缓冲区,确保 electron 主进程每次接收到的字符串都可以被解释为 JSON 对象。**
如果使用 python 语言,可以参考以下方式将数据传递给主程序:
```python
# python-subprocess\audio2text\gummy.py
# caption-engine\main-gummy.py
sys.stdout.reconfigure(line_buffering=True)
# caption-engine\audio2text\gummy.py
...
def send_to_node(self, data):
"""
@@ -84,4 +114,4 @@ export interface CaptionItem {
## 参考代码
本项目 `python-subprocess` 文件夹下的 `main-gummy.py` 文件为默认字幕引擎的入口代码。`src\main\utils\engine.ts` 为服务端获取字幕引擎数据和进行处理的代码。可以根据需要阅读了解字幕引擎的实现细节和完整运行过程。
本项目 `caption-engine` 文件夹下的 `main-gummy.py` 文件为默认字幕引擎的入口代码。`src\main\utils\engine.ts` 为服务端获取字幕引擎数据和进行处理的代码。可以根据需要阅读了解字幕引擎的实现细节和完整运行过程。

View File

Before

Width:  |  Height:  |  Size: 26 KiB

After

Width:  |  Height:  |  Size: 26 KiB

BIN
docs/img/02_en.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 61 KiB

BIN
docs/img/02_ja.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 66 KiB

BIN
docs/img/02_zh.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 68 KiB

60
docs/user-manual/en.md Normal file
View File

@@ -0,0 +1,60 @@
# Auto Caption User Manual
Corresponding Version: v0.2.0
## Software Introduction
Auto Caption is a cross-platform caption display software that can real-time capture system audio input (recording) or output (playback) streaming data and use an audio-to-text model to generate captions for the corresponding audio. The default caption engine provided by the software (using Alibaba Cloud Gummy model) supports recognition and translation in nine languages (Chinese, English, Japanese, Korean, German, French, Russian, Spanish, Italian).
Currently, the default caption engine only has full functionality on the Windows platform. On the Linux platform, it can only generate captions for audio input (microphone) and does not support generating captions for audio output (playback).
![](../../assets/media/main_en.png)
### Software Limitations
To use the default caption service, you need to obtain an API KEY from Alibaba Cloud.
The software is built using Electron, so the software size is inevitably large.
## Software Usage
### Preparing the Alibaba Cloud Model Studio API KEY
To use the default caption engine (Alibaba Cloud Gummy), you need to obtain an API KEY from the Alibaba Cloud Model Studio and configure it in your local environment variables.
**The international version of Alibaba Cloud does not provide the Gummy model, so non-Chinese users currently cannot use the default caption engine. I am trying to develop a new local caption engine to ensure that all users have access to a default caption engine.**
Alibaba Cloud provides detailed tutorials for this:
- [Obtain API KEY (Chinese)](https://help.aliyun.com/zh/model-studio/get-api-key)
- [Configure API Key in Environment Variables (Chinese)](https://help.aliyun.com/zh/model-studio/configure-api-key-through-environment-variables)
### Modifying Settings
Caption settings can be divided into three categories: general settings, caption engine settings, and caption style settings. Note that changes to general settings take effect immediately. For the other two categories, after making changes, you need to click the "Apply" option in the upper right corner of the corresponding settings module for the changes to take effect. If you click "Cancel Changes," the current modifications will not be saved and will revert to the previous state.
### Starting and Stopping Captions
After completing all configurations, click the "Start Caption Engine" button on the interface to start the captions. If you need a separate caption display window, click the "Open Caption Window" button to activate the independent caption display window. To pause caption recognition, click the "Stop Caption Engine" button.
### Adjusting the Caption Display Window
The following image shows the caption display window, which displays the latest captions in real-time. The three buttons in the upper right corner of the window have the following functions: pin the window to the front, open the caption control window, and close the caption display window. The width of the window can be adjusted by moving the mouse to the left or right edge of the window and dragging the mouse.
![](../img/01.png)
### Exporting Caption Records
In the caption control window, you can see the records of all collected captions. Click the "Export Caption Records" button to export the caption records as a JSON file.
## Caption Engine
The so-called caption engine is actually a subprocess that real-time captures system audio input (recording) or output (playback) streaming data and uses an audio-to-text model to generate captions for the corresponding audio. The generated captions are output as JSON data converted to strings and returned to the main program. The main program reads the caption data, processes it, and displays it in the window.
The software provides a default caption engine. If you need other caption engines, you can call them by enabling the custom engine option (other engines need to be developed specifically for this software). The engine path is the path to the custom caption engine on your computer, and the engine command is the runtime parameters for the custom caption engine, which need to be filled out according to the rules of the specific caption engine.
![](../img/02_en.png)
Note that when using a custom caption engine, all previous caption engine settings will be ineffective, and the configuration of the custom caption engine is entirely done through the engine command.
If you are a developer and want to develop a custom caption engine, please refer to the [Caption Engine Explanation Document](../engine-manual/en.md).

62
docs/user-manual/ja.md Normal file
View File

@@ -0,0 +1,62 @@
# Auto Caption ユーザーマニュアル
対応バージョンv0.2.0
この文書は大規模モデルを使用して翻訳されていますので、内容に正確でない部分があるかもしれません。
## ソフトウェアの概要
Auto Caption は、クロスプラットフォームの字幕表示ソフトウェアで、システムの音声入力(録音)または出力(音声再生)のストリーミングデータをリアルタイムで取得し、音声からテキストに変換するモデルを利用して対応する音声の字幕を生成します。このソフトウェアが提供するデフォルトの字幕エンジン(アリババクラウド Gummy モデルを使用は、9つの言語中国語、英語、日本語、韓国語、ドイツ語、フランス語、ロシア語、スペイン語、イタリア語の認識と翻訳をサポートしています。
現在、デフォルトの字幕エンジンは Windows プラットフォームでのみ完全な機能を利用できます。Linux プラットフォームでは、音声入力(マイク)からの字幕生成のみがサポートされており、音声出力(音声再生)からの字幕生成はまだサポートされていません。
![](../../assets/media/main_ja.png)
### ソフトウェアの欠点
デフォルトの字幕サービスを使用するには、アリババクラウドの API KEY を取得する必要があります。
ソフトウェアは Electron で構築されているため、そのサイズは避けられないほど大きいです。
## ソフトウェアの使用方法
### アリババクラウド百炼プラットフォームの API KEY の準備
ソフトウェアが提供するデフォルトの字幕エンジン(アリババクラウド Gummyを使用するには、アリババクラウド百炼プラットフォームから API KEY を取得し、ローカル環境変数に設定する必要があります。
**アリババクラウドの国際版には Gummy モデルが提供されていないため、中国以外のユーザーは現在、デフォルトの字幕エンジンを使用できません。すべてのユーザーが利用できるように、新しいローカルの字幕エンジンを開発中です。**
アリババクラウドは詳細なチュートリアルを提供していますので、以下のリンクを参照してください:
- [API KEY の取得(中国語)](https://help.aliyun.com/zh/model-studio/get-api-key)
- [環境変数を通じて API Key を設定する(中国語)](https://help.aliyun.com/zh/model-studio/configure-api-key-through-environment-variables)
### 設定の変更
字幕の設定は3つのカテゴリーに分かれます一般的な設定、字幕エンジンの設定、字幕スタイルの設定。注意すべき点として、一般的な設定の変更は即座に適用されます。しかし、他の2つの設定については、変更後に該当する設定モジュール右上の「適用」オプションをクリックすることで初めて変更が有効になります。「変更を取り消す」を選択すると、現在の変更は保存されず、前回の状態に戻ります。
### 字幕の開始と停止
すべての設定を完了したら、インターフェースの「字幕エンジンを開始」ボタンをクリックして字幕を開始できます。独立した字幕表示ウィンドウが必要な場合は、インターフェースの「字幕ウィンドウを開く」ボタンをクリックして独立した字幕表示ウィンドウをアクティブ化します。字幕認識を一時停止する必要がある場合は、「字幕エンジンを停止」ボタンをクリックします。
### 字幕表示ウィンドウの調整
下の図は字幕表示ウィンドウです。このウィンドウは現在の最新の字幕をリアルタイムで表示します。ウィンドウの右上にある3つのボタンの機能はそれぞれ次の通りですウィンドウを最前面に固定する、字幕制御ウィンドウを開く、字幕表示ウィンドウを閉じる。このウィンドウの幅は調整可能です。マウスをウィンドウの左右の端に移動し、ドラッグして幅を調整します。
![](../img/01.png)
### 字幕記録のエクスポート
字幕制御ウィンドウでは、現在収集されたすべての字幕の記録を見ることができます。「字幕記録をエクスポート」ボタンをクリックすると、字幕記録をJSONファイルとしてエクスポートできます。
## 字幕エンジン
字幕エンジンとは、実際にはサブプログラムであり、システムの音声入力録音または出力音声再生のストリーミングデータをリアルタイムで取得し、音声からテキストに変換するモデルを利用して対応する音声の字幕を生成します。生成された字幕はIPC経由で文字列に変換されたJSONデータとして出力され、メインプログラムに返されます。メインプログラムは字幕データを読み取り、処理してウィンドウ上に表示します。
ソフトウェアはデフォルトの字幕エンジンを提供しており、他の字幕エンジンが必要な場合は、カスタムエンジンオプションを開いて他の字幕エンジンを呼び出すことができます(他のエンジンはこのソフトウェアに対して開発する必要があります)。エンジンパスは、あなたのコンピュータ上のカスタム字幕エンジンのパスであり、エンジンコマンドはカスタム字幕エンジンの実行パラメータです。これらの部分は、その字幕エンジンの規則に従って記入する必要があります。
![](../img/02_ja.png)
カスタム字幕エンジンを使用する場合、前の字幕エンジンの設定はすべて無効になります。カスタム字幕エンジンの設定は完全にエンジンコマンドによって行われます。
開発者の方で、カスタム字幕エンジンを開発したい場合は、[字幕エンジン説明文書](../engine-manual/ja.md)をご覧ください。

View File

@@ -1,14 +1,14 @@
# Auto Caption 用户手册
对应版本v0.1.0
对应版本v0.2.0
## 软件简介
Auto Caption 是一个跨平台的字幕显示软件,能够实时获取系统音频输入(录音)或输出(播放声音)的流式数据,并调用音频转文字的模型生成对应音频的字幕。软件提供的默认字幕引擎(使用阿里云 Gummy 模型)支持九种语言(中英日韩德法俄西意)的识别与翻译。
Auto Caption 是一个跨平台的字幕显示软件,能够实时获取系统音频输入(录音)或输出(播放声音)的流式数据,并调用音频转文字的模型生成对应音频的字幕。软件提供的默认字幕引擎(使用阿里云 Gummy 模型)支持九种语言(中、英、日、韩、德、法、俄、西、意)的识别与翻译。
目前软件默认字幕引擎只有在 Windows 平台下才拥有完整功能。在 Linux 平台下只能生成音频输入(麦克风)的字幕,暂不支持音频输出(播放声音)的字幕生成。
![](./media/main.png)
![](../../assets/media/main_zh.png)
### 软件缺点
@@ -22,15 +22,17 @@ Auto Caption 是一个跨平台的字幕显示软件,能够实时获取系统
要使用软件提供的默认字幕引擎(阿里云 Gummy需要从阿里云百炼平台获取 API KEY 并在本机环境变量中配置。
**国际版的阿里云服务并没有提供 Gummy 模型,因此目前非中国用户无法使用默认字幕引擎。我正在开发新的本地字幕引擎,以确保所有用户都有默认字幕引擎可以使用。**
这部分阿里云提供了详细的教程,可参考:
- [获取 API KEY](https://help.aliyun.com/zh/model-studio/get-api-key)
- [将 API Key 配置到环境变量](https://help.aliyun.com/zh/model-studio/configure-api-key-through-environment-variables)
- [将 API Key 配置到环境变量](https://help.aliyun.com/zh/model-studio/configure-api-key-through-environment-variables)
### 修改字幕设置
### 修改设置
字幕设置可以分为类:修改字幕引擎置、修改字幕样式设置。需要注意的是,在调整的设置的参数后,需要点击每个设置模块右上角的“更改设置”(字幕引擎设置)或“应用样式”(字幕样式设置),更改才会真正生效。如果点击“取消更改”那么当前设置将不会被保存,而是回到上次修改的状态。
字幕设置可以分为类:通用设置、字幕引擎置、字幕样式设置。需要注意的是,修改通用设置是立即生效的。但是对于其他两类设置,修改后需要点击对应设置模块右上角的“应用”选项,更改才会真正生效。如果点击“取消更改”那么当前修改将不会被保存,而是回退到上次修改的状态。
### 启动和关闭字幕
@@ -40,7 +42,7 @@ Auto Caption 是一个跨平台的字幕显示软件,能够实时获取系统
如下图为字幕展示窗口,该窗口实时展示当前最新字幕。窗口右上角三个按钮的功能分别是:将窗口固定在最前面、打开字幕控制窗口、关闭字幕展示窗口。该窗口宽度可以调整,将鼠标移动至窗口的左右边缘,拖动鼠标即可调整宽度。
![](./img/01.png)
![](../img/01.png)
### 字幕记录的导出
@@ -50,10 +52,10 @@ Auto Caption 是一个跨平台的字幕显示软件,能够实时获取系统
所谓的字幕引擎实际上是一个子程序,它会实时获取系统音频输入(录音)或输出(播放声音)的流式数据,并调用音频转文字的模型生成对应音频的字幕。生成的字幕通过 IPC 输出为转换为字符串的 JSON 数据,并返回给主程序。主程序读取字幕数据,处理后显示在窗口上。
软件提供了一个默认的字幕引擎,如果你需要其他的字幕引擎,可以通过打开自定义引擎选项来调用其他字幕引擎(其他引擎需要针对进行开发)。其中引擎路径是自定义字幕引擎在你的电脑上的路径,引擎指令是自定义字幕引擎的运行参数,这部分需要按该字幕引擎的规则进行填写。
软件提供了一个默认的字幕引擎,如果你需要其他的字幕引擎,可以通过打开自定义引擎选项来调用其他字幕引擎(其他引擎需要针对该软件进行开发)。其中引擎路径是自定义字幕引擎在你的电脑上的路径,引擎指令是自定义字幕引擎的运行参数,这部分需要按该字幕引擎的规则进行填写。
![](./img/02.png)
![](../img/02_zh.png)
注意使用自定义字幕引擎时,前面的字幕引擎的设置将全部不起作用,自定义字幕引擎的配置完全通过引擎指令进行配置。
如果你是开发者,想开发自定义字幕引擎,请查看[字幕引擎说明文档](./engine-manual_zh.md)。
如果你是开发者,想开发自定义字幕引擎,请查看[字幕引擎说明文档](../engine-manual/zh.md)。

View File

@@ -10,8 +10,8 @@ files:
- '!{.env,.env.*,.npmrc,pnpm-lock.yaml}'
- '!{tsconfig.json,tsconfig.node.json,tsconfig.web.json}'
extraResources:
from: ./python-subprocess/dist/main-gummy.exe
to: ./python-subprocess/dist/main-gummy.exe
from: ./caption-engine/dist/main-gummy.exe
to: ./caption-engine/dist/main-gummy.exe
asarUnpack:
- resources/**
win:

110
package-lock.json generated
View File

@@ -1,27 +1,26 @@
{
"name": "auto-caption",
"version": "0.0.1",
"version": "0.1.0",
"lockfileVersion": 3,
"requires": true,
"packages": {
"": {
"name": "auto-caption",
"version": "0.0.1",
"version": "0.1.0",
"hasInstallScript": true,
"dependencies": {
"@electron-toolkit/preload": "^3.0.1",
"@electron-toolkit/utils": "^4.0.0",
"ant-design-vue": "^4.2.6",
"pinia": "^3.0.2",
"vue-router": "^4.5.1",
"ws": "^8.18.2"
"vue-i18n": "^11.1.9",
"vue-router": "^4.5.1"
},
"devDependencies": {
"@electron-toolkit/eslint-config-prettier": "3.0.0",
"@electron-toolkit/eslint-config-ts": "^3.0.0",
"@electron-toolkit/tsconfig": "^1.0.1",
"@types/node": "^22.14.1",
"@types/ws": "^8.18.1",
"@vitejs/plugin-vue": "^5.2.3",
"electron": "^35.1.5",
"electron-builder": "^25.1.8",
@@ -1492,6 +1491,50 @@
"url": "https://github.com/sponsors/nzakas"
}
},
"node_modules/@intlify/core-base": {
"version": "11.1.9",
"resolved": "https://registry.npmmirror.com/@intlify/core-base/-/core-base-11.1.9.tgz",
"integrity": "sha512-Lrdi4wp3XnGhWmB/mMD/XtfGUw1Jt+PGpZI/M63X1ZqhTDjNHRVCs/i8vv8U1cwaj1A9fb0bkCQHLSL0SK+pIQ==",
"license": "MIT",
"dependencies": {
"@intlify/message-compiler": "11.1.9",
"@intlify/shared": "11.1.9"
},
"engines": {
"node": ">= 16"
},
"funding": {
"url": "https://github.com/sponsors/kazupon"
}
},
"node_modules/@intlify/message-compiler": {
"version": "11.1.9",
"resolved": "https://registry.npmmirror.com/@intlify/message-compiler/-/message-compiler-11.1.9.tgz",
"integrity": "sha512-84SNs3Ikjg0rD1bOuchzb3iK1vR2/8nxrkyccIl5DjFTeMzE/Fxv6X+A7RN5ZXjEWelc1p5D4kHA6HEOhlKL5Q==",
"license": "MIT",
"dependencies": {
"@intlify/shared": "11.1.9",
"source-map-js": "^1.0.2"
},
"engines": {
"node": ">= 16"
},
"funding": {
"url": "https://github.com/sponsors/kazupon"
}
},
"node_modules/@intlify/shared": {
"version": "11.1.9",
"resolved": "https://registry.npmmirror.com/@intlify/shared/-/shared-11.1.9.tgz",
"integrity": "sha512-H/83xgU1l8ox+qG305p6ucmoy93qyjIPnvxGWRA7YdOoHe1tIiW9IlEu4lTdsOR7cfP1ecrwyflQSqXdXBacXA==",
"license": "MIT",
"engines": {
"node": ">= 16"
},
"funding": {
"url": "https://github.com/sponsors/kazupon"
}
},
"node_modules/@isaacs/cliui": {
"version": "8.0.2",
"resolved": "https://registry.npmmirror.com/@isaacs/cliui/-/cliui-8.0.2.tgz",
@@ -2281,16 +2324,6 @@
"license": "MIT",
"optional": true
},
"node_modules/@types/ws": {
"version": "8.18.1",
"resolved": "https://registry.npmmirror.com/@types/ws/-/ws-8.18.1.tgz",
"integrity": "sha512-ThVF6DCVhA8kUGy+aazFQ4kXQ7E1Ty7A3ypFOe0IcJV8O/M511G99AW24irKrW56Wt44yG9+ij8FaqoBGkuBXg==",
"dev": true,
"license": "MIT",
"dependencies": {
"@types/node": "*"
}
},
"node_modules/@types/yauzl": {
"version": "2.10.3",
"resolved": "https://registry.npmmirror.com/@types/yauzl/-/yauzl-2.10.3.tgz",
@@ -9430,6 +9463,32 @@
"node": ">=10"
}
},
"node_modules/vue-i18n": {
"version": "11.1.9",
"resolved": "https://registry.npmmirror.com/vue-i18n/-/vue-i18n-11.1.9.tgz",
"integrity": "sha512-N9ZTsXdRmX38AwS9F6Rh93RtPkvZTkSy/zNv63FTIwZCUbLwwrpqlKz9YQuzFLdlvRdZTnWAUE5jMxr8exdl7g==",
"license": "MIT",
"dependencies": {
"@intlify/core-base": "11.1.9",
"@intlify/shared": "11.1.9",
"@vue/devtools-api": "^6.5.0"
},
"engines": {
"node": ">= 16"
},
"funding": {
"url": "https://github.com/sponsors/kazupon"
},
"peerDependencies": {
"vue": "^3.0.0"
}
},
"node_modules/vue-i18n/node_modules/@vue/devtools-api": {
"version": "6.6.4",
"resolved": "https://registry.npmmirror.com/@vue/devtools-api/-/devtools-api-6.6.4.tgz",
"integrity": "sha512-sGhTPMuXqZ1rVOk32RylztWkfXTRhuS7vgAKv0zjqk8gbsHkJ7xfFf+jbySxt7tWObEJwyKaHMikV/WGDiQm8g==",
"license": "MIT"
},
"node_modules/vue-router": {
"version": "4.5.1",
"resolved": "https://registry.npmmirror.com/vue-router/-/vue-router-4.5.1.tgz",
@@ -9581,27 +9640,6 @@
"integrity": "sha512-l4Sp/DRseor9wL6EvV2+TuQn63dMkPjZ/sp9XkghTEbV9KlPS1xUsZ3u7/IQO4wxtcFB4bgpQPRcR3QCvezPcQ==",
"license": "ISC"
},
"node_modules/ws": {
"version": "8.18.2",
"resolved": "https://registry.npmmirror.com/ws/-/ws-8.18.2.tgz",
"integrity": "sha512-DMricUmwGZUVr++AEAe2uiVM7UoO9MAVZMDu05UQOaUII0lp+zOzLLU4Xqh/JvTqklB1T4uELaaPBKyjE1r4fQ==",
"license": "MIT",
"engines": {
"node": ">=10.0.0"
},
"peerDependencies": {
"bufferutil": "^4.0.1",
"utf-8-validate": ">=5.0.2"
},
"peerDependenciesMeta": {
"bufferutil": {
"optional": true
},
"utf-8-validate": {
"optional": true
}
}
},
"node_modules/xml-name-validator": {
"version": "4.0.0",
"resolved": "https://registry.npmmirror.com/xml-name-validator/-/xml-name-validator-4.0.0.tgz",

View File

@@ -1,6 +1,6 @@
{
"name": "auto-caption",
"version": "0.1.0",
"version": "0.2.0",
"description": "A cross-platform subtitle display software.",
"main": "./out/main/index.js",
"author": "himeditator",
@@ -25,15 +25,14 @@
"@electron-toolkit/utils": "^4.0.0",
"ant-design-vue": "^4.2.6",
"pinia": "^3.0.2",
"vue-router": "^4.5.1",
"ws": "^8.18.2"
"vue-i18n": "^11.1.9",
"vue-router": "^4.5.1"
},
"devDependencies": {
"@electron-toolkit/eslint-config-prettier": "3.0.0",
"@electron-toolkit/eslint-config-ts": "^3.0.0",
"@electron-toolkit/tsconfig": "^1.0.1",
"@types/node": "^22.14.1",
"@types/ws": "^8.18.1",
"@vitejs/plugin-vue": "^5.2.3",
"electron": "^35.1.5",
"electron-builder": "^25.1.8",

View File

@@ -1,221 +0,0 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"from dashscope.audio.asr import *\n",
"import pyaudiowpatch as pyaudio\n",
"import numpy as np\n",
"\n",
"\n",
"def getDefaultSpeakers(mic: pyaudio.PyAudio, info = True):\n",
" \"\"\"\n",
" 获取默认的系统音频输出的回环设备\n",
" Args:\n",
" mic (pyaudio.PyAudio): pyaudio对象\n",
" info (bool, optional): 是否打印设备信息. Defaults to True.\n",
"\n",
" Returns:\n",
" dict: 统音频输出的回环设备\n",
" \"\"\"\n",
" try:\n",
" WASAPI_info = mic.get_host_api_info_by_type(pyaudio.paWASAPI)\n",
" except OSError:\n",
" print(\"Looks like WASAPI is not available on the system. Exiting...\")\n",
" exit()\n",
"\n",
" default_speaker = mic.get_device_info_by_index(WASAPI_info[\"defaultOutputDevice\"])\n",
" if(info): print(\"wasapi_info:\\n\", WASAPI_info, \"\\n\")\n",
" if(info): print(\"default_speaker:\\n\", default_speaker, \"\\n\")\n",
"\n",
" if not default_speaker[\"isLoopbackDevice\"]:\n",
" for loopback in mic.get_loopback_device_info_generator():\n",
" if default_speaker[\"name\"] in loopback[\"name\"]:\n",
" default_speaker = loopback\n",
" if(info): print(\"Using loopback device:\\n\", default_speaker, \"\\n\")\n",
" break\n",
" else:\n",
" print(\"Default loopback output device not found.\")\n",
" print(\"Run `python -m pyaudiowpatch` to check available devices.\")\n",
" print(\"Exiting...\")\n",
" exit()\n",
" \n",
" if(info): print(f\"Recording Device: #{default_speaker['index']} {default_speaker['name']}\")\n",
" return default_speaker\n",
"\n",
"\n",
"class Callback(TranslationRecognizerCallback):\n",
" \"\"\"\n",
" 语音大模型流式传输回调对象\n",
" \"\"\"\n",
" def __init__(self):\n",
" super().__init__()\n",
" self.usage = 0\n",
" self.sentences = []\n",
" self.translations = []\n",
" \n",
" def on_open(self) -> None:\n",
" print(\"\\n流式翻译开始...\\n\")\n",
"\n",
" def on_close(self) -> None:\n",
" print(f\"\\nTokens消耗{self.usage}\")\n",
" print(f\"流式翻译结束...\\n\")\n",
" for i in range(len(self.sentences)):\n",
" print(f\"\\n{self.sentences[i]}\\n{self.translations[i]}\\n\")\n",
"\n",
" def on_event(\n",
" self,\n",
" request_id,\n",
" transcription_result: TranscriptionResult,\n",
" translation_result: TranslationResult,\n",
" usage\n",
" ) -> None:\n",
" if transcription_result is not None:\n",
" id = transcription_result.sentence_id\n",
" text = transcription_result.text\n",
" if transcription_result.stash is not None:\n",
" stash = transcription_result.stash.text\n",
" else:\n",
" stash = \"\"\n",
" print(f\"#{id}: {text}{stash}\")\n",
" if usage: self.sentences.append(text)\n",
" \n",
" if translation_result is not None:\n",
" lang = translation_result.get_language_list()[0]\n",
" text = translation_result.get_translation(lang).text\n",
" if translation_result.get_translation(lang).stash is not None:\n",
" stash = translation_result.get_translation(lang).stash.text\n",
" else:\n",
" stash = \"\"\n",
" print(f\"#{lang}: {text}{stash}\")\n",
" if usage: self.translations.append(text)\n",
" \n",
" if usage: self.usage += usage['duration']"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"采样输入设备:\n",
" - 序号37\n",
" - 名称:耳机 (HUAWEI FreeLace 活力版) [Loopback]\n",
" - 最大输入通道数2\n",
" - 默认低输入延迟0.003s\n",
" - 默认高输入延迟0.01s\n",
" - 默认采样率44100.0Hz\n",
" - 是否回环设备True\n",
"\n",
"音频样本块大小4410\n",
"样本位宽2\n",
"音频数据格式8\n",
"音频通道数2\n",
"音频采样率44100\n",
"\n"
]
}
],
"source": [
"mic = pyaudio.PyAudio()\n",
"default_speaker = getDefaultSpeakers(mic, False)\n",
"\n",
"SAMP_WIDTH = pyaudio.get_sample_size(pyaudio.paInt16)\n",
"FORMAT = pyaudio.paInt16\n",
"CHANNELS = default_speaker[\"maxInputChannels\"]\n",
"RATE = int(default_speaker[\"defaultSampleRate\"])\n",
"CHUNK = RATE // 10\n",
"INDEX = default_speaker[\"index\"]\n",
"\n",
"dev_info = f\"\"\"\n",
"采样输入设备:\n",
" - 序号:{default_speaker['index']}\n",
" - 名称:{default_speaker['name']}\n",
" - 最大输入通道数:{default_speaker['maxInputChannels']}\n",
" - 默认低输入延迟:{default_speaker['defaultLowInputLatency']}s\n",
" - 默认高输入延迟:{default_speaker['defaultHighInputLatency']}s\n",
" - 默认采样率:{default_speaker['defaultSampleRate']}Hz\n",
" - 是否回环设备:{default_speaker['isLoopbackDevice']}\n",
"\n",
"音频样本块大小:{CHUNK}\n",
"样本位宽:{SAMP_WIDTH}\n",
"音频数据格式:{FORMAT}\n",
"音频通道数:{CHANNELS}\n",
"音频采样率:{RATE}\n",
"\"\"\"\n",
"print(dev_info)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"RECORD_SECONDS = 20 # 监听时长(s)\n",
"\n",
"stream = mic.open(\n",
" format = FORMAT,\n",
" channels = CHANNELS,\n",
" rate = RATE,\n",
" input = True,\n",
" input_device_index = INDEX\n",
")\n",
"translator = TranslationRecognizerRealtime(\n",
" model = \"gummy-realtime-v1\",\n",
" format = \"pcm\",\n",
" sample_rate = RATE,\n",
" transcription_enabled = True,\n",
" translation_enabled = True,\n",
" source_language = \"ja\",\n",
" translation_target_languages = [\"zh\"],\n",
" callback = Callback()\n",
")\n",
"translator.start()\n",
"\n",
"for i in range(0, int(RATE / CHUNK * RECORD_SECONDS)):\n",
" data = stream.read(CHUNK)\n",
" data_np = np.frombuffer(data, dtype=np.int16)\n",
" data_np_r = data_np.reshape(-1, CHANNELS)\n",
" print(data_np_r.shape)\n",
" mono_data = np.mean(data_np_r.astype(np.float32), axis=1)\n",
" mono_data = mono_data.astype(np.int16)\n",
" mono_data_bytes = mono_data.tobytes()\n",
" translator.send_audio_frame(mono_data_bytes)\n",
"\n",
"translator.stop()\n",
"stream.stop_stream()\n",
"stream.close()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "mystd",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.12"
}
},
"nbformat": 4,
"nbformat_minor": 2
}

View File

@@ -1,4 +0,0 @@
numpy
dashscope
pyaudio
pyaudiowpatch

Binary file not shown.

View File

@@ -2,13 +2,12 @@ import { shell, BrowserWindow, ipcMain } from 'electron'
import path from 'path'
import { is } from '@electron-toolkit/utils'
import icon from '../../resources/icon.png?asset'
import { controlWindow } from './control'
import { sendStyles, sendCaptionLog } from './utils/config'
import { controlWindow } from './ControlWindow'
class CaptionWindow {
class CaptionWindow {
window: BrowserWindow | undefined;
public createWindow(): void {
public createWindow(): void {
this.window = new BrowserWindow({
icon: icon,
width: 900,
@@ -26,13 +25,6 @@ class CaptionWindow {
sandbox: false
}
})
setTimeout(() => {
if (this.window) {
sendStyles(this.window);
sendCaptionLog(this.window, 'set');
}
}, 1000);
this.window.on('ready-to-show', () => {
this.window?.show()
@@ -46,7 +38,7 @@ class CaptionWindow {
shell.openExternal(details.url)
return { action: 'deny' }
})
if (is.dev && process.env['ELECTRON_RENDERER_URL']) {
this.window.loadURL(`${process.env['ELECTRON_RENDERER_URL']}/#/caption`)
} else {
@@ -57,7 +49,6 @@ class CaptionWindow {
}
public handleMessage() {
// 字幕窗口请求创建控制窗口
ipcMain.on('caption.controlWindow.activate', () => {
if(!controlWindow.window){
controlWindow.createWindow()
@@ -66,19 +57,19 @@ class CaptionWindow {
controlWindow.window.show()
}
})
// 字幕窗口高度发生变化
ipcMain.on('caption.windowHeight.change', (_, height) => {
if(this.window){
this.window.setSize(this.window.getSize()[0], height)
this.window.setSize(this.window.getSize()[0], height)
}
})
// 关闭字幕窗口
ipcMain.on('caption.window.close', () => {
if(this.window){
this.window.close()
}
})
// 是否固定在最前面
ipcMain.on('caption.pin.set', (_, pinned) => {
if(this.window){
this.window.setAlwaysOnTop(pinned)

136
src/main/ControlWindow.ts Normal file
View File

@@ -0,0 +1,136 @@
import { shell, BrowserWindow, ipcMain, nativeTheme } from 'electron'
import path from 'path'
import { is } from '@electron-toolkit/utils'
import icon from '../../resources/icon.png?asset'
import { captionWindow } from './CaptionWindow'
import { allConfig } from './utils/AllConfig'
import { captionEngine } from './utils/CaptionEngine'
class ControlWindow {
window: BrowserWindow | undefined;
public createWindow(): void {
this.window = new BrowserWindow({
icon: icon,
width: 1200,
height: 800,
minWidth: 750,
minHeight: 500,
show: false,
center: true,
autoHideMenuBar: true,
...(process.platform === 'linux' ? { icon } : {}),
webPreferences: {
preload: path.join(__dirname, '../preload/index.js'),
sandbox: false
}
})
allConfig.readConfig()
this.window.on('ready-to-show', () => {
this.window?.show()
})
this.window.on('closed', () => {
this.window = undefined
allConfig.writeConfig()
})
this.window.webContents.setWindowOpenHandler((details) => {
shell.openExternal(details.url)
return { action: 'deny' }
})
if (is.dev && process.env['ELECTRON_RENDERER_URL']) {
this.window.loadURL(process.env['ELECTRON_RENDERER_URL'])
} else {
this.window.loadFile(path.join(__dirname, '../renderer/index.html'))
}
}
public handleMessage() {
nativeTheme.on('updated', () => {
if(allConfig.uiTheme === 'system'){
if(nativeTheme.shouldUseDarkColors && this.window){
this.window.webContents.send('control.nativeTheme.change', 'dark')
}
else if(!nativeTheme.shouldUseDarkColors && this.window){
this.window.webContents.send('control.nativeTheme.change', 'light')
}
}
})
ipcMain.handle('both.window.mounted', () => {
return allConfig.getFullConfig()
})
ipcMain.handle('control.nativeTheme.get', () => {
if(nativeTheme.shouldUseDarkColors) return 'dark'
return 'light'
})
ipcMain.on('control.uiLanguage.change', (_, args) => {
allConfig.uiLanguage = args
if(captionWindow.window){
captionWindow.window.webContents.send('control.uiLanguage.set', args)
}
})
ipcMain.on('control.uiTheme.change', (_, args) => {
allConfig.uiTheme = args
})
ipcMain.on('control.leftBarWidth.change', (_, args) => {
allConfig.leftBarWidth = args
})
ipcMain.on('control.styles.change', (_, args) => {
allConfig.setStyles(args)
if(captionWindow.window){
allConfig.sendStyles(captionWindow.window)
}
})
ipcMain.on('control.styles.reset', () => {
allConfig.resetStyles()
if(this.window){
allConfig.sendStyles(this.window)
}
if(captionWindow.window){
allConfig.sendStyles(captionWindow.window)
}
})
ipcMain.on('control.captionWindow.activate', () => {
if(!captionWindow.window){
captionWindow.createWindow()
}
else {
captionWindow.window.show()
}
})
ipcMain.on('control.controls.change', (_, args) => {
allConfig.setControls(args)
})
ipcMain.on('control.engine.start', () => {
captionEngine.start()
})
ipcMain.on('control.engine.stop', () => {
captionEngine.stop()
})
ipcMain.on('control.captionLog.clear', () => {
allConfig.captionLog.splice(0)
})
}
public sendErrorMessage(message: string) {
this.window?.webContents.send('control.error.occurred', message)
}
}
export const controlWindow = new ControlWindow()

View File

@@ -1,136 +0,0 @@
import { shell, BrowserWindow, ipcMain } from 'electron'
import path from 'path'
import { is } from '@electron-toolkit/utils'
import icon from '../../resources/icon.png?asset'
import { captionWindow } from './caption'
import {
captionEngine,
captionLog,
controls,
setStyles,
resetStyles,
sendStyles,
sendCaptionLog,
setControls,
sendControls,
readConfig,
writeConfig
} from './utils/config'
class ControlWindow {
window: BrowserWindow | undefined;
public createWindow(): void {
this.window = new BrowserWindow({
icon: icon,
width: 1200,
height: 800,
minWidth: 900,
minHeight: 600,
show: false,
center: true,
autoHideMenuBar: true,
...(process.platform === 'linux' ? { icon } : {}),
webPreferences: {
preload: path.join(__dirname, '../preload/index.js'),
sandbox: false
}
})
setTimeout(() => {
if (this.window) {
readConfig()
sendStyles(this.window) // 配置初始样式
sendCaptionLog(this.window, 'set') // 配置当前字幕记录
sendControls(this.window) // 配置字幕引擎配置
}
}, 1000);
this.window.on('ready-to-show', () => {
this.window?.show()
})
this.window.on('closed', () => {
this.window = undefined
writeConfig()
})
this.window.webContents.setWindowOpenHandler((details) => {
shell.openExternal(details.url)
return { action: 'deny' }
})
if (is.dev && process.env['ELECTRON_RENDERER_URL']) {
this.window.loadURL(process.env['ELECTRON_RENDERER_URL'])
} else {
this.window.loadFile(path.join(__dirname, '../renderer/index.html'))
}
}
public handleMessage() {
// 控制窗口样式更新
ipcMain.on('control.style.change', (_, args) => {
setStyles(args)
if(captionWindow.window){
sendStyles(captionWindow.window)
}
})
ipcMain.on('control.style.reset', () => {
resetStyles()
if(captionWindow.window){
sendStyles(captionWindow.window)
}
if(this.window){
sendStyles(this.window)
}
})
// 控制窗口请求创建字幕窗口
ipcMain.on('control.captionWindow.activate', () => {
if(!captionWindow.window){
captionWindow.createWindow()
}
else {
captionWindow.window.show()
}
})
// 字幕引擎控制配置更新并启动引擎
ipcMain.on('control.control.change', (_, args) => {
setControls(args)
})
// 启动字幕引擎
ipcMain.on('control.engine.start', () => {
if(controls.engineEnabled){
this.window?.webContents.send('control.engine.already')
}
else {
if(
process.env.DASHSCOPE_API_KEY ||
(controls.customized && controls.customizedApp)
) {
if(this.window){
captionEngine.start(this.window)
}
}
else {
this.sendErrorMessage('没有检测到 DASHSCOPE_API_KEY 环境变量,如果要使用 gummy 引擎,需要在阿里云百炼平台获取 API Key 并添加到本机环境变量')
}
}
})
// 停止字幕引擎
ipcMain.on('control.engine.stop', () => {
captionEngine.stop()
this.window?.webContents.send('control.engine.stopped')
})
// 清空字幕记录
ipcMain.on('control.caption.clear', () => {
captionLog.splice(0)
})
}
public sendErrorMessage(message: string) {
this.window?.webContents.send('control.error.send', message)
}
}
export const controlWindow = new ControlWindow()

11
src/main/i18n/index.ts Normal file
View File

@@ -0,0 +1,11 @@
import zh from './lang/zh'
import en from './lang/en'
import ja from './lang/ja'
import { allConfig } from '../utils/AllConfig'
export function i18n(key: string): string{
if(allConfig.uiLanguage === 'zh') return zh[key] || key
else if(allConfig.uiLanguage === 'en') return en[key] || key
else if(allConfig.uiLanguage === 'ja') return ja[key] || key
else return key
}

8
src/main/i18n/lang/en.ts Normal file
View File

@@ -0,0 +1,8 @@
export default {
"gummy.env.missing": "DASHSCOPE_API_KEY environment variable not detected. To use the gummy engine, you need to obtain an API Key from Alibaba Cloud's Bailian platform and add it to your local environment variables.",
"platform.unsupported": "Unsupported platform: ",
"engine.start.error": "Caption engine failed to start: ",
"engine.output.parse.error": "Unable to parse caption engine output as a JSON object: ",
"engine.error": "Caption engine error: ",
"engine.shutdown.error": "Failed to shut down the caption engine process: "
}

8
src/main/i18n/lang/ja.ts Normal file
View File

@@ -0,0 +1,8 @@
export default {
"gummy.env.missing": "DASHSCOPE_API_KEY 環境変数が検出されませんでした。Gummy エンジンを使用するには、Alibaba Cloud の百煉プラットフォームから API Key を取得し、ローカル環境変数に追加する必要があります。",
"platform.unsupported": "サポートされていないプラットフォーム: ",
"engine.start.error": "字幕エンジンの起動に失敗しました: ",
"engine.output.parse.error": "字幕エンジンの出力を JSON オブジェクトとして解析できませんでした: ",
"engine.error": "字幕エンジンエラー: ",
"engine.shutdown.error": "字幕エンジンプロセスの終了に失敗しました: "
}

8
src/main/i18n/lang/zh.ts Normal file
View File

@@ -0,0 +1,8 @@
export default {
"gummy.env.missing": "没有检测到 DASHSCOPE_API_KEY 环境变量,如果要使用 gummy 引擎,需要在阿里云百炼平台获取 API Key 并添加到本机环境变量",
"platform.unsupported": "不支持的平台:",
"engine.start.error": "字幕引擎启动失败:",
"engine.output.parse.error": "字幕引擎输出内容无法解析为 JSON 对象:",
"engine.error": "字幕引擎错误:",
"engine.shutdown.error": "字幕引擎进程关闭失败:"
}

View File

@@ -1,8 +1,9 @@
import { app, BrowserWindow } from 'electron'
import { electronApp, optimizer } from '@electron-toolkit/utils'
import { controlWindow } from './control'
import { captionWindow } from './caption'
import { captionEngine, writeConfig } from './utils/config'
import { controlWindow } from './ControlWindow'
import { captionWindow } from './CaptionWindow'
import { allConfig } from './utils/AllConfig'
import { captionEngine } from './utils/CaptionEngine'
app.whenReady().then(() => {
electronApp.setAppUserModelId('com.himeditator.autocaption')
@@ -23,9 +24,9 @@ app.whenReady().then(() => {
})
})
app.on('will-quit', async () => {
app.on('will-quit', async () => {
captionEngine.stop()
writeConfig()
allConfig.writeConfig()
});
app.on('window-all-closed', () => {

View File

@@ -1,9 +1,27 @@
export type UILanguage = "zh" | "en" | "ja"
export type UITheme = "light" | "dark" | "system"
export interface Controls {
engineEnabled: boolean,
sourceLang: string,
targetLang: string,
engine: 'gummy',
audio: 0 | 1,
translation: boolean,
customized: boolean,
customizedApp: string,
customizedCommand: string
}
export interface Styles {
lineBreak: number,
fontFamily: string,
fontSize: number,
fontColor: string,
background: string,
opacity: number,
showPreview: boolean,
transDisplay: boolean,
transFontFamily: string,
transFontSize: number,
@@ -18,14 +36,11 @@ export interface CaptionItem {
translation: string
}
export interface Controls {
engineEnabled: boolean,
sourceLang: string,
targetLang: string,
engine: string,
audio: 0 | 1,
translation: boolean,
customized: boolean,
customizedApp: string,
customizedCommand: string
}
export interface FullConfig {
uiLanguage: UILanguage,
uiTheme: UITheme,
leftBarWidth: number,
styles: Styles,
controls: Controls,
captionLog: CaptionItem[]
}

148
src/main/utils/AllConfig.ts Normal file
View File

@@ -0,0 +1,148 @@
import {
UILanguage, UITheme, Styles, Controls,
CaptionItem, FullConfig
} from '../types'
import { app, BrowserWindow } from 'electron'
import * as path from 'path'
import * as fs from 'fs'
const defaultStyles: Styles = {
lineBreak: 1,
fontFamily: 'sans-serif',
fontSize: 24,
fontColor: '#000000',
background: '#dbe2ef',
opacity: 80,
showPreview: true,
transDisplay: true,
transFontFamily: 'sans-serif',
transFontSize: 24,
transFontColor: '#000000'
};
const defaultControls: Controls = {
sourceLang: 'en',
targetLang: 'zh',
engine: 'gummy',
audio: 0,
engineEnabled: false,
translation: true,
customized: false,
customizedApp: '',
customizedCommand: ''
};
class AllConfig {
uiLanguage: UILanguage = 'zh';
leftBarWidth: number = 8;
uiTheme: UITheme = 'system';
styles: Styles = {...defaultStyles};
controls: Controls = {...defaultControls};
captionLog: CaptionItem[] = [];
constructor() {}
public readConfig() {
const configPath = path.join(app.getPath('userData'), 'config.json')
if(fs.existsSync(configPath)){
const config = JSON.parse(fs.readFileSync(configPath, 'utf-8'))
if(config.uiLanguage) this.uiLanguage = config.uiLanguage
if(config.uiTheme) this.uiTheme = config.uiTheme
if(config.leftBarWidth) this.leftBarWidth = config.leftBarWidth
if(config.styles) this.setStyles(config.styles)
if(config.controls) this.setControls(config.controls)
console.log('[INFO] Read Config from:', configPath)
}
}
public writeConfig() {
const config = {
uiLanguage: this.uiLanguage,
uiTheme: this.uiTheme,
leftBarWidth: this.leftBarWidth,
controls: this.controls,
styles: this.styles
}
const configPath = path.join(app.getPath('userData'), 'config.json')
fs.writeFileSync(configPath, JSON.stringify(config, null, 2))
console.log('[INFO] Write Config to:', configPath)
}
public getFullConfig(): FullConfig {
return {
uiLanguage: this.uiLanguage,
uiTheme: this.uiTheme,
leftBarWidth: this.leftBarWidth,
styles: this.styles,
controls: this.controls,
captionLog: this.captionLog
}
}
public setStyles(args: Styles) {
for(let key in this.styles) {
if(key in args) {
this.styles[key] = args[key]
}
}
console.log('[INFO] Set Styles:', this.styles)
}
public resetStyles() {
this.setStyles(defaultStyles)
}
public sendStyles(window: BrowserWindow) {
window.webContents.send('both.styles.set', this.styles)
console.log(`[INFO] Send Styles to #${window.id}:`, this.styles)
}
public setControls(args: Controls) {
const engineEnabled = this.controls.engineEnabled
for(let key in this.controls){
if(key in args) {
this.controls[key] = args[key]
}
}
this.controls.engineEnabled = engineEnabled
console.log('[INFO] Set Controls:', this.controls)
}
public sendControls(window: BrowserWindow) {
window.webContents.send('control.controls.set', this.controls)
console.log(`[INFO] Send Controls to #${window.id}:`, this.controls)
}
public updateCaptionLog(log: CaptionItem) {
let command: 'add' | 'upd' = 'add'
if(
this.captionLog.length &&
this.captionLog[this.captionLog.length - 1].index === log.index &&
this.captionLog[this.captionLog.length - 1].time_s === log.time_s
) {
this.captionLog.splice(this.captionLog.length - 1, 1, log)
command = 'upd'
}
else {
this.captionLog.push(log)
}
for(const window of BrowserWindow.getAllWindows()){
this.sendCaptionLog(window, command)
}
}
public sendCaptionLog(window: BrowserWindow, command: 'add' | 'upd' | 'set') {
if(command === 'add'){
window.webContents.send(`both.captionLog.add`, this.captionLog[this.captionLog.length - 1])
}
else if(command === 'upd'){
window.webContents.send(`both.captionLog.upd`, this.captionLog[this.captionLog.length - 1])
}
else if(command === 'set'){
window.webContents.send(`both.captionLog.set`, this.captionLog)
}
}
}
export const allConfig = new AllConfig()

View File

@@ -0,0 +1,143 @@
import { spawn, exec } from 'child_process'
import { app } from 'electron'
import { is } from '@electron-toolkit/utils'
import path from 'path'
import { controlWindow } from '../ControlWindow'
import { allConfig } from './AllConfig'
import { i18n } from '../i18n'
export class CaptionEngine {
appPath: string = ''
command: string[] = []
process: any | undefined
processStatus: 'running' | 'stopping' | 'stopped' = 'stopped'
private getApp(): boolean {
if (allConfig.controls.customized && allConfig.controls.customizedApp) {
this.appPath = allConfig.controls.customizedApp
this.command = [allConfig.controls.customizedCommand]
}
else if (allConfig.controls.engine === 'gummy') {
allConfig.controls.customized = false
if(!process.env.DASHSCOPE_API_KEY) {
controlWindow.sendErrorMessage(i18n('gummy.env.missing'))
return false
}
let gummyName = ''
if (process.platform === 'win32') {
gummyName = 'main-gummy.exe'
}
else if (process.platform === 'linux') {
gummyName = 'main-gummy'
}
else {
controlWindow.sendErrorMessage(i18n('platform.unsupported') + process.platform)
throw new Error(i18n('platform.unsupported'))
}
if (is.dev) {
this.appPath = path.join(
app.getAppPath(),
'caption-engine', 'dist', gummyName
)
}
else {
this.appPath = path.join(
process.resourcesPath,
'caption-engine', 'dist', gummyName
)
}
this.command = []
this.command.push('-s', allConfig.controls.sourceLang)
this.command.push(
'-t', allConfig.controls.translation ?
allConfig.controls.targetLang : 'none'
)
this.command.push('-a', allConfig.controls.audio ? '1' : '0')
console.log('[INFO] Engine Path:', this.appPath)
console.log('[INFO] Engine Command:', this.command)
}
return true
}
public start() {
if (this.processStatus!== 'stopped') {
return
}
if(!this.getApp()){ return }
try {
this.process = spawn(this.appPath, this.command)
}
catch (e) {
controlWindow.sendErrorMessage(i18n('engine.start.error') + e)
console.error('[ERROR] Error starting subprocess:', e)
return
}
this.processStatus = 'running'
console.log('[INFO] Caption Engine Started, PID:', this.process.pid)
allConfig.controls.engineEnabled = true
if(controlWindow.window){
allConfig.sendControls(controlWindow.window)
controlWindow.window.webContents.send(
'control.engine.started',
this.process.pid
)
}
this.process.stdout.on('data', (data: any) => {
const lines = data.toString().split('\n');
lines.forEach((line: string) => {
if (line.trim()) {
try {
const caption = JSON.parse(line);
allConfig.updateCaptionLog(caption);
} catch (e) {
controlWindow.sendErrorMessage(i18n('engine.output.parse.error') + e)
console.error('[ERROR] Error parsing JSON:', e);
}
}
});
});
this.process.stderr.on('data', (data) => {
controlWindow.sendErrorMessage(i18n('engine.error') + data)
console.error(`[ERROR] Subprocess Error: ${data}`);
});
this.process.on('close', (code: any) => {
console.log(`[INFO] Subprocess exited with code ${code}`);
this.process = undefined;
allConfig.controls.engineEnabled = false
if(controlWindow.window){
allConfig.sendControls(controlWindow.window)
controlWindow.window.webContents.send('control.engine.stopped')
}
this.processStatus = 'stopped'
console.log('[INFO] Caption engine process stopped')
});
}
public stop() {
if(this.processStatus !== 'running') return
if (this.process) {
console.log('[INFO] Trying to stop process, PID:', this.process.pid)
if (process.platform === "win32" && this.process.pid) {
exec(`taskkill /pid ${this.process.pid} /t /f`, (error) => {
if (error) {
controlWindow.sendErrorMessage(i18n('engine.shutdown.error') + error)
console.error(`[ERROR] Failed to kill process: ${error}`)
}
});
} else {
this.process.kill('SIGKILL');
}
}
this.processStatus = 'stopping'
console.log('[INFO] Caption engine process stopping')
}
}
export const captionEngine = new CaptionEngine()

View File

@@ -1,123 +0,0 @@
import { Styles, CaptionItem, Controls } from '../types'
import { app, BrowserWindow } from 'electron'
import { CaptionEngine } from './engine'
import * as path from 'path'
import * as fs from 'fs'
export const captionEngine = new CaptionEngine()
export const styles: Styles = {
fontFamily: 'sans-serif',
fontSize: 24,
fontColor: '#000000',
background: '#dbe2ef',
opacity: 80,
transDisplay: true,
transFontFamily: 'sans-serif',
transFontSize: 24,
transFontColor: '#000000'
}
export const captionLog: CaptionItem[] = []
export const controls: Controls = {
sourceLang: 'en',
targetLang: 'zh',
engine: 'gummy',
audio: 0,
engineEnabled: false,
translation: true,
customized: false,
customizedApp: '',
customizedCommand: ''
}
export function setStyles(args: any) {
styles.fontFamily = args.fontFamily
styles.fontSize = args.fontSize
styles.fontColor = args.fontColor
styles.background = args.background
styles.opacity = args.opacity
styles.transDisplay = args.transDisplay
styles.transFontFamily = args.transFontFamily
styles.transFontSize = args.transFontSize
styles.transFontColor = args.transFontColor
console.log('[INFO] Set Styles:', styles)
}
export function resetStyles() {
setStyles({
fontFamily: 'sans-serif',
fontSize: 24,
fontColor: '#000000',
background: '#dbe2ef',
opacity: 80,
transDisplay: true,
transFontFamily: 'sans-serif',
transFontSize: 24,
transFontColor: '#000000'
})
}
export function sendStyles(window: BrowserWindow) {
window.webContents.send('caption.style.set', styles)
console.log(`[INFO] Send Styles to #${window.id}:`, styles)
}
export function sendCaptionLog(window: BrowserWindow, command: string) {
if(command === 'add'){
window.webContents.send(`both.log.add`, captionLog[captionLog.length - 1])
}
else if(command === 'set'){
window.webContents.send(`both.log.${command}`, captionLog)
}
}
export function addCaptionLog(log: CaptionItem) {
if(captionLog.length && captionLog[captionLog.length - 1].index === log.index) {
captionLog.splice(captionLog.length - 1, 1, log)
}
else {
captionLog.push(log)
}
for(const window of BrowserWindow.getAllWindows()){
sendCaptionLog(window, 'add')
}
}
export function setControls(args: any) {
controls.sourceLang = args.sourceLang
controls.targetLang = args.targetLang
controls.engine = args.engine
controls.audio = args.audio
controls.translation = args.translation
controls.customized = args.customized
controls.customizedApp = args.customizedApp
controls.customizedCommand = args.customizedCommand
console.log('[INFO] Set Controls:', controls)
}
export function sendControls(window: BrowserWindow) {
window.webContents.send('control.control.set', controls)
console.log(`[INFO] Send Controls to #${window.id}:`, controls)
}
export function readConfig() {
const configPath = path.join(app.getPath('userData'), 'config.json')
if(fs.existsSync(configPath)){
const config = JSON.parse(fs.readFileSync(configPath, 'utf-8'))
setStyles(config.styles)
setControls(config.controls)
console.log('[INFO] Read Config from:', configPath)
}
}
export function writeConfig() {
const config = {
controls: controls,
styles: styles
}
const configPath = path.join(app.getPath('userData'), 'config.json')
fs.writeFileSync(configPath, JSON.stringify(config, null, 2))
console.log('[INFO] Write Config to:', configPath)
}

View File

@@ -1,3 +0,0 @@
class configSave {
}

View File

@@ -1,122 +0,0 @@
import { spawn, exec } from 'child_process'
import { app, BrowserWindow } from 'electron'
import { is } from '@electron-toolkit/utils'
import path from 'path'
import { addCaptionLog, controls, sendControls } from './config'
import { controlWindow } from '../control'
export class CaptionEngine {
appPath: string = ''
command: string[] = []
process: any | undefined
private getApp() {
if (controls.customized && controls.customizedApp) {
this.appPath = controls.customizedApp
this.command = [controls.customizedCommand]
}
else if (controls.engine === 'gummy') {
controls.customized = false
let gummyName = ''
if (process.platform === 'win32') {
gummyName = 'main-gummy.exe'
}
else if (process.platform === 'linux') {
gummyName = 'main-gummy'
}
else {
controlWindow.sendErrorMessage('不支持的操作系统平台:' + process.platform)
throw new Error('Unsupported platform')
}
if (is.dev) {
this.appPath = path.join(
app.getAppPath(),
'python-subprocess', 'dist', gummyName
)
}
else {
this.appPath = path.join(
process.resourcesPath,
'python-subprocess', 'dist', gummyName
)
}
this.command = []
this.command.push('-s', controls.sourceLang)
this.command.push('-t', controls.translation ? controls.targetLang : 'none')
this.command.push('-a', controls.audio ? '1' : '0')
console.log('[INFO] Engine Path:', this.appPath)
console.log('[INFO] Engine Command:', this.command)
}
}
public start(window: BrowserWindow) {
if (this.process) {
this.stop();
}
this.getApp()
try {
this.process = spawn(this.appPath, this.command)
}
catch (e) {
controlWindow.sendErrorMessage('字幕引擎启动失败:' + e)
console.error('[ERROR] Error starting subprocess:', e)
return
}
console.log('[INFO] Caption Engine Started: ', {
appPath: this.appPath,
command: this.command
})
controls.engineEnabled = true
sendControls(window)
window.webContents.send('control.engine.started')
this.process.stdout.on('data', (data) => {
const lines = data.toString().split('\n');
lines.forEach((line: string) => {
if (line.trim()) {
try {
const caption = JSON.parse(line);
addCaptionLog(caption);
} catch (e) {
controlWindow.sendErrorMessage('字幕引擎输出内容无法解析为 JSON 对象:' + e)
console.error('[ERROR] Error parsing JSON:', e);
}
}
});
});
this.process.stderr.on('data', (data) => {
controlWindow.sendErrorMessage('字幕引擎错误:' + data)
console.error(`[ERROR] Subprocess Error: ${data}`);
});
this.process.on('close', (code: any) => {
console.log(`[INFO] Subprocess exited with code ${code}`);
this.process = undefined;
controls.engineEnabled = false
sendControls(window)
});
}
public stop() {
if (this.process) {
if (process.platform === "win32" && this.process.pid) {
exec(`taskkill /pid ${this.process.pid} /t /f`, (error) => {
if (error) {
controlWindow.sendErrorMessage('字幕引擎进程关闭失败:' + error)
console.error(`[ERROR] Failed to kill process: ${error}`);
}
});
} else {
this.process.kill('SIGKILL');
}
}
this.process = undefined;
controls.engineEnabled = false;
console.log('[INFO] Caption engine process stopped');
if(controlWindow.window) sendControls(controlWindow.window);
}
}

View File

@@ -3,4 +3,21 @@
</template>
<script setup lang="ts">
import { onMounted } from 'vue'
import { FullConfig } from './types'
import { useCaptionLogStore } from './stores/captionLog'
import { useCaptionStyleStore } from './stores/captionStyle'
import { useEngineControlStore } from './stores/engineControl'
import { useGeneralSettingStore } from './stores/generalSetting'
onMounted(() => {
window.electron.ipcRenderer.invoke('both.window.mounted').then((data: FullConfig) => {
useGeneralSettingStore().uiLanguage = data.uiLanguage
useGeneralSettingStore().uiTheme = data.uiTheme
useGeneralSettingStore().leftBarWidth = data.leftBarWidth
useCaptionStyleStore().setStyles(data.styles)
useEngineControlStore().setControls(data.controls)
useCaptionLogStore().captionData = data.captionLog
})
})
</script>

View File

@@ -0,0 +1,27 @@
.input-item {
margin: 10px 0;
}
.input-label {
display: inline-block;
width: 80px;
text-align: right;
margin-right: 10px;
}
.switch-label {
display: inline-block;
margin-right: 10px;
}
.input-area {
width: calc(100% - 100px);
min-width: 100px;
}
.input-item-value {
width: 80px;
text-align: right;
font-size: 12px;
color: var(--tag-color)
}

View File

@@ -0,0 +1,12 @@
:root {
--control-background: #fff;
--tag-color: rgba(0, 0, 0, 0.45);
--icon-color: rgba(0, 0, 0, 0.88);
}
body {
margin: 0;
padding: 0;
height: 100vh;
overflow: hidden;
}

View File

@@ -1,6 +0,0 @@
body {
margin: 0;
padding: 0;
height: 100vh;
overflow: hidden;
}

View File

@@ -1,165 +0,0 @@
<template>
<div style="height: 20px;"></div>
<a-card size="small" title="字幕控制">
<template #extra>
<a @click="applyChange">更改设置</a> |
<a @click="cancelChange">取消更改</a>
</template>
<div class="control-item">
<span class="control-label">源语言</span>
<a-select
class="control-input"
v-model:value="currentSourceLang"
:options="langList"
></a-select>
</div>
<div class="control-item">
<span class="control-label">翻译语言</span>
<a-select
class="control-input"
v-model:value="currentTargetLang"
:options="langList.filter((item) => item.value !== 'auto')"
></a-select>
</div>
<div class="control-item">
<span class="control-label">字幕引擎</span>
<a-select
class="control-input"
v-model:value="currentEngine"
:options="captionEngine"
></a-select>
</div>
<div class="control-item">
<span class="control-label">音频选择</span>
<a-select
class="control-input"
v-model:value="currentAudio"
:options="audioType"
></a-select>
</div>
<div class="control-item">
<span class="control-label">启用翻译</span>
<a-switch v-model:checked="currentTranslation" />
<span class="control-label">自定义引擎</span>
<a-switch v-model:checked="currentCustomized" />
</div>
<div v-show="currentCustomized">
<a-card size="small" title="自定义字幕引擎">
<p class="customize-note">说明允许用户使用自定义字幕引擎提供字幕提供的引擎要能通过 <code>child_process.spawn()</code> 进行启动且需要通过 IPC 与项目 node.js 后端进行通信具体通信接口见后端实现</p>
<div class="control-item">
<span class="control-label">引擎路径</span>
<a-input
class="control-input"
v-model:value="currentCustomizedApp"
></a-input>
</div>
<div class="control-item">
<span class="control-label">引擎指令</span>
<a-input
class="control-input"
v-model:value="currentCustomizedCommand"
></a-input>
</div>
</a-card>
</div>
</a-card>
<div style="height: 20px;"></div>
</template>
<script setup lang="ts">
import { ref, computed, watch } from 'vue'
import { storeToRefs } from 'pinia'
import { useCaptionControlStore } from '@renderer/stores/captionControl'
import { notification } from 'ant-design-vue'
const captionControl = useCaptionControlStore()
const { captionEngine, audioType, changeSignal } = storeToRefs(captionControl)
const currentSourceLang = ref('auto')
const currentTargetLang = ref('zh')
const currentEngine = ref('gummy')
const currentAudio = ref<0 | 1>(0)
const currentTranslation = ref<boolean>(false)
const currentCustomized = ref<boolean>(false)
const currentCustomizedApp = ref('')
const currentCustomizedCommand = ref('')
const langList = computed(() => {
for(let item of captionEngine.value){
if(item.value === currentEngine.value) {
return item.languages
}
}
return []
})
function applyChange(){
captionControl.sourceLang = currentSourceLang.value
captionControl.targetLang = currentTargetLang.value
captionControl.engine = currentEngine.value
captionControl.audio = currentAudio.value
captionControl.translation = currentTranslation.value
captionControl.customized = currentCustomized.value
captionControl.customizedApp = currentCustomizedApp.value
captionControl.customizedCommand = currentCustomizedCommand.value
captionControl.sendControlChange()
notification.open({
message: '字幕控制已更改',
description: '如果字幕引擎已经启动,需要关闭后重启才会生效'
});
}
function cancelChange(){
currentSourceLang.value = captionControl.sourceLang
currentTargetLang.value = captionControl.targetLang
currentEngine.value = captionControl.engine
currentAudio.value = captionControl.audio
currentTranslation.value = captionControl.translation
currentCustomized.value = captionControl.customized
currentCustomizedApp.value = captionControl.customizedApp
currentCustomizedCommand.value = captionControl.customizedCommand
}
watch(changeSignal, (val) => {
if(val == true) {
cancelChange();
captionControl.changeSignal = false;
}
})
</script>
<style scoped>
.control-item {
margin: 10px 0;
}
.control-label {
display: inline-block;
width: 80px;
text-align: right;
margin-right: 10px;
}
.customize-note {
padding: 0 20px;
color: red;
font-size: 12px;
}
.control-input {
width: calc(100% - 100px);
min-width: 100px;
}
.control-item-value {
width: 80px;
text-align: right;
font-size: 12px;
color: #666
}
</style>

View File

@@ -1,50 +1,22 @@
<template>
<div class="caption-stat">
<a-row>
<a-col :span="6">
<a-statistic title="字幕引擎" :value="(customized && customizedApp)?'自定义':engine" />
</a-col>
<a-col :span="6">
<a-statistic title="字幕引擎状态" :value="engineEnabled?'已启动':'未启动'" />
</a-col>
<a-col :span="6">
<a-statistic title="已记录字幕" :value="captionData.length" />
</a-col>
</a-row>
</div>
<div class="caption-control">
<a-button
type="primary"
class="control-button"
@click="openCaptionWindow"
>打开字幕窗口</a-button>
<a-button
class="control-button"
@click="captionControl.startEngine"
>启动字幕引擎</a-button>
<a-button
danger class="control-button"
@click="captionControl.stopEngine"
>关闭字幕引擎</a-button>
</div>
<div class="caption-list">
<div class="caption-title">
<span style="margin-right: 30px;">字幕记录</span>
<div>
<a-app class="caption-title">
<span style="margin-right: 30px;">{{ $t('log.title') }}</span>
</a-app>
<a-button
type="primary"
style="margin-right: 20px;"
@click="exportCaptions"
:disabled="captionData.length === 0"
>
导出字幕记录
{{ $t('log.export') }}
</a-button>
<a-button
danger
@click="clearCaptions"
>
清空字幕记录
{{ $t('log.clear') }}
</a-button>
</div>
<a-table
@@ -77,17 +49,14 @@
import { ref } from 'vue'
import { storeToRefs } from 'pinia'
import { useCaptionLogStore } from '@renderer/stores/captionLog'
import { useCaptionControlStore } from '@renderer/stores/captionControl'
const captionLog = useCaptionLogStore()
const { captionData } = storeToRefs(captionLog)
const captionControl = useCaptionControlStore()
const { engineEnabled, engine, customized, customizedApp } = storeToRefs(captionControl)
const pagination = ref({
current: 1,
pageSize: 10,
showSizeChanger: true,
pageSizeOptions: ['10', '20', '50'],
showTotal: (total: number) => ` ${total} 条记录`,
showTotal: (total: number) => `Total: ${total}`,
onChange: (page: number, pageSize: number) => {
pagination.value.current = page
pagination.value.pageSize = pageSize
@@ -100,28 +69,24 @@ const pagination = ref({
const columns = [
{
title: '序号',
title: 'index',
dataIndex: 'index',
key: 'index',
width: 80,
},
{
title: '时间',
title: 'time',
dataIndex: 'time',
key: 'time',
width: 160,
},
{
title: '字幕内容',
title: 'content',
dataIndex: 'content',
key: 'content',
},
]
function openCaptionWindow() {
window.electron.ipcRenderer.send('control.captionWindow.activate')
}
function exportCaptions() {
const jsonData = JSON.stringify(captionData.value, null, 2)
const blob = new Blob([jsonData], { type: 'application/json' })
@@ -142,27 +107,14 @@ function clearCaptions() {
</script>
<style scoped>
.caption-control {
display: flex;
flex-wrap: wrap;
justify-content: center;
margin: 30px;
}
.control-button {
height: 40px;
margin: 20px;
font-size: 16px;
}
.caption-list {
background: #fff;
padding: 20px;
border-radius: 8px;
box-shadow: 0 2px 8px rgba(0, 0, 0, 0.1);
}
.caption-title {
display: inline-block;
font-size: 24px;
font-weight: bold;
margin-bottom: 10px;
@@ -189,14 +141,12 @@ function clearCaptions() {
.caption-text {
font-size: 16px;
color: #333;
margin-bottom: 4px;
}
.caption-translation {
font-size: 14px;
color: #666;
padding-left: 16px;
border-left: 3px solid #1890ff;
}
</style>
</style>

View File

@@ -1,124 +1,141 @@
<template>
<a-card size="small" title="字幕样式设置">
<a-card size="small" :title="$t('style.title')">
<template #extra>
<a @click="applyStyle">应用样式</a> |
<a @click="backStyle">取消更改</a> |
<a @click="resetStyle">恢复默认</a>
<a @click="applyStyle">{{ $t('style.applyStyle') }}</a> |
<a @click="backStyle">{{ $t('style.cancelChange') }}</a> |
<a @click="resetStyle">{{ $t('style.resetStyle') }}</a>
</template>
<div class="style-item">
<span class="style-label">字体族</span>
<a-input
class="style-input"
v-model:value="currentFontFamily"
/>
<div class="input-item">
<span class="input-label">{{ $t('style.longCaption') }}</span>
<a-select
class="input-area"
v-model:value="currentLineBreak"
:options="captionStyle.iBreakOptions"
></a-select>
</div>
<div class="style-item">
<span class="style-label">字体颜色</span>
<div class="input-item">
<span class="input-label">{{ $t('style.fontFamily') }}</span>
<a-input
class="style-input"
class="input-area"
v-model:value="currentFontFamily"
/>
</div>
<div class="input-item">
<span class="input-label">{{ $t('style.fontColor') }}</span>
<a-input
class="input-area"
type="color"
v-model:value="currentFontColor"
/>
<div class="style-item-value">{{ currentFontColor }}</div>
<div class="input-item-value">{{ currentFontColor }}</div>
</div>
<div class="style-item">
<span class="style-label">字体大小</span>
<div class="input-item">
<span class="input-label">{{ $t('style.fontSize') }}</span>
<a-input
class="style-input"
class="input-area"
type="range"
min="0" max="64"
v-model:value="currentFontSize"
/>
<div class="style-item-value">{{ currentFontSize }}px</div>
/>
<div class="input-item-value">{{ currentFontSize }}px</div>
</div>
<div class="style-item">
<span class="style-label">背景颜色</span>
<div class="input-item">
<span class="input-label">{{ $t('style.background') }}</span>
<a-input
class="style-input"
class="input-area"
type="color"
v-model:value="currentBackground"
/>
<div class="style-item-value">{{ currentBackground }}</div>
<div class="input-item-value">{{ currentBackground }}</div>
</div>
<div class="style-item">
<span class="style-label">背景透明度</span>
<div class="input-item">
<span class="input-label">{{ $t('style.opacity') }}</span>
<a-input
class="style-input"
class="input-area"
type="range"
min="0"
max="100"
v-model:value="currentOpacity"
/>
<div class="style-item-value">{{ currentOpacity }}</div>
<div class="input-item-value">{{ currentOpacity }}%</div>
</div>
<div class="style-item">
<span class="style-label">显示预览</span>
<a-switch v-model:checked="displayPreview" />
<span class="style-label">显示翻译</span>
<a-switch v-model:checked="currentTransDisplay" />
<div class="input-item">
<span class="input-label">{{ $t('style.preview') }}</span>
<a-switch v-model:checked="currentPreview" />
<span style="display:inline-block;width:20px;"></span>
<div style="display: inline-block;">
<span class="switch-label">{{ $t('style.translation') }}</span>
<a-switch v-model:checked="currentTransDisplay" />
</div>
</div>
<div v-show="currentTransDisplay">
<a-card size="small" title="翻译样式设置">
<a-card size="small" :title="$t('style.trans.title')">
<template #extra>
<a @click="useSameStyle">使用相同样式</a>
<a @click="useSameStyle">{{ $t('style.trans.useSame') }}</a>
</template>
<div class="style-item">
<span class="style-label">翻译字体</span>
<div class="input-item">
<span class="input-label">{{ $t('style.fontFamily') }}</span>
<a-input
class="style-input"
class="input-area"
v-model:value="currentTransFontFamily"
/>
/>
</div>
<div class="style-item">
<span class="style-label">翻译颜色</span>
<div class="input-item">
<span class="input-label">{{ $t('style.fontColor') }}</span>
<a-input
class="style-input"
class="input-area"
type="color"
v-model:value="currentTransFontColor"
/>
<div class="style-item-value">{{ currentTransFontColor }}</div>
<div class="input-item-value">{{ currentTransFontColor }}</div>
</div>
<div class="style-item">
<span class="style-label">翻译大小</span>
<div class="input-item">
<span class="input-label">{{ $t('style.fontSize') }}</span>
<a-input
class="style-input"
class="input-area"
type="range"
min="0" max="64"
v-model:value="currentTransFontSize"
/>
<div class="style-item-value">{{ currentTransFontSize }}px</div>
/>
<div class="input-item-value">{{ currentTransFontSize }}px</div>
</div>
</a-card>
</div>
</a-card>
<Teleport to="body">
<div
v-if="displayPreview"
v-if="currentPreview"
class="preview-container"
:style="{
backgroundColor: addOpicityToColor(currentBackground, currentOpacity)
}"
>
<p class="preview-caption"
<p :class="[captionStyle.lineBreak?'':'left-ellipsis']"
:style="{
fontFamily: currentFontFamily,
fontSize: currentFontSize + 'px',
color: currentFontColor
}">
{{ "This is a preview of subtitle styles." }}
}">
<span v-if="captionData.length">{{ captionData[captionData.length-1].text }}</span>
<span v-else>{{ $t('example.original') }}</span>
</p>
<p class="preview-translation" v-if="currentTransDisplay"
<p :class="[captionStyle.lineBreak?'':'left-ellipsis']"
v-if="currentTransDisplay"
:style="{
fontFamily: currentTransFontFamily,
fontSize: currentTransFontSize + 'px',
color: currentTransFontColor
color: currentTransFontColor
}"
>这是字幕样式预览(翻译)</p>
</div>
>
<span v-if="captionData.length">{{ captionData[captionData.length-1].translation }}</span>
<span v-else>{{ $t('example.translation') }}</span>
</p>
</div>
</Teleport>
</template>
@@ -127,20 +144,29 @@
import { ref, watch } from 'vue'
import { useCaptionStyleStore } from '@renderer/stores/captionStyle'
import { storeToRefs } from 'pinia'
import { notification } from 'ant-design-vue'
import { useI18n } from 'vue-i18n'
import { useCaptionLogStore } from '@renderer/stores/captionLog';
const captionLog = useCaptionLogStore();
const { captionData } = storeToRefs(captionLog);
const { t } = useI18n()
const captionStyle = useCaptionStyleStore()
const { changeSignal } = storeToRefs(captionStyle)
const currentLineBreak = ref<number>(0)
const currentFontFamily = ref<string>('sans-serif')
const currentFontSize = ref<number>(24)
const currentFontColor = ref<string>('#000000')
const currentBackground = ref<string>('#dbe2ef')
const currentOpacity = ref<number>(50)
const currentPreview = ref<boolean>(true)
const currentTransDisplay = ref<boolean>(true)
const currentTransFontFamily = ref<string>('sans-serif')
const currentTransFontSize = ref<number>(24)
const currentTransFontColor = ref<string>('#000000')
const displayPreview = ref<boolean>(true)
function addOpicityToColor(color: string, opicity: number) {
const opicityValue = Math.round(opicity * 255 / 100);
@@ -154,36 +180,43 @@ function useSameStyle(){
currentTransFontColor.value = currentFontColor.value;
}
function applyStyle(){
function applyStyle(){
captionStyle.lineBreak = currentLineBreak.value;
captionStyle.fontFamily = currentFontFamily.value;
captionStyle.fontSize = currentFontSize.value;
captionStyle.fontColor = currentFontColor.value;
captionStyle.background = currentBackground.value;
captionStyle.opacity = currentOpacity.value;
captionStyle.showPreview = currentPreview.value;
captionStyle.transDisplay = currentTransDisplay.value;
captionStyle.transFontFamily = currentTransFontFamily.value;
captionStyle.transFontSize = currentTransFontSize.value;
captionStyle.transFontColor = currentTransFontColor.value;
captionStyle.sendStyleChange();
captionStyle.sendStylesChange();
notification.open({
message: t('noti.styleChange'),
description: t('noti.styleInfo')
});
}
function backStyle(){
currentLineBreak.value = captionStyle.lineBreak;
currentFontFamily.value = captionStyle.fontFamily;
currentFontSize.value = captionStyle.fontSize;
currentFontColor.value = captionStyle.fontColor;
currentBackground.value = captionStyle.background;
currentOpacity.value = captionStyle.opacity;
currentPreview.value = captionStyle.showPreview;
currentTransDisplay.value = captionStyle.transDisplay;
currentTransFontFamily.value = captionStyle.transFontFamily;
currentTransFontSize.value = captionStyle.transFontSize;
currentTransFontColor.value = captionStyle.transFontColor;
}
function resetStyle() {
captionStyle.sendStyleReset();
function resetStyle() {
captionStyle.sendStylesReset();
}
watch(changeSignal, (val) => {
@@ -195,33 +228,7 @@ watch(changeSignal, (val) => {
</script>
<style scoped>
.caption-button {
display: flex;
justify-content: center;
}
.style-item {
margin: 10px 0;
}
.style-label {
display: inline-block;
width: 80px;
text-align: right;
margin-right: 10px;
}
.style-input {
width: calc(100% - 100px);
min-width: 100px;
}
.style-item-value {
width: 80px;
text-align: right;
font-size: 12px;
color: #666
}
@import url(../assets/input.css);
.preview-container {
line-height: 2em;
@@ -236,7 +243,20 @@ watch(changeSignal, (val) => {
}
.preview-container p {
text-align: center;
margin: 0;
line-height: 1.5em;
}
</style>
.left-ellipsis {
white-space: nowrap;
overflow: hidden;
direction: rtl;
text-align: left;
}
.left-ellipsis > span {
direction: ltr;
display: inline-block;
}
</style>

View File

@@ -0,0 +1,158 @@
<template>
<div style="height: 20px;"></div>
<a-card size="small" :title="$t('engine.title')">
<template #extra>
<a @click="applyChange">{{ $t('engine.applyChange') }}</a> |
<a @click="cancelChange">{{ $t('engine.cancelChange') }}</a>
</template>
<div class="input-item">
<span class="input-label">{{ $t('engine.sourceLang') }}</span>
<a-select
class="input-area"
v-model:value="currentSourceLang"
:options="langList"
></a-select>
</div>
<div class="input-item">
<span class="input-label">{{ $t('engine.transLang') }}</span>
<a-select
class="input-area"
v-model:value="currentTargetLang"
:options="langList.filter((item) => item.value !== 'auto')"
></a-select>
</div>
<div class="input-item">
<span class="input-label">{{ $t('engine.captionEngine') }}</span>
<a-select
class="input-area"
v-model:value="currentEngine"
:options="captionEngine"
></a-select>
</div>
<div class="input-item">
<span class="input-label">{{ $t('engine.audioType') }}</span>
<a-select
class="input-area"
v-model:value="currentAudio"
:options="audioType"
></a-select>
</div>
<div class="input-item">
<span class="input-label">{{ $t('engine.enableTranslation') }}</span>
<a-switch v-model:checked="currentTranslation" />
<span style="display:inline-block;width:20px;"></span>
<div style="display: inline-block;">
<span class="switch-label">{{ $t('engine.customEngine') }}</span>
<a-switch v-model:checked="currentCustomized" />
</div>
</div>
<div v-show="currentCustomized">
<a-card size="small" :title="$t('engine.custom.title')">
<template #extra>
<a-popover>
<template #content>
<p class="customize-note">{{ $t('engine.custom.note') }}</p>
</template>
<a><InfoCircleOutlined />{{ $t('engine.custom.attention') }}</a>
</a-popover>
</template>
<div class="input-item">
<span class="input-label">{{ $t('engine.custom.app') }}</span>
<a-input
class="input-area"
v-model:value="currentCustomizedApp"
></a-input>
</div>
<div class="input-item">
<span class="input-label">{{ $t('engine.custom.command') }}</span>
<a-input
class="input-area"
v-model:value="currentCustomizedCommand"
></a-input>
</div>
</a-card>
</div>
</a-card>
<div style="height: 20px;"></div>
</template>
<script setup lang="ts">
import { ref, computed, watch } from 'vue'
import { storeToRefs } from 'pinia'
import { useEngineControlStore } from '@renderer/stores/engineControl'
import { notification } from 'ant-design-vue'
import { InfoCircleOutlined } from '@ant-design/icons-vue';
import { useI18n } from 'vue-i18n'
const { t } = useI18n()
const engineControl = useEngineControlStore()
const { captionEngine, audioType, changeSignal } = storeToRefs(engineControl)
const currentSourceLang = ref('auto')
const currentTargetLang = ref('zh')
const currentEngine = ref<'gummy'>('gummy')
const currentAudio = ref<0 | 1>(0)
const currentTranslation = ref<boolean>(false)
const currentCustomized = ref<boolean>(false)
const currentCustomizedApp = ref('')
const currentCustomizedCommand = ref('')
const langList = computed(() => {
for(let item of captionEngine.value){
if(item.value === currentEngine.value) {
return item.languages
}
}
return []
})
function applyChange(){
engineControl.sourceLang = currentSourceLang.value
engineControl.targetLang = currentTargetLang.value
engineControl.engine = currentEngine.value
engineControl.audio = currentAudio.value
engineControl.translation = currentTranslation.value
engineControl.customized = currentCustomized.value
engineControl.customizedApp = currentCustomizedApp.value
engineControl.customizedCommand = currentCustomizedCommand.value
engineControl.sendControlsChange()
notification.open({
message: t('noti.engineChange'),
description: t('noti.changeInfo')
});
}
function cancelChange(){
currentSourceLang.value = engineControl.sourceLang
currentTargetLang.value = engineControl.targetLang
currentEngine.value = engineControl.engine
currentAudio.value = engineControl.audio
currentTranslation.value = engineControl.translation
currentCustomized.value = engineControl.customized
currentCustomizedApp.value = engineControl.customizedApp
currentCustomizedCommand.value = engineControl.customizedCommand
}
watch(changeSignal, (val) => {
if(val == true) {
cancelChange();
engineControl.changeSignal = false;
}
})
</script>
<style scoped>
@import url(../assets/input.css);
.customize-note {
padding: 10px 10px 0;
color: red;
max-width: min(40vw, 480px);
}
</style>

View File

@@ -0,0 +1,176 @@
<template>
<div class="caption-stat">
<a-row>
<a-col :span="6">
<a-statistic
:title="$t('status.engine')"
:value="(customized && customizedApp)?$t('status.customized'):engine"
/>
</a-col>
<a-col :span="6">
<a-statistic
:title="$t('status.status')"
:value="engineEnabled?$t('status.started'):$t('status.stopped')"
/>
</a-col>
<a-col :span="6">
<a-statistic :title="$t('status.logNumber')" :value="captionData.length" />
</a-col>
<a-col :span="6">
<div class="about-tag">{{ $t('status.aboutProj') }}</div>
<GithubOutlined class="proj-info" @click="showAbout = true"/>
</a-col>
</a-row>
</div>
<div class="caption-control">
<a-button
type="primary"
class="control-button"
@click="openCaptionWindow"
>{{ $t('status.openCaption') }}</a-button>
<a-button
class="control-button"
:disabled="engineEnabled"
@click="startEngine"
>{{ $t('status.startEngine') }}</a-button>
<a-button
danger class="control-button"
:disabled="!engineEnabled"
@click="stopEngine"
>{{ $t('status.stopEngine') }}</a-button>
</div>
<a-modal v-model:open="showAbout" :title="$t('status.about.title')" :footer="null">
<div class="about-modal-content">
<h2 class="about-title">{{ $t('status.about.proj') }}</h2>
<p class="about-desc">{{ $t('status.about.desc') }}</p>
<a-divider />
<div class="about-info">
<p><b>{{ $t('status.about.version') }}</b><a-tag color="green">v0.2.0</a-tag></p>
<p>
<b>{{ $t('status.about.author') }}</b>
<a
href="https://github.com/HiMeditator"
target="_blank"
>
<a-tag color="blue">HiMeditator</a-tag>
</a>
</p>
<p>
<b>{{ $t('status.about.projLink') }}</b>
<a href="https://github.com/HiMeditator/auto-caption" target="_blank">
<a-tag color="blue">GitHub | auto-caption</a-tag>
</a>
</p>
<p>
<b>{{ $t('status.about.manual') }}</b>
<a
:href="`https://github.com/HiMeditator/auto-caption/tree/main/docs/user-manual/${$t('lang')}.md`"
target="_blank"
>
<a-tag color="blue">GitHub | user-manual/{{ $t('lang') }}.md</a-tag>
</a>
</p>
<p>
<b>{{ $t('status.about.engineDoc') }}</b>
<a
:href="`https://github.com/HiMeditator/auto-caption/tree/main/docs/engine-manual/${$t('lang')}.md`"
target="_blank"
>
<a-tag color="blue">GitHub | engine-manual/{{ $t('lang') }}.md</a-tag>
</a>
</p>
</div>
<div class="about-date">{{ $t('status.about.date') }}</div>
</div>
</a-modal>
</template>
<script setup lang="ts">
import { ref } from 'vue'
import { storeToRefs } from 'pinia'
import { useCaptionLogStore } from '@renderer/stores/captionLog'
import { useEngineControlStore } from '@renderer/stores/engineControl'
import { GithubOutlined } from '@ant-design/icons-vue';
const showAbout = ref(false)
const captionLog = useCaptionLogStore()
const { captionData } = storeToRefs(captionLog)
const engineControl = useEngineControlStore()
const { engineEnabled, engine, customized, customizedApp } = storeToRefs(engineControl)
function openCaptionWindow() {
window.electron.ipcRenderer.send('control.captionWindow.activate')
}
function startEngine() {
window.electron.ipcRenderer.send('control.engine.start')
}
function stopEngine() {
window.electron.ipcRenderer.send('control.engine.stop')
}
</script>
<style scoped>
.about-tag {
color: var(--tag-color);
margin-bottom: 16px;
}
.proj-info {
display: inline-block;
font-size: 24px;
cursor: pointer;
color: var(--icon-color);
}
.about-modal-content {
text-align: center;
padding: 8px 0 0 0;
}
.about-title {
font-size: 1.5em;
font-weight: bold;
margin-bottom: 0.2em;
}
.about-desc {
color: #666;
margin-bottom: 0.5em;
}
.about-info {
text-align: left;
display: inline-block;
margin: 0 auto;
font-size: 1em;
}
.about-info b {
margin-right: 1em;
}
.about-date {
margin-top: 1.5em;
color: #aaa;
font-size: 0.95em;
text-align: right;
}
.caption-control {
display: flex;
flex-wrap: wrap;
justify-content: center;
margin: 30px;
}
.control-button {
height: 40px;
margin: 20px;
font-size: 16px;
}
</style>

View File

@@ -0,0 +1,63 @@
<template>
<a-card size="small" :title="$t('general.title')">
<template #extra>
<a-popover>
<template #content>
<p class="general-note">{{ $t('general.note') }}</p>
</template>
<a><InfoCircleOutlined /></a>
</a-popover>
</template>
<div>
<div class="input-item">
<span class="input-label">{{ $t('general.uiLanguage') }}</span>
<a-radio-group v-model:value="uiLanguage">
<a-radio-button value="zh">中文</a-radio-button>
<a-radio-button value="en">English</a-radio-button>
<a-radio-button value="ja">日本語</a-radio-button>
</a-radio-group>
</div>
<div class="input-item">
<span class="input-label">{{ $t('general.theme') }}</span>
<a-radio-group v-model:value="uiTheme">
<a-radio-button value="system">{{ $t('general.system') }}</a-radio-button>
<a-radio-button value="light">{{ $t('general.light') }}</a-radio-button>
<a-radio-button value="dark">{{ $t('general.dark') }}</a-radio-button>
</a-radio-group>
</div>
<div class="input-item">
<span class="input-label">{{ $t('general.barWidth') }}</span>
<a-input
type="range" class="span-input"
min="6" max="12" v-model:value="leftBarWidth"
/>
<div class="input-item-value">{{ (leftBarWidth * 100 / 24).toFixed(0) }}%</div>
</div>
</div>
</a-card>
</template>
<script setup lang="ts">
import { storeToRefs } from 'pinia'
import { useGeneralSettingStore } from '@renderer/stores/generalSetting'
import { InfoCircleOutlined } from '@ant-design/icons-vue';
const generalSettingStore = useGeneralSettingStore()
const { uiLanguage, uiTheme, leftBarWidth } = storeToRefs(generalSettingStore)
</script>
<style scoped>
@import url(../assets/input.css);
.span-input {
width: 100px;
}
.general-note {
padding: 10px 10px 0;
max-width: min(36vw, 400px);
}
</style>

View File

@@ -0,0 +1,32 @@
export const audioTypes = {
zh: [
{
value: 0,
label: '系统音频输出(扬声器)'
},
{
value: 1,
label: '系统音频输入(麦克风)'
}
],
en: [
{
value: 0,
label: 'System Audio Output (Speaker)'
},
{
value: 1,
label: 'System Audio Input (Microphone)'
}
],
ja: [
{
value: 0,
label: 'システム音声出力(スピーカー)'
},
{
value: 1,
label: 'システム音声入力(マイク)'
}
]
}

View File

@@ -0,0 +1,57 @@
export const engines = {
zh: [
{
value: 'gummy',
label: '云端 - 阿里云 - Gummy',
languages: [
{ value: 'auto', label: '自动检测' },
{ value: 'en', label: '英语' },
{ value: 'zh', label: '中文' },
{ value: 'ja', label: '日语' },
{ value: 'ko', label: '韩语' },
{ value: 'de', label: '德语' },
{ value: 'fr', label: '法语' },
{ value: 'ru', label: '俄语' },
{ value: 'es', label: '西班牙语' },
{ value: 'it', label: '意大利语' },
]
},
],
en: [
{
value: 'gummy',
label: 'Cloud - Alibaba Cloud - Gummy',
languages: [
{ value: 'auto', label: 'Auto Detect' },
{ value: 'en', label: 'English' },
{ value: 'zh', label: 'Chinese' },
{ value: 'ja', label: 'Japanese' },
{ value: 'ko', label: 'Korean' },
{ value: 'de', label: 'German' },
{ value: 'fr', label: 'French' },
{ value: 'ru', label: 'Russian' },
{ value: 'es', label: 'Spanish' },
{ value: 'it', label: 'Italian' },
]
},
],
ja: [
{
value: 'gummy',
label: 'クラウド - アリババクラウド - Gummy',
languages: [
{ value: 'auto', label: '自動検出' },
{ value: 'en', label: '英語' },
{ value: 'zh', label: '中国語' },
{ value: 'ja', label: '日本語' },
{ value: 'ko', label: '韓国語' },
{ value: 'de', label: 'ドイツ語' },
{ value: 'fr', label: 'フランス語' },
{ value: 'ru', label: 'ロシア語' },
{ value: 'es', label: 'スペイン語' },
{ value: 'it', label: 'イタリア語' },
]
},
]
}

View File

@@ -0,0 +1,32 @@
export const breakOptions = {
zh: [
{
value: 1,
label: '换行(可能造成字幕窗口高度增加)'
},
{
value: 0,
label: '不换行(省略掉超出字幕窗口宽度的内容)'
}
],
en: [
{
value: 1,
label: 'Wrap (may increase caption window height)'
},
{
value: 0,
label: 'Do not wrap (truncate content that exceeds caption window width)'
}
],
ja: [
{
value: 1,
label: '改行する(字幕ウィンドウの高さが増える可能性があります)'
},
{
value: 0,
label: '改行しない(字幕ウィンドウの幅を超える内容は省略します)'
}
]
}

View File

@@ -0,0 +1,10 @@
import { theme } from 'ant-design-vue';
export const antDesignTheme = {
light: {
token: {}
},
dark: {
algorithm: theme.darkAlgorithm,
}
}

View File

@@ -0,0 +1,20 @@
import { createI18n } from 'vue-i18n';
import zh from './lang/zh';
import en from './lang/en';
import ja from './lang/ja';
export const i18n = createI18n({
legacy: false,
locale: 'zh',
messages: {
zh,
en,
ja
}
});
export * from './config/engine'
export * from './config/audio'
export * from './config/theme'
export * from './config/linebreak'

View File

@@ -0,0 +1,105 @@
export default {
lang: "en",
example: {
"original": "这是字幕样式预览。",
"translation": "(Translation) This is a preview of caption styles."
},
noti: {
"restarted": "Caption Engine Restarted Successfully",
"started": "Caption Engine Started Successfully",
"sLang": "Source language: ",
"trans": ", translation: ",
"engine": ", caption engine: ",
"audio": ", audio type: ",
"sysout": "system audio output (speaker)",
"sysin": "system audio input (microphone)",
"tLang": ", target language: ",
"custom": "Type: Custom engine, engine path: ",
"args": ", command arguments: ",
"pidInfo": ", caption engine process PID: ",
"stopped": "Caption Engine Stopped",
"stoppedInfo": "The caption engine has stopped. You can click the 'Start Caption Engine' button to restart it.",
"error": "An error occurred",
"engineChange": "Cpation Engine Configuration Changed",
"changeInfo": "If the caption engine is already running, you need to restart it for the changes to take effect.",
"styleChange": "Caption Style Changed",
"styleInfo": "Caption style changes have been saved and applied."
},
general: {
"title": "General Settings",
"uiLanguage": "Language",
"barWidth": "Width",
"note": "General Settings take effect immediately. Please note that changes to the Caption Engine Settings and Caption Style Settings will only take effect after clicking Apply.",
"theme": "Theme",
"light": "light",
"dark": "dark",
"system": "system"
},
engine: {
"title": "Caption Engine Settings",
"applyChange": "Apply Changes",
"cancelChange": "Cancel Changes",
"sourceLang": "Source",
"transLang": "Translation",
"captionEngine": "Engine",
"audioType": "Audio Type",
"systemOutput": "System Audio Output (Speaker)",
"systemInput": "System Audio Input (Microphone)",
"enableTranslation": "Translation",
"customEngine": "Custom Engine",
custom: {
"title": "Custom Caption Engine",
"attention": "Attention",
"note": "Note: Allows users to provide captions using a custom engine. The provided engine should be able to start via the command line and can specify parameters through command-line instructions. The engine needs to communicate with the node.js backend using standard output. For more information, refer to the project's documentation.",
"app": "Engine Path",
"command": "Command"
}
},
style: {
"title": "Caption Style Settings",
"applyStyle": "Apply",
"cancelChange": "Cancel",
"resetStyle": "Reset",
"longCaption": "LongCaption",
"fontFamily": "Font Family",
"fontColor": "Font Color",
"fontSize": "Font Size",
"background": "Background",
"opacity": "Opacity",
"preview": "Preview",
"translation": "Show Translation",
trans: {
"title": "Translation Style Settings",
"useSame": "Use Original Style"
}
},
status: {
"engine": "Caption Engine",
"customized": "Customized",
"status": "Engine Status",
"started": "Started",
"stopped": "Not Started",
"logNumber": "Caption Count",
"aboutProj": "About Project",
"openCaption": "Open Caption Window",
"startEngine": "Start Caption Engine",
"restartEngine": "Restart Caption Engine",
"stopEngine": "Stop Caption Engine",
about: {
"title": "About This Project",
"proj": "Auto Caption Project",
"desc": "A cross-platform real-time caption display software supporting multiple languages.",
"version": "Software Version",
"author": "Project Author",
"projLink": "Project Link",
"manual": "User Manual",
"engineDoc": "Caption Engine Manual",
"date": "July 5, 2026"
}
},
log: {
"title": "Caption Log",
"export": "Export Caption Log",
"clear": "Clear Caption Log"
}
}

View File

@@ -0,0 +1,105 @@
export default {
lang: "ja",
example: {
"original": "这是字幕样式预览。",
"translation": "(翻訳)これは字幕のスタイルのプレビューです。"
},
noti: {
"restarted": "字幕エンジンが再起動しました",
"started": "字幕エンジンを開始しました",
"sLang": "ソース言語:",
"trans": "、翻訳する:",
"engine": "、字幕エンジン:",
"audio": "、オーディオタイプ:",
"sysout": "システムオーディオ出力(スピーカー)",
"sysin": "システムオーディオ入力(マイク)",
"tLang": "、翻訳先の言語:",
"custom": "タイプ:カスタムエンジン、エンジンパス:",
"args": "、コマンド引数:",
"pidInfo": "、字幕エンジンプロセス PID",
"stopped": "字幕エンジンが停止しました",
"stoppedInfo": "字幕エンジンが停止しました。再起動するには「字幕エンジンを開始」ボタンをクリックしてください。",
"error": "エラーが発生しました",
"engineChange": "字幕エンジンの設定が変更されました",
"changeInfo": "字幕エンジンがすでに起動している場合、変更を有効にするには再起動が必要です。",
"styleChange": "字幕のスタイルが変更されました",
"styleInfo": "字幕のスタイル変更が保存され、適用されました"
},
general: {
"title": "一般設定",
"uiLanguage": "言語設定",
"barWidth": "左側の幅",
"note": "一般設定はすぐに有効になります。字幕エンジンの設定と字幕スタイルの設定を変更した場合は、適用ボタンをクリックしてから有効になりますのでご注意ください。",
"theme": "テーマ",
"light": "明るい",
"dark": "暗い",
"system": "システム"
},
engine: {
"title": "字幕エンジン設定",
"applyChange": "変更を適用",
"cancelChange": "変更をキャンセル",
"sourceLang": "ソース言語",
"transLang": "翻訳言語",
"captionEngine": "エンジン",
"audioType": "オーディオ",
"systemOutput": "システムオーディオ出力(スピーカー)",
"systemInput": "システムオーディオ入力(マイク)",
"enableTranslation": "翻訳",
"customEngine": "カスタムエンジン",
custom: {
"title": "カスタムキャプションエンジン",
"attention": "注意事項",
"note": "注意:ユーザーがカスタムエンジンを使用して字幕を提供できるようにします。提供するエンジンは、コマンドラインから起動でき、パラメータをコマンドラインの指示で指定できる必要があります。エンジンは、標準出力を使用して node.js バックエンドと通信する必要があります。詳細については、プロジェクトドキュメントを参照してください。",
"app": "パス",
"command": "コマンド"
}
},
style: {
"title": "字幕スタイル設定",
"applyStyle": "適用",
"cancelChange": "キャンセル",
"resetStyle": "リセット",
"longCaption": "長い字幕",
"fontFamily": "フォント",
"fontColor": "カラー",
"fontSize": "サイズ",
"background": "背景色",
"opacity": "不透明度",
"preview": "プレビュー",
"translation": "翻訳表示",
trans: {
"title": "翻訳スタイル設定",
"useSame": "原文のスタイルを使用"
}
},
status: {
"engine": "字幕エンジン",
"customized": "カスタマイズ済み",
"status": "エンジン状態",
"started": "開始済み",
"stopped": "未開始",
"logNumber": "字幕数",
"aboutProj": "プロジェクト情報",
"openCaption": "字幕ウィンドウを開く",
"startEngine": "字幕エンジンを開始",
"restartEngine": "字幕エンジンを再起動",
"stopEngine": "字幕エンジンを停止",
about: {
"title": "このプロジェクトについて",
"proj": "Auto Caption プロジェクト",
"desc": "複数の言語をサポートするクロスプラットフォームのリアルタイム字幕表示ソフトウェア。",
"version": "ソフトウェアバージョン",
"author": "プロジェクト作者",
"projLink": "プロジェクトリンク",
"manual": "ユーザーマニュアル",
"engineDoc": "字幕エンジンマニュアル",
"date": "2025 年 7 月 5 日"
}
},
log: {
"title": "字幕ログ",
"export": "エクスポート",
"clear": "字幕ログをクリア"
}
}

View File

@@ -0,0 +1,105 @@
export default {
lang: "zh",
example: {
"original": "This is a preview of caption styles. ",
"translation": "(翻译)这是字幕样式预览。"
},
noti: {
"restarted": "字幕引擎重启成功",
"started": "字幕引擎启动成功",
"sLang": "源语言:",
"trans": ",是否翻译:",
"engine": ",字幕引擎:",
"audio": ",音频类型:",
"sysout": "系统音频输出(扬声器)",
"sysin": "系统音频输入(麦克风)",
"tLang": ",翻译语言:",
"custom": "类型:自定义引擎,引擎路径:",
"args": ",命令参数:",
"pidInfo": ",字幕引擎进程 PID",
"stopped": "字幕引擎停止",
"stoppedInfo": "字幕引擎已经停止,可点击“启动字幕引擎”按钮重新启动",
"error": "发生错误",
"engineChange": "字幕引擎配置已更改",
"changeInfo": "如果字幕引擎已经启动,需要重启字幕引擎修改才会生效",
"styleChange": "字幕样式已修改",
"styleInfo": "字幕样式修改已经保存并生效"
},
general: {
"title": "通用设置",
"uiLanguage": "界面语言",
"barWidth": "左侧宽度",
"note": "通用设置修改后立即生效。注意字幕引擎设置和字幕样式的设置修改后需要点击应用后才会生效。",
"theme": "主题",
"light": "浅色",
"dark": "深色",
"system": "系统"
},
engine: {
"title": "字幕引擎设置",
"applyChange": "应用更改",
"cancelChange": "取消更改",
"sourceLang": "源语言",
"transLang": "翻译语言",
"captionEngine": "字幕引擎",
"audioType": "音频类型",
"systemOutput": "系统音频输出(扬声器)",
"systemInput": "系统音频输入(麦克风)",
"enableTranslation": "启用翻译",
"customEngine": "自定义引擎",
custom: {
"title": "自定义字幕引擎",
"attention": "注意事项",
"note": "说明:允许用户使用自定义引擎提供字幕。提供的引擎要能通过命令行启动,且可以提供命令行指令来指定参数。引擎需要使用标准输出与软件 node.js 后端进行通信。详细信息参考项目文档。",
"app": "引擎路径",
"command": "引擎指令"
}
},
style: {
"title": "字幕样式设置",
"applyStyle": "应用样式",
"cancelChange": "取消更改",
"resetStyle": "恢复默认",
"longCaption": "长字幕",
"fontFamily": "字体族",
"fontColor": "字体颜色",
"fontSize": "字体大小",
"background": "背景颜色",
"opacity": "不透明度",
"preview": "显示预览",
"translation": "显示翻译",
trans: {
"title": "翻译样式设置",
"useSame": "使用原文样式"
}
},
status: {
"engine": "字幕引擎",
"customized": "自定义",
"status": "引擎状态",
"started": "已启动",
"stopped": "未启动",
"logNumber": "字幕数量",
"aboutProj": "项目关于",
"openCaption": "打开字幕窗口",
"startEngine": "启动字幕引擎",
"restartEngine": "重启字幕引擎",
"stopEngine": "关闭字幕引擎",
about: {
"title": "关于本项目",
"proj": "Auto Caption 项目",
"desc": "一个跨平台的支持多种语言的实时字幕显示软件。",
"version": "软件版本",
"author": "项目作者",
"projLink": "项目链接",
"manual": "用户手册",
"engineDoc": "字幕引擎手册",
"date": "2025 年 7 月 5 日"
}
},
log: {
"title": "字幕记录",
"export": "导出字幕记录",
"clear": "清空字幕记录"
}
}

View File

@@ -1,14 +1,17 @@
import './assets/reset.css'
import { createPinia } from 'pinia'
import './assets/main.css'
import { createApp } from 'vue'
import { createPinia } from 'pinia'
import App from './App.vue'
import router from './router'
import { i18n } from './i18n'
import Antd from 'ant-design-vue';
import 'ant-design-vue/dist/reset.css';
const app = createApp(App)
app.use(createPinia())
app.use(router)
app.use(i18n)
app.use(Antd)
app.mount('#app')
app.mount('#app')

View File

@@ -1,140 +0,0 @@
import { ref } from 'vue'
import { defineStore } from 'pinia'
import { notification } from 'ant-design-vue'
import { ExclamationCircleOutlined } from '@ant-design/icons-vue'
import { h } from 'vue'
export const useCaptionControlStore = defineStore('captionControl', () => {
const captionEngine = ref([
{
value: 'gummy',
label: '云端-阿里云-Gummy',
languages: [
{ value: 'auto', label: '自动检测' },
{ value: 'en', label: '英语' },
{ value: 'zh', label: '中文' },
{ value: 'ja', label: '日语' },
{ value: 'ko', label: '韩语' },
{ value: 'de', label: '德语' },
{ value: 'fr', label: '法语' },
{ value: 'ru', label: '俄语' },
{ value: 'es', label: '西班牙语' },
{ value: 'it', label: '意大利语' },
]
},
])
const audioType = ref([
{
value: 0,
label: '系统音频输出(扬声器)'
},
{
value: 1,
label: '系统音频输入(麦克风)'
}
])
const engineEnabled = ref(false)
const sourceLang = ref<string>('en')
const targetLang = ref<string>('zh')
const engine = ref<string>('gummy')
const audio = ref<0 | 1>(0)
const translation = ref<boolean>(true)
const customized = ref<boolean>(false)
const customizedApp = ref<string>('')
const customizedCommand = ref<string>('')
const changeSignal = ref<boolean>(false)
function sendControlChange() {
const controls = {
engineEnabled: engineEnabled.value,
sourceLang: sourceLang.value,
targetLang: targetLang.value,
engine: engine.value,
audio: audio.value,
translation: translation.value,
customized: customized.value,
customizedApp: customizedApp.value,
customizedCommand: customizedCommand.value
}
window.electron.ipcRenderer.send('control.control.change', controls)
}
function startEngine() {
window.electron.ipcRenderer.send('control.engine.start')
}
function stopEngine() {
window.electron.ipcRenderer.send('control.engine.stop')
}
window.electron.ipcRenderer.on('control.control.set', (_, controls) => {
sourceLang.value = controls.sourceLang
targetLang.value = controls.targetLang
engine.value = controls.engine
audio.value = controls.audio
engineEnabled.value = controls.engineEnabled
translation.value = controls.translation
customized.value = controls.customized
customizedApp.value = controls.customizedApp
customizedCommand.value = controls.customizedCommand
changeSignal.value = true
})
window.electron.ipcRenderer.on('control.engine.already', () => {
notification.open({
message: '字幕引擎已经启动',
description: '字幕引擎已经启动,请勿重复启动'
});
})
window.electron.ipcRenderer.on('control.engine.started', () => {
const str0 =
`原语言:${sourceLang.value},是否翻译:${translation.value?'是':'否'}` +
`字幕引擎:${engine.value},音频类型:${audio.value ? '输入音频' : '输出音频'}` +
(translation.value ? `,翻译语言:${targetLang.value}` : '');
const str1 = `类型:自定义引擎,引擎路径:${customizedApp.value},命令参数:${customizedCommand.value}`;
notification.open({
message: '字幕引擎启动',
description: (customized.value && customizedApp.value) ? str1 : str0
});
})
window.electron.ipcRenderer.on('control.engine.stopped', () => {
notification.open({
message: '字幕引擎停止',
description: '可点击“启动字幕引擎”按钮重新启动'
});
})
window.electron.ipcRenderer.on('control.error.send', (_, message) => {
notification.open({
message: '发生错误',
description: message,
duration: null,
placement: 'topLeft',
icon: () => h(ExclamationCircleOutlined, { style: 'color: #ff4d4f' })
});
})
return {
captionEngine, // 字幕引擎
audioType, // 音频类型
engineEnabled, // 字幕引擎是否启用
sourceLang, // 源语言
targetLang, // 目标语言
engine, // 字幕引擎
audio, // 选择音频
translation, // 是否启用翻译
customized, // 是否使用自定义字幕引擎
customizedApp, // 自定义字幕引擎的应用程序
customizedCommand, // 自定义字幕引擎的命令
sendControlChange, // 发送最新控制消息到后端
startEngine, // 启动字幕引擎
stopEngine, // 停止字幕引擎
changeSignal, // 配置改变信号
}
})

View File

@@ -1,32 +1,24 @@
import { ref } from 'vue'
import { defineStore } from 'pinia'
interface CaptionItem {
index: number,
time_s: string,
time_t: string,
text: string,
translation: string
}
import { CaptionItem } from '../types'
export const useCaptionLogStore = defineStore('captionLog', () => {
const captionData = ref<CaptionItem[]>([])
window.electron.ipcRenderer.on('both.log.add', (_, log) => {
if(captionData.value.length && log.index === captionData.value[captionData.value.length - 1].index) {
captionData.value.splice(captionData.value.length - 1, 1, log)
}
else {
captionData.value.push(log)
}
})
function clear() {
captionData.value = []
window.electron.ipcRenderer.send('control.caption.clear')
window.electron.ipcRenderer.send('control.captionLog.clear')
}
window.electron.ipcRenderer.on('both.log.set', (_, logs) => {
window.electron.ipcRenderer.on('both.captionLog.add', (_, log) => {
captionData.value.push(log)
})
window.electron.ipcRenderer.on('both.captionLog.upd', (_, log) => {
captionData.value.splice(captionData.value.length - 1, 1, log)
})
window.electron.ipcRenderer.on('both.captionLog.set', (_, logs) => {
captionData.value = logs
})
@@ -34,4 +26,4 @@ export const useCaptionLogStore = defineStore('captionLog', () => {
captionData,
clear
}
})
})

View File

@@ -1,18 +1,22 @@
import { ref, computed } from 'vue'
import { defineStore } from 'pinia'
import { Styles } from '@renderer/types'
import { breakOptions } from '@renderer/i18n'
export const useCaptionStyleStore = defineStore('captionStyle', () => {
const lineBreak = ref<number>(1)
const fontFamily = ref<string>('sans-serif')
const fontSize = ref<number>(24)
const fontColor = ref<string>('#000000')
const background = ref<string>('#dbe2ef')
const opacity = ref<number>(80)
const showPreview = ref<boolean>(true)
const transDisplay = ref<boolean>(true)
const transFontFamily = ref<string>('sans-serif')
const transFontSize = ref<number>(24)
const transFontColor = ref<string>('#000000')
const iBreakOptions = ref(breakOptions['zh'])
const changeSignal = ref<boolean>(false)
function addOpicityToColor(color: string, opicity: number) {
@@ -25,51 +29,63 @@ export const useCaptionStyleStore = defineStore('captionStyle', () => {
return addOpicityToColor(background.value, opacity.value)
})
function sendStyleChange() {
const styles = {
function sendStylesChange() {
const styles: Styles = {
lineBreak: lineBreak.value,
fontFamily: fontFamily.value,
fontSize: fontSize.value,
fontColor: fontColor.value,
background: background.value,
opacity: opacity.value,
showPreview: showPreview.value,
transDisplay: transDisplay.value,
transFontFamily: transFontFamily.value,
transFontSize: transFontSize.value,
transFontColor: transFontColor.value
}
window.electron.ipcRenderer.send('control.style.change', styles)
window.electron.ipcRenderer.send('control.styles.change', styles)
}
function sendStyleReset() {
window.electron.ipcRenderer.send('control.style.reset')
function sendStylesReset() {
window.electron.ipcRenderer.send('control.styles.reset')
}
window.electron.ipcRenderer.on('caption.style.set', (_, args) => {
function setStyles(args: Styles){
lineBreak.value = args.lineBreak
fontFamily.value = args.fontFamily
fontSize.value = args.fontSize
fontColor.value = args.fontColor
background.value = args.background
opacity.value = args.opacity
showPreview.value = args.showPreview
transDisplay.value = args.transDisplay
transFontFamily.value = args.transFontFamily
transFontSize.value = args.transFontSize
transFontColor.value = args.transFontColor
changeSignal.value = true
}
window.electron.ipcRenderer.on('both.styles.set', (_, args: Styles) => {
setStyles(args)
})
return {
lineBreak, // 换行方式
fontFamily, // 字体族
fontSize, // 字体大小
fontColor, // 字体颜色
background, // 背景颜色
opacity, // 背景透明度
showPreview, // 是否显示预览
transDisplay, // 是否显示翻译
transFontFamily, // 翻译字体族
transFontSize, // 翻译字体大小
transFontColor, // 翻译字体颜色
backgroundRGBA, // 带透明度的背景颜色
sendStyleChange, // 发送样式改变
sendStyleReset, // 恢复默认样式
setStyles, // 设置样式
sendStylesChange, // 发送样式改变
sendStylesReset, // 恢复默认样式
iBreakOptions, // 换行选项
changeSignal // 样式改变信号
}
})
})

View File

@@ -0,0 +1,110 @@
import { ref } from 'vue'
import { defineStore } from 'pinia'
import { notification } from 'ant-design-vue'
import { ExclamationCircleOutlined } from '@ant-design/icons-vue'
import { h } from 'vue'
import { useI18n } from 'vue-i18n'
import { Controls } from '@renderer/types'
import { engines, audioTypes } from '@renderer/i18n'
import { useGeneralSettingStore } from './generalSetting'
export const useEngineControlStore = defineStore('engineControl', () => {
const { t } = useI18n()
const captionEngine = ref(engines[useGeneralSettingStore().uiLanguage])
const audioType = ref(audioTypes[useGeneralSettingStore().uiLanguage])
const engineEnabled = ref(false)
const sourceLang = ref<string>('en')
const targetLang = ref<string>('zh')
const engine = ref<'gummy'>('gummy')
const audio = ref<0 | 1>(0)
const translation = ref<boolean>(true)
const customized = ref<boolean>(false)
const customizedApp = ref<string>('')
const customizedCommand = ref<string>('')
const changeSignal = ref<boolean>(false)
function sendControlsChange() {
const controls: Controls = {
engineEnabled: engineEnabled.value,
sourceLang: sourceLang.value,
targetLang: targetLang.value,
engine: engine.value,
audio: audio.value,
translation: translation.value,
customized: customized.value,
customizedApp: customizedApp.value,
customizedCommand: customizedCommand.value
}
window.electron.ipcRenderer.send('control.controls.change', controls)
}
function setControls(controls: Controls) {
sourceLang.value = controls.sourceLang
targetLang.value = controls.targetLang
engine.value = controls.engine
audio.value = controls.audio
engineEnabled.value = controls.engineEnabled
translation.value = controls.translation
customized.value = controls.customized
customizedApp.value = controls.customizedApp
customizedCommand.value = controls.customizedCommand
changeSignal.value = true
}
window.electron.ipcRenderer.on('control.controls.set', (_, controls: Controls) => {
setControls(controls)
})
window.electron.ipcRenderer.on('control.engine.started', (_, args) => {
const str0 =
`${t('noti.sLang')}${sourceLang.value}${t('noti.trans')}${translation.value?'yes':'no'}` +
`${t('noti.engine')}${engine.value}${t('noti.audio')}${audio.value?t('noti.sysin'):t('noti.sysout')}` +
(translation.value ? `${t('noti.tLang')}${targetLang.value}` : '');
const str1 = `${t('noti.custom')}${customizedApp.value}${t('noti.args')}${customizedCommand.value}`;
notification.open({
message: t('noti.started'),
description:
((customized.value && customizedApp.value) ? str1 : str0) +
`${t('noti.pidInfo')}${args}`
});
})
window.electron.ipcRenderer.on('control.engine.stopped', () => {
notification.open({
message: t('noti.stopped'),
description: t('noti.stoppedInfo')
});
})
window.electron.ipcRenderer.on('control.error.occurred', (_, message) => {
notification.open({
message: t('noti.error'),
description: message,
duration: null,
placement: 'topLeft',
icon: () => h(ExclamationCircleOutlined, { style: 'color: #ff4d4f' })
});
})
return {
captionEngine, // 字幕引擎
audioType, // 音频类型
engineEnabled, // 字幕引擎是否启用
sourceLang, // 源语言
targetLang, // 目标语言
engine, // 字幕引擎
audio, // 选择音频
translation, // 是否启用翻译
customized, // 是否使用自定义字幕引擎
customizedApp, // 自定义字幕引擎的应用程序
customizedCommand, // 自定义字幕引擎的命令
setControls, // 设置引擎配置
sendControlsChange, // 发送最新控制消息到后端
changeSignal, // 配置改变信号
}
})

View File

@@ -0,0 +1,72 @@
import { ref, watch } from 'vue'
import { defineStore } from 'pinia'
import { i18n } from '../i18n'
import type { UILanguage, UITheme } from '../types'
import { engines, audioTypes, antDesignTheme, breakOptions } from '../i18n'
import { useEngineControlStore } from './engineControl'
import { useCaptionStyleStore } from './captionStyle'
export const useGeneralSettingStore = defineStore('generalSetting', () => {
const uiLanguage = ref<UILanguage>('zh')
const uiTheme = ref<UITheme>('system')
const leftBarWidth = ref<number>(8)
const antdTheme = ref<Object>(antDesignTheme['light'])
watch(uiLanguage, (newValue) => {
i18n.global.locale.value = newValue
useEngineControlStore().captionEngine = engines[newValue]
useEngineControlStore().audioType = audioTypes[newValue]
useCaptionStyleStore().iBreakOptions = breakOptions[newValue]
window.electron.ipcRenderer.send('control.uiLanguage.change', newValue)
})
watch(uiTheme, (newValue) => {
window.electron.ipcRenderer.send('control.uiTheme.change', newValue)
if(newValue === 'system'){
window.electron.ipcRenderer.invoke('control.nativeTheme.get').then((theme) => {
if(theme === 'light') setLightTheme()
else if(theme === 'dark') setDarkTheme()
})
}
else if(newValue === 'light') setLightTheme()
else if(newValue === 'dark') setDarkTheme()
})
watch(leftBarWidth, (newValue) => {
window.electron.ipcRenderer.send('control.leftBarWidth.change', newValue)
})
window.electron.ipcRenderer.on('control.uiLanguage.set', (_, args: UILanguage) => {
uiLanguage.value = args
})
window.electron.ipcRenderer.on('control.nativeTheme.change', (_, args) => {
if(args === 'light') setLightTheme()
else if(args === 'dark') setDarkTheme()
})
function setLightTheme(){
antdTheme.value = antDesignTheme.light
const root = document.documentElement
root.style.setProperty('--control-background', '#fff')
root.style.setProperty('--tag-color', 'rgba(0, 0, 0, 0.45)')
root.style.setProperty('--icon-color', 'rgba(0, 0, 0, 0.88)')
}
function setDarkTheme(){
antdTheme.value = antDesignTheme.dark
const root = document.documentElement
root.style.setProperty('--control-background', '#000')
root.style.setProperty('--tag-color', 'rgba(255, 255, 255, 0.45)')
root.style.setProperty('--icon-color', 'rgba(255, 255, 255, 0.85)')
}
return {
uiLanguage,
uiTheme,
leftBarWidth,
antdTheme
}
})

View File

@@ -0,0 +1,46 @@
export type UILanguage = "zh" | "en" | "ja"
export type UITheme = "light" | "dark" | "system"
export interface Controls {
engineEnabled: boolean,
sourceLang: string,
targetLang: string,
engine: 'gummy',
audio: 0 | 1,
translation: boolean,
customized: boolean,
customizedApp: string,
customizedCommand: string
}
export interface Styles {
lineBreak: number,
fontFamily: string,
fontSize: number,
fontColor: string,
background: string,
opacity: number,
showPreview: boolean,
transDisplay: boolean,
transFontFamily: string,
transFontSize: number,
transFontColor: string
}
export interface CaptionItem {
index: number,
time_s: string,
time_t: string,
text: string,
translation: string
}
export interface FullConfig {
uiLanguage: UILanguage,
uiTheme: UITheme,
leftBarWidth: number,
styles: Styles,
controls: Controls,
captionLog: CaptionItem[]
}

View File

@@ -20,21 +20,23 @@
</div>
</div>
<div class="caption-container">
<p class="preview-caption" :style="{
<p :class="[captionStyle.lineBreak?'':'left-ellipsis']" :style="{
fontFamily: captionStyle.fontFamily,
fontSize: captionStyle.fontSize + 'px',
color: captionStyle.fontColor
}">
<span v-if="captionData.length">{{ captionData[captionData.length-1].text }}</span>
<span v-else>{{ "This is a preview of subtitle styles." }}</span>
<span v-else>{{ $t('example.original') }}</span>
</p>
<p class="preview-translation" v-if="captionStyle.transDisplay" :style="{
<p :class="[captionStyle.lineBreak?'':'left-ellipsis']"
v-if="captionStyle.transDisplay"
:style="{
fontFamily: captionStyle.transFontFamily,
fontSize: captionStyle.transFontSize + 'px',
color: captionStyle.transFontColor
}">
<span v-if="captionData.length">{{ captionData[captionData.length-1].translation }}</span>
<span v-else>{{ "这是字幕样式预览(翻译)" }}</span>
<span v-else>{{ $t('example.translation') }}</span>
</p>
</div>
</div>
@@ -111,7 +113,7 @@ function closeCaptionWindow() {
background-color: #2221;
}
.caption-container {
.caption-container {
-webkit-app-region: drag;
}
@@ -121,4 +123,16 @@ function closeCaptionWindow() {
line-height: 1.5em;
padding: 0 10px 10px 10px;
}
</style>
.left-ellipsis {
white-space: nowrap;
overflow: hidden;
direction: rtl;
text-align: left;
}
.left-ellipsis > span {
direction: ltr;
display: inline-block;
}
</style>

View File

@@ -1,112 +1,57 @@
<template>
<div>
<a-row>
<a-col :span="controlSpan">
<a-config-provider :theme="antdTheme">
<a-row class="control-container">
<a-col :span="leftBarWidth">
<div class="caption-control">
<a-card size="small" title="页面宽度">
<template #extra>
<a-button type="link" @click="showAbout = true">关于本项目</a-button>
</template>
<div>
<a-input type="range" class="span-input" min="6" max="18" v-model:value="controlSpan" />
</div>
</a-card>
<CaptionControl />
<GeneralSetting />
<EngineControl />
<CaptionStyle />
</div>
</a-col>
<a-col :span="24 - controlSpan">
<a-col :span="24 - leftBarWidth">
<div class="caption-data">
<CaptionData />
<EngineStatus />
<CaptionLog />
</div>
</a-col>
</a-row>
<a-modal v-model:open="showAbout" title="关于本项目" :footer="null">
<div class="about-modal-content">
<h2 class="about-title">Auto Caption 项目</h2>
<p class="about-desc">一个跨平台的实时字幕显示软件</p>
<a-divider />
<div class="about-info">
<p><b>作者</b>HiMeditator</p>
<p><b>版本</b>v0.1.0</p>
<p>
<b>项目地址</b>
<a href="https://github.com/HiMeditator/auto-caption" target="_blank">
GitHub | auto-caption
</a>
</p>
<p>
<b>用户手册</b>
<a
href="https://github.com/HiMeditator/auto-caption/blob/main/assets/user-manual_zh.md"
target="_blank"
>
GitHub | user-manual_zh.md
</a>
</p>
</div>
<div class="about-date">2026 6 26 </div>
</div>
</a-modal>
</div>
</a-config-provider>
</template>
<script setup lang="ts">
import GeneralSetting from '../components/GeneralSetting.vue'
import CaptionStyle from '../components/CaptionStyle.vue'
import CaptionControl from '../components/CaptionControl.vue';
import CaptionData from '../components/CaptionData.vue'
import { ref } from 'vue'
const controlSpan = ref(8)
const showAbout = ref(false)
import EngineControl from '../components/EngineControl.vue'
import EngineStatus from '@renderer/components/EngineStatus.vue'
import CaptionLog from '../components/CaptionLog.vue'
import { storeToRefs } from 'pinia'
import { useGeneralSettingStore } from '@renderer/stores/generalSetting'
const generalSettingStore = useGeneralSettingStore()
const { leftBarWidth, antdTheme } = storeToRefs(generalSettingStore)
</script>
<style scoped>
.control-container {
background-color: var(--control-background);
}
.caption-control {
height: 100vh;
border-right: 1px solid #7774;
border-right: 1px solid var(--tag-color);
padding: 20px;
overflow-y: auto;
scrollbar-width: thin;
}
.caption-data {
height: 100vh;
padding: 20px;
overflow-y: auto;
scrollbar-width: thin;
}
.span-input {
width: 100px;
.caption-control::-webkit-scrollbar,
.caption-data::-webkit-scrollbar {
display: none;
}
.about-modal-content {
text-align: center;
padding: 8px 0 0 0;
}
.about-title {
font-size: 1.5em;
font-weight: bold;
margin-bottom: 0.2em;
}
.about-desc {
color: #666;
margin-bottom: 0.5em;
}
.about-info {
text-align: left;
display: inline-block;
margin: 0 auto;
font-size: 1em;
}
.about-date {
margin-top: 1.5em;
color: #aaa;
font-size: 0.95em;
text-align: right;
}
</style>
</style>