mirror of
https://github.com/HiMeditator/auto-caption.git
synced 2026-03-13 17:47:34 +08:00
Compare commits
5 Commits
f42458124e
...
sosv-model
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
6bff978b88 | ||
|
|
eba2c5ca45 | ||
|
|
2b7ce06f04 | ||
|
|
14987cbfc5 | ||
|
|
56fdc348f8 |
14
README.md
14
README.md
@@ -49,7 +49,7 @@
|
||||
| 操作系统版本 | 处理器架构 | 获取系统音频输入 | 获取系统音频输出 |
|
||||
| ------------------ | ---------- | ---------------- | ---------------- |
|
||||
| Windows 11 24H2 | x64 | ✅ | ✅ |
|
||||
| macOS Sequoia 15.5 | arm64 | ✅需要额外配置 | ✅ |
|
||||
| macOS Sequoia 15.5 | arm64 | ✅ [需要额外配置](./docs/user-manual/zh.md#macos-获取系统音频输出) | ✅ |
|
||||
| Ubuntu 24.04.2 | x64 | ✅ | ✅ |
|
||||
| Kali Linux 2022.3 | x64 | ✅ | ✅ |
|
||||
| Kylin Server V10 SP3 | x64 | ✅ | ✅ |
|
||||
@@ -188,15 +188,3 @@ npm run build:mac
|
||||
# For Linux
|
||||
npm run build:linux
|
||||
```
|
||||
|
||||
注意,根据不同的平台需要修改项目根目录下 `electron-builder.yml` 文件中的配置内容:
|
||||
|
||||
```yml
|
||||
extraResources:
|
||||
# For Windows
|
||||
- from: ./engine/dist/main.exe
|
||||
to: ./engine/main.exe
|
||||
# For macOS and Linux
|
||||
# - from: ./engine/dist/main
|
||||
# to: ./engine/main
|
||||
```
|
||||
|
||||
14
README_en.md
14
README_en.md
@@ -49,7 +49,7 @@ The software has been adapted for Windows, macOS, and Linux platforms. The teste
|
||||
| OS Version | Architecture | System Audio Input | System Audio Output |
|
||||
| ------------------ | ------------ | ------------------ | ------------------- |
|
||||
| Windows 11 24H2 | x64 | ✅ | ✅ |
|
||||
| macOS Sequoia 15.5 | arm64 | ✅ Additional config required | ✅ |
|
||||
| macOS Sequoia 15.5 | arm64 | ✅ [Additional config required](./docs/user-manual/en.md#capturing-system-audio-output-on-macos) | ✅ |
|
||||
| Ubuntu 24.04.2 | x64 | ✅ | ✅ |
|
||||
| Kali Linux 2022.3 | x64 | ✅ | ✅ |
|
||||
| Kylin Server V10 SP3 | x64 | ✅ | ✅ |
|
||||
@@ -188,15 +188,3 @@ npm run build:mac
|
||||
# For Linux
|
||||
npm run build:linux
|
||||
```
|
||||
|
||||
Note: You need to modify the configuration content in the `electron-builder.yml` file in the project root directory according to different platforms:
|
||||
|
||||
```yml
|
||||
extraResources:
|
||||
# For Windows
|
||||
- from: ./engine/dist/main.exe
|
||||
to: ./engine/main.exe
|
||||
# For macOS and Linux
|
||||
# - from: ./engine/dist/main
|
||||
# to: ./engine/main
|
||||
```
|
||||
14
README_ja.md
14
README_ja.md
@@ -49,7 +49,7 @@
|
||||
| OS バージョン | アーキテクチャ | システムオーディオ入力 | システムオーディオ出力 |
|
||||
| ------------------ | ------------ | ------------------ | ------------------- |
|
||||
| Windows 11 24H2 | x64 | ✅ | ✅ |
|
||||
| macOS Sequoia 15.5 | arm64 | ✅ 追加設定が必要 | ✅ |
|
||||
| macOS Sequoia 15.5 | arm64 | ✅ [追加設定が必要](./docs/user-manual/ja.md#macos-でのシステムオーディオ出力の取得方法) | ✅ |
|
||||
| Ubuntu 24.04.2 | x64 | ✅ | ✅ |
|
||||
| Kali Linux 2022.3 | x64 | ✅ | ✅ |
|
||||
| Kylin Server V10 SP3 | x64 | ✅ | ✅ |
|
||||
@@ -188,15 +188,3 @@ npm run build:mac
|
||||
# Linux 用
|
||||
npm run build:linux
|
||||
```
|
||||
|
||||
注意: プラットフォームに応じて、プロジェクトルートディレクトリにある `electron-builder.yml` ファイルの設定内容を変更する必要があります:
|
||||
|
||||
```yml
|
||||
extraResources:
|
||||
# Windows 用
|
||||
- from: ./engine/dist/main.exe
|
||||
to: ./engine/main.exe
|
||||
# macOS と Linux 用
|
||||
# - from: ./engine/dist/main
|
||||
# to: ./engine/main
|
||||
```
|
||||
|
||||
@@ -153,4 +153,18 @@
|
||||
### 优化体验
|
||||
|
||||
- 优化软件用户界面的部分组件
|
||||
- 更清晰的日志输出
|
||||
- 更清晰的日志输出
|
||||
|
||||
|
||||
## v0.8.0
|
||||
|
||||
2025-09-??
|
||||
|
||||
### 新增功能
|
||||
|
||||
- 字幕引擎添加超时关闭功能:如果在规定时间字幕引擎没有启动成功会自动关闭、在字幕引擎启动过程中也可选择关闭字幕引擎
|
||||
- 添加非实时翻译功能:支持调用 Ollama 本地模型进行翻译、支持调用 Google 翻译 API 进行翻译
|
||||
|
||||
### 优化体验
|
||||
|
||||
- 带有额外信息的标签颜色改为与主题色一致
|
||||
@@ -58,6 +58,18 @@ Electron 主进程通过 TCP Socket 向 Python 进程发送数据。发送的数
|
||||
|
||||
Python 端监听到的音频流转换为的字幕数据。
|
||||
|
||||
### `translation`
|
||||
|
||||
```js
|
||||
{
|
||||
command: "translation",
|
||||
time_s: string,
|
||||
translation: string
|
||||
}
|
||||
```
|
||||
|
||||
语音识别的内容的翻译,可以根据起始时间确定对应的字幕。
|
||||
|
||||
### `print`
|
||||
|
||||
```js
|
||||
@@ -67,7 +79,7 @@ Python 端监听到的音频流转换为的字幕数据。
|
||||
}
|
||||
```
|
||||
|
||||
输出 Python 端打印的内容。
|
||||
输出 Python 端打印的内容,不计入日志。
|
||||
|
||||
### `info`
|
||||
|
||||
@@ -78,7 +90,18 @@ Python 端监听到的音频流转换为的字幕数据。
|
||||
}
|
||||
```
|
||||
|
||||
Python 端打印的提示信息,比起 `print`,该信息更希望 Electron 端的关注。
|
||||
Python 端打印的提示信息,会计入日志。
|
||||
|
||||
### `warn`
|
||||
|
||||
```js
|
||||
{
|
||||
command: "warn",
|
||||
content: string
|
||||
}
|
||||
```
|
||||
|
||||
Python 端打印的警告信息,会计入日志。
|
||||
|
||||
### `error`
|
||||
|
||||
@@ -89,7 +112,7 @@ Python 端打印的提示信息,比起 `print`,该信息更希望 Electron
|
||||
}
|
||||
```
|
||||
|
||||
Python 端打印的错误信息,该错误信息需要在前端弹窗显示。
|
||||
Python 端打印的错误信息,该错误信息会在前端弹窗显示。
|
||||
|
||||
### `usage`
|
||||
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
appId: com.himeditator.autocaption
|
||||
productName: auto-caption
|
||||
productName: Auto Caption
|
||||
directories:
|
||||
buildResources: build
|
||||
files:
|
||||
@@ -13,13 +13,15 @@ files:
|
||||
- '!engine/*'
|
||||
- '!docs/*'
|
||||
- '!assets/*'
|
||||
- '!.repomap/*'
|
||||
- '!.virtualme/*'
|
||||
extraResources:
|
||||
# For Windows
|
||||
- from: ./engine/dist/main.exe
|
||||
to: ./engine/main.exe
|
||||
# For macOS and Linux
|
||||
# - from: ./engine/dist/main
|
||||
# to: ./engine/main
|
||||
- from: ./engine/dist/main
|
||||
to: ./engine/main
|
||||
win:
|
||||
executableName: auto-caption
|
||||
icon: build/icon.png
|
||||
|
||||
@@ -1,3 +1,3 @@
|
||||
from dashscope.common.error import InvalidParameter
|
||||
from .gummy import GummyRecognizer
|
||||
from .vosk import VoskRecognizer
|
||||
from .vosk import VoskRecognizer
|
||||
from .sosv import SosvRecognizer
|
||||
@@ -5,9 +5,10 @@ from dashscope.audio.asr import (
|
||||
TranslationRecognizerRealtime
|
||||
)
|
||||
import dashscope
|
||||
from dashscope.common.error import InvalidParameter
|
||||
from datetime import datetime
|
||||
from utils import stdout_cmd, stdout_obj, stderr
|
||||
|
||||
from utils import stdout_cmd, stdout_obj, stdout_err
|
||||
from utils import shared_data
|
||||
|
||||
class Callback(TranslationRecognizerCallback):
|
||||
"""
|
||||
@@ -90,9 +91,23 @@ class GummyRecognizer:
|
||||
"""启动 Gummy 引擎"""
|
||||
self.translator.start()
|
||||
|
||||
def send_audio_frame(self, data):
|
||||
"""发送音频帧,擎将自动识别并将识别结果输出到标准输出中"""
|
||||
self.translator.send_audio_frame(data)
|
||||
def translate(self):
|
||||
"""持续读取共享数据中的音频帧,并进行语音识别,将识别结果输出到标准输出中"""
|
||||
global shared_data
|
||||
restart_count = 0
|
||||
while shared_data.status == 'running':
|
||||
chunk = shared_data.chunk_queue.get()
|
||||
try:
|
||||
self.translator.send_audio_frame(chunk)
|
||||
except InvalidParameter as e:
|
||||
restart_count += 1
|
||||
if restart_count > 5:
|
||||
stdout_err(str(e))
|
||||
shared_data.status = "kill"
|
||||
stdout_cmd('kill')
|
||||
break
|
||||
else:
|
||||
stdout_cmd('info', f'Gummy engine stopped, restart attempt: {restart_count}...')
|
||||
|
||||
def stop(self):
|
||||
"""停止 Gummy 引擎"""
|
||||
|
||||
176
engine/audio2text/sosv.py
Normal file
176
engine/audio2text/sosv.py
Normal file
@@ -0,0 +1,176 @@
|
||||
"""
|
||||
Shepra-ONNX SenseVoice Model
|
||||
|
||||
This code file references the following:
|
||||
|
||||
https://github.com/k2-fsa/sherpa-onnx/blob/master/python-api-examples/simulate-streaming-sense-voice-microphone.py
|
||||
"""
|
||||
|
||||
import time
|
||||
from datetime import datetime
|
||||
import sherpa_onnx
|
||||
import threading
|
||||
import numpy as np
|
||||
|
||||
from utils import shared_data
|
||||
from utils import stdout_cmd, stdout_obj
|
||||
from utils import google_translate, ollama_translate
|
||||
|
||||
|
||||
class SosvRecognizer:
|
||||
"""
|
||||
使用 Sense Voice 非流式模型处理流式音频数据,并在标准输出中输出 Auto Caption 软件可读取的 JSON 字符串数据
|
||||
|
||||
初始化参数:
|
||||
model_path: Shepra ONNX Sense Voice 识别模型路径
|
||||
vad_model: Silero VAD 模型路径
|
||||
source: 识别源语言(auto, zh, en, ja, ko, yue)
|
||||
target: 翻译目标语言
|
||||
trans_model: 翻译模型名称
|
||||
ollama_name: Ollama 模型名称
|
||||
"""
|
||||
def __init__(self, model_path: str, source: str, target: str | None, trans_model: str, ollama_name: str):
|
||||
if model_path.startswith('"'):
|
||||
model_path = model_path[1:]
|
||||
if model_path.endswith('"'):
|
||||
model_path = model_path[:-1]
|
||||
self.model_path = model_path
|
||||
self.ext = ""
|
||||
if self.model_path[-4:] == "int8":
|
||||
self.ext = ".int8"
|
||||
self.source = source
|
||||
self.target = target
|
||||
if trans_model == 'google':
|
||||
self.trans_func = google_translate
|
||||
else:
|
||||
self.trans_func = ollama_translate
|
||||
self.ollama_name = ollama_name
|
||||
self.time_str = ''
|
||||
self.cur_id = 0
|
||||
self.prev_content = ''
|
||||
|
||||
def start(self):
|
||||
"""启动 Sense Voice 模型"""
|
||||
self.recognizer = sherpa_onnx.OfflineRecognizer.from_sense_voice(
|
||||
model=f"{self.model_path}/sensevoice/model{self.ext}.onnx",
|
||||
tokens=f"{self.model_path}/sensevoice/tokens.txt",
|
||||
language=self.source,
|
||||
num_threads = 2,
|
||||
)
|
||||
|
||||
vad_config = sherpa_onnx.VadModelConfig()
|
||||
vad_config.silero_vad.model = f"{self.model_path}/silero_vad.onnx"
|
||||
vad_config.silero_vad.threshold = 0.5
|
||||
vad_config.silero_vad.min_silence_duration = 0.1
|
||||
vad_config.silero_vad.min_speech_duration = 0.25
|
||||
vad_config.silero_vad.max_speech_duration = 8
|
||||
vad_config.sample_rate = 16000
|
||||
self.window_size = vad_config.silero_vad.window_size
|
||||
self.vad = sherpa_onnx.VoiceActivityDetector(vad_config, buffer_size_in_seconds=100)
|
||||
|
||||
if self.source == 'en':
|
||||
model_config = sherpa_onnx.OnlinePunctuationModelConfig(
|
||||
cnn_bilstm=f"{self.model_path}/punct-en/model{self.ext}.onnx",
|
||||
bpe_vocab=f"{self.model_path}/punct-en/bpe.vocab"
|
||||
)
|
||||
punct_config = sherpa_onnx.OnlinePunctuationConfig(
|
||||
model_config=model_config,
|
||||
)
|
||||
self.punct = sherpa_onnx.OnlinePunctuation(punct_config)
|
||||
else:
|
||||
punct_config = sherpa_onnx.OfflinePunctuationConfig(
|
||||
model=sherpa_onnx.OfflinePunctuationModelConfig(
|
||||
ct_transformer=f"{self.model_path}/punct/model{self.ext}.onnx"
|
||||
),
|
||||
)
|
||||
self.punct = sherpa_onnx.OfflinePunctuation(punct_config)
|
||||
|
||||
self.buffer = []
|
||||
self.offset = 0
|
||||
self.started = False
|
||||
self.started_time = .0
|
||||
self.time_str = datetime.now().strftime('%H:%M:%S.%f')[:-3]
|
||||
stdout_cmd('info', 'Shepra ONNX Sense Voice recognizer started.')
|
||||
|
||||
def send_audio_frame(self, data: bytes):
|
||||
"""
|
||||
发送音频帧给 SOSV 引擎,引擎将自动识别并将识别结果输出到标准输出中
|
||||
|
||||
Args:
|
||||
data: 音频帧数据,采样率必须为 16000Hz
|
||||
"""
|
||||
caption = {}
|
||||
caption['command'] = 'caption'
|
||||
caption['translation'] = ''
|
||||
|
||||
data_np = np.frombuffer(data, dtype=np.int16).astype(np.float32)
|
||||
self.buffer = np.concatenate([self.buffer, data_np])
|
||||
while self.offset + self.window_size < len(self.buffer):
|
||||
self.vad.accept_waveform(self.buffer[self.offset: self.offset + self.window_size])
|
||||
if not self.started and self.vad.is_speech_detected():
|
||||
self.started = True
|
||||
self.started_time = time.time()
|
||||
self.offset += self.window_size
|
||||
|
||||
if not self.started:
|
||||
if len(self.buffer) > 10 * self.window_size:
|
||||
self.offset -= len(self.buffer) - 10 * self.window_size
|
||||
self.buffer = self.buffer[-10 * self.window_size:]
|
||||
|
||||
if self.started and time.time() - self.started_time > 0.2:
|
||||
stream = self.recognizer.create_stream()
|
||||
stream.accept_waveform(16000, self.buffer)
|
||||
self.recognizer.decode_stream(stream)
|
||||
text = stream.result.text.strip()
|
||||
if text and self.prev_content != text:
|
||||
caption['index'] = self.cur_id
|
||||
caption['text'] = text
|
||||
caption['time_s'] = self.time_str
|
||||
caption['time_t'] = datetime.now().strftime('%H:%M:%S.%f')[:-3]
|
||||
self.prev_content = text
|
||||
stdout_obj(caption)
|
||||
self.started_time = time.time()
|
||||
|
||||
while not self.vad.empty():
|
||||
stream = self.recognizer.create_stream()
|
||||
stream.accept_waveform(16000, self.vad.front.samples)
|
||||
self.vad.pop()
|
||||
self.recognizer.decode_stream(stream)
|
||||
text = stream.result.text.strip()
|
||||
|
||||
if self.source == 'en':
|
||||
text_with_punct = self.punct.add_punctuation_with_case(text)
|
||||
else:
|
||||
text_with_punct = self.punct.add_punctuation(text)
|
||||
|
||||
caption['index'] = self.cur_id
|
||||
caption['text'] = text_with_punct
|
||||
caption['time_s'] = self.time_str
|
||||
caption['time_t'] = datetime.now().strftime('%H:%M:%S.%f')[:-3]
|
||||
if text:
|
||||
stdout_obj(caption)
|
||||
if self.target:
|
||||
th = threading.Thread(
|
||||
target=self.trans_func,
|
||||
args=(self.ollama_name, self.target, caption['text'], self.time_str),
|
||||
daemon=True
|
||||
)
|
||||
th.start()
|
||||
self.cur_id += 1
|
||||
self.prev_content = ''
|
||||
self.time_str = datetime.now().strftime('%H:%M:%S.%f')[:-3]
|
||||
self.buffer = []
|
||||
self.offset = 0
|
||||
self.started = False
|
||||
self.started_time = .0
|
||||
|
||||
def translate(self):
|
||||
"""持续读取共享数据中的音频帧,并进行语音识别,将识别结果输出到标准输出中"""
|
||||
global shared_data
|
||||
while shared_data.status == 'running':
|
||||
chunk = shared_data.chunk_queue.get()
|
||||
self.send_audio_frame(chunk)
|
||||
|
||||
def stop(self):
|
||||
"""停止 Sense Voice 模型"""
|
||||
stdout_cmd('info', 'Shepra ONNX Sense Voice recognizer closed.')
|
||||
@@ -1,8 +1,11 @@
|
||||
import json
|
||||
import threading
|
||||
import time
|
||||
from datetime import datetime
|
||||
|
||||
from vosk import Model, KaldiRecognizer, SetLogLevel
|
||||
from utils import stdout_cmd, stdout_obj
|
||||
from utils import shared_data
|
||||
from utils import stdout_cmd, stdout_obj, google_translate, ollama_translate
|
||||
|
||||
|
||||
class VoskRecognizer:
|
||||
@@ -11,14 +14,23 @@ class VoskRecognizer:
|
||||
|
||||
初始化参数:
|
||||
model_path: Vosk 识别模型路径
|
||||
target: 翻译目标语言
|
||||
trans_model: 翻译模型名称
|
||||
ollama_name: Ollama 模型名称
|
||||
"""
|
||||
def __init__(self, model_path: str):
|
||||
def __init__(self, model_path: str, target: str | None, trans_model: str, ollama_name: str):
|
||||
SetLogLevel(-1)
|
||||
if model_path.startswith('"'):
|
||||
model_path = model_path[1:]
|
||||
if model_path.endswith('"'):
|
||||
model_path = model_path[:-1]
|
||||
self.model_path = model_path
|
||||
self.target = target
|
||||
if trans_model == 'google':
|
||||
self.trans_func = google_translate
|
||||
else:
|
||||
self.trans_func = ollama_translate
|
||||
self.ollama_name = ollama_name
|
||||
self.time_str = ''
|
||||
self.cur_id = 0
|
||||
self.prev_content = ''
|
||||
@@ -48,7 +60,16 @@ class VoskRecognizer:
|
||||
caption['time_s'] = self.time_str
|
||||
caption['time_t'] = datetime.now().strftime('%H:%M:%S.%f')[:-3]
|
||||
self.prev_content = ''
|
||||
if content == '': return
|
||||
self.cur_id += 1
|
||||
|
||||
if self.target:
|
||||
th = threading.Thread(
|
||||
target=self.trans_func,
|
||||
args=(self.ollama_name, self.target, caption['text'], self.time_str),
|
||||
daemon=True
|
||||
)
|
||||
th.start()
|
||||
else:
|
||||
content = json.loads(self.recognizer.PartialResult()).get('partial', '')
|
||||
if content == '' or content == self.prev_content:
|
||||
@@ -63,6 +84,13 @@ class VoskRecognizer:
|
||||
|
||||
stdout_obj(caption)
|
||||
|
||||
def translate(self):
|
||||
"""持续读取共享数据中的音频帧,并进行语音识别,将识别结果输出到标准输出中"""
|
||||
global shared_data
|
||||
while shared_data.status == 'running':
|
||||
chunk = shared_data.chunk_queue.get()
|
||||
self.send_audio_frame(chunk)
|
||||
|
||||
def stop(self):
|
||||
"""停止 Vosk 引擎"""
|
||||
stdout_cmd('info', 'Vosk recognizer closed.')
|
||||
178
engine/main.py
178
engine/main.py
@@ -1,90 +1,153 @@
|
||||
import wave
|
||||
import argparse
|
||||
from utils import stdout_cmd, stdout_err
|
||||
from utils import thread_data, start_server
|
||||
import threading
|
||||
from utils import stdout, stdout_cmd
|
||||
from utils import shared_data, start_server
|
||||
from utils import merge_chunk_channels, resample_chunk_mono
|
||||
from audio2text import InvalidParameter, GummyRecognizer
|
||||
from audio2text import GummyRecognizer
|
||||
from audio2text import VoskRecognizer
|
||||
from audio2text import SosvRecognizer
|
||||
from sysaudio import AudioStream
|
||||
|
||||
|
||||
def audio_recording(stream: AudioStream, resample: bool, save = False, path = ''):
|
||||
global shared_data
|
||||
stream.open_stream()
|
||||
wf = None
|
||||
if save:
|
||||
if path != '':
|
||||
path += '/'
|
||||
wf = wave.open(f'{path}record.wav', 'wb')
|
||||
wf.setnchannels(stream.CHANNELS)
|
||||
wf.setsampwidth(stream.SAMP_WIDTH)
|
||||
wf.setframerate(stream.CHUNK_RATE)
|
||||
while shared_data.status == 'running':
|
||||
raw_chunk = stream.read_chunk()
|
||||
if save: wf.writeframes(raw_chunk) # type: ignore
|
||||
if raw_chunk is None: continue
|
||||
if resample:
|
||||
chunk = resample_chunk_mono(raw_chunk, stream.CHANNELS, stream.RATE, 16000)
|
||||
else:
|
||||
chunk = merge_chunk_channels(raw_chunk, stream.CHANNELS)
|
||||
shared_data.chunk_queue.put(chunk)
|
||||
if save: wf.close() # type: ignore
|
||||
stream.close_stream_signal()
|
||||
|
||||
|
||||
def main_gummy(s: str, t: str, a: int, c: int, k: str):
|
||||
global thread_data
|
||||
"""
|
||||
Parameters:
|
||||
s: Source language
|
||||
t: Target language
|
||||
k: Aliyun Bailian API key
|
||||
"""
|
||||
stream = AudioStream(a, c)
|
||||
if t == 'none':
|
||||
engine = GummyRecognizer(stream.RATE, s, None, k)
|
||||
else:
|
||||
engine = GummyRecognizer(stream.RATE, s, t, k)
|
||||
|
||||
stream.open_stream()
|
||||
engine.start()
|
||||
chunk_mono = bytes()
|
||||
|
||||
restart_count = 0
|
||||
while thread_data.status == "running":
|
||||
try:
|
||||
chunk = stream.read_chunk()
|
||||
if chunk is None: continue
|
||||
chunk_mono = merge_chunk_channels(chunk, stream.CHANNELS)
|
||||
try:
|
||||
engine.send_audio_frame(chunk_mono)
|
||||
except InvalidParameter as e:
|
||||
restart_count += 1
|
||||
if restart_count > 5:
|
||||
stdout_err(str(e))
|
||||
thread_data.status = "kill"
|
||||
stdout_cmd('kill')
|
||||
break
|
||||
else:
|
||||
stdout_cmd('info', f'Gummy engine stopped, restart attempt: {restart_count}...')
|
||||
except KeyboardInterrupt:
|
||||
break
|
||||
|
||||
engine.send_audio_frame(chunk_mono)
|
||||
stream.close_stream()
|
||||
stream_thread = threading.Thread(
|
||||
target=audio_recording,
|
||||
args=(stream, False),
|
||||
daemon=True
|
||||
)
|
||||
stream_thread.start()
|
||||
try:
|
||||
engine.translate()
|
||||
except KeyboardInterrupt:
|
||||
stdout("Keyboard interrupt detected. Exiting...")
|
||||
engine.stop()
|
||||
|
||||
|
||||
def main_vosk(a: int, c: int, m: str):
|
||||
global thread_data
|
||||
def main_vosk(a: int, c: int, vosk: str, t: str, tm: str, omn: str):
|
||||
"""
|
||||
Parameters:
|
||||
a: Audio source: 0 for output, 1 for input
|
||||
c: Chunk number in 1 second
|
||||
vosk: Vosk model path
|
||||
t: Target language
|
||||
tm: Translation model type, ollama or google
|
||||
omn: Ollama model name
|
||||
"""
|
||||
stream = AudioStream(a, c)
|
||||
engine = VoskRecognizer(m)
|
||||
if t == 'none':
|
||||
engine = VoskRecognizer(vosk, None, tm, omn)
|
||||
else:
|
||||
engine = VoskRecognizer(vosk, t, tm, omn)
|
||||
|
||||
stream.open_stream()
|
||||
engine.start()
|
||||
stream_thread = threading.Thread(
|
||||
target=audio_recording,
|
||||
args=(stream, True),
|
||||
daemon=True
|
||||
)
|
||||
stream_thread.start()
|
||||
try:
|
||||
engine.translate()
|
||||
except KeyboardInterrupt:
|
||||
stdout("Keyboard interrupt detected. Exiting...")
|
||||
engine.stop()
|
||||
|
||||
while thread_data.status == "running":
|
||||
try:
|
||||
chunk = stream.read_chunk()
|
||||
if chunk is None: continue
|
||||
chunk_mono = resample_chunk_mono(chunk, stream.CHANNELS, stream.RATE, 16000)
|
||||
engine.send_audio_frame(chunk_mono)
|
||||
except KeyboardInterrupt:
|
||||
break
|
||||
|
||||
stream.close_stream()
|
||||
def main_sosv(a: int, c: int, sosv: str, s: str, t: str, tm: str, omn: str):
|
||||
"""
|
||||
Parameters:
|
||||
a: Audio source: 0 for output, 1 for input
|
||||
c: Chunk number in 1 second
|
||||
sosv: Sherpa-ONNX SenseVoice model path
|
||||
s: Source language
|
||||
t: Target language
|
||||
tm: Translation model type, ollama or google
|
||||
omn: Ollama model name
|
||||
"""
|
||||
stream = AudioStream(a, c)
|
||||
if t == 'none':
|
||||
engine = SosvRecognizer(sosv, s, None, tm, omn)
|
||||
else:
|
||||
engine = SosvRecognizer(sosv, s, t, tm, omn)
|
||||
|
||||
engine.start()
|
||||
stream_thread = threading.Thread(
|
||||
target=audio_recording,
|
||||
args=(stream, True),
|
||||
daemon=True
|
||||
)
|
||||
stream_thread.start()
|
||||
try:
|
||||
engine.translate()
|
||||
except KeyboardInterrupt:
|
||||
stdout("Keyboard interrupt detected. Exiting...")
|
||||
engine.stop()
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
parser = argparse.ArgumentParser(description='Convert system audio stream to text')
|
||||
# both
|
||||
# all
|
||||
parser.add_argument('-e', '--caption_engine', default='gummy', help='Caption engine: gummy or vosk')
|
||||
parser.add_argument('-a', '--audio_type', default=0, help='Audio stream source: 0 for output, 1 for input')
|
||||
parser.add_argument('-c', '--chunk_rate', default=10, help='Number of audio stream chunks collected per second')
|
||||
parser.add_argument('-p', '--port', default=8080, help='The port to run the server on, 0 for no server')
|
||||
parser.add_argument('-p', '--port', default=0, help='The port to run the server on, 0 for no server')
|
||||
parser.add_argument('-t', '--target_language', default='zh', help='Target language code, "none" for no translation')
|
||||
# gummy and sosv
|
||||
parser.add_argument('-s', '--source_language', default='auto', help='Source language code')
|
||||
# gummy only
|
||||
parser.add_argument('-s', '--source_language', default='en', help='Source language code')
|
||||
parser.add_argument('-t', '--target_language', default='zh', help='Target language code')
|
||||
parser.add_argument('-k', '--api_key', default='', help='API KEY for Gummy model')
|
||||
# vosk and sosv
|
||||
parser.add_argument('-tm', '--translation_model', default='ollama', help='Model for translation: ollama or google')
|
||||
parser.add_argument('-omn', '--ollama_name', default='', help='Ollama model name for translation')
|
||||
# vosk only
|
||||
parser.add_argument('-m', '--model_path', default='', help='The path to the vosk model.')
|
||||
parser.add_argument('-vosk', '--vosk_model', default='', help='The path to the vosk model.')
|
||||
# sosv only
|
||||
parser.add_argument('-sosv', '--sosv_model', default=None, help='The SenseVoice model path')
|
||||
|
||||
args = parser.parse_args()
|
||||
if int(args.port) == 0:
|
||||
thread_data.status = "running"
|
||||
shared_data.status = "running"
|
||||
else:
|
||||
start_server(int(args.port))
|
||||
|
||||
|
||||
if args.caption_engine == 'gummy':
|
||||
main_gummy(
|
||||
args.source_language,
|
||||
@@ -97,10 +160,23 @@ if __name__ == "__main__":
|
||||
main_vosk(
|
||||
int(args.audio_type),
|
||||
int(args.chunk_rate),
|
||||
args.model_path
|
||||
args.vosk_model,
|
||||
args.target_language,
|
||||
args.translation_model,
|
||||
args.ollama_name
|
||||
)
|
||||
elif args.caption_engine == 'sosv':
|
||||
main_sosv(
|
||||
int(args.audio_type),
|
||||
int(args.chunk_rate),
|
||||
args.sosv_model,
|
||||
args.source_language,
|
||||
args.target_language,
|
||||
args.translation_model,
|
||||
args.ollama_name
|
||||
)
|
||||
else:
|
||||
raise ValueError('Invalid caption engine specified.')
|
||||
|
||||
if thread_data.status == "kill":
|
||||
if shared_data.status == "kill":
|
||||
stdout_cmd('kill')
|
||||
|
||||
@@ -1,7 +1,10 @@
|
||||
dashscope
|
||||
numpy
|
||||
samplerate
|
||||
resampy
|
||||
vosk
|
||||
pyinstaller
|
||||
pyaudio; sys_platform == 'darwin'
|
||||
pyaudiowpatch; sys_platform == 'win32'
|
||||
googletrans
|
||||
ollama
|
||||
sherpa_onnx
|
||||
@@ -37,14 +37,13 @@ class AudioStream:
|
||||
self.FORMAT = pyaudio.paInt16
|
||||
self.SAMP_WIDTH = pyaudio.get_sample_size(self.FORMAT)
|
||||
self.CHANNELS = int(self.device["maxInputChannels"])
|
||||
self.RATE = int(self.device["defaultSampleRate"])
|
||||
self.CHUNK = self.RATE // chunk_rate
|
||||
self.DEFAULT_RATE = int(self.device["defaultSampleRate"])
|
||||
self.CHUNK_RATE = chunk_rate
|
||||
|
||||
def reset_chunk_size(self, chunk_size: int):
|
||||
"""
|
||||
重新设置音频块大小
|
||||
"""
|
||||
self.CHUNK = chunk_size
|
||||
self.RATE = 16000
|
||||
self.CHUNK = self.RATE // self.CHUNK_RATE
|
||||
self.open_stream()
|
||||
self.close_stream()
|
||||
|
||||
def get_info(self):
|
||||
dev_info = f"""
|
||||
@@ -72,16 +71,27 @@ class AudioStream:
|
||||
打开并返回系统音频输出流
|
||||
"""
|
||||
if self.stream: return self.stream
|
||||
self.stream = self.mic.open(
|
||||
format = self.FORMAT,
|
||||
channels = int(self.CHANNELS),
|
||||
rate = self.RATE,
|
||||
input = True,
|
||||
input_device_index = int(self.INDEX)
|
||||
)
|
||||
try:
|
||||
self.stream = self.mic.open(
|
||||
format = self.FORMAT,
|
||||
channels = int(self.CHANNELS),
|
||||
rate = self.RATE,
|
||||
input = True,
|
||||
input_device_index = int(self.INDEX)
|
||||
)
|
||||
except OSError:
|
||||
self.RATE = self.DEFAULT_RATE
|
||||
self.CHUNK = self.RATE // self.CHUNK_RATE
|
||||
self.stream = self.mic.open(
|
||||
format = self.FORMAT,
|
||||
channels = int(self.CHANNELS),
|
||||
rate = self.RATE,
|
||||
input = True,
|
||||
input_device_index = int(self.INDEX)
|
||||
)
|
||||
return self.stream
|
||||
|
||||
def read_chunk(self):
|
||||
def read_chunk(self) -> bytes | None:
|
||||
"""
|
||||
读取音频数据
|
||||
"""
|
||||
|
||||
@@ -55,15 +55,10 @@ class AudioStream:
|
||||
self.FORMAT = 16
|
||||
self.SAMP_WIDTH = 2
|
||||
self.CHANNELS = 2
|
||||
self.RATE = 48000
|
||||
self.RATE = 16000
|
||||
self.CHUNK_RATE = chunk_rate
|
||||
self.CHUNK = self.RATE // chunk_rate
|
||||
|
||||
def reset_chunk_size(self, chunk_size: int):
|
||||
"""
|
||||
重新设置音频块大小
|
||||
"""
|
||||
self.CHUNK = chunk_size
|
||||
|
||||
def get_info(self):
|
||||
dev_info = f"""
|
||||
音频捕获进程:
|
||||
@@ -84,7 +79,7 @@ class AudioStream:
|
||||
启动音频捕获进程
|
||||
"""
|
||||
self.process = subprocess.Popen(
|
||||
["parec", "-d", self.source, "--format=s16le", "--rate=48000", "--channels=2"],
|
||||
["parec", "-d", self.source, "--format=s16le", "--rate=16000", "--channels=2"],
|
||||
stdout=subprocess.PIPE
|
||||
)
|
||||
|
||||
|
||||
@@ -61,14 +61,13 @@ class AudioStream:
|
||||
self.FORMAT = pyaudio.paInt16
|
||||
self.SAMP_WIDTH = pyaudio.get_sample_size(self.FORMAT)
|
||||
self.CHANNELS = int(self.device["maxInputChannels"])
|
||||
self.RATE = int(self.device["defaultSampleRate"])
|
||||
self.CHUNK = self.RATE // chunk_rate
|
||||
self.DEFAULT_RATE = int(self.device["defaultSampleRate"])
|
||||
self.CHUNK_RATE = chunk_rate
|
||||
|
||||
def reset_chunk_size(self, chunk_size: int):
|
||||
"""
|
||||
重新设置音频块大小
|
||||
"""
|
||||
self.CHUNK = chunk_size
|
||||
self.RATE = 16000
|
||||
self.CHUNK = self.RATE // self.CHUNK_RATE
|
||||
self.open_stream()
|
||||
self.close_stream()
|
||||
|
||||
def get_info(self):
|
||||
dev_info = f"""
|
||||
@@ -96,13 +95,24 @@ class AudioStream:
|
||||
打开并返回系统音频输出流
|
||||
"""
|
||||
if self.stream: return self.stream
|
||||
self.stream = self.mic.open(
|
||||
format = self.FORMAT,
|
||||
channels = self.CHANNELS,
|
||||
rate = self.RATE,
|
||||
input = True,
|
||||
input_device_index = self.INDEX
|
||||
)
|
||||
try:
|
||||
self.stream = self.mic.open(
|
||||
format = self.FORMAT,
|
||||
channels = self.CHANNELS,
|
||||
rate = self.RATE,
|
||||
input = True,
|
||||
input_device_index = self.INDEX
|
||||
)
|
||||
except OSError:
|
||||
self.RATE = self.DEFAULT_RATE
|
||||
self.CHUNK = self.RATE // self.CHUNK_RATE
|
||||
self.stream = self.mic.open(
|
||||
format = self.FORMAT,
|
||||
channels = self.CHANNELS,
|
||||
rate = self.RATE,
|
||||
input = True,
|
||||
input_device_index = self.INDEX
|
||||
)
|
||||
return self.stream
|
||||
|
||||
def read_chunk(self) -> bytes | None:
|
||||
|
||||
@@ -1,9 +1,5 @@
|
||||
from .audioprcs import (
|
||||
merge_chunk_channels,
|
||||
resample_chunk_mono,
|
||||
resample_chunk_mono_np,
|
||||
resample_mono_chunk
|
||||
)
|
||||
from .audioprcs import merge_chunk_channels, resample_chunk_mono
|
||||
from .sysout import stdout, stdout_err, stdout_cmd, stdout_obj, stderr
|
||||
from .thdata import thread_data
|
||||
from .server import start_server
|
||||
from .shared import shared_data
|
||||
from .server import start_server
|
||||
from .translation import ollama_translate, google_translate
|
||||
@@ -1,4 +1,4 @@
|
||||
import samplerate
|
||||
import resampy
|
||||
import numpy as np
|
||||
import numpy.core.multiarray # do not remove
|
||||
|
||||
@@ -24,16 +24,15 @@ def merge_chunk_channels(chunk: bytes, channels: int) -> bytes:
|
||||
return chunk_mono.tobytes()
|
||||
|
||||
|
||||
def resample_chunk_mono(chunk: bytes, channels: int, orig_sr: int, target_sr: int, mode="sinc_best") -> bytes:
|
||||
def resample_chunk_mono(chunk: bytes, channels: int, orig_sr: int, target_sr: int) -> bytes:
|
||||
"""
|
||||
将当前多通道音频数据块转换成单通道音频数据块,然后进行重采样
|
||||
将当前多通道音频数据块转换成单通道音频数据块,并进行重采样
|
||||
|
||||
Args:
|
||||
chunk: 多通道音频数据块
|
||||
channels: 通道数
|
||||
orig_sr: 原始采样率
|
||||
target_sr: 目标采样率
|
||||
mode: 重采样模式,可选:'sinc_best' | 'sinc_medium' | 'sinc_fastest' | 'zero_order_hold' | 'linear'
|
||||
|
||||
Return:
|
||||
单通道音频数据块
|
||||
@@ -49,60 +48,17 @@ def resample_chunk_mono(chunk: bytes, channels: int, orig_sr: int, target_sr: in
|
||||
# (length,)
|
||||
chunk_mono = np.mean(chunk_np.astype(np.float32), axis=1)
|
||||
|
||||
ratio = target_sr / orig_sr
|
||||
chunk_mono_r = samplerate.resample(chunk_mono, ratio, converter_type=mode)
|
||||
if orig_sr == target_sr:
|
||||
return chunk_mono.astype(np.int16).tobytes()
|
||||
|
||||
chunk_mono_r = resampy.resample(chunk_mono, orig_sr, target_sr)
|
||||
chunk_mono_r = np.round(chunk_mono_r).astype(np.int16)
|
||||
return chunk_mono_r.tobytes()
|
||||
|
||||
|
||||
def resample_chunk_mono_np(chunk: bytes, channels: int, orig_sr: int, target_sr: int, mode="sinc_best", dtype=np.float32) -> np.ndarray:
|
||||
"""
|
||||
将当前多通道音频数据块转换成单通道音频数据块,然后进行重采样,返回 Numpy 数组
|
||||
|
||||
Args:
|
||||
chunk: 多通道音频数据块
|
||||
channels: 通道数
|
||||
orig_sr: 原始采样率
|
||||
target_sr: 目标采样率
|
||||
mode: 重采样模式,可选:'sinc_best' | 'sinc_medium' | 'sinc_fastest' | 'zero_order_hold' | 'linear'
|
||||
dtype: 返回 Numpy 数组的数据类型
|
||||
|
||||
Return:
|
||||
单通道音频数据块
|
||||
"""
|
||||
if channels == 1:
|
||||
chunk_mono = np.frombuffer(chunk, dtype=np.int16)
|
||||
chunk_mono = chunk_mono.astype(np.float32)
|
||||
real_len = round(chunk_mono.shape[0] * target_sr / orig_sr)
|
||||
if(chunk_mono_r.shape[0] != real_len):
|
||||
print(chunk_mono_r.shape[0], real_len)
|
||||
if(chunk_mono_r.shape[0] > real_len):
|
||||
chunk_mono_r = chunk_mono_r[:real_len]
|
||||
else:
|
||||
# (length * channels,)
|
||||
chunk_np = np.frombuffer(chunk, dtype=np.int16)
|
||||
# (length, channels)
|
||||
chunk_np = chunk_np.reshape(-1, channels)
|
||||
# (length,)
|
||||
chunk_mono = np.mean(chunk_np.astype(np.float32), axis=1)
|
||||
|
||||
ratio = target_sr / orig_sr
|
||||
chunk_mono_r = samplerate.resample(chunk_mono, ratio, converter_type=mode)
|
||||
chunk_mono_r = chunk_mono_r.astype(dtype)
|
||||
return chunk_mono_r
|
||||
|
||||
|
||||
def resample_mono_chunk(chunk: bytes, orig_sr: int, target_sr: int, mode="sinc_best") -> bytes:
|
||||
"""
|
||||
将当前单通道音频块进行重采样
|
||||
|
||||
Args:
|
||||
chunk: 单通道音频数据块
|
||||
orig_sr: 原始采样率
|
||||
target_sr: 目标采样率
|
||||
mode: 重采样模式,可选:'sinc_best' | 'sinc_medium' | 'sinc_fastest' | 'zero_order_hold' | 'linear'
|
||||
|
||||
Return:
|
||||
单通道音频数据块
|
||||
"""
|
||||
chunk_np = np.frombuffer(chunk, dtype=np.int16)
|
||||
chunk_np = chunk_np.astype(np.float32)
|
||||
ratio = target_sr / orig_sr
|
||||
chunk_r = samplerate.resample(chunk_np, ratio, converter_type=mode)
|
||||
chunk_r = np.round(chunk_r).astype(np.int16)
|
||||
return chunk_r.tobytes()
|
||||
while chunk_mono_r.shape[0] < real_len:
|
||||
chunk_mono_r = np.append(chunk_mono_r, chunk_mono_r[-1])
|
||||
return chunk_mono_r.tobytes()
|
||||
|
||||
@@ -1,13 +1,12 @@
|
||||
import socket
|
||||
import threading
|
||||
import json
|
||||
# import time
|
||||
from utils import thread_data, stdout_cmd, stderr
|
||||
from utils import shared_data, stdout_cmd, stderr
|
||||
|
||||
|
||||
def handle_client(client_socket):
|
||||
global thread_data
|
||||
while thread_data.status == 'running':
|
||||
global shared_data
|
||||
while shared_data.status == 'running':
|
||||
try:
|
||||
data = client_socket.recv(4096).decode('utf-8')
|
||||
if not data:
|
||||
@@ -15,13 +14,13 @@ def handle_client(client_socket):
|
||||
data = json.loads(data)
|
||||
|
||||
if data['command'] == 'stop':
|
||||
thread_data.status = 'stop'
|
||||
shared_data.status = 'stop'
|
||||
break
|
||||
except Exception as e:
|
||||
stderr(f'Communication error: {e}')
|
||||
break
|
||||
|
||||
thread_data.status = 'stop'
|
||||
shared_data.status = 'stop'
|
||||
client_socket.close()
|
||||
|
||||
|
||||
@@ -34,7 +33,6 @@ def start_server(port: int):
|
||||
stderr(str(e))
|
||||
stdout_cmd('kill')
|
||||
return
|
||||
# time.sleep(20)
|
||||
stdout_cmd('connect')
|
||||
|
||||
client, addr = server.accept()
|
||||
|
||||
8
engine/utils/shared.py
Normal file
8
engine/utils/shared.py
Normal file
@@ -0,0 +1,8 @@
|
||||
import queue
|
||||
|
||||
class SharedData:
|
||||
def __init__(self):
|
||||
self.status = "running"
|
||||
self.chunk_queue = queue.Queue()
|
||||
|
||||
shared_data = SharedData()
|
||||
@@ -1,5 +0,0 @@
|
||||
class ThreadData:
|
||||
def __init__(self):
|
||||
self.status = "running"
|
||||
|
||||
thread_data = ThreadData()
|
||||
49
engine/utils/translation.py
Normal file
49
engine/utils/translation.py
Normal file
@@ -0,0 +1,49 @@
|
||||
from ollama import chat
|
||||
from ollama import ChatResponse
|
||||
import asyncio
|
||||
from googletrans import Translator
|
||||
from .sysout import stdout_cmd, stdout_obj
|
||||
|
||||
lang_map = {
|
||||
'en': 'English',
|
||||
'es': 'Spanish',
|
||||
'fr': 'French',
|
||||
'de': 'German',
|
||||
'it': 'Italian',
|
||||
'ru': 'Russian',
|
||||
'ja': 'Japanese',
|
||||
'ko': 'Korean',
|
||||
'zh': 'Chinese',
|
||||
'zh-cn': 'Chinese'
|
||||
}
|
||||
|
||||
def ollama_translate(model: str, target: str, text: str, time_s: str):
|
||||
response: ChatResponse = chat(
|
||||
model=model,
|
||||
messages=[
|
||||
{"role": "system", "content": f"/no_think Translate the following content into {lang_map[target]}, and do not output any additional information."},
|
||||
{"role": "user", "content": text}
|
||||
]
|
||||
)
|
||||
content = response.message.content or ""
|
||||
if content.startswith('<think>'):
|
||||
index = content.find('</think>')
|
||||
if index != -1:
|
||||
content = content[index+8:]
|
||||
stdout_obj({
|
||||
"command": "translation",
|
||||
"time_s": time_s,
|
||||
"translation": content.strip()
|
||||
})
|
||||
|
||||
def google_translate(model: str, target: str, text: str, time_s: str):
|
||||
translator = Translator()
|
||||
try:
|
||||
res = asyncio.run(translator.translate(text, dest=target))
|
||||
stdout_obj({
|
||||
"command": "translation",
|
||||
"time_s": time_s,
|
||||
"translation": res.text
|
||||
})
|
||||
except Exception as e:
|
||||
stdout_cmd("warn", f"Google translation request failed, please check your network connection...")
|
||||
@@ -160,7 +160,7 @@ class ControlWindow {
|
||||
})
|
||||
|
||||
ipcMain.on('control.engine.forceKill', () => {
|
||||
captionEngine.forceKill()
|
||||
captionEngine.kill()
|
||||
})
|
||||
|
||||
ipcMain.on('control.captionLog.clear', () => {
|
||||
|
||||
@@ -6,6 +6,8 @@ export interface Controls {
|
||||
engineEnabled: boolean,
|
||||
sourceLang: string,
|
||||
targetLang: string,
|
||||
transModel: string,
|
||||
ollamaName: string,
|
||||
engine: string,
|
||||
audio: 0 | 1,
|
||||
translation: boolean,
|
||||
|
||||
@@ -7,6 +7,11 @@ import { app, BrowserWindow } from 'electron'
|
||||
import * as path from 'path'
|
||||
import * as fs from 'fs'
|
||||
|
||||
interface CaptionTranslation {
|
||||
time_s: string,
|
||||
translation: string
|
||||
}
|
||||
|
||||
const defaultStyles: Styles = {
|
||||
lineBreak: 1,
|
||||
fontFamily: 'sans-serif',
|
||||
@@ -31,6 +36,8 @@ const defaultStyles: Styles = {
|
||||
const defaultControls: Controls = {
|
||||
sourceLang: 'en',
|
||||
targetLang: 'zh',
|
||||
transModel: 'ollama',
|
||||
ollamaName: '',
|
||||
engine: 'gummy',
|
||||
audio: 0,
|
||||
engineEnabled: false,
|
||||
@@ -158,12 +165,28 @@ class AllConfig {
|
||||
}
|
||||
}
|
||||
|
||||
public sendCaptionLog(window: BrowserWindow, command: 'add' | 'upd' | 'set') {
|
||||
public updateCaptionTranslation(trans: CaptionTranslation){
|
||||
for(let i = this.captionLog.length - 1; i >= 0; i--){
|
||||
if(this.captionLog[i].time_s === trans.time_s){
|
||||
this.captionLog[i].translation = trans.translation
|
||||
for(const window of BrowserWindow.getAllWindows()){
|
||||
this.sendCaptionLog(window, 'upd', i)
|
||||
}
|
||||
break
|
||||
}
|
||||
}
|
||||
}
|
||||
public sendCaptionLog(
|
||||
window: BrowserWindow,
|
||||
command: 'add' | 'upd' | 'set',
|
||||
index: number | undefined = undefined
|
||||
) {
|
||||
if(command === 'add'){
|
||||
window.webContents.send(`both.captionLog.add`, this.captionLog[this.captionLog.length - 1])
|
||||
window.webContents.send(`both.captionLog.add`, this.captionLog.at(-1))
|
||||
}
|
||||
else if(command === 'upd'){
|
||||
window.webContents.send(`both.captionLog.upd`, this.captionLog[this.captionLog.length - 1])
|
||||
if(index !== undefined) window.webContents.send(`both.captionLog.upd`, this.captionLog[index])
|
||||
else window.webContents.send(`both.captionLog.upd`, this.captionLog.at(-1))
|
||||
}
|
||||
else if(command === 'set'){
|
||||
window.webContents.send(`both.captionLog.set`, this.captionLog)
|
||||
|
||||
@@ -67,22 +67,23 @@ export class CaptionEngine {
|
||||
this.command.push('-a', allConfig.controls.audio ? '1' : '0')
|
||||
this.port = Math.floor(Math.random() * (65535 - 1024 + 1)) + 1024
|
||||
this.command.push('-p', this.port.toString())
|
||||
this.command.push(
|
||||
'-t', allConfig.controls.translation ?
|
||||
allConfig.controls.targetLang : 'none'
|
||||
)
|
||||
|
||||
if(allConfig.controls.engine === 'gummy') {
|
||||
this.command.push('-e', 'gummy')
|
||||
this.command.push('-s', allConfig.controls.sourceLang)
|
||||
this.command.push(
|
||||
'-t', allConfig.controls.translation ?
|
||||
allConfig.controls.targetLang : 'none'
|
||||
)
|
||||
if(allConfig.controls.API_KEY) {
|
||||
this.command.push('-k', allConfig.controls.API_KEY)
|
||||
}
|
||||
}
|
||||
else if(allConfig.controls.engine === 'vosk'){
|
||||
this.command.push('-e', 'vosk')
|
||||
|
||||
this.command.push('-m', `"${allConfig.controls.modelPath}"`)
|
||||
this.command.push('-vosk', `"${allConfig.controls.modelPath}"`)
|
||||
this.command.push('-tm', allConfig.controls.transModel)
|
||||
this.command.push('-omn', allConfig.controls.ollamaName)
|
||||
}
|
||||
}
|
||||
Log.info('Engine Path:', this.appPath)
|
||||
@@ -97,7 +98,6 @@ export class CaptionEngine {
|
||||
|
||||
public connect() {
|
||||
if(this.client) { Log.warn('Client already exists, ignoring...') }
|
||||
// 清除启动超时计时器
|
||||
if (this.startTimeoutID) {
|
||||
clearTimeout(this.startTimeoutID)
|
||||
this.startTimeoutID = undefined
|
||||
@@ -137,14 +137,13 @@ export class CaptionEngine {
|
||||
this.status = 'starting'
|
||||
Log.info('Caption Engine Starting, PID:', this.process.pid)
|
||||
|
||||
// 设置启动超时机制
|
||||
const timeoutMs = allConfig.controls.startTimeoutSeconds * 1000
|
||||
this.startTimeoutID = setTimeout(() => {
|
||||
if (this.status === 'starting') {
|
||||
Log.warn(`Engine start timeout after ${allConfig.controls.startTimeoutSeconds} seconds, forcing kill...`)
|
||||
this.status = 'starting-timeout'
|
||||
controlWindow.sendErrorMessage(i18n('engine.start.timeout'))
|
||||
this.forceKill()
|
||||
this.kill()
|
||||
}
|
||||
}, timeoutMs)
|
||||
|
||||
@@ -182,7 +181,6 @@ export class CaptionEngine {
|
||||
}
|
||||
this.status = 'stopped'
|
||||
clearInterval(this.timerID)
|
||||
// 清理启动超时计时器
|
||||
if (this.startTimeoutID) {
|
||||
clearTimeout(this.startTimeoutID)
|
||||
this.startTimeoutID = undefined
|
||||
@@ -194,7 +192,6 @@ export class CaptionEngine {
|
||||
public stop() {
|
||||
if(this.status !== 'running'){
|
||||
Log.warn('Trying to stop engine which is not running, current status:', this.status)
|
||||
return
|
||||
}
|
||||
this.sendCommand('stop')
|
||||
if(this.client){
|
||||
@@ -210,27 +207,12 @@ export class CaptionEngine {
|
||||
}
|
||||
|
||||
public kill(){
|
||||
if(!this.process || !this.process.pid) return
|
||||
if(this.status !== 'running'){
|
||||
Log.warn('Trying to kill engine which is not running, current status:', this.status)
|
||||
return
|
||||
}
|
||||
this.sendCommand('stop')
|
||||
if(this.client){
|
||||
this.client.destroy()
|
||||
this.client = undefined
|
||||
}
|
||||
this.status = 'stopping'
|
||||
this.timerID = setTimeout(() => {
|
||||
if(this.status !== 'stopping') return
|
||||
Log.warn('Engine process still not stopped, trying to kill...')
|
||||
this.forceKill()
|
||||
}, 4000);
|
||||
}
|
||||
Log.warn('Killing engine process, PID:', this.process.pid)
|
||||
|
||||
public forceKill(){
|
||||
if(!this.process || !this.process.pid) return
|
||||
Log.warn('Force killing engine process, PID:', this.process.pid)
|
||||
// 清理启动超时计时器
|
||||
if (this.startTimeoutID) {
|
||||
clearTimeout(this.startTimeoutID)
|
||||
this.startTimeoutID = undefined
|
||||
@@ -246,13 +228,12 @@ export class CaptionEngine {
|
||||
}
|
||||
exec(cmd, (error) => {
|
||||
if (error) {
|
||||
Log.error('Failed to force kill process:', error)
|
||||
Log.error('Failed to kill process:', error)
|
||||
} else {
|
||||
Log.info('Process force killed successfully')
|
||||
Log.info('Process killed successfully')
|
||||
}
|
||||
})
|
||||
}
|
||||
this.status = 'stopping'
|
||||
}
|
||||
}
|
||||
|
||||
@@ -269,12 +250,18 @@ function handleEngineData(data: any) {
|
||||
else if(data.command === 'caption') {
|
||||
allConfig.updateCaptionLog(data);
|
||||
}
|
||||
else if(data.command === 'translation') {
|
||||
allConfig.updateCaptionTranslation(data);
|
||||
}
|
||||
else if(data.command === 'print') {
|
||||
Log.info('Engine Print:', data.content)
|
||||
console.log(data.content)
|
||||
}
|
||||
else if(data.command === 'info') {
|
||||
Log.info('Engine Info:', data.content)
|
||||
}
|
||||
else if(data.command === 'warn') {
|
||||
Log.warn('Engine Warn:', data.content)
|
||||
}
|
||||
else if(data.command === 'error') {
|
||||
Log.error('Engine Error:', data.content)
|
||||
controlWindow.sendErrorMessage(/*i18n('engine.error') +*/ data.content)
|
||||
|
||||
@@ -5,9 +5,18 @@
|
||||
<a @click="applyChange">{{ $t('engine.applyChange') }}</a> |
|
||||
<a @click="cancelChange">{{ $t('engine.cancelChange') }}</a>
|
||||
</template>
|
||||
<div class="input-item">
|
||||
<span class="input-label">{{ $t('engine.captionEngine') }}</span>
|
||||
<a-select
|
||||
class="input-area"
|
||||
v-model:value="currentEngine"
|
||||
:options="captionEngine"
|
||||
></a-select>
|
||||
</div>
|
||||
<div class="input-item">
|
||||
<span class="input-label">{{ $t('engine.sourceLang') }}</span>
|
||||
<a-select
|
||||
:disabled="currentEngine === 'vosk'"
|
||||
class="input-area"
|
||||
v-model:value="currentSourceLang"
|
||||
:options="langList"
|
||||
@@ -16,20 +25,33 @@
|
||||
<div class="input-item">
|
||||
<span class="input-label">{{ $t('engine.transLang') }}</span>
|
||||
<a-select
|
||||
:disabled="currentEngine === 'vosk'"
|
||||
class="input-area"
|
||||
v-model:value="currentTargetLang"
|
||||
:options="langList.filter((item) => item.value !== 'auto')"
|
||||
></a-select>
|
||||
</div>
|
||||
<div class="input-item">
|
||||
<span class="input-label">{{ $t('engine.captionEngine') }}</span>
|
||||
<div class="input-item" v-if="transModel">
|
||||
<span class="input-label">{{ $t('engine.transModel') }}</span>
|
||||
<a-select
|
||||
class="input-area"
|
||||
v-model:value="currentEngine"
|
||||
:options="captionEngine"
|
||||
v-model:value="currentTransModel"
|
||||
:options="transModel"
|
||||
></a-select>
|
||||
</div>
|
||||
<div class="input-item" v-if="transModel && currentTransModel === 'ollama'">
|
||||
<a-popover placement="right">
|
||||
<template #content>
|
||||
<p class="label-hover-info">{{ $t('engine.ollamaNote') }}</p>
|
||||
</template>
|
||||
<span class="input-label info-label"
|
||||
:style="{color: uiColor}"
|
||||
>{{ $t('engine.ollama') }}</span>
|
||||
</a-popover>
|
||||
<a-input
|
||||
class="input-area"
|
||||
v-model:value="currentOllamaName"
|
||||
></a-input>
|
||||
</div>
|
||||
<div class="input-item">
|
||||
<span class="input-label">{{ $t('engine.audioType') }}</span>
|
||||
<a-select
|
||||
@@ -80,11 +102,13 @@
|
||||
|
||||
<a-card size="small" :title="$t('engine.showMore')" v-show="showMore" style="margin-top:10px;">
|
||||
<div class="input-item">
|
||||
<a-popover>
|
||||
<a-popover placement="right">
|
||||
<template #content>
|
||||
<p class="label-hover-info">{{ $t('engine.apikeyInfo') }}</p>
|
||||
</template>
|
||||
<span class="input-label info-label">{{ $t('engine.apikey') }}</span>
|
||||
<span class="input-label info-label"
|
||||
:style="{color: uiColor}"
|
||||
>{{ $t('engine.apikey') }}</span>
|
||||
</a-popover>
|
||||
<a-input
|
||||
class="input-area"
|
||||
@@ -93,14 +117,17 @@
|
||||
/>
|
||||
</div>
|
||||
<div class="input-item">
|
||||
<a-popover>
|
||||
<a-popover placement="right">
|
||||
<template #content>
|
||||
<p class="label-hover-info">{{ $t('engine.modelPathInfo') }}</p>
|
||||
</template>
|
||||
<span class="input-label info-label">{{ $t('engine.modelPath') }}</span>
|
||||
<span class="input-label info-label"
|
||||
:style="{color: uiColor}"
|
||||
>{{ $t('engine.modelPath') }}</span>
|
||||
</a-popover>
|
||||
<span
|
||||
class="input-folder"
|
||||
:style="{color: uiColor}"
|
||||
@click="selectFolderPath"
|
||||
><span><FolderOpenOutlined /></span></span>
|
||||
<a-input
|
||||
@@ -110,13 +137,13 @@
|
||||
/>
|
||||
</div>
|
||||
<div class="input-item">
|
||||
<a-popover>
|
||||
<a-popover placement="right">
|
||||
<template #content>
|
||||
<p class="label-hover-info">{{ $t('engine.startTimeoutInfo') }}</p>
|
||||
</template>
|
||||
<span
|
||||
class="input-label info-label"
|
||||
style="vertical-align: middle;"
|
||||
:style="{color: uiColor, verticalAlign: 'middle'}"
|
||||
>{{ $t('engine.startTimeout') }}</span>
|
||||
</a-popover>
|
||||
<a-input-number
|
||||
@@ -134,12 +161,12 @@
|
||||
</template>
|
||||
|
||||
<script setup lang="ts">
|
||||
import { ref, computed, watch } from 'vue'
|
||||
import { ref, computed, watch, h } from 'vue'
|
||||
import { storeToRefs } from 'pinia'
|
||||
import { useGeneralSettingStore } from '@renderer/stores/generalSetting'
|
||||
import { useEngineControlStore } from '@renderer/stores/engineControl'
|
||||
import { notification } from 'ant-design-vue'
|
||||
import { FolderOpenOutlined ,InfoCircleOutlined } from '@ant-design/icons-vue';
|
||||
import { ExclamationCircleOutlined, FolderOpenOutlined ,InfoCircleOutlined } from '@ant-design/icons-vue';
|
||||
import { useI18n } from 'vue-i18n'
|
||||
|
||||
const { t } = useI18n()
|
||||
@@ -148,11 +175,16 @@ const showMore = ref(false)
|
||||
const engineControl = useEngineControlStore()
|
||||
const { captionEngine, audioType, changeSignal } = storeToRefs(engineControl)
|
||||
|
||||
const generalSetting = useGeneralSettingStore()
|
||||
const { uiColor } = storeToRefs(generalSetting)
|
||||
|
||||
const currentSourceLang = ref('auto')
|
||||
const currentTargetLang = ref('zh')
|
||||
const currentEngine = ref<string>('gummy')
|
||||
const currentAudio = ref<0 | 1>(0)
|
||||
const currentTranslation = ref<boolean>(false)
|
||||
const currentTranslation = ref<boolean>(true)
|
||||
const currentTransModel = ref('ollama')
|
||||
const currentOllamaName = ref('')
|
||||
const currentAPI_KEY = ref<string>('')
|
||||
const currentModelPath = ref<string>('')
|
||||
const currentCustomized = ref<boolean>(false)
|
||||
@@ -169,9 +201,33 @@ const langList = computed(() => {
|
||||
return []
|
||||
})
|
||||
|
||||
const transModel = computed(() => {
|
||||
for(let item of captionEngine.value){
|
||||
if(item.value === currentEngine.value) {
|
||||
return item.transModel
|
||||
}
|
||||
}
|
||||
return []
|
||||
})
|
||||
|
||||
function applyChange(){
|
||||
if(
|
||||
currentTranslation.value && transModel.value &&
|
||||
currentTransModel.value === 'ollama' && !currentOllamaName.value.trim()
|
||||
) {
|
||||
notification.open({
|
||||
message: t('noti.ollamaNameNull'),
|
||||
description: t('noti.ollamaNameNullNote'),
|
||||
duration: null,
|
||||
icon: () => h(ExclamationCircleOutlined, { style: 'color: #ff4d4f' })
|
||||
})
|
||||
return
|
||||
}
|
||||
|
||||
engineControl.sourceLang = currentSourceLang.value
|
||||
engineControl.targetLang = currentTargetLang.value
|
||||
engineControl.transModel = currentTransModel.value
|
||||
engineControl.ollamaName = currentOllamaName.value
|
||||
engineControl.engine = currentEngine.value
|
||||
engineControl.audio = currentAudio.value
|
||||
engineControl.translation = currentTranslation.value
|
||||
@@ -194,6 +250,8 @@ function applyChange(){
|
||||
function cancelChange(){
|
||||
currentSourceLang.value = engineControl.sourceLang
|
||||
currentTargetLang.value = engineControl.targetLang
|
||||
currentTransModel.value = engineControl.transModel
|
||||
currentOllamaName.value = engineControl.ollamaName
|
||||
currentEngine.value = engineControl.engine
|
||||
currentAudio.value = engineControl.audio
|
||||
currentTranslation.value = engineControl.translation
|
||||
@@ -222,7 +280,10 @@ watch(changeSignal, (val) => {
|
||||
watch(currentEngine, (val) => {
|
||||
if(val == 'vosk'){
|
||||
currentSourceLang.value = 'auto'
|
||||
currentTargetLang.value = ''
|
||||
currentTargetLang.value = useGeneralSettingStore().uiLanguage
|
||||
if(currentTargetLang.value === 'zh') {
|
||||
currentTargetLang.value = 'zh-cn'
|
||||
}
|
||||
}
|
||||
else if(val == 'gummy'){
|
||||
currentSourceLang.value = 'auto'
|
||||
@@ -240,8 +301,8 @@ watch(currentEngine, (val) => {
|
||||
}
|
||||
|
||||
.info-label {
|
||||
color: #1677ff;
|
||||
cursor: pointer;
|
||||
font-style: italic;
|
||||
}
|
||||
|
||||
.input-folder {
|
||||
@@ -252,20 +313,12 @@ watch(currentEngine, (val) => {
|
||||
transition: all 0.25s;
|
||||
}
|
||||
|
||||
.input-folder>span {
|
||||
padding: 0 2px;
|
||||
border: 2px solid #1677ff;
|
||||
color: #1677ff;
|
||||
border-radius: 30%;
|
||||
}
|
||||
|
||||
.input-folder:hover {
|
||||
transform: scale(1.1);
|
||||
}
|
||||
|
||||
.customize-note {
|
||||
padding: 10px 10px 0;
|
||||
color: red;
|
||||
max-width: min(40vw, 480px);
|
||||
}
|
||||
</style>
|
||||
|
||||
@@ -21,6 +21,19 @@ export const engines = {
|
||||
label: '本地 - Vosk',
|
||||
languages: [
|
||||
{ value: 'auto', label: '需要自行配置模型' },
|
||||
{ value: 'en', label: '英语' },
|
||||
{ value: 'zh-cn', label: '中文' },
|
||||
{ value: 'ja', label: '日语' },
|
||||
{ value: 'ko', label: '韩语' },
|
||||
{ value: 'de', label: '德语' },
|
||||
{ value: 'fr', label: '法语' },
|
||||
{ value: 'ru', label: '俄语' },
|
||||
{ value: 'es', label: '西班牙语' },
|
||||
{ value: 'it', label: '意大利语' },
|
||||
],
|
||||
transModel: [
|
||||
{ value: 'ollama', label: 'Ollama 本地模型' },
|
||||
{ value: 'google', label: 'Google API 调用' },
|
||||
]
|
||||
}
|
||||
],
|
||||
@@ -46,6 +59,19 @@ export const engines = {
|
||||
label: 'Local - Vosk',
|
||||
languages: [
|
||||
{ value: 'auto', label: 'Model needs to be configured manually' },
|
||||
{ value: 'en', label: 'English' },
|
||||
{ value: 'zh-cn', label: 'Chinese' },
|
||||
{ value: 'ja', label: 'Japanese' },
|
||||
{ value: 'ko', label: 'Korean' },
|
||||
{ value: 'de', label: 'German' },
|
||||
{ value: 'fr', label: 'French' },
|
||||
{ value: 'ru', label: 'Russian' },
|
||||
{ value: 'es', label: 'Spanish' },
|
||||
{ value: 'it', label: 'Italian' },
|
||||
],
|
||||
transModel: [
|
||||
{ value: 'ollama', label: 'Ollama Local Model' },
|
||||
{ value: 'google', label: 'Google API Call' },
|
||||
]
|
||||
}
|
||||
],
|
||||
@@ -71,8 +97,20 @@ export const engines = {
|
||||
label: 'ローカル - Vosk',
|
||||
languages: [
|
||||
{ value: 'auto', label: 'モデルを手動で設定する必要があります' },
|
||||
{ value: 'en', label: '英語' },
|
||||
{ value: 'zh-cn', label: '中国語' },
|
||||
{ value: 'ja', label: '日本語' },
|
||||
{ value: 'ko', label: '韓国語' },
|
||||
{ value: 'de', label: 'ドイツ語' },
|
||||
{ value: 'fr', label: 'フランス語' },
|
||||
{ value: 'ru', label: 'ロシア語' },
|
||||
{ value: 'es', label: 'スペイン語' },
|
||||
{ value: 'it', label: 'イタリア語' },
|
||||
],
|
||||
transModel: [
|
||||
{ value: 'ollama', label: 'Ollama ローカルモデル' },
|
||||
{ value: 'google', label: 'Google API 呼び出し' },
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
|
||||
|
||||
@@ -28,7 +28,9 @@ export default {
|
||||
"changeInfo": "If the caption engine is already running, you need to restart it for the changes to take effect.",
|
||||
"styleChange": "Caption Style Changed",
|
||||
"styleInfo": "Caption style changes have been saved and applied.",
|
||||
"engineStartTimeout": "Caption engine startup timeout, automatically force stopped"
|
||||
"engineStartTimeout": "Caption engine startup timeout, automatically force stopped",
|
||||
"ollamaNameNull": "'Ollama' Field is Empty",
|
||||
"ollamaNameNullNote": "When selecting Ollama model as the translation model, the 'Ollama' field cannot be empty and must be filled with the name of a locally configured Ollama model."
|
||||
},
|
||||
general: {
|
||||
"title": "General Settings",
|
||||
@@ -47,6 +49,9 @@ export default {
|
||||
"cancelChange": "Cancel Changes",
|
||||
"sourceLang": "Source",
|
||||
"transLang": "Translation",
|
||||
"transModel": "Model",
|
||||
"ollama": "Ollama",
|
||||
"ollamaNote": "To use for translation, the name of the local Ollama model that will call the service on the default port. It is recommended to use a non-inference model with less than 1B parameters.",
|
||||
"captionEngine": "Engine",
|
||||
"audioType": "Audio Type",
|
||||
"systemOutput": "System Audio Output (Speaker)",
|
||||
|
||||
@@ -28,7 +28,9 @@ export default {
|
||||
"changeInfo": "字幕エンジンがすでに起動している場合、変更を有効にするには再起動が必要です。",
|
||||
"styleChange": "字幕のスタイルが変更されました",
|
||||
"styleInfo": "字幕のスタイル変更が保存され、適用されました",
|
||||
"engineStartTimeout": "字幕エンジンの起動がタイムアウトしました。自動的に強制停止しました"
|
||||
"engineStartTimeout": "字幕エンジンの起動がタイムアウトしました。自動的に強制停止しました",
|
||||
"ollamaNameNull": "Ollama フィールドが空です",
|
||||
"ollamaNameNullNote": "Ollama モデルを翻訳モデルとして選択する場合、Ollama フィールドは空にできません。ローカルで設定された Ollama モデルの名前を入力してください。"
|
||||
},
|
||||
general: {
|
||||
"title": "一般設定",
|
||||
@@ -47,6 +49,9 @@ export default {
|
||||
"cancelChange": "変更をキャンセル",
|
||||
"sourceLang": "ソース言語",
|
||||
"transLang": "翻訳言語",
|
||||
"transModel": "翻訳モデル",
|
||||
"ollama": "Ollama",
|
||||
"ollamaNote": "翻訳に使用する、デフォルトポートでサービスを呼び出すローカルOllamaモデルの名前。1B 未満のパラメータを持つ非推論モデルの使用を推奨します。",
|
||||
"captionEngine": "エンジン",
|
||||
"audioType": "オーディオ",
|
||||
"systemOutput": "システムオーディオ出力(スピーカー)",
|
||||
|
||||
@@ -28,7 +28,9 @@ export default {
|
||||
"changeInfo": "如果字幕引擎已经启动,需要重启字幕引擎修改才会生效",
|
||||
"styleChange": "字幕样式已修改",
|
||||
"styleInfo": "字幕样式修改已经保存并生效",
|
||||
"engineStartTimeout": "字幕引擎启动超时,已自动强制停止"
|
||||
"engineStartTimeout": "字幕引擎启动超时,已自动强制停止",
|
||||
"ollamaNameNull": "Ollama 字段为空",
|
||||
"ollamaNameNullNote": "选择 Ollama 模型作为翻译模型时,Ollama 字段不能为空,需要填写本地已经配置好的 Ollama 模型的名称。"
|
||||
},
|
||||
general: {
|
||||
"title": "通用设置",
|
||||
@@ -47,6 +49,9 @@ export default {
|
||||
"cancelChange": "取消更改",
|
||||
"sourceLang": "源语言",
|
||||
"transLang": "翻译语言",
|
||||
"transModel": "翻译模型",
|
||||
"ollama": "Ollama",
|
||||
"ollamaNote": "要使用的进行翻译的本地 Ollama 模型的名称,将调用默认端口的服务,建议使用参数量小于 1B 的非推理模型。",
|
||||
"captionEngine": "字幕引擎",
|
||||
"audioType": "音频类型",
|
||||
"systemOutput": "系统音频输出(扬声器)",
|
||||
|
||||
@@ -15,7 +15,12 @@ export const useCaptionLogStore = defineStore('captionLog', () => {
|
||||
})
|
||||
|
||||
window.electron.ipcRenderer.on('both.captionLog.upd', (_, log) => {
|
||||
captionData.value.splice(captionData.value.length - 1, 1, log)
|
||||
for(let i = captionData.value.length - 1; i >= 0; i--) {
|
||||
if(captionData.value[i].time_s === log.time_s){
|
||||
captionData.value.splice(i, 1, log)
|
||||
break
|
||||
}
|
||||
}
|
||||
})
|
||||
|
||||
window.electron.ipcRenderer.on('both.captionLog.set', (_, logs) => {
|
||||
|
||||
@@ -19,6 +19,8 @@ export const useEngineControlStore = defineStore('engineControl', () => {
|
||||
const engineEnabled = ref(false)
|
||||
const sourceLang = ref<string>('en')
|
||||
const targetLang = ref<string>('zh')
|
||||
const transModel = ref<string>('ollama')
|
||||
const ollamaName = ref<string>('')
|
||||
const engine = ref<string>('gummy')
|
||||
const audio = ref<0 | 1>(0)
|
||||
const translation = ref<boolean>(true)
|
||||
@@ -37,6 +39,8 @@ export const useEngineControlStore = defineStore('engineControl', () => {
|
||||
engineEnabled: engineEnabled.value,
|
||||
sourceLang: sourceLang.value,
|
||||
targetLang: targetLang.value,
|
||||
transModel: transModel.value,
|
||||
ollamaName: ollamaName.value,
|
||||
engine: engine.value,
|
||||
audio: audio.value,
|
||||
translation: translation.value,
|
||||
@@ -68,6 +72,8 @@ export const useEngineControlStore = defineStore('engineControl', () => {
|
||||
}
|
||||
sourceLang.value = controls.sourceLang
|
||||
targetLang.value = controls.targetLang
|
||||
transModel.value = controls.transModel
|
||||
ollamaName.value = controls.ollamaName
|
||||
engine.value = controls.engine
|
||||
audio.value = controls.audio
|
||||
engineEnabled.value = controls.engineEnabled
|
||||
@@ -132,6 +138,8 @@ export const useEngineControlStore = defineStore('engineControl', () => {
|
||||
engineEnabled, // 字幕引擎是否启用
|
||||
sourceLang, // 源语言
|
||||
targetLang, // 目标语言
|
||||
transModel, // 翻译模型
|
||||
ollamaName, // Ollama 模型
|
||||
engine, // 字幕引擎
|
||||
audio, // 选择音频
|
||||
translation, // 是否启用翻译
|
||||
|
||||
@@ -6,6 +6,8 @@ export interface Controls {
|
||||
engineEnabled: boolean,
|
||||
sourceLang: string,
|
||||
targetLang: string,
|
||||
transModel: string,
|
||||
ollamaName: string,
|
||||
engine: string,
|
||||
audio: 0 | 1,
|
||||
translation: boolean,
|
||||
|
||||
Reference in New Issue
Block a user