refactor(caption-engine): 重构字幕引擎代码结构

- 重构 GummyTranslator 类,增加启动和停止方法
- 优化 AudioStream 类,添加读取音频数据方法
- 更新 main-gummy.py,使用新的 GummyTranslator 和 AudioStream 接口
- 更新文档和 TODO 列表
This commit is contained in:
himeditator
2025-07-06 22:46:46 +08:00
parent 213426dace
commit f2aa075e65
16 changed files with 517 additions and 174 deletions

View File

@@ -0,0 +1 @@
from .streamchnl import mergeStreamChannels

View File

@@ -0,0 +1,22 @@
import numpy as np
def mergeStreamChannels(data, channels):
"""
将当前多通道流数据合并为单通道流数据
Args:
data: 多通道数据
channels: 通道数
Returns:
mono_data_bytes: 单通道数据
"""
# (length * channels,)
data_np = np.frombuffer(data, dtype=np.int16)
# (length, channels)
data_np_r = data_np.reshape(-1, channels)
# (length,)
mono_data = np.mean(data_np_r.astype(np.float32), axis=1)
mono_data = mono_data.astype(np.int16)
mono_data_bytes = mono_data.tobytes()
return mono_data_bytes