mirror of
https://github.com/HiMeditator/auto-caption.git
synced 2026-02-04 04:14:42 +08:00
feat(engine): 重构字幕引擎,新增 Sherpa-ONNX SenseVoice 语音识别模型
- 重构字幕引擎,将音频采集改为在新线程上进行 - 重构 audio2text 中的类,调整运行逻辑 - 更新 main 函数,添加对 Sosv 模型的支持 - 修改 AudioStream 类,默认使用 16000Hz 采样率
This commit is contained in:
@@ -55,15 +55,10 @@ class AudioStream:
|
||||
self.FORMAT = 16
|
||||
self.SAMP_WIDTH = 2
|
||||
self.CHANNELS = 2
|
||||
self.RATE = 48000
|
||||
self.RATE = 16000
|
||||
self.CHUNK_RATE = chunk_rate
|
||||
self.CHUNK = self.RATE // chunk_rate
|
||||
|
||||
def reset_chunk_size(self, chunk_size: int):
|
||||
"""
|
||||
重新设置音频块大小
|
||||
"""
|
||||
self.CHUNK = chunk_size
|
||||
|
||||
def get_info(self):
|
||||
dev_info = f"""
|
||||
音频捕获进程:
|
||||
@@ -84,7 +79,7 @@ class AudioStream:
|
||||
启动音频捕获进程
|
||||
"""
|
||||
self.process = subprocess.Popen(
|
||||
["parec", "-d", self.source, "--format=s16le", "--rate=48000", "--channels=2"],
|
||||
["parec", "-d", self.source, "--format=s16le", "--rate=16000", "--channels=2"],
|
||||
stdout=subprocess.PIPE
|
||||
)
|
||||
|
||||
|
||||
Reference in New Issue
Block a user