GPU加速和批处理优化、更新README

- STTN Auto/Det: 统一 torch.no_grad 包裹，减少重复上下文切换开销 - STTN Auto: 添加 FramePrefetcher 帧预读取，根据 GPU 显存动态调整 batch size - Lama Inpaint: 新增 _inpaint_batch 批量推理，多帧合并一次 GPU 推理 - ProPainter: copy.deepcopy 替换为浅拷贝，每个区域处理后 gc.collect - HardwareAccelerator: 新增 get_available_vram_mb 显存查询方法 - README: 添加应用 Logo，同步英文版 README_en.md Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-18 11:37:35 +08:00 · 2026-04-08 00:17:50 +08:00
parent 8aac76030d
commit e801d58e80
7 changed files with 177 additions and 128 deletions
--- a/backend/tools/hardware_accelerator.py
+++ b/backend/tools/hardware_accelerator.py
@@ -106,6 +106,27 @@ class HardwareAccelerator:
    def set_enabled(self, enable):
        self.__enabled = enable

+    def get_available_vram_mb(self):
+        """获取可用 GPU 显存（MB），无 GPU 返回 0"""
+        if not self.__enabled:
+            return 0
+        if self.__cuda:
+            try:
+                free_vram = torch.cuda.mem_get_info()[0]  # (free, total)
+                return free_vram / (1024 * 1024)
+            except Exception:
+                return 0
+        if self.__mps:
+            try:
+                # MPS 没有直接查询接口，使用系统内存作为参考
+                import subprocess
+                result = subprocess.run(['sysctl', '-n', 'hw.memsize'], capture_output=True, text=True)
+                total_mem = int(result.stdout.strip()) / (1024 * 1024)
+                return total_mem * 0.5  # 保守估计可用一半
+            except Exception:
+                return 0
+        return 0
+
    @property
    def device(self):
        """