Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions ms_agent/cli/run.py
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,13 @@ def define_args(parsers: argparse.ArgumentParser):
type=str,
default=None,
help='API key for accessing ModelScope api-inference services.')
parser.add_argument(
'--animation_mode',
required=False,
type=str,
choices=['auto', 'human'],
default=None,
help='Animation mode for video_generate project: auto (default) or human.')
parser.set_defaults(func=subparser_func)

def execute(self):
Expand All @@ -91,6 +98,10 @@ def execute(self):
self.args.trust_remote_code) # noqa
self.args.load_cache = strtobool(self.args.load_cache)

# Propagate animation mode via environment variable for downstream code agents
if getattr(self.args, 'animation_mode', None):
os.environ['MS_ANIMATION_MODE'] = self.args.animation_mode
Comment on lines +102 to +103

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

通过环境变量 MS_ANIMATION_MODE 来传递配置是一种隐式依赖,这会使得代码的追踪、测试和维护变得更加困难。当其他开发者阅读 video_agent.py 时,可能不清楚这个环境变量是从哪里设置的。建议将这个参数通过函数调用链显式地传递下去,例如通过 engine.run**kwargs


config = Config.from_task(self.args.config)

if Config.is_workflow(config):
Expand Down
96 changes: 96 additions & 0 deletions projects/video_generate/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
# Video Generate

一个“AI 科普短视频”工作流。支持全自动与人工协同两种模式,产生脚本、语音、插画/动画、字幕,并合成为成片。

## 快速检查(必读)

在首次运行前,建议完成以下检查:

1) 运行环境
- Windows / Python 3.10+(推荐)
- 已安装 FFmpeg,并添加到 PATH(ffmpeg -version 可执行)
- Manim 可用(manim -h 可执行)

2) Python 依赖(若未安装)
- 依赖在仓库 requirements 下,或按需安装:moviepy、Pillow、edge-tts、matplotlib 等

3) 资源文件(已随仓库提供)
- 自定义字体与背景音乐:`projects/video_generate/core/asset/`
- `bg_audio.mp3`
- `字小魂扶摇手书(商用需授权).ttf`

4) 可选的 API Key(全自动模式常用)
- MODELSCOPE_API_KEY:用于 ModelScope 模型调用

提示:未设置 Key 也可运行“只合成/人工模式”,但全自动模式可能因缺少 LLM 能力失败。

## 运行方式一:全自动模式(auto)

按主题从零到一自动生成并合成视频:

```powershell
# 可选:设置 API Key
$env:MODELSCOPE_API_KEY="你的ModelScopeKey"
# 运行三步工作流(脚本 → 素材 → 合成)
ms-agent run --config "ms-agent/projects/video_generate/workflow.yaml" --query "主题" --animation_mode auto --trust_remote_code true
```

输出将位于 `ms-agent/projects/video_generate/output/<主题>/`

## 运行方式二:人工模式(human)

适合需要人工把控动画的流程:自动产出“脚本/语音/插画/字幕/占位前景”,然后在“人工工作室”内逐段制作/审批前景动画,最终一键完整合成。

1) 先生成素材(不自动渲染 Manim)
```powershell
ms-agent run --config "ms-agent/projects/video_generate/workflow.yaml" --query "主题" --animation_mode human --trust_remote_code true
```

2) 打开人工工作室(指向上一步生成的主题目录)
```powershell
# 确保将 ms-agent 包目录加入 PYTHONPATH
$env:PYTHONPATH="项目本地目录\ms-agent"
# 以模块方式启动交互式工作室
python -m projects.video_generate.core.human_animation_studio "项目本地目录\ms-agent\projects\video_generate\output\主题"
```

在工作室中:
- 1 查看待制作任务 → 2 开始制作动画 → 生成/改进 Manim 代码 → 创建预览 → 批准动画
- 当所有片段完成后,系统会自动合并前景并执行“完整合成(背景+字幕+音频+前景+音乐)”生成成片

## 运行方式三:只合成(已有素材)

如果目录中已经有 `asset_info.json`(或你只想重新合成):

```powershell
ms-agent run --config "ms-agent/projects/video_generate/workflow_from_assets.yaml" `
--query "项目本地目录\ms-agent\projects\video_generate\output\<主题>\asset_info.json" `
--animation_mode human `
--trust_remote_code true
Comment on lines +36 to +71

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

README 中的示例命令路径和环境变量设置可能会让用户感到困惑。

  • 命令 ms-agent run --config "ms-agent/projects/video_generate/workflow.yaml" 中的路径 ms-agent/projects/... 看起来很奇怪。如果用户在仓库根目录运行,路径应该是 projects/video_generate/workflow.yaml。请澄清运行命令时所在的当前工作目录,并相应地修正路径。
  • $env:PYTHONPATH="项目本地目录\ms-agent" 是 PowerShell 语法。虽然文档推荐 Windows,但提供 bash/zsh 的等效命令 (export PYTHONPATH="项目本地目录/ms-agent") 会对跨平台用户更友好。
  • 命令中的路径分隔符 \ 是 Windows 特有的,建议统一使用 / 以提高跨平台兼容性。

```

该流程只执行合成,不会重新生成脚本/插画/动画。若存在已审批的透明前景(finals/scene_*_final.mov),将优先使用。

## 目录说明
- `video_agent.py`:三步逻辑的 Agent 封装
- `workflow.yaml`:三步编排;`workflow_from_assets.yaml`:只合成编排
- `core/workflow.py`:主流程;`core/human_animation_studio.py`:人工工作室
- `core/asset/`:字体与背景音乐
- `output/`:运行产物
- `scripts/compose_from_asset_info.py`:从现有 `asset_info.json` 直接合成的辅助脚本

## 常见问题
- 退出码 1:
- 检查是否缺少 MODELSCOPE_API_KEY(全自动模式常见)
- 检查 ffmpeg / manim 是否可执行(PATH)
- 查看终端最后 80 行日志定位具体异常
- 字体/背景不一致:
- 背景由 `create_manual_background` 生成,字体/音乐来自 `core/asset/`;确保该目录可读
- TTS/事件循环冲突:
- 已内置 loop-safe 处理;若仍报错,重试并贴出日志尾部

## 许可证与注意
- 自定义字体文件标注为“商用需授权”,请在合规授权范围内使用
- 背景音乐仅作示例,商业使用请更换或确保版权无虞
Empty file.
Empty file.
256 changes: 256 additions & 0 deletions projects/video_generate/core/animation_production_modes.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,256 @@
from enum import Enum
from typing import Dict, List, Optional, Any
from dataclasses import dataclass
import json
import os

class AnimationProductionMode(Enum):
"""动画制作模式"""
AUTO = "auto" # 全自动模式
HUMAN_CONTROLLED = "human" # 人工控制模式

class AnimationStatus(Enum):
"""动画状态"""
PENDING = "pending" # 等待制作
DRAFT = "draft" # 草稿阶段
PREVIEW = "preview" # 预览阶段
REVISION = "revision" # 修订中
APPROVED = "approved" # 已批准
COMPLETED = "completed" # 制作完成
FAILED = "failed" # 制作失败

@dataclass
class AnimationTask:
"""动画任务数据结构"""
task_id: str
segment_index: int
content: str
content_type: str
mode: AnimationProductionMode
status: AnimationStatus

# 制作相关
script: Optional[str] = None
manim_code: Optional[str] = None
preview_video_path: Optional[str] = None
final_video_path: Optional[str] = None
placeholder_path: Optional[str] = None

# 人机交互
human_feedback: List[str] = None
revision_count: int = 0
max_revisions: int = 5

# 时间信息
audio_duration: float = 8.0
creation_time: Optional[str] = None
completion_time: Optional[str] = None

def __post_init__(self):
if self.human_feedback is None:
self.human_feedback = []

@dataclass
class PlaceholderConfig:
"""占位符配置"""
width: int = 1280
height: int = 720
background_color: str = "#f0f0f0"
text_color: str = "#333333"
font_size: int = 48
placeholder_text: str = "动画制作中..."
show_content_preview: bool = True
show_progress_indicator: bool = True

class AnimationTaskManager:
"""动画任务管理"""

def __init__(self, project_dir):
self.project_dir = project_dir
self.tasks_file = os.path.join(project_dir, "animation_tasks.json")
self.tasks: Dict[str, AnimationTask] = {}
self.load_tasks()

def create_task(self, segment_index, content, content_type, mode, audio_duration):
"""创建新动画任务,重复任务直接返回ID"""
import uuid
from datetime import datetime

# 检查是否已存在相同段落的任务
existing_task = self.get_task_by_segment(segment_index, content_type)
if existing_task:
print(f"发现已存在的任务: {existing_task.task_id}")
return existing_task.task_id

task_id = f"anim_{segment_index}_{uuid.uuid4().hex[:8]}"

task = AnimationTask(
task_id=task_id,
segment_index=segment_index,
content=content,
content_type=content_type,
mode=mode,
status=AnimationStatus.PENDING,
audio_duration=audio_duration,
creation_time=datetime.now().isoformat()
)

self.tasks[task_id] = task
self.save_tasks()
print(f"创建新任务: {task_id}")
return task_id

def get_task_by_segment(self, segment_index, content_type):
"""根据段落索引和内容类型查找任务"""
for task in self.tasks.values():
if task.segment_index == segment_index and task.content_type == content_type:
return task
return None

def update_task_status(self, task_id, status):
"""更新任务状态"""
if task_id in self.tasks:
self.tasks[task_id].status = status
self.save_tasks()

def add_human_feedback(self, task_id, feedback):
"""添加人工反馈"""
if task_id in self.tasks:
self.tasks[task_id].human_feedback.append(feedback)
self.tasks[task_id].revision_count += 1
self.save_tasks()

def get_task(self, task_id):
"""获取任务"""
return self.tasks.get(task_id)

def get_tasks_by_status(self, status):
"""根据状态获取任务列表"""
return [task for task in self.tasks.values() if task.status == status]

def save_tasks(self):
"""保存任务到文件"""
import json
from dataclasses import asdict

tasks_data = {}
for task_id, task in self.tasks.items():
task_dict = asdict(task)
# 处理枚举类型
task_dict['mode'] = task.mode.value
task_dict['status'] = task.status.value
tasks_data[task_id] = task_dict

with open(self.tasks_file, 'w', encoding='utf-8') as f:
json.dump(tasks_data, f, ensure_ascii=False, indent=2)

def load_tasks(self):
"""从文件加载任务"""
if not os.path.exists(self.tasks_file):
return

try:
with open(self.tasks_file, 'r', encoding='utf-8') as f:
tasks_data = json.load(f)

for task_id, task_dict in tasks_data.items():
# 恢复枚举类型
task_dict['mode'] = AnimationProductionMode(task_dict['mode'])
task_dict['status'] = AnimationStatus(task_dict['status'])

self.tasks[task_id] = AnimationTask(**task_dict)

except Exception as e:
print(f"加载任务文件失败: {e}")
Comment on lines +152 to +164

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

load_tasks 方法中,使用了过于宽泛的 except Exception as e。这会捕获所有类型的异常,可能会掩盖一些潜在的 bug,例如 json.JSONDecodeError(文件格式损坏)、FileNotFoundError(文件不存在)或 KeyError(JSON 结构不正确)。建议捕获更具体的异常类型,并提供更明确的错误日志,以便于调试。

Suggested change
try:
with open(self.tasks_file, 'r', encoding='utf-8') as f:
tasks_data = json.load(f)
for task_id, task_dict in tasks_data.items():
# 恢复枚举类型
task_dict['mode'] = AnimationProductionMode(task_dict['mode'])
task_dict['status'] = AnimationStatus(task_dict['status'])
self.tasks[task_id] = AnimationTask(**task_dict)
except Exception as e:
print(f"加载任务文件失败: {e}")
try:
with open(self.tasks_file, 'r', encoding='utf-8') as f:
tasks_data = json.load(f)
for task_id, task_dict in tasks_data.items():
# 恢复枚举类型
task_dict['mode'] = AnimationProductionMode(task_dict['mode'])
task_dict['status'] = AnimationStatus(task_dict['status'])
self.tasks[task_id] = AnimationTask(**task_dict)
except FileNotFoundError:
# 文件不存在是正常情况,无需打印错误
return
except (json.JSONDecodeError, KeyError) as e:
print(f"加载或解析任务文件失败 ({self.tasks_file}): {e}")
except Exception as e:
print(f"加载任务时发生未知错误: {e}")


class PlaceholderGenerator:
"""占位符生成工具"""

def __init__(self, config = None):
self.config = config or PlaceholderConfig()

def create_placeholder(self, task, output_path):
"""创建占位符视频"""
from PIL import Image, ImageDraw, ImageFont
import tempfile
import subprocess

# 创建占位符图片
img = Image.new('RGB', (self.config.width, self.config.height),
self.config.background_color)
draw = ImageDraw.Draw(img)

# 添加占位文本
try:
# 尝试使用自定义字体
font_path = os.path.join(os.path.dirname(__file__), 'asset', '字魂龙吟手书(商用需授权).ttf')

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

此处使用的字体文件名 字魂龙吟手书(商用需授权).ttfREADME.mdbackground_image.py 中使用的 字小魂扶摇手书(商用需授权).ttf 不一致。请统一文件名以避免资源加载失败。

Suggested change
font_path = os.path.join(os.path.dirname(__file__), 'asset', '字魂龙吟手书(商用需授权).ttf')
font_path = os.path.join(os.path.dirname(__file__), 'asset', '字小魂扶摇手书(商用需授权).ttf')

if os.path.exists(font_path):
font = ImageFont.truetype(font_path, self.config.font_size)
else:
font = ImageFont.load_default()
except:
font = ImageFont.load_default()

# 主标题
title = self.config.placeholder_text
title_bbox = draw.textbbox((0, 0), title, font=font)
title_width = title_bbox[2] - title_bbox[0]
title_height = title_bbox[3] - title_bbox[1]
title_x = (self.config.width - title_width) // 2
title_y = self.config.height // 3

draw.text((title_x, title_y), title, fill=self.config.text_color, font=font)

# 内容预览
if self.config.show_content_preview and task.content:
content_preview = task.content[:50] + "..." if len(task.content) > 50 else task.content
try:
content_font = ImageFont.truetype(font_path, self.config.font_size // 2) if os.path.exists(font_path) else ImageFont.load_default()
except:
content_font = ImageFont.load_default()

content_bbox = draw.textbbox((0, 0), content_preview, font=content_font)
content_width = content_bbox[2] - content_bbox[0]
content_x = (self.config.width - content_width) // 2
content_y = title_y + title_height + 50

draw.text((content_x, content_y), content_preview,
fill=self.config.text_color, font=content_font)

# 进度指示器
if self.config.show_progress_indicator:
status_text = f"状态: {task.status.value} | 类型: {task.content_type}"
try:
status_font = ImageFont.truetype(font_path, self.config.font_size // 3) if os.path.exists(font_path) else ImageFont.load_default()
except:
status_font = ImageFont.load_default()

status_bbox = draw.textbbox((0, 0), status_text, font=status_font)
status_width = status_bbox[2] - status_bbox[0]
status_x = (self.config.width - status_width) // 2
status_y = self.config.height - 100

draw.text((status_x, status_y), status_text,
fill=self.config.text_color, font=status_font)

# 保存占位符图片
temp_img_path = output_path.replace('.mov', '_placeholder.png')
img.save(temp_img_path)

# 转换为视频
try:
cmd = [
'ffmpeg', '-y',
'-f', 'image2', '-loop', '1',
'-i', temp_img_path,
'-t', str(task.audio_duration),
'-pix_fmt', 'yuv420p',
'-r', '15',
output_path
]
subprocess.run(cmd, check=True, capture_output=True)
os.remove(temp_img_path) # 清理临时文件
return output_path
except Exception as e:
print(f"创建占位符视频失败: {e}")
return None
Comment on lines +254 to +256

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

调用 ffmpeg 进程时,如果发生错误,subprocess.run 会抛出 CalledProcessError。当前的异常捕获虽然能捕捉到错误,但没有将 ffmpeg 的具体错误输出(stderr)打印出来,这给调试带来了困难。建议在捕获异常时,打印 e.stderr 的内容。

Suggested change
except Exception as e:
print(f"创建占位符视频失败: {e}")
return None
except subprocess.CalledProcessError as e:
print(f"创建占位符视频失败: {e}")
if e.stderr:
print(f"FFmpeg stderr: {e.stderr.decode('utf-8', errors='ignore')}")
return None
except Exception as e:
print(f"创建占位符视频失败: {e}")
return None

Binary file added projects/video_generate/core/asset/bg_audio.mp3
Binary file not shown.
Binary file not shown.
Loading
Loading