Skip to content

MeiGen-AI/X-Cut

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

10 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

X-CUT

X-CUT: Chat-Driven Video Editing Agent with Real-Time Rendering

Python 3.12+ React 18 TypeScript FastAPI Remotion Agno License

Features Β· Demo Β· Get Started Β· Skill System Β· Roadmap

πŸ‡¨πŸ‡³ δΈ­ζ–‡ Β· πŸ‡¬πŸ‡§ English


πŸ“– Overview

X-CUT is an AI video editing Agent. You can simply describe your idea in the chat interface; the Agent understands your intent, clarifies when needed, dynamically selects and combines skills, and edits a multi-track timeline. The entire process is rendered in real time onto tracks via Remotion β€” every step is instantly visible! No timeline dragging. No keyframe tweaking. Just chat with X-Cut.

"Turn my travel footage into a Vlog with chill background music and subtitles" β€” The Agent takes it from there: asset analysis, script generation, visual arrangement, music, dubbing, MG animation, and final rendering β€” all from a single sentence.


✨ Key Features

  • 🎬 Intelligent Asset Analysis & Script Generation: Automatically analyzes uploaded video/image assets β€” shot scale analysis, camera movement analysis, and content understanding. Based on user intent and asset content, the Agent generates a script preview.
  • 🎨 MG Animation Generation: Built-in LLM-powered MG animation code generator that leverages Remotion skills to turn natural-language instructions into Remotion JSX components in real time. Supports adding transitions, opening/ending titles, and any natural-language-described animation rendered to any position in the video, with both create and modify modes. Just describe what you want: "add a travel guide info card on the right side", "make the title more playful" β€” the Agent generates, transpiles, and renders it instantly.
  • 🎡 Smart Music, Dubbing & Subtitles: Automatically generates background music matching the video mood based on asset analysis, synthesizes AI voiceover from script narration (with selectable voice styles and voice cloning), and renders timeline-aligned subtitles β€” all orchestrated by the Agent through a single conversation.
  • πŸ’¬ Chat-Based Editing: Edit anything through conversation β€” add, remove, or modify dubbing, animations, subtitles, voice styles, and more. Rendered in real time via Remotion Player β€” what you see is what you export.
  • πŸ“Œ Reference-Based Modification: The rendered result on the timeline is composed of original assets layered with generated dubbing, music, and animations. Any individual element can be referenced and modified without affecting the rest β€” enabling precise, surgical edits.
  • πŸ–±οΈ Drag-and-Drop Timeline: Not satisfied with the AI-generated edit? Directly drag, reorder, delete, or add clips on the real-time rendered timeline.
  • ⚑ Editing Style Preservation & Sharing: Save any project's editing recipe as a reusable Skill β€” capturing structure, pacing, music style, dubbing preferences, and MG animation patterns. Next time, just load new assets and apply the Skill to instantly replicate the same style, enabling efficient batch production. Also supports sharing via .md files.

🎬 Demo

Interface Overview

Overview 1 Overview 2 Overview 3
X-Cut Entry Chat + Canvas Chat + Editor

Showcase

To meet GitHub's video upload size requirements, all videos have been heavily compressed. In X-CUT, you can export videos in various resolutions to ensure the highest output quality.

Vlog Creation

Assets: 5 travel video clips
Prompt: "Help me make a beach travel vlog"

Screen Recording

vlog-creator.mp4

Result

vlog-res.mp4

Marketing Video Creation

Assets: 9 product pictures (screenshot from Xiaomi's official website)
Prompt: "Make a promotional video for the Xiaomi Yu7."

Screen Recording

marketing-creator.mp4

Result

marketing-res.mp4

Free Edit

Assets: 1 video clip
Prompt: "Generate a Hainan travel guide in the right third of the screen, mock some data, make it detailed."
Later: "Change the background to transparent."

Screen Recording

free-edit.mp4

Result

free-edit-res.mp4

Assets: Multi video clips
Prompt: "Add transitions to these materials"
Later: "1. Change to push-pull effect; 2. Add a forest VLOG title with Chinese-English layout, English as subtitle, plus some geometric decorative elements, transparent background."

Screen Recording

free-edit2.mp4

Result

free-edit-res2.mp4

Style Sharing and Replication

Share your editing recipe as a reusable style via the "Share" button β€” export as .md or save to your space. Apply any style to a new project to instantly generate videos in the same style.

Screen Recording

style-replication.mp4

🧩 Skill System

Skills are the backbone of X-CUT's Agent, organized by category and easy to extend. The Agent dynamically selects and combines skills based on user intent, enabling flexible and composable editing without rigid pipelines.

.xcut_skills/system/
β”œβ”€β”€ entry/           β†’ the single top-level skill loaded by the Agent
β”œβ”€β”€ scenes/          β†’ vlog-generation, marketing-generation, free-edit,
β”œβ”€β”€ operations/      β†’ audio-edit, dubbing-edit, mg-edit, opening-ending-edit,
β”‚                      track-delete, transition-edit
β”œβ”€β”€ styles/          β†’ marketing-conversion, vlog-natural (shareable editing styles)
β”œβ”€β”€ templates/       β†’ marketing-hook-sell-cta, vlog-story
β”œβ”€β”€ tools/           β†’ analyze-assets, build-script, generate-visuals,
β”‚                      generate-music, add-dubbing, mg-animation,
β”‚                      resolve-voice-id, apply-timeline-strategy, references
└── prompt-specs/    β†’ script-from-style

How Agent + Skills Work Together

  1. Entry Skill β€” The Agent always loads the entry skill as its single top-level capability.
  2. Scene Selection β€” Based on user intent, the Agent selects the appropriate scene skill (vlog, marketing, etc.).
  3. Skill Staging β€” RuntimeSkillBuilder copies the entry skill into a task-scoped directory, then stages selected scene/user/shared skills as references.
  4. Flexible Execution β€” The Agent reads staged skill references and orchestrates tools accordingly β€” no rigid pipeline, just skill-driven intelligence.

Style Sharing & Import

Styles are a special kind of skill that captures an editing recipe β€” structure, pacing, music genre, dubbing preferences, and more. Users can:

  • Browse the Style Gallery on the home page to discover community styles
  • Preview a style's full configuration before applying
  • Apply any style to instantly start a new project with that creative direction
  • Share your project's editing recipe to the community as a new style skill
  • Import styles from .md files for sharing
  • Download any style as a portable .md file

πŸš€ Get Started

We are preparing the open-source model/API integration guide β€” code coming soon.


πŸ—Ί Roadmap

Status Milestone
πŸ”œ Web code and deployment guide release
πŸ”œ Standalone Skills release β€” pluggable into Claude, OpenClaw, and more
πŸ”œ CLI version release
πŸ”œ Long video decomposition & editing
πŸ”œ Talking-head / narration video editing
πŸ”œ One more thing ......

🀝 Contributing

Members


πŸ“„ License

This project is licensed under the GNU Affero General Public License v3.0 β€” see the LICENSE file for details.


X-CUT β€” Talk to the Agent. Enjoy your cut.

Report Bug Β· Request Feature Β· Discussions

About

Chat-Driven Video Editing Agent with Real-Time Rendering

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages