ViMax
by
HKUDS

Description: "ViMax: Agentic Video Generation (Director, Screenwriter, Producer, and Video Generator All-in-One)"

View HKUDS/ViMax on GitHub ↗

Summary Information

Updated 36 minutes ago
Added to GitGenius on May 31st, 2026
Created on March 30th, 2025
Open Issues & Pull Requests: 33 (+0)
Number of forks: 1,325
Total Stargazers: 8,634 (+1)
Total Subscribers: 67 (+0)

Issue Activity (beta)

Open issues: 26
New in 7 days: 0
Closed in 7 days: 2
Avg open age: 149 days
Stale 30+ days: 19
Stale 90+ days: 19

Recent activity

Opened in 7 days: 0
Closed in 7 days: 0
Comments in 7 days: 0
Events in 7 days: 0

Top labels

No label distribution available yet.

Most active issues this week

Detailed Description

ViMax is an advanced, agentic video generation framework designed to automate the entire creative process of video production. Unlike traditional AI video tools that are limited to generating short, visually inconsistent clips, ViMax aims to serve as a comprehensive solution by acting as director, screenwriter, producer, and video generator all in one. Its core purpose is to empower users to transform raw ideas, scripts, or even entire novels into high-quality, coherent videos with minimal manual intervention.

The system addresses several key challenges in AI video generation, such as maintaining character and scene consistency across frames, integrating narrative structure and storytelling depth, and producing longer, more complex videos. ViMax achieves this by orchestrating a multi-agent workflow that automates scriptwriting, storyboarding, character creation, and final video synthesis. Users can simply input their creative concept, and ViMax handles the rest, ensuring a seamless transition from idea to finished video.

ViMax offers several standout features:

1. Idea2Video: This feature allows users to input a simple idea, which is then expanded into a complete video story through intelligent automation of storytelling, character design, and production. 2. Novel2Video: ViMax can adapt entire novels into episodic video content, using narrative compression, character tracking, and scene-by-scene visual adaptation to retain the essence of the original story. 3. Script2Video: Users can provide any screenplay, from personal anecdotes to epic adventures, and ViMax will generate a corresponding video, giving creators full control over their visual storytelling. 4. AutoCameo: This interactive feature enables users to upload a photo of themselves or a pet, which ViMax then integrates as a consistent character throughout the video, allowing for personalized cameo appearances in various creative scripts and storylines.

The architecture of ViMax is built around a multi-agent pipeline that includes input processing (ideas, scripts, reference images, style directives), central orchestration (agent scheduling and resource management), script understanding (character and environment extraction), scene and shot planning (storyboarding and shot lists), visual asset planning (reference image selection and style guidance), asset indexing (cataloging frames and references), consistency and continuity checks (character and environment tracking), and visual synthesis and assembly (image generation and video editing). The final output includes frames, video clips, and logs for further review.

ViMax is designed for efficiency and scalability, featuring intelligent long script generation using retrieval-augmented generation (RAG), expressive storyboard design, multi-camera filming simulation, automated reference image selection, and high-efficiency parallel shot generation. It also includes automated consistency checks using multi-modal large language models (MLLM/VLM) to ensure professional quality and coherence across all frames.

The repository provides clear instructions for setup and usage, supporting both Linux and Windows environments. It leverages the 'uv' environment manager and supports integration with various chat and image generation models, including Google Gemini and MiniMax, via configurable API keys. Users can quickly get started by cloning the repository, syncing dependencies, and configuring their preferred models.

In summary, ViMax represents a significant advancement in AI-driven video creation, offering a unified, end-to-end platform that democratizes professional-quality video production for creators, storytellers, and developers alike.

ViMax
by
HKUDSHKUDS/ViMax

Repository Details

Fetching additional details & charts...