Pixelle-Video is an AI-powered, fully automated short video generation engine designed to make video creation accessible to everyone, regardless of technical or editing experience. The core concept is simple: users input a topic or theme, and Pixelle-Video handles the entire workflow, including script writing, image and video generation, voice synthesis, background music addition, and final video assembly. This process is streamlined and modular, allowing for customization at each stage, from choosing AI models to selecting visual styles and audio engines.
The repository provides a user-friendly web interface, which can be launched via a one-click Windows package or installed from source for macOS and Linux users. The Windows package is particularly notable for its ease of use, requiring no manual installation of Python, uv, or ffmpeg, and includes all dependencies. Upon launching the web interface, users are guided through system configuration, where they set up their preferred AI language model (LLM) for script generation and image generation services, such as ComfyUI or RunningHub.
Pixelle-Video supports a wide range of AI models for both text and image generation, including GPT, Tongyi Qianwen, DeepSeek, Ollama, and more. For voice synthesis, it integrates with multiple TTS (text-to-speech) engines like Edge-TTS and Index-TTS, and allows for voice cloning by uploading reference audio files. Background music can be selected from built-in options or uploaded by the user, enhancing the atmosphere of the generated videos.
The video creation process is highly flexible. Users can choose to let the AI generate the script based on a topic or input their own script. Each sentence or segment is paired with AI-generated illustrations or video clips, and the narration is synthesized using the selected TTS workflow. The visual style is customizable through templates, which determine layout and design, and users can adjust image dimensions and prompt prefixes to influence the artistic output. The system supports both vertical and horizontal video formats, catering to different platforms and content types.
Recent updates have added advanced features such as action transfer (allowing users to upload reference videos and images for motion migration), digital avatar narration, multi-language TTS support, and improved concurrency for cloud-based image generation. Users can upload their own photos and videos for AI analysis and script generation, and batch processing is supported for creating multiple videos simultaneously. The modular architecture, based on ComfyUI, enables users to combine atomic capabilities and swap out models or workflows as needed.
Pixelle-Video is entirely free to use, with recommended configurations for both local and cloud-based operation. The project is inspired by several other open-source tools and maintains an active community for support and feedback. It is licensed under Apache 2.0, ensuring open access and collaborative development. Overall, Pixelle-Video stands out as a comprehensive, accessible, and highly customizable solution for automated short video creation, empowering users to produce engaging content with minimal effort.