Lightricks Releases LTX-2.3: Open-Source 4K Video Model With Synchronized Audio
Lightricks has released LTX-2.3, an open-source audio-video foundation model built on a Diffusion Transformer (DiT) architecture with 22 billion parameters. The model generates 4K video clips at up to 50 fps with synchronized audio in a single pass โ no separate audio pipeline required.
What it can do
LTX-2.3 supports video generation from text prompts or image inputs, producing clips up to 20 seconds at resolutions of 1080p, 1440p, or 2160p (4K). It natively handles both 16:9 widescreen and 9:16 vertical formats. The updated VAE autoencoder delivers sharper texture and edge detail compared to its predecessor LTX-2, while a 4x larger text connector improves prompt adherence for complex descriptions.
The model runs entirely on local hardware โ confirmed working on NVIDIA RTX 30/40/50-series GPUs with as little as 8GB VRAM. Code is released under Apache 2.0 on GitHub, and model weights are available on Hugging Face.
Why open-source
Lightricks CEO Zeev Farbman argues that closed APIs lock creative studios into dependency on third-party pricing and model updates โ a problem his company intends to undercut. "Google and OpenAI want to control your entire pipeline," he wrote. "We put the weights on Hugging Face so you can build your own."
The repo includes inference code, training utilities, and a LoRA trainer, positioning LTX-2.3 as infrastructure for production pipelines rather than a standalone demo tool.
Traction
LTX-2 was already the most downloaded open-source video model before LTX-2.3 shipped. Farbman's announcement post has accumulated over 450 likes and 85 reposts from developers and creators testing the new release.
A free desktop application is available at ltx.io for anyone who wants to run the model locally without setting up the Python environment.