New “Stable Video Diffusion” AI model can animate any still image

November 28, 2023

On Tuesday, Stability AI released Stable Video Diffusion, a new free AI research tool that can turn any still image into a short video—with mixed results. It’s an open-weights preview of two AI models that use a technique called image-to-video, and it can run locally on a machine with an Nvidia GPU. Last year, Stability AI made waves with the release of Stable Diffusion, an “open weights” image synthesis model that kick started a wave of open image synthesis and inspired a large community of hobbyists that have built off the technology with their own custom fine-tunings. Now Stability wants to do the same with AI video synthesis, although the tech is still in its infancy. Right now, Stable Video Diffusion consists of two models: one that can produce image-to-video synthesis at 14 frames of length (called “SVD”), and another that generates 25 frames (called “SVD-XT”). They can operate at varying speeds from 3 to 30 frames per second, and they output short (typically 2-4 second-long) MP4 video clips at 576×1024 resolution. In our local testing, a 14-frame generation took about 30 minutes to create on an Nvidia RTX 3060 graphics card, but users can experiment with running the models much faster on the cloud through services like Hugging Face and Replicate (some of which you may need to pay for). In our experiments, the generated animation typically keeps a portion of the scene static and adds panning and zooming effects or animates smoke or fire. People depicted in photos often do not move, although we did get one Getty image of Steve Wozniak to slightly come to life. (Note: Other than the Steve Wozniak Getty Images photo, the other images animated in this article were generated with DALL-E 3 and animated using Stable Video Diffusion.) Given these limitations, Stability emphasizes that the model is still early and is intended for research only. “While we eagerly update our models with the latest advancements and work to incorporate your feedback,” the company writes on its website, “this model is not intended for real-world or commercial applications at this stage. Your insights and feedback on safety and quality are important to refining this model for its eventual release.” Notably, but perhaps unsurprisingly, the Stable Video Diffusion research paper does not reveal the source of the models’ training datasets, only saying that the research team used “a large video dataset comprising roughly 600 million samples” that they curated into the Large Video Dataset (LVD), which consists of 580 million annotated video clips that span 212 years of content in duration. Stable Video Diffusion is far from the first AI model to offer this kind of functionality. We’ve previously covered other AI video synthesis methods, including those from Meta, Google, and Adobe. We’ve also covered the open source ModelScope and what many consider the best AI video model at the moment, Runway’s Gen-2 model (Pika Labs is another AI video provider). Stability AI says it is also working on a text-to-video model, which will allow the creation of short video clips using written prompts instead of images. The Stable Video Diffusion source and weights are available on GitHub, and another easy way to test it locally is by running it through the Pinokio platform, which handles installation dependencies easily and runs the model in its own environment.

Previous articleAustralia to amend law to regulate digital payments like Apple, Google Pay

Next articleWhat’s Merriam-Webster’s word of the year for 2023? Hint: Be true to yourself

Disney stock jumps as earnings, streaming profit, and guidance top estimates

PayPal Begins Rollout of ‘Pool Money’ Feature for Shared Expenses

Santander Commercial Bank Delivers New Products and Digital Capabilities to Drive…

Japan GDP expands by 0.3% in third quarter, snapping two quarters…

Nuke From Orbit Sets Out to Help Banks Protect Customers’ Mobile…

Ukraine Shows U.S. How To Beat China In Drone Battery Wars

AI and 3D printing combine for advanced monitoring of small nuclear…

Novel electro-biodiesel offers a more efficient, cleaner alternative to existing options

People are fleeing Elon Musk’s X for Threads and Bluesky. Welcome…

Citigroup cuts copper forecast on tariffs risk, China outlook

Vaneck’s Matthew Sigel Sets Bitcoin Target at $180,000

Disney earnings offer hope that streaming can successfully supplant linear TV

Bitcoin Price And The Trump Effect: Here’s What Happened The Last…

Powell says the Fed doesn’t need to be ‘in a hurry’…

Global Oil Market Faces a Million-Barrel Glut Next Year, the IEA…

State leaders urged to divest pension funds from China: watchdog

The average amount Americans have saved for retirement in every U.S….

I Don’t Care If My Savings Account Has the Highest APY….

The House just voted ‘yes’ on a bill that would increase…

The House just voted ‘yes’ on a bill that would increase…

New “Stable Video Diffusion” AI model can animate any still image

Must Read

Bitcoin Price And The Trump Effect: Here’s What Happened The Last...

Powell says the Fed doesn’t need to be ‘in a hurry’...

Ukraine Shows U.S. How To Beat China In Drone Battery Wars

AI and 3D printing combine for advanced monitoring of small nuclear...

Novel electro-biodiesel offers a more efficient, cleaner alternative to existing options

Most Viewed

Markets are priced to perfection on the Trump trade. Here’s why...

Virgin Money Launches Mobiliser Fund to Help Struggling Firms Meet ESG...

Want to know what Trump will do as president? Look to...

Trending Now

The average amount Americans have saved for retirement in every U.S. state—see how you...

State leaders urged to divest pension funds from China: watchdog

The House just voted ‘yes’ on a bill that would increase Social Security checks...

New “Stable Video Diffusion” AI model can animate any still image

RELATED ARTICLES

Must Read

Most Viewed

Trending Now