OpenAI Changes the Face of AI Video

Cory Wright
Feb 16, 2024
3 min read

Updated: Feb 20, 2024

AI video generation has made leaps and bounds in the past year but no one could have predicted it would jump this far this fast. Well, except maybe Open AI.

The creators of Chat GPT and Dall-E 3 have just released a preview of their new, generative AI system called Sora. As the previews show, OpenAI has moved the bar on what is possible with AI.

Compared to some current video generators, Sora is far and away the winner. See comparisons below.

PROMPT: The crashing blue waters create white-tipped waves, while the golden light of the setting sun illuminates the rocky shore. A small island with a lighthouse sits in the distance, and green shrubbery covers the cliff’s edge. The steep drop from the road down to the beach is a dramatic feat, with the cliff’s edges jutting out over the sea. This is a view that captures the raw beauty of the coast and the rugged landscape of the Pacific Coast Highway.

Sora

Pica

Pixverse

While current AI video generators are able to produce some amazing results as well, Sora ups the anty by not only the quality of the video, but by sheer length. Most generate videos of about 4 seconds, Sora can produce videos of up to 60 seconds.

Not only can videos be longer, but with just one prompt, videos can also have multiply cuts, from multiple camera angles, all with the same subjects. Sora also seems to solve a lot of morphing, warping and inconsistency issues that hold back a lot of other AI video generators.

PROMPT: A stylish woman walks down a Tokyo street filled with warm glowing neon and animated city signage. She wears a black leather jacket, a long red dress, and black boots, and carries a black purse. She wears sunglasses and red lipstick. She walks confidently and casually. The street is damp and reflective, creating a mirror effect of the colorful lights. Many pedestrians walk about.

Sora

Compared to Pixverse using same prompt

While the prompt doesn't specify any cuts, the AI model decides where they might look best. This will more than likely be a feature in the prompt that users can take advantage of to create exactly what they are envisioning, and with 60 seconds to play with, a lot can be done. While the Pixverse comparison looks pretty good, the 4 second cap on video length doesn't allow for much to happen.

Natural Language

Since this is OpenAI, natural language can be used for all the prompts. What this means, is that you can describe your scene just as you picture it and don't have to be weighed down with complicated prompt structure; you can just say exactly what you need. And if it is anything like Chat GPT, making changes will hopefully be just as easy.

OpenAI does admit that Sora has it's issues. Particularly around complex scenes and physics. It also struggles with cause and effects prompts. OpenAI's own example of a person taking a bite of a cookie. The AI can create the video of the person, the cookie, and them taking a bite, but doesn't understand that after, the cookie should have a bite taken out of it. This may be able to be solved with more precise prompting; as in telling Sora to have the person take the bite and after, a bite should be missing from the cookie. While OpenAI doesn't specify this as a solution, it would seem to make sense that being this precise would help, but we'll have to wait and see.

What Does This Mean for Businesses?

READ ALL ABOUT IT

But When!?

According to OpenAI's announcement, the model is already being tested by a select group of users, and hopefully, be available to everyone (with a Plus plan, or even as an additional membership) soon.

Check out the full announcement and more examples at www.openai.com/sora