An AI-generated image of several giant wooly mammoths treading through a snowy landscape.
A Sora-generated image. Prompt: "Several giant wooly mammoths treading through a snowy meadow, their long wooly fur lightly blows in the wind as they walk, snow-covered trees and dramatic snow-capped mountains in the distance, mid-afternoon light with wispy clouds and a sun high in the distance creates a warm glow, the low camera view is stunning capturing the large furry mammal with beautiful photography, depth of field. Image screenshotted with permission from OpenAI.

For the past week, social media feeds have been awash with A.I.-generated videos courtesy of OpenAI’s Sora. Many commentators have expressed awe at the complex, photorealistic worlds that can be built by this new generative tool, but the overall mood is a muted compared to the unbridled excitement that DALL-E was met with two years ago.

Like DALL-E, Sora uses text or image prompts to create soundless videos that can last up to one minute in length with a range of resolutions. To launch the tool, OpenAI’s CEO Sam Altman invited users of X (formerly Twitter) to suggest prompts and posted the requested A.I.-generated video in response. Some of the examples shared so far include different ocean animals competing in a bicycle race, a grandmother’s cooking tutorial from her rustic Tuscan kitchen, and two golden retrievers podcasting on top of a mountain.

Sora is still being safety tested and has not yet been made available for public use but, much like still-image artists before them, filmmakers, animators, and VFX artists are rushing to grapple with the implications of this new technology. The legendary world-building prowess of directors like Wes Anderson and Tim Burton is unlikely to be challenged by a machine that is not yet capable of producing charm, sophistication, or attention to detail. The case may be different for some of the specialists tasked with bringing these fantasy realms to life.

Sora’s release therefore inevitably dominated a series of panel discussions on the future of A.I. in film production at the Berlinale over the weekend.

“This is game changing,” said one L.A.-based director Dave Clark in The Hollywood Reporter. “You should fear the person who uses these tools.” He emphasized, however, that tried-and-tested storytelling techniques would remain fundamental to creating films with widespread appeal. “When you create that 60-second shot of an astronaut soaring through space, then what?”

“A lot of people don’t want to talk about A.I. but it’s just a new thing and we have to work with it,” said Simon Weisse, a prop designer who specializes in miniature and was keen to draw attention to A.I.’s potential uses. “For background pictures on the miniatures, instead of searching on Google for days to find pictures, we just use ChatGPT.”

How such an impressive text-to-video generator was developed is a question that is likely to attract scrutiny. OpenAI has published a technical report that delves into which A.I. models were used, but not much information has yet been released about what data was used to train the model. We know that some of the data was licensed and some was “publicly available,” but that latter category remains vague.

This is not particularly surprising in light of the wave of lawsuits OpenAI is facing from artists, authors, and the New York Times over its alleged use of copyrighted material as training data. Last year, the company openly admitted to the U.K. government that, without some kind of legal exemption granting access to copyrighted data, “it would be impossible to train today’s leading A.I. models.”

“Did the training data providers consent to their work being used?,” asked Ed Newton-Rex, the CEO of Fairly Trained, on X. “The total lack of info from OpenAI on this doesn’t inspire confidence. Across the A.I. industry, people’s work is being exploited without consent to build products that compete with that work.”

“When I started working on A.I. four decades ago, it simply didn’t occur to me that one of the biggest use cases would be derivative mimicry, transferring value from artists and other creators to mega corporations,” added leading A.I. skeptic Gary Marcus.

OpenAI removed the option to request images in the style of a living artist when it released DALL-E 3 last September. It also introduced what has been widely criticized as an intentionally bothersome process for creators to opt-out of having their art used as training data for future models. Other tools like Nightshade allow creators to “poison” their art.

The rights of copyright holders are not the only issue raised by the release of Sora. OpenAI products are prevented from generating violent, hateful, or sexual imagery, but A.I. experts are also concerned about the technology’s potential to be used for producing misinformation, in particular with regard to the forthcoming U.S. election as well a record number of elections around the globe happening this year. For the time being, A.I. generated videos tend to have at least a few tell-tale mistakes and cause does not always create effect, which often prevents seamless continuity.

Sora’s potentially troubling ability to animate image prompts is currently being held back from public demonstrations and it is not clear if or when it would be released to the public.

“There is no reason to believe that text-to-video will not continue to rapidly improve—moving us closer and closer to a time when it will be difficult to distinguish the fake from the real,” the deepfakes expert Hany Farid told New Scientist. He also warned that these eerily soundless videos could soon be combined with A.I.-powered voice cloning.

“The solution to misinformation will involve some level of mitigation on our part, but it will also need understanding from society and for social media networks to adapt as well,” OpenAI’s Aditya Ramesh told Wired. Earlier this month Meta pledged to label all A.I. generated imagery on its platforms.

While OpenAI may not be rushing to release its text-to-video generator for public use just yet, it is only a matter of time. When the company launched DALL-E 2 and ChatGPT in 2022, imitators were hot on the company’s heels. This time around, multiple tech giants are competing to lead the charge, including Google and Meta.

Startups like Runway and Pika Labs have also already launched text-to-video generators, and a similar tool has been teased by Stability A.I., the company behind popular text-to-image generator Stable Diffusion. None of these tools have so far achieved Sora’s crisp, high-definition finish or the impressive length of its outputs, but OpenAI’s new product has no doubt raised the stakes.