How Kling beat Sora in the AI race
Their new text-to-video model could well revolutionise the film industry
June 13th, 2024
Last February, OpenAI revealed Sora, a model similar to CHATGPT, except that with this one, the requested prompt will be able to generate very realistic videos of one minute, surpassing previous models which were limited to a few seconds. Then in May, during the Google I/O 2024 conference, Google unveiled VEO, extending Sora's video generation capabilities to more than a minute. Today, these two models, still unavailable to the public, must contend with a serious competitor: Kling, developed by Kuaishou Technology, which promises two-minute videos.
Sora by OpenAI is insane.
— Angry Tom (@AngryTomtweets) June 6, 2024
But KWAI just dropped a Sora-like model called KLING, and people are going crazy over it.
Here are 10 wild examples you don't want to miss:
1. A Chinese man sits at a table and eats noodles with chopstickspic.twitter.com/MIV5IP3fyQ
Kuaishou, mainly known for its short video sharing platform, has quickly gained popularity since its launch in 2011, becoming the second social network in China behind TikTok and also establishing itself internationally under the name Kwai. This application, offering a wide variety of content, ranging from entertainment videos to tutorials, including personal vlogs, has simultaneously strengthened its AI strategy. In August 2023, it presented its family of LLM KwaiYii and more recently its text-image model Kolors, similar to Dall-E from their competitor OpenAI. Kling, their latest innovation, currently in the testing phase, allows converting text into a two-minute video with a resolution of 1080p and a frequency of 30 frames per second, thanks, according to the company, « to an efficient training infrastructure, extreme inference optimization, and scalable infrastructure. » But the model also stands out for its flexibility in output formats : trained for variable resolution, the application allows generating videos in various Width/Height formats, thus adapting to different staging and broadcasting needs.
This is wild.
— Min Choi (@minchoi) June 9, 2024
Chinese AI KLING is breaking the Internet while OpenAI Sora is sleeping.
People with access are already generating AI videos and short films.
The videos look insane.
1. "Zootopia Grand Prix"pic.twitter.com/pmCZctsMtT
Kling, like Sora, uses an advanced 3D spatio-temporal attention mechanism and a transformer-type diffusion model, allowing the modeling of complex movements. Its 3D face and body reconstruction technology (3D VAE) enhances facial and body expression from a single image. Allowing its users to animate their 3D model by finely controlling its expressions and movements, such as making it dance or sing. Models like Kling could well transform the film industry, as evidenced by the upcoming screening of « Sora Shorts », a series of short films created with Sora, at the Tribeca Film Festival, demonstrating the revolutionary potential of these technologies in the 7th art. We may wonder if cinema, in a few decades, will still need actors.