Sber has introduced a major update to the Kandinsky generative neural network. With the upgrade to Kandinsky 3.0, the generation of illustrations has significantly improved, and the new Kandinsky Video is the first Russian neural network capable of creating videos.
Let's start with Kandinsky 3.0. The main innovation compared to version 2.2 was improved query recognition: generation now corresponds more accurately to the prompt without compromising quality. In addition, the quality of understanding of queries on the topic of the national cultural code — the heroes of Soviet and Russian films and cartoons - has increased significantly. The difference was clearly shown in queries with Cheburashka and Kuzya the housekeeper:
They also showed a comparison with older versions of Kandinsky and other popular models — Midjourney (signed as MJv.52), Stable Diffusion XL (SDXL) and DALL-E 3. Here are the results of the generation for the query "beautiful girl":
This is a "man with a beard":
And this is "Barbie and Ken are shopping":
Another innovation is the Inpainting and Outpainting modes — the ability to "fit" a new object into an existing image or finish it (it resembles generative fill in Photoshop). Here is an example of finishing:
And these are examples of adding an object.
You can try out the neural network on the Fusion Brain platform, in the official Telegram bot or VKontakte.
Kandinsky Video allows you to create small animations based on a text query — up to 8 seconds with a frequency of about 30 frames per second and a resolution of up to 512 pixels on the larger side. The height and width can be set by the user.Video creation is available in beta mode on Fusion Brain, and Telegram bot so far offers only to sign up for the waiting list.