{“value”:”Sber Unveils a 29B-Parameter GigaChat Model and Early API Access”}

No time to read?
Get a summary

Sber developers are advancing toward a new version of the GigaChat service, built on one of the most capable Russian language models, featuring 29 billion parameters. This milestone was disclosed by Andrey Belevtsev, Senior Vice President at STO and head of the Sberbank Technology block, during the AI Journey conference titled Journey to the World of Artificial Intelligence.

Belevtsev explained that the forthcoming GigaChat platform will be powered by a large language model that aims to place the service on a par with leading international solutions. He emphasized that training models at such scale represents a colossal and intricate computational challenge, something Sber has not attempted before. He recalled that training the 13 billion-parameter ruGPT-3 model in 2021 demanded operations that were nearly six times greater in volume, underscoring the magnitude of the recent effort.

The executive highlighted that Sber has assembled and refined a unique data set specifically for GigaChat, with hundreds of Sber employees contributing to its development. This data foundation is designed to improve the quality of responses across a wide range of topics and use cases, driving more accurate and helpful interactions for users.

Thanks to these efforts, each new release of GigaChat enables users to tackle problems more effectively. Sber reports that the new large language model adheres more closely to user instructions and can handle more complex tasks. Capabilities now include summarization, rewriting, and editing of texts, alongside a marked improvement in the quality of answers to diverse questions. In internal comparisons between the new model and the previous version, overall response quality increased by about 23 percent, while accuracy in reflecting real-world information improved by roughly 25 percent.

To achieve these results, a substantial amount of experimentation was conducted to enhance the model and speed up the training process. A specialized framework was employed to partition neural network weights across multiple GPUs, enabling efficient training of large language models and reducing memory demands on individual cards. This technical approach helped unlock the potential of larger parameter counts without overwhelming hardware resources.

On the internal evaluation front, the GigaChat model with 29 billion parameters demonstrated superior performance on the Massive Multitask Language Understanding benchmark, outperforming the most widely used open analogue LLaMA 2 34B. This result positions the new GigaChat variant as a competitive option in the global AI landscape and signals meaningful progress for Sber’s research and development program.

Commercially, Sber anticipates early access to the updated API for selected enterprise clients, enabling them to deploy their own solutions, as well as providing the academic community with opportunities to advance research in natural language processing and related fields. This rollout is expected to foster broader adoption while supporting rigorous experimentation in real-world environments.

The AI Journey conference series continues to underscore Sber’s commitment to artificial intelligence, with the latest edition inaugurated on November 22 and running through November 24. The event brings together researchers, developers, and industry leaders to explore advancements in AI and their practical implications for business and society.

No time to read?
Get a summary
Previous Article

A Sabadell‑led program to boost student tech skills in Alicante and beyond

Next Article

Expanded snapshot of Russia’s taxi fleet dynamics and regulatory shifts