Sber’s GigaChat neural network has reportedly outperformed most publicly available models in recent benchmarking conducted under the MERA framework. The results were shared by the financial institution’s press service, highlighting the model’s strong performance across a series of structured tasks.
The evaluation featured two Sber neural network configurations: GigaChat PRO and GigaChat Lite+.
In a practical assessment consisting of 21 instruction-based tasks spanning diverse knowledge domains, GigaChat PRO achieved a score of 51.3 out of 100, surpassing the Mixtral 8x7B Instruct model which earned 47.8 points. These figures come from an open evaluation system intended to offer objective and transparent insights into model capabilities.
Sberbank notes that the higher a model scores, the more effectively artificial intelligence can handle a broad range of intellectual and daily problems. The technology can assist with drafting articles in a chosen style, conducting targeted information searches, and producing analyses grounded in retrieved data.
According to the company, neural networks enable businesses to design their own AI solutions and streamline internal processes. This aligns with the broader goal of leveraging language models to improve efficiency and decision making across organizational functions.
Andrey Belevtsev, Senior Deputy CTO and Head of Sberbank Technology, emphasized the importance of keeping a precise understanding of what large language models can actually do, especially as development in this space accelerates. The CEO highlighted that evaluation results help users learn how to apply GigaChat effectively and provide researchers with objective data for future training, customization, and ongoing advancement of large language models.
Belevtsev suggested that the test outcomes not only recognize the work of the Sber team but also serve as a foundation for enhancing the service. The aim is to make the platform more convenient and valuable for both everyday users and business customers, driving practical benefits in real-world tasks.
The MERA benchmark, short for Multimodal Assessment for Russian Language Architectures, was introduced at the AI Journey conference in 2023. The initiative gathered support from several Alliance member companies along with academic partners such as Skoltech AI and the National Research University Higher School of Economics. Their collaboration contributed to the development of the evaluation materials and tests that underpin the MERA framework.
These developments illustrate how open evaluation ecosystems can shape the deployment of language and multimodal AI in enterprise settings. By providing clear metrics and comparative results, they help buyers, developers, and researchers make smarter choices about model selection, integration, and ongoing improvement. As large language models continue to advance, transparent benchmarks like MERA are likely to play a central role in guiding responsible progress and practical adoption across industries.
Cited references: Sberbank press service, MERA benchmark team, and participating academic partners. [Citation: Sberbank press service].