French AI Startup Mistral Unveils New Language Models for Code Generation and Math Reasoning

French AI Startup Mistral Unveils New Language Models for Code Generation and Math Reasoning
Table of Contents
1French AI Startup Mistral Unveils New Language Models for Code Generation and Math Reasoning
Codestral Mamba: A Swift Code Generation Assistant
MathΣtral: Tackling Advanced Math Problems
Industry Impact and Accessibility

French AI startup Mistral has introduced two cutting-edge language models aimed at enhancing code generation and mathematical reasoning capabilities. The first model, Codestral Mamba, is a compact and efficient solution for code generation, while the second, MathΣtral (Mathstral), focuses on solving complex mathematical problems.

Codestral Mamba: A Swift Code Generation Assistant

Codestral Mamba is designed to deliver rapid code outputs despite its relatively small size of 7 billion parameters. This model is capable of handling up to 256k tokens, which translates to processing between 50,000 to 200,000 lines of code, depending on the programming language and style.

Mistral touts Codestral Mamba as an ideal local code assistant, perfect for real-time code autocompletion, syntax error detection, and personalized coding assistance. The model outperforms competitors such as Google’s CodeGemma and even surpasses larger models like Meta’s CodeLlama.

Built using Mistral’s proprietary Mamba architecture, Codestral Mamba employs selective state space models (SSMs) instead of the traditional Transformer architecture. This allows it to process sequences linearly, enhancing its ability to manage longer and larger inputs efficiently.

Available under an Apache 2.0 license, Codestral Mamba can be tested on Mistral’s la Plateforme and downloaded from Hugging Face, enabling users to integrate it into proprietary software and distribute the licensed code to customers.

MathΣtral: Tackling Advanced Math Problems

Mistral also unveiled MathΣtral, a model designed to address complex mathematical challenges requiring advanced, multi-step logical reasoning. Named in homage to Archimedes, MathΣtral aims to support academics and scientists by solving intricate math problems.

Developed in collaboration with Project Numina, MathΣtral has demonstrated state-of-the-art performance in various benchmark tests. It achieved scores of 56.6% on the MATH benchmark and 63.47% on the MMLU test, with even higher scores possible with additional inference-time computation.

According to Mistral, MathΣtral exemplifies the excellent performance-to-speed tradeoffs achieved through purpose-built models, a development philosophy promoted within la Plateforme. The model can be fine-tuned for specific mathematical or scientific areas, with its weights available on Hugging Face for further customization.

Industry Impact and Accessibility

These new models from Mistral reflect the company's commitment to providing specialized AI solutions that enhance productivity and problem-solving across various domains. The introduction of Codestral Mamba and MathΣtral highlights the growing trend of developing targeted AI tools that deliver superior performance and efficiency.

As Mistral continues to innovate, its models are expected to become invaluable assets for professionals in coding and mathematics, driving advancements in technology and research.