Mistral AI has introduced the new generation of its flagship model dubbed Mistral Large 2, which succeeds the previous version with enhanced capabilities. The French AI startup’s new large language model (LLM) is designed to improve on the performance of its predecessors when it comes to code generation, mathematics, and reasoning. The firm also claims Mistral Large 2 offers stronger multilingual support and advanced function calling features.
The LLM also features a 128k context window and supports multiple languages, including French, German, Arabic, Mandarin and Hindi in addition to 80 coding languages such as Python, Java, C, C++, JavaScript, and Bash. The model is tailored for single-node inference and is equipped with 123 billion parameters, allowing it to handle large throughput on a single node.
Mistral Large 2 is available under the Mistral Research License for research and non-commercial use, while commercial deployment requires a Mistral Commercial License. The LLM sets benchmarks in performance and cost-efficiency, achieving an accuracy of 84% on massive multitask language understanding (MMLU) evaluation, said the company. Additionally, Mistral Large 2 has been designed to deliver improvements in code generation and reasoning, outperforming its predecessor and performing comparably to leading models such as GPT-4o, Claude 3 Opus, and Llama 3 405B.
Greater reasoning capabilities for Mistral Large 2 compared to predecessors
Significant efforts, too, have been made to enhance the model’s reasoning capabilities, reducing its tendency to generate incorrect or irrelevant information. As such, Mistral Large 2 has been fine-tuned for accuracy using mathematical benchmarks, among others, and is better at following instructions and managing long multi-turn conversations. The model also maintains concise output for business applications.
The new model also excels in handling multilingual documents, Mistral claimed, and has been trained on extensive multilingual data. It features improved function calling and retrieval skills, making it suitable for complex business applications.
Mistral Large 2 can be accessed on la Plateforme under the name mistral-large-2407, available in version 24.07. It can be tested on le Chat and is also accessible on HuggingFace. Its creator company added that it is consolidating its offerings to focus on Mistral Nemo and Mistral Large as general-purpose models, and Codestral and Embed as specialist models. Fine-tuning capabilities are now available on la Plateforme for Mistral Large, Mistral Nemo, and Codestral.
Cloud collaborations and investigations
Mistral’s influence in the AI value chain has grown in recent years, thanks in large part to its partnerships with major cloud service providers. The Mistral Large 2 model is now available on Google Cloud Platform’s Vertex AI via a Managed API, as well as on Azure AI Studio, Amazon Bedrock, and IBM watsonx.ai. Recently, Mistral AI, in collaboration with NVIDIA, introduced the Mistral NeMo 12B language model. Said to be the French firm’s best small model, Mistral NeMo 12B is designed for enterprise applications, including chatbots, multilingual tasks, coding, and summarization.
The AI startup has also attracted the scrutiny of competition regulators. Earlier this year, the UK Competition and Markets Authority (CMA) ended its investigation into the partnership between Microsoft and Mistral AI. The CMA had been examining whether Microsoft’s collaboration with the French AI startup constituted a ‘relevant merger situation’ that could potentially reduce competition in the market. The British regulator concluded that this was not the case. Announced in February, the partnership involved Microsoft investing $16m in Mistral AI, providing access to its supercomputing infrastructure, and enabling the French firm’s models to be featured on Microsoft’s Azure platform.