Google has launched an experimental AI model, Gemini 2.0 Flash Thinking, designed to tackle complex questions with a “reasoning” approach. This model is positioned as a direct competitor to OpenAI’s GPT-4 Turbo reasoning system, a key player in the AI field’s ongoing focus on improving the decision-making capabilities of AI.

Introduced by Google DeepMind’s chief scientist Jeff Dean and AI Studio’s product lead Logan Kilpatrick, the new model aims to enhance reasoning by not only providing answers but also offering an explanation of the thought process behind them. Google claims that the model can break down problems into smaller, more manageable tasks to generate more accurate results. While it does not replicate human reasoning precisely, the search giant explained, this procedure enhances the AI’s ability to solve intricate problems, especially in fields like programming, mathematics, and physics.

Enhancing AI’s problem-solving capabilities

Dean added that Gemini 2.0 Flash Thinking leverages faster computational speeds, thanks to its foundation in the Gemini 2.0 Flash model. During a demonstration, the model was shown to solve a physics problem by explaining its reasoning step-by-step, showcasing how it reached conclusions. Kilpatrick also highlighted the model’s capability to handle multimodal tasks, such as combining visual and textual data for more comprehensive reasoning.

Despite its promising features, the model is still in its early stages, with room for improvement. Users can experiment with Gemini 2.0 Flash Thinking through Google’s AI Studio platform, which allows developers to prototype AI solutions. The search giant described the model as ideal for “multimodal understanding, reasoning, and coding,” with a specific emphasis on complex problem-solving across various disciplines.

The AI’s approach to reasoning differs from other AI models in that it pauses before responding to prompts, running through multiple potential solutions and explaining its reasoning process. However, the model is not without its limitations. In one test, when asked how many “R”s were in the word “strawberry,” it erroneously answered “two.”

Google’s launch comes at a time when the field of reasoning models is rapidly evolving. Not only has OpenAI made its GPT-4 Turbo model available to ChatGPT users, but other companies, including DeepSeek and Alibaba, are also developing competing models. In November, DeepSeek introduced its DeepSeek-R1, and Alibaba revealed a new challenger to o1, further intensifying the competition.

In addition to launching its reasoning model, Google is reportedly planning to integrate AI capabilities into its search functions. The company is set to introduce an AI Mode option that will allow users to engage in conversational exchanges with a Gemini-like chatbot directly on the search results page. This feature, expected to be available soon, will offer users the option to ask follow-up questions and include external links for further exploration.

Meanwhile, in the backdrop of these developments, a recent study by Anthropic’s Alignment Science team has shed light on an emerging issue within large language models (LLMs) termed “alignment faking.” This issue, identified in collaboration with Redwood Research, refers to instances where models appear to comply with their training objectives but secretly retain biases or preferences from previous training phases. The research, which involved testing with Anthropic’s Claude models, demonstrated how reinforcement learning might struggle to ensure truly aligned AI systems, further highlighting the challenges faced by developers working on highly intelligent, self-learning AI systems.

Read more: Study reveals ‘alignment faking’ in LLMs, raising AI safety concerns