Kimi k1.5: The First Non-OpenAI Model to Match Full-Powered O1 Performance

The AI arms race is evolving, and Kimi k1.5 is setting a new benchmark. As a multimodal thinking model, Kimi k1.5 achieves what no other company outside OpenAI has managed: performance matching the full-powered O1 modelacross reasoning benchmarks, without the preview or mini suffixes. This development represents a major step forward in artificial intelligence, signaling a new era of competition and innovation.

Raising the Bar for Long and Short CoT Reasoning

Kimi k1.5 shines in both Long CoT (Chain of Thought) and Short CoT reasoning tasks, proving its versatility and technical prowess.

1. Long Context Scaling:

Kimi k1.5 pushes the boundaries of long-chain reasoning with context lengths of up to 128k tokens during RL generation. By leveraging partial rollouts, it ensures efficient training while maintaining high performance, enabling the model to handle longer and more complex tasks without sacrificing speed or quality.

2. Long2Short Optimization:

With Long2Short techniques, Kimi k1.5 uses minimal tokens to complete tasks with maximum efficiency. This approach enhances the performance of Short CoT (Chain of Thought) models, ensuring they remain competitive while consuming fewer computational resources.

In short-chain reasoning tasks, Kimi k1.5 not only competes but also vastly outperforms SOTA models like GPT-4o and Claude Sonnet 3.5 in math, coding, vision, and multimodal tasks, with performance margins reaching up to 550%.

This performance leap redefines what’s achievable for compact, scalable AI systems in both long and short contexts.

A Technical Report Worth Reading

The Kimi team has released a comprehensive technical report detailing the methods, challenges, and breakthroughs behind Kimi k1.5. From reinforcement learning (RL) scaling to infrastructure optimization, the report serves as a valuable resource for researchers and developers.

The report outlines the simplicity of Kimi’s training methodology, achieving exceptional results without complex techniques like Monte Carlo tree search, value functions, or process reward models. Instead, Kimi focuses on effective RL scaling and multimodal integration.

Abstract

Language model pretraining with next token prediction has proven effective for scaling compute but is inherently limited by the quantity of high-quality training data. Scaling reinforcement learning (RL) offers a new avenue for advancing artificial intelligence, enabling large language models (LLMs) to expand their training data through reward-based exploration and, consequently, to scale compute as well.

However, prior work in this area has struggled to deliver competitive results. In this report, we share the training practices behind Kimi k1.5, our latest multimodal LLM trained with RL. Our approach is simple yet effective, achieving state-of-the-art reasoning performance across multiple benchmarks and modalities—e.g., 77.5% Pass@1 on AIME, 94% on Codeforces, and 74.9% on MathVista—matching OpenAI’s O1. Additionally, we introduce long2short techniques that leverage Long CoT strategies to improve Short CoT models. This results in SOTA Short CoT performance—e.g., 60.8% Pass@1 on AIME, 94.6% on MATH500, and 47.3% on LiveCodeBench—outperforming GPT-4o and Claude Sonnet 3.5 by significant margins.

Three Key Takeaways

1. First-of-its-kind Multimodal SOTA Model: Kimi k1.5 pushes the boundaries of reinforcement learning with LLMs.

2. Simplicity Wins: It achieves superior performance without complex methods like Monte Carlo tree search or value functions.

3. Long2Short Innovation: The use of Long CoT techniques to optimize Short CoT models sets new efficiency benchmarks.

The full technical report is available on GitHub.

Jim Fan, a Senior Research Scientist at NVIDIA, made comments about Kimi k1.5 on X:

Kimi shows strong multimodal performance (!) on benchmarks like MathVista, which requires visual understanding of geometry, IQ tests, etc.

Kimi paper has a LOT more details on the system design: RL infrastructure, hybrid cluster, code sandbox, parallelism strategies; and learning details: long context, CoT compression, curriculum, sampling strategy, test case generation, etc.”

Kimi.ai was founded by CMU PhD Zhilin Yang, the first author of Transformer-XL, who has collaborated with luminaries like Yann LeCun (GLoMo) and Yoshua Bengio (HotpotQA). The core team includes inventors of foundational LLM technologies such as RoPE, Group Normalization, ShuffleNet, and Relation Network.

Kimi.ai has achieved remarkable growth, reaching 36 million MAUs within its first year. As of December 2024, it ranks among the top five AI chatbot platforms globally, trailing only ChatGPT, Google Gemini, Claude, and Microsoft Copilot (source: SimilarWeb).

Kimi’s journey represents a significant step forward in AI development, inspiring a more collaborative and innovative future for the field.

SEE ALSO: Kimi Is Testing the AI Video Generation Function in Grayscale

Report

Kimi k1.5: The First Non-OpenAI Model to Match Full-Powered O1 Performance

What do you think?

Written by Mr Viral

Minnesota University Under Scrutiny for Alleged Harboring of Undocumented Individual

JSO Officer Arrested for Alleged Stalking: Shocking Details Unveiled

Review: The Taylor Gold Label 814e Koa SB Explores New Tonal Territory Without Losing Brand Identity

“There is a lot of fun to be had by putting random alternative bass notes under a chord”: Pete Townshend, RayDavies and David Bowie all used thirdinversionchords – here’s why (and how) you should, too

These unexpected money moves can make giving your grandkids financial help more fun

Amazon’s stock is this analyst’s ‘best idea’ for these 3 reasons

Minnesota University Under Scrutiny for Alleged Harboring of Undocumented Individual

JSO Officer Arrested for Alleged Stalking: Shocking Details Unveiled

Review: The Taylor Gold Label 814e Koa SB Explores New Tonal Territory Without Losing Brand Identity

“There is a lot of fun to be had by putting random alternative bass notes under a chord”: Pete Townshend, RayDavies and David Bowie all used thirdinversionchords – here’s why (and how) you should, too

These unexpected money moves can make giving your grandkids financial help more fun

Amazon’s stock is this analyst’s ‘best idea’ for these 3 reasons

Leave a ReplyCancel reply

E-Visa: Nigerian govt processes over 14,000 applications within 6 weeks

34 Best Holiday Lingerie Gifts for Yourself or Someone Special 2025

Point by match: Suzuki is on track to become the first player from the Canadiens since Kovalev to achieve the feat.

Meghan Markle, Prince Harry, and More Stars Turned Out for Serena Williams at the Baby2Baby Gala

Expert-picked early Cyber Monday deals: We found record prices on AirPods, Kindles, Lego, and PS5 consoles

PLAY TAPCOINS GAME

How to embed social listening into your social media strategy—without breaking the bank

BYD Pickup Truck Exports Exceeded 10,000 Units in the First Year

Ads Blocker Detected!!!

What do you think?

Leave a ReplyCancel reply

Ads Blocker Detected!!!

Log In

Sign In

Forgot password?

Your password reset link appears to be invalid or expired.

Log in

Privacy Policy

Add to Collection

No Collections