Best Open-Source Coding Model 2026: DeepSeek vs MiniMax vs Kimi

Q: What is the best open-source coding model in 2026?

There is no single best model, because DeepSeek V4, MiniMax M3, and Kimi K2.7-Code each report different coding benchmarks. Based on official sources: DeepSeek V4 Flash is the cheapest serious option with a 1M context window, MiniMax M3 reports the highest raw coding score (59% on SWE-Bench Pro, vendor-run), and Kimi K2.7-Code is purpose-built for coding agents and ships under a Modified MIT license.

Q: Which open-source coding model is cheapest?

DeepSeek V4 Flash is the cheapest on the official pricing page at $0.14 per million input tokens and $0.28 per million output tokens, with cache discounts and a 1M context window. MiniMax M3 lists roughly $0.30 input and $1.20 output per million tokens for the smaller context tier, and Kimi K2.7-Code lists $0.95 input and $4.00 output.

Q: Can I self-host DeepSeek V4, MiniMax M3, and Kimi K2.7-Code?

Yes. All three publish open weights. DeepSeek V4 weights are on Hugging Face, Kimi K2.7-Code weights are on Hugging Face under a Modified MIT license, and MiniMax M3 open weights began rolling out in mid-June 2026. Check each provider's license terms before commercial deployment.

TL;DR

There is no single best open-source coding model in 2026, and anyone who hands you one number is hiding something. DeepSeek V4 is the cheapest serious option with a 1M context window and an official "state-of-the-art agentic coding" claim. MiniMax M3 reports the highest raw coding score (59% on SWE-Bench Pro), but that number is vendor-run and its weights only just shipped. Kimi K2.7-Code is purpose-built for coding agents and is the cleanest to self-host, under a Modified MIT license. The catch: all three report different benchmarks, so you cannot rank them on one score. This guide uses official sources only.

Open-Source Coding Models - June 2026

Updated June 15, 2026

DeepSeek V4 (April 24, 2026) ships as V4 Pro and V4 Flash, with a 1M context window, open weights on Hugging Face, and dual Thinking / Non-Thinking modes.
DeepSeek's official release calls V4 Pro open-source state-of-the-art in agentic coding, but publishes charts rather than a single reproducible score.
MiniMax M3 (June 1, 2026) is the first open-weight model to combine frontier coding, up to 1M context, and native multimodality; MiniMax claims 59.0% on SWE-Bench Pro.
MiniMax M3's open weights began rolling out in mid-June 2026, and its benchmarks are run on MiniMax's own infrastructure.
Kimi K2.7-Code (June 12, 2026) is a 1-trillion-parameter / 32B-active coding model built on Kimi K2.6, with a 256K context window and a Modified MIT license.
Kimi reports K2.7-Code scores 62.0 on its own Kimi Code Bench v2 and cuts thinking-token use about 30% versus K2.6.
The only benchmark two of the three share is MCP-Atlas (MiniMax 74.2, Kimi 76.0); DeepSeek does not report it.

Everyone wants one answer: the best open-source coding model. The honest answer is that the three models worth your time in mid-2026 cannot be lined up on a single chart, because each vendor reports a different benchmark.

This guide uses official sources only. I checked DeepSeek's release notes, MiniMax's announcement, and Moonshot's Kimi K2.7-Code model card directly. I did not use leaked benchmark slides, third-party leaderboards, or "I tested it" threads. Where a number is the vendor's own claim, I say so.

Context

DeepSeek V4 and MiniMax M3

$0.14

Cheapest input / 1M

DeepSeek V4 Flash

59%

Top coding claim

MiniMax SWE-Bench Pro

MIT

Cleanest license

Kimi K2.7-Code (modified)

For the full open-model ranking across every use case, see our best open-source AI models guide. This article is only about coding.

Three Models, Three Different Benchmarks

This is why "best coding model" is the wrong question

Here is the problem in one paragraph. DeepSeek says V4 Pro is open-source state-of-the-art in agentic coding, but its release note shows charts, not a single number you can reproduce. MiniMax says M3 scores 59.0% on SWE-Bench Pro. Kimi says K2.7-Code scores 62.0 on Kimi Code Bench v2, a benchmark Moonshot runs itself.

Those are three different tests. A SWE-Bench Pro percentage and a Kimi Code Bench v2 percentage are not the same measurement, so "MiniMax 59 vs Kimi 62" means nothing. The only benchmark any two of them share is MCP-Atlas, a tool-orchestration test (MiniMax 74.2, Kimi 76.0), and that measures agentic tool use, not raw code quality.

What this means for you

Choose by fit, not by a single leaderboard number. The right question is "which one matches my context window, budget, license, and workflow," not "which one has the bigger percentage." The rest of this guide answers the first question.

Head-to-Head: The Official Numbers

Everything each vendor states, side by side

DeepSeek V4 vs MiniMax M3 vs Kimi K2.7-Code

	DeepSeek V4	MiniMax M3	Kimi K2.7-Code
Released	April 24, 2026	June 1, 2026	June 12, 2026
Open weights	Yes (Hugging Face)	Yes (rolled out mid-June)	Yes (Hugging Face)
License	Open-weight (check terms)	Open-weight (check terms)	Modified MIT
Context window	1M	Up to 1M	256K
Architecture	V4 Pro 1.6T / 49B active; V4 Flash 284B / 13B active (MoE)	Sparse MoE with MSA (params not disclosed)	1T total / 32B active (MoE), built on K2.6
Multimodal	Text, with Thinking / Non-Thinking modes	Native image and video input	Vision via MoonViT encoder
Coding benchmark the vendor reports	Claims SOTA agentic coding (no single number)	SWE-Bench Pro 59.0% (vendor-run)	Kimi Code Bench v2 62.0 (own benchmark)
API price (input / output per 1M)	Flash $0.14 / $0.28; Pro $0.435 / $0.87	~$0.30 / $1.20 (smaller-context tier)	$0.95 / $4.00

1. DeepSeek V4: Cheapest Long-Context Coding

The value pick, and the one with the most momentum

DeepSeek V4 Preview went live on April 24, 2026. DeepSeek's official release note lists two models, both with a 1M context window and both available as open weights on Hugging Face.

DeepSeek V4 Official Model Split

Model	Parameters	Positioning
DeepSeek V4 Pro	1.6T total / 49B active	Flagship V4 model for reasoning, world knowledge, and agentic coding
DeepSeek V4 Flash	284B total / 13B active	Fast and economical V4 model

Source: DeepSeek V4 Preview Release

On coding, DeepSeek's release note is confident but vague. It says V4 Pro shows "open-source state-of-the-art in agentic coding benchmarks" and "beats all current open models in Math, STEM, and Coding, rivaling top closed-source models." It backs this with benchmark charts rather than a single reproducible figure, so treat it as a strong claim, not a measured number.

Two things make it the value pick. Price first: V4 Flash is the cheapest serious coding model here, at $0.14 input and $0.28 output per million tokens. Then the ecosystem: DeepSeek says V4 plugs straight into Claude Code, OpenClaw, and OpenCode, so it drops into existing agent setups. For a deeper breakdown, see our DeepSeek V4 guide and the DeepSeek V4 vs Qwen comparison.

Best for

High-volume coding at the lowest cost, whole-repo work that needs a 1M context window, and teams that want open weights plus a published API in one model.

2. MiniMax M3: Highest Coding Claim, Newest Weights

Promising numbers you should verify yourself

MiniMax announced M3 on June 1, 2026, and calls it the first open-weight model to combine frontier coding, up to 1M-token context, and native multimodality in one model. It uses a new sparse attention architecture MiniMax calls MSA, which the company says cuts per-token compute at 1M context to roughly one-twentieth of its previous generation.

Not sure which AI model to use?

12 models · Personalized picks · 60 seconds

Take the Quiz

MiniMax M3 Official Claims

Item	What MiniMax states
Coding (SWE-Bench Pro)	59.0% (surpasses GPT-5.5 and Gemini 3.1 Pro on this test, per MiniMax)
Terminal-Bench 2.1	66.0%
MCP-Atlas (tool use)	74.2%
Context	Up to 1M tokens via MiniMax Sparse Attention (MSA)
Multimodal	Native image and video input; can operate a desktop computer
Weights	Open weights released over roughly 10 days from the June 1 announcement

Source: MiniMax M3 official announcement

Read the asterisk

MiniMax M3's benchmark numbers are the company's own, run on MiniMax infrastructure, and are not independently verified at the time of writing. The open weights only began shipping in mid-June, so real-world, self-hosted reports are still thin. The 59% SWE-Bench Pro figure is strong enough to test, not strong enough to crown.

If the SWE-Bench Pro number holds up under independent testing, M3 could be the most interesting open-weight coding model of the year, and the native multimodality is a genuine edge if your workflow feeds screenshots or video into the model. For now, put it on a branch and benchmark it on your own repository before you trust it in production.

3. Kimi K2.7-Code: Built for Coding Agents

The cleanest license and the strongest tool-use score

Moonshot released Kimi K2.7-Code on June 12, 2026, as a coding-focused model built on Kimi K2.6. Its official Hugging Face model card lists a 1-trillion-parameter Mixture-of-Experts design with 32B active parameters, a 256K context window, and a 400M-parameter MoonViT vision encoder, released under a Modified MIT license.

One flag before you commit to this tier: Moonshot has since announced Kimi K3, a 2.8T-parameter flagship with weights promised by July 27, 2026. Our Kimi K3 review covers the verified benchmarks and what a weights release at that scale would mean for this ranking.

Kimi K2.7-Code Official Details

Item	Official detail
Built on	Kimi K2.6, tuned for real-world long-horizon coding
Architecture	1T total / 32B active Mixture-of-Experts
Context	256K tokens
License	Modified MIT, with open weights on Hugging Face
Coding (own benchmark)	Kimi Code Bench v2 62.0, up from 50.9 on K2.6
MCP-Atlas (tool use)	76.0, the highest tool-use score here
Efficiency	Moonshot says it cuts thinking-token usage about 30% versus K2.6

Source: Moonshot Kimi K2.7-Code model card (Hugging Face)

K2.7-Code earns its spot for agent builders for two reasons. Its MCP-Atlas score of 76.0 is the highest tool-use number here, which matters when the model has to call tools, read a repo, and edit files in a loop. And the Modified MIT license is the cleanest of the three for commercial self-hosting. The API is OpenAI- and Anthropic-compatible, so it slots into existing tooling. For background on Moonshot's agent approach, see our Kimi agent swarm guide.

Want the raw numbers side by side?

Our live AI benchmark leaderboard tracks coding and agent scores with the official source behind every cell.

Open the benchmark leaderboard

Official sources only, archived versions kept for history.

Which One Should You Use?

Pick by goal, not by a single percentage

Quick Decision

1Want the lowest API cost for high-volume coding? Start with DeepSeek V4 Flash at $0.14 / $0.28 per 1M tokens.
2Need the biggest context for whole-repo work? DeepSeek V4 and MiniMax M3 both list up to 1M tokens.
3Chasing the highest reported coding score? Test MiniMax M3, but treat its 59% SWE-Bench Pro as an unverified vendor claim.
4Building coding agents with heavy tool use? Kimi K2.7-Code is purpose-built for it and scores highest on MCP-Atlas (76.0).
5Need image or video input in your coding workflow? MiniMax M3 is natively multimodal; Kimi adds a vision encoder.
6Want the cleanest license to self-host commercially? Kimi K2.7-Code ships under a Modified MIT license.

Best Pick by Goal

Your goal	Best pick	Why
Cheapest coding API	DeepSeek V4 Flash	$0.14 / $0.28 per 1M tokens with a 1M context window
Whole-repo / long context	DeepSeek V4 or MiniMax M3	Both list up to 1M tokens in official docs
Highest reported coding score	MiniMax M3	59.0% SWE-Bench Pro, vendor-run and unverified
Coding agents and tool use	Kimi K2.7-Code	Built for agentic coding; 76.0 on MCP-Atlas
Cleanest self-host license	Kimi K2.7-Code	Modified MIT with open weights on Hugging Face
Multimodal coding input	MiniMax M3	Native image and video input

My practical pick

If I had to start one project today: DeepSeek V4 Flash for cheap, long-context coding; Kimi K2.7-Code when the job is an agent that uses tools and edits a repo; and MiniMax M3 on a test branch until independent benchmarks confirm the SWE-Bench Pro number. Do not pick a "winner" off a single leaderboard screenshot.

If you also want closed models in the mix, run your shortlist through the AI Model Picker, or compare prices in our AI coding tools pricing guide.

Read This Before You Trust Any Number

Four honest caveats

What the official data does not prove

All three benchmark sets are vendor-reported, and none are independently reproduced here. MiniMax M3's weights only began shipping in mid-June, so self-hosting reports are still early. The three models report different benchmarks, so an "X beats Y by N percent" claim is not possible from official data. And licenses differ: only Kimi states a Modified MIT license outright, so confirm DeepSeek's and MiniMax's terms before commercial use.

The practical takeaway is simple. Treat every number above as a starting hypothesis, then benchmark the two or three finalists on your own codebase. Your repository is the only leaderboard that matters.

Official Sources Used

FAQ

What is the best open-source coding model in 2026?

There is no single winner, because the three leading open-weight coding models report different benchmarks. Based on official sources: DeepSeek V4 Flash is the cheapest with a 1M context window, MiniMax M3 reports the highest raw coding score (59% on SWE-Bench Pro, vendor-run), and Kimi K2.7-Code is purpose-built for coding agents and ships under a Modified MIT license.

Is MiniMax M3 better than DeepSeek V4 for coding?

You cannot prove it from official data. MiniMax reports 59% on SWE-Bench Pro, but that test ran on MiniMax's own infrastructure and is not independently verified. DeepSeek calls V4 Pro open-source state-of-the-art in agentic coding but publishes no single comparable number. They report different benchmarks, so a clean head-to-head is not possible yet.

Which open-source coding model is cheapest?

DeepSeek V4 Flash, at $0.14 per million input tokens and $0.28 per million output tokens on DeepSeek's official pricing page, with cache discounts and a 1M context window. MiniMax M3 lists roughly $0.30 input and $1.20 output for the smaller-context tier, and Kimi K2.7-Code lists $0.95 input and $4.00 output.

Can I self-host DeepSeek V4, MiniMax M3, and Kimi K2.7-Code?

Yes. All three publish open weights. DeepSeek V4 and Kimi K2.7-Code weights are on Hugging Face (Kimi under a Modified MIT license), and MiniMax M3 weights began rolling out in mid-June 2026. Check each provider's license before commercial deployment.

Which open coding model has the biggest context window?

DeepSeek V4 and MiniMax M3 both list up to a 1 million token context window in their official docs. Kimi K2.7-Code lists a 256K token context window on its model card.

Want official-source AI breakdowns like this?

Join the newsletter for plain-English model comparisons, pricing checks, and no-hype analysis built on official sources only.

Written by

Paras Tiwari

Founder, Spectrum AI Labs

Founder of Spectrum AI Labs — testing AI tools and models, and writing up what actually ships.

More about Paras →

Best Open-Source Coding Model in 2026: DeepSeek V4 vs MiniMax M3 vs Kimi K2.7-Code