Is Gemma 4 open source?

Google says Gemma 4 is released under the commercially permissive Apache 2.0 license.

Is Llama 4 open source?

Meta's Llama 4 model cards describe Llama 4 as available under the Llama 4 Community License, not Apache 2.0. It is open-weight, but the license has conditions, including a 700 million monthly active user threshold.

Is MiniMax M3 worth using?

MiniMax M3, announced June 1, 2026, is a new open-weight model that MiniMax says combines frontier coding, up to 1M-token context, and native multimodality. MiniMax claims 59% on SWE-Bench Pro, but those benchmarks are vendor-run and the open weights only began rolling out in mid-June, so it is best treated as a promising model to test yourself rather than a verified top pick.

Best Open-Source AI Models 2026: What to Use Now

Q: What is the best open-source AI model in 2026?

There is no single best model for every use case. Based on official docs, DeepSeek V4 is the strongest pick for low-cost 1M-context API use, Kimi K2.6 for agent swarms and coding workflows, Gemma 4 for permissive local deployment, Qwen3-Coder for agentic coding, Llama 4 for open-weight multimodal work, and GLM-5.1 for long-horizon agent tasks.

Q: Has DeepSeek V4 been released?

Yes. DeepSeek's official docs say DeepSeek-V4 Preview went live on April 24, 2026 with V4 Pro and V4 Flash, 1M context, API access, and open weights.

TL;DR

The best open model depends on what you are actually building. DeepSeek V4 is the practical pick for cheap 1M-context API work. Kimi K2.6 is the agent-swarm pick. Gemma 4 is the cleanest local and commercial-license pick. Qwen3-Coder is built for agentic coding. Llama 4 is the open-weight multimodal pick. GLM-5.1 is worth watching for long-horizon coding agents. MiniMax M3 is a new June 2026 entry I am watching but not yet ranking, because its benchmarks are vendor-run and its weights only just shipped. This ranking uses official provider docs only, so provider benchmark claims are treated as claims, not independent proof.

Open-Source and Open-Weight AI Models

Updated June 2026

DeepSeek-V4 Preview is live with V4 Pro, V4 Flash, 1M context, API access, and open weights
Kimi K2.6 is open-source according to Moonshot/Kimi and adds Agent Swarm upgrades over Kimi K2.5
Kimi K2.7-Code (June 12, 2026) is a newer coding-focused open Kimi built on K2.6: a 1T/32B-active MoE with 256K context, under a Modified MIT license with open weights on Hugging Face
Gemma 4 is available in E2B, E4B, 26B MoE, and 31B Dense sizes under Apache 2.0
Qwen3-Coder-480B-A35B is a 480B MoE coding model with 35B active parameters and 256K native context
Llama 4 Scout and Maverick are open-weight multimodal models under Meta's Llama 4 Community License
GLM-5.1 is Z.AI's latest flagship model for long-horizon tasks, with a 200K context window listed in official docs
MiniMax M3 (announced June 1, 2026) is a new open-weight model claiming frontier coding, 1M context, and native multimodality, with weights rolling out in mid-June
This article avoids leaked benchmark claims and third-party ranking data by design

The old version of this article had one big problem: it treated April 2026 rumors as if they were still current. DeepSeek V4 had not shipped then. Kimi K2.5 was still the Kimi model to talk about. Gemma 4 and Llama 4 were framed around benchmark comparisons that needed a cleaner source trail.

That is not good enough now.

So this is a reset. The ranking below is based on official pages from DeepSeek, Kimi/Moonshot, Google, Qwen, Meta, and Z.AI. I am not using leaked benchmark slides, social posts, third-party leaderboards, or random pricing mirrors.

DeepSeek V4 context

official DeepSeek docs

300

Kimi sub-agents

Agent Swarm beta

Apache 2.0

Gemma 4 license

Google official

10M

Llama 4 Scout context

Meta model card

For closed-model comparisons, use our AI benchmark leaderboard or the AI Model Picker. This article is only about open-source and open-weight options.

How This Ranking Works

Official docs only, and no fake certainty.

I am ranking by practical use case, not by a single benchmark score. That matters because "best open-source AI model" is usually a bad question. A model that is great for agentic coding may be wrong for local deployment. A model with clean licensing may be less capable than a huge open-weight model that needs a GPU cluster.

Ranking Rules

Rule	What it means
Official sources only	Provider docs, model cards, official blogs, and official model pages
No leaked scores	If the provider has not published it, it does not go in the ranking
Provider benchmarks are labeled	A company claim is useful, but it is not independent validation
License matters	Apache 2.0, custom community licenses, and open weights are not the same thing
Use case beats hype	The article recommends models for jobs, not for bragging rights

Open-source vs open-weight

Some providers call their models open source. Others provide open weights under custom licenses. Those are not identical. In this article, I use "open model" as the broad category, then call out license details where the official source makes them clear.

The Ranking

The best model for each real use case.

Best Open Models in 2026

Rank	Model	Best for	Official-source reason
1	DeepSeek V4 Flash / Pro	Low-cost API, 1M context, open weights	DeepSeek lists V4 Preview as live with V4 Pro, V4 Flash, 1M context, API access, and open weights
2	Kimi K2.6	Agent swarms, coding workflows, complex deliverables	Kimi says K2.6 is open-source and upgrades Agent Swarm to 300 sub-agents and 4,000+ tool calls
3	Gemma 4	Local use, edge devices, permissive commercial license	Google lists E2B, E4B, 26B MoE, and 31B Dense under Apache 2.0
4	Qwen3-Coder	Agentic coding and repo-scale coding	Qwen lists a 480B MoE coding model with 35B active parameters and 256K native context
5	Llama 4 Scout / Maverick	Open-weight multimodal work	Meta model cards list native multimodality, Scout's 10M context, and Maverick's 1M context
6	GLM-5.1	Long-horizon coding agents	Z.AI lists GLM-5.1 as its latest flagship for long-horizon tasks with 200K context and 128K max output
7	MiniMax M3 (watchlist)	Open-weight coding, 1M context, multimodal — pending verification	MiniMax's official post claims frontier coding, 1M context via MSA, and native multimodality; benchmarks are vendor-run and weights only shipped mid-June

1. DeepSeek V4: Best for Cheap 1M-Context API Work

V4 is real now. The old article was wrong to keep waiting.

DeepSeek V4 Preview is live. DeepSeek's official release note lists two models:

DeepSeek V4 Official Model Split

Model	Parameters	Positioning
DeepSeek V4 Pro	1.6T total / 49B active	Flagship V4 model for reasoning, world knowledge, and agentic coding
DeepSeek V4 Flash	284B total / 13B active	Fast and economical V4 model

Source: DeepSeek V4 Preview Release

Both models support 1M context in DeepSeek's docs. Both are available through API model names: deepseek-v4-pro and deepseek-v4-flash. DeepSeek also links open weights from the official release note.

The main reason DeepSeek ranks first here is practical: it combines open weights, published API access, long context, and aggressive pricing. For teams that need long-context retrieval, document processing, or high-volume agent runs, V4 Flash is the first one I would test.

Pricing changes quickly

DeepSeek's official pricing changes periodically and has run temporary V4 promotions. Do not copy a static number into a purchasing decision without checking the current DeepSeek pricing page.

2. Kimi K2.6: Best for Agent Swarms

Kimi moved on from K2.5.

The old post mentioned Kimi K2.5. That is now stale. Kimi's official pages say Kimi K2.6 was released and open-sourced on April 20, 2026, with major Agent Swarm upgrades.

Kimi K2.6 Official Details

Feature	Official detail
Open-source status	Kimi says K2.6 is open-source with weights and code publicly available
Agent Swarm	Up to 300 sub-agents working simultaneously
Tool calls	Over 4,000 tool calls per task
Speed claim	Kimi says Agent Swarm completes tasks about 4.5x faster than single-agent execution
API pricing	$0.16 cache-hit input, $0.95 cache-miss input, $4.00 output per 1M tokens
Context	262,144 tokens on Kimi's pricing page

Source: Kimi K2.6 model page, Agent Swarm docs, and Kimi pricing page

Kimi K2.6 is the model I would test when the job is not a single answer, but a whole deliverable: a website, a report, a spreadsheet, a slide deck, or a research project that needs multiple parallel workstreams.

That said, Agent Swarm is not the same as ordinary chat. It can burn more quota, and the official help page says beta access depends on membership tier. Treat it as a workflow product, not just a model endpoint.

Update — Kimi K2.7-Code (June 12, 2026): Moonshot has since released Kimi K2.7-Code, a coding-focused model built on K2.6. Its official Hugging Face model card lists it as a 1-trillion-parameter Mixture-of-Experts model with 32B active parameters, a 256K context window, and a 400M-parameter MoonViT vision encoder, released under a Modified MIT license with open weights on Hugging Face.

Kimi K2.7-Code Official Details

Item	Official detail
Built on	Kimi K2.6, tuned for real-world long-horizon coding
Architecture	1T total / 32B active Mixture-of-Experts, 256K context
License	Modified MIT, with open weights on Hugging Face
Efficiency	Moonshot says it cuts thinking-token usage about 30% versus K2.6
Vendor benchmarks	Moonshot reports gains over K2.6 on its own suites (Kimi Code Bench v2 62.0 vs 50.9, MCP Atlas 76.0 vs 69.4)

Source: Moonshot Kimi K2.7-Code model card (Hugging Face)

If your use case is coding specifically, K2.7-Code is the newer open Kimi to test. If you need agent-swarm orchestration across mixed deliverables, K2.6 is still the model the official Agent Swarm docs describe.

3. Gemma 4: Best Local and Commercial-License Pick

Clean license, useful sizes, and realistic hardware paths.

Not sure which AI model to use?

12 models · Personalized picks · 60 seconds

Take the Quiz

Google's Gemma 4 page is the cleanest official source in this group. It says Gemma 4 ships in four sizes: Effective 2B, Effective 4B, 26B MoE, and 31B Dense. Google also says the family is released under a commercially permissive Apache 2.0 license.

Gemma 4 Official Model Family

Model	Best fit	Official note
Gemma 4 E2B	Phones and edge devices	Google says edge models are built for on-device utility
Gemma 4 E4B	Laptops and small local apps	Edge model with native multimodal capabilities
Gemma 4 26B MoE	Fast workstation use	MoE model activating 3.8B parameters during inference
Gemma 4 31B Dense	Higher-quality local workstation use	Dense model designed for raw quality and fine-tuning

Source: Google Gemma 4 announcement

If license clarity matters, Gemma 4 is the easiest recommendation. Apache 2.0 is familiar, commercially permissive, and much simpler than custom community licenses.

The tradeoff: Gemma 4 is not trying to be a 1M-context frontier giant. It is the practical local model family. That is exactly why it belongs near the top.

4. Qwen3-Coder: Best Qwen Pick for Agentic Coding

Built for code agents, not just code completion.

Qwen's official Qwen3-Coder post introduces Qwen3-Coder-480B-A35B-Instruct as a 480B-parameter MoE model with 35B active parameters. It supports 256K context natively and up to 1M tokens with extrapolation methods.

Qwen3-Coder Official Details

Item	Official detail
Largest named variant	Qwen3-Coder-480B-A35B-Instruct
Architecture	480B Mixture-of-Experts with 35B active parameters
Context	256K native, up to 1M with extrapolation methods
Training focus	Coding, agentic tasks, repo-scale and dynamic data
Tooling	Qwen Code CLI is open-sourced alongside the model

Source: Qwen3-Coder official blog

For general open-weight use, Qwen3 is also worth knowing. Qwen's official Qwen3 post says the family includes two open-weight MoE models and six dense models under Apache 2.0. But for this ranking, Qwen3-Coder is the more interesting pick because it is aimed at agentic coding.

5. Llama 4: Best Open-Weight Multimodal Pick

Strong models, but read the license.

Meta's official Llama 4 model card lists Scout and Maverick as natively multimodal models using mixture-of-experts architecture. Scout has 17B active parameters, 109B total parameters, and 10M context. Maverick has 17B active parameters, 400B total parameters, and 1M context.

Llama 4 Official Model Card Details

Model	Parameters	Context	Input
Llama 4 Scout	17B active / 109B total	10M	Multilingual text and image
Llama 4 Maverick	17B active / 400B total	1M	Multilingual text and image

Source: Meta Llama 4 model card

Llama 4 ranks lower than Gemma 4 for one reason: licensing. Meta's model card lists the Llama 4 Community License, not Apache 2.0. It includes obligations such as displaying "Built with Llama" when distributing products using the materials, and a separate license requirement for organizations above a 700 million monthly active user threshold.

For most small teams, that may be fine. For enterprise use, legal needs to read the license before anyone calls it "open source" in a procurement deck.

6. GLM-5.1: Best Watchlist Model for Long-Horizon Agents

Interesting, but I would verify licensing before building around it.

Z.AI's official docs list GLM-5.1 as the latest flagship model for long-horizon tasks. The docs say it can work continuously and autonomously on a single task for up to 8 hours, with a 200K context window and 128K maximum output tokens.

GLM-5.1 Official Details

Item	Official detail
Positioning	Latest flagship model for long-horizon tasks
Context	200K
Maximum output	128K
Use case	Autonomous agents and long-horizon coding agents
Claimed alignment	Z.AI says it is overall aligned with Claude Opus 4.6 in general capability and coding performance

Source: Z.AI GLM-5.1 developer docs

I am keeping GLM-5.1 in the ranking because the official docs make it relevant for long-horizon coding agents. I am not ranking it higher because the licensing and weight availability story needs a more careful official-source pass before I would recommend it as a default open model.

7. MiniMax M3: New Watchlist Entry for Open-Weight Coding

Promising official claims, but the benchmarks are vendor-run and the weights only just shipped.

MiniMax announced MiniMax M3 on June 1, 2026. MiniMax's official post describes it as the first open-weight model to combine frontier coding, up to 1M-token context, and native multimodality in a single model. It uses a new sparse attention architecture MiniMax calls MSA (MiniMax Sparse Attention), which the company says cuts per-token compute at 1M context to roughly 1/20 of its previous generation.

MiniMax M3 Official Claims

Item	What MiniMax states
Release	Announced June 1, 2026; technical report and open weights released over the following ~10 days
Context	Up to 1M tokens via MiniMax Sparse Attention (MSA)
Multimodal	Natively multimodal; supports image and video input and can operate a desktop computer
Target use	Coding, agentic work, office workflows, financial tasks, autonomous research and engineering
Vendor benchmarks	MiniMax claims 59.0% on SWE-Bench Pro, 66.0% on Terminal-Bench 2.1, and 74.2% on MCP Atlas

Source: MiniMax M3 official announcement

Why this is a watchlist entry, not a ranked pick

Two reasons I am not ranking MiniMax M3 against the models above yet. First, the benchmark numbers are MiniMax's own, run on MiniMax infrastructure, with no independent verification at the time of writing. Second, the open weights only began rolling out in mid-June, so real-world self-hosting reports are still thin. The official claims are strong enough to test, but not yet strong enough to rank.

If the SWE-Bench Pro claim holds up under independent testing, M3 could be one of the most interesting open-weight coding models of the year. For now, treat it as a "test it yourself" candidate alongside DeepSeek V4 and Qwen3-Coder, not as a settled top pick.

Which One Should You Use?

Simple choices, no leaderboard theater.

Quick Decision

1Need cheap long-context API calls? Start with DeepSeek V4 Flash.
2Need a stronger DeepSeek model for harder tasks? Test DeepSeek V4 Pro, but check current pricing first.
3Need agent swarms and multi-format deliverables? Test Kimi K2.6.
4Need a clean commercial license and local deployment? Start with Gemma 4.
5Need code-agent workflows and repo-scale coding? Test Qwen3-Coder.
6Need open-weight multimodal models with huge context? Look at Llama 4 Scout or Maverick, then read the license.
7Need long-horizon autonomous coding agents? Put GLM-5.1 on the shortlist, but verify deployment and licensing details.
8Want the newest open-weight coding contender? Test MiniMax M3 yourself, but treat its vendor benchmarks as unverified for now.

My practical pick

If I had to start today: DeepSeek V4 Flash for low-cost API work, Gemma 4 for local/self-hosted apps, and Kimi K2.6 for agent-heavy deliverables. That covers most real use cases without pretending one model wins everything.

For a broader closed-vs-open model choice, see our task-by-task AI model guide. If you just want a recommendation, use the free AI Model Picker.

Official Sources Used

FAQ

What is the best open-source AI model in 2026?

There is no single winner. Use DeepSeek V4 for cheap 1M-context API work, Kimi K2.6 for agent swarms, Gemma 4 for local deployment and Apache 2.0 licensing, Qwen3-Coder for coding agents, Llama 4 for open-weight multimodal work, and GLM-5.1 for long-horizon agent experiments.

Has DeepSeek V4 been released?

Yes. DeepSeek's official release note says DeepSeek-V4 Preview went live on April 24, 2026. The API models are deepseek-v4-pro and deepseek-v4-flash.

Is Kimi K2.6 newer than Kimi K2.5?

Yes. Kimi's Agent Swarm documentation says Kimi K2.5 introduced Agent Swarm on January 27, 2026, and Kimi K2.6 was released and open-sourced on April 20, 2026 with major Agent Swarm upgrades.

Is there a newer open Kimi model than K2.6?

Yes. Moonshot released Kimi K2.7-Code on June 12, 2026, a coding-focused open model built on K2.6. Its Hugging Face model card lists a 1-trillion-parameter Mixture-of-Experts design with 32B active parameters, a 256K context window, a Modified MIT license, and open weights for self-hosting. K2.6 remains the general agent-swarm pick; K2.7-Code is the newer coding specialist.

Which model has the cleanest commercial license?

Gemma 4 is the cleanest from this list because Google says it uses Apache 2.0. Qwen3's official blog also says the Qwen3 open-weight family is under Apache 2.0. Llama 4 uses Meta's Llama 4 Community License, which has extra conditions.

Which model should I self-host first?

Start with Gemma 4 unless you specifically need another model's capability. Google provides smaller Gemma 4 sizes for edge and laptop use, and the license is straightforward.

Is MiniMax M3 a top open-source model?

Not a confirmed one yet. MiniMax announced M3 on June 1, 2026 as an open-weight model combining frontier coding, up to 1M context, and native multimodality, and it claims 59% on SWE-Bench Pro. But those benchmarks are vendor-run and the open weights only started shipping in mid-June, so this article lists it as a watchlist entry to test rather than a ranked pick.

Written by

Paras Tiwari

Founder, Spectrum AI Labs

Founder of Spectrum AI Labs — testing AI tools and models, and writing up what actually ships.

More about Paras →

Stay ahead of the AI curve

We test new AI tools every week and share honest results. Join our newsletter.