Short version: DeepSeek V4 is no longer a rumor. DeepSeek's official docs say DeepSeek-V4 Preview went live on April 24, 2026 with two models: V4 Pro and V4 Flash. Both support 1M context, both are available through the API, and DeepSeek links open weights from the release note. Pricing is currently unusual because V4 Pro has a temporary discount running through May 31, 2026.
- DeepSeek-V4 Preview went live on April 24, 2026 according to DeepSeek API Docs
- The two API models are deepseek-v4-pro and deepseek-v4-flash
- DeepSeek-V4-Pro is listed as 1.6T total parameters and 49B active parameters
- DeepSeek-V4-Flash is listed as 284B total parameters and 13B active parameters
- Both V4 models support 1M context and thinking / non-thinking modes
- DeepSeek lists OpenAI-format and Anthropic-format API endpoints
- DeepSeek links a technical report and open weights from the official release note
- DeepSeek says deepseek-chat and deepseek-reasoner will be retired after July 24, 2026 at 15:59 UTC
This article used to be a pre-release tracker. That is no longer the right framing.
DeepSeek has now published an official V4 Preview release note, an API pricing page, model names, context limits, endpoint details, a tech report link, and open weights. So the useful question is not "when will V4 launch?" It is: what exactly did DeepSeek ship, what does it cost, and what should developers change?
This update uses only DeepSeek-owned sources and official DeepSeek-linked docs. No anonymous reports. No third-party benchmark roundups. No guessing.
Source note
The main sources for this guide are DeepSeek's official V4 Preview Release and DeepSeek's official Models & Pricing page. Pricing can change, and DeepSeek explicitly recommends checking the pricing page regularly.
Official Status: DeepSeek V4 Preview Is Live
The old February release-date discussion is obsolete.
DeepSeek's official release note says DeepSeek-V4 Preview is live as of April 24, 2026. It also says the model is open-sourced, with a technical report and open weights linked from the same official post.
What Changed Since the Old Article
| Old article framing | Current official status |
|---|---|
| Expected mid-February 2026 release | Official V4 Preview published April 24, 2026 |
| Rumored 1M context | 1M context listed in DeepSeek's release note and pricing page |
| Expected pricing | Pricing is published in DeepSeek API Docs |
| Expected open-source release | DeepSeek says V4 Preview is open-sourced and links open weights |
| Speculated API details | DeepSeek lists model names, base URLs, supported modes, and features |
That means any old text about "expected release date," "rumored 1M context," or "unconfirmed V4 availability" should be treated as outdated.
Official sources:
- DeepSeek-V4 Preview Release
- DeepSeek Models & Pricing
- DeepSeek V4 open weights collection linked by DeepSeek
- DeepSeek V4 technical report linked by DeepSeek
DeepSeek V4 Pro vs DeepSeek V4 Flash
Same generation, different jobs.
DeepSeek split V4 into two models. V4 Pro is the larger model. V4 Flash is the cheaper, faster option.
DeepSeek V4 Model Details
| Model | Official description | Parameters | Best fit |
|---|---|---|---|
| DeepSeek-V4-Pro | Flagship model for reasoning, world knowledge, and agentic coding | 1.6T total / 49B active | Hard coding, research, long-context work |
| DeepSeek-V4-Flash | Smaller, faster, more economical V4 model | 284B total / 13B active | High-volume use, simple agent tasks, cost-sensitive routing |
Source: DeepSeek V4 Preview Release
DeepSeek says V4 Flash's reasoning capabilities closely approach V4 Pro and that it performs on par with V4 Pro on simple agent tasks. I would still test both on your own workload. "Simple agent task" is a broad phrase, and DeepSeek does not define it tightly in the release note.
Not sure which AI model to use?
12 models · Personalized picks · 60 seconds
API Access and Migration
Keep the base URL, change the model name.
DeepSeek says the API is available now. The migration note is straightforward: keep the same base URL and update the model to either deepseek-v4-pro or deepseek-v4-flash.
DeepSeek V4 API Details
| Item | Official value |
|---|---|
| OpenAI-format base URL | https://api.deepseek.com |
| Anthropic-format base URL | https://api.deepseek.com/anthropic |
| Model names | deepseek-v4-flash, deepseek-v4-pro |
| Thinking modes | Thinking and non-thinking modes supported |
| Context length | 1M |
| Maximum output | 384K |
| JSON output | Supported |
| Tool calls | Supported |
| FIM completion | Non-thinking mode only |
Source: DeepSeek Models & Pricing
Old DeepSeek model names are on the way out
DeepSeek says deepseek-chat and deepseek-reasoner are currently routed to DeepSeek V4 Flash non-thinking and thinking modes, respectively. The official V4 release note says those names will be retired after July 24, 2026 at 15:59 UTC.
If your app still calls deepseek-chat or deepseek-reasoner, do not wait until the deadline. Move to the V4 model names, then test output length, tool calls, JSON behavior, and latency before production traffic depends on it.
DeepSeek V4 Pricing
The discount matters, so date-stamp your numbers.
DeepSeek's pricing page lists prices per 1M tokens. As of May 1, 2026, V4 Flash is at normal listed pricing, while V4 Pro is shown with a temporary 75% discount. DeepSeek says that V4 Pro discount is extended until May 31, 2026 at 15:59 UTC.
DeepSeek V4 API Pricing as of May 1, 2026
| Model | Cache hit input | Cache miss input | Output |
|---|---|---|---|
| DeepSeek V4 Flash | $0.0028 | $0.14 | $0.28 |
| DeepSeek V4 Pro | $0.003625 discounted from $0.0145 | $0.435 discounted from $1.74 | $0.87 discounted from $3.48 |
Source: DeepSeek Models & Pricing
The cache-hit number is easy to miss. DeepSeek says input cache-hit pricing was reduced to one-tenth of the launch price from April 26, 2026 at 12:15 UTC. If your workload repeats long system prompts or shared context, this can change the economics a lot.
Use current pricing for buying decisions
The V4 Pro prices above include a temporary discount. If you are reading this after May 31, 2026, check DeepSeek's pricing page before making a cost comparison.
Benchmarks and Performance Claims
What DeepSeek says, and what we are not claiming.
DeepSeek's release note makes strong performance claims, but most of the text version of the official page does not expose full benchmark tables. So this article is intentionally careful.
Official Performance Claims in the Release Note
| Area | DeepSeek's official claim |
|---|---|
| Agentic coding | V4 Pro is described as open-source SOTA in agentic coding benchmarks |
| World knowledge | V4 Pro is described as leading current open models and trailing only Gemini 3.1 Pro |
| Reasoning | V4 Pro is described as beating current open models in Math, STEM, and coding |
| V4 Flash | DeepSeek says V4 Flash reasoning closely approaches V4 Pro |
| Simple agent tasks | DeepSeek says V4 Flash performs on par with V4 Pro on simple agent tasks |
Source: DeepSeek V4 Preview Release
Those are official DeepSeek claims. They are useful, but they are still provider claims. I am not adding third-party benchmark numbers here because you asked for official sources only.
Architecture Notes
What DeepSeek explicitly names in the V4 release.
The old article spent a lot of time on Engram and mHC. Those may be interesting DeepSeek research threads, but the official V4 Preview Release names different items directly.
Architecture and Capability Notes from DeepSeek
| Item | What DeepSeek says |
|---|---|
| DeepSeek Sparse Attention | V4 uses token-wise compression plus DSA |
| Long context | 1M context is listed as the default across official DeepSeek services |
| Agent integrations | DeepSeek says V4 is integrated with agents including Claude Code, OpenClaw, and OpenCode |
| API compatibility | OpenAI ChatCompletions and Anthropic APIs are supported |
| Modes | Both V4 models support thinking and non-thinking modes |
Source: DeepSeek V4 Preview Release
The cleanest way to describe V4 is this: DeepSeek is pushing long context, cheaper inference, and agent workflows. If you need the engineering details, read the official technical report linked from DeepSeek's release note.
Migration Checklist
What to test before switching production traffic.
DeepSeek V4 Migration Steps
- 1Replace deepseek-chat or deepseek-reasoner with deepseek-v4-flash or deepseek-v4-pro
- 2Keep the base URL unless you are changing between OpenAI-format and Anthropic-format APIs
- 3Decide whether each call should use thinking or non-thinking mode
- 4Test JSON output, tool calls, and FIM completion if your app depends on them
- 5Check output size assumptions because DeepSeek lists a 384K maximum output
- 6Recalculate cost using cache-hit and cache-miss pricing separately
- 7Date-stamp any V4 Pro cost comparison because the current discount is temporary
For most apps, V4 Flash is the first model to test. It is much cheaper, it is what the old compatibility model names route to, and DeepSeek positions it for economical production use. Move to V4 Pro when the task genuinely needs the larger model.
The Bottom Line
DeepSeek V4 is real now. The careful part is pricing and migration.
The old version of this article was built around a question that has been answered. DeepSeek V4 Preview is live. The official docs now give us model names, context length, API formats, feature support, pricing, open weights, and a retirement deadline for old compatibility names.
The biggest practical update is pricing. V4 Flash is extremely cheap on output, and V4 Pro is temporarily discounted. That makes old cost comparisons stale almost immediately. Any production decision should use the current DeepSeek pricing page, not a static number copied from an older post.
My practical take: use V4 Flash as the default candidate for cost-sensitive agent and long-context work, then test V4 Pro only where the task justifies it. And if you still use deepseek-chat or deepseek-reasoner, migrate before July 24, 2026.
For broader model choice, see our Claude Opus 4.7 vs GPT-5.5 vs Gemini 3.1 Pro vs DeepSeek V4 comparison.
Keep Reading
Stay ahead of the AI curve
We test new AI tools every week and share honest results. Join our newsletter.



