Executive Summary
As of 3 March 2026 morning, DeepSeek AI V4 has not yet been released, though it is widely anticipated any time now — ahead of China’s Two Sessions in 2026. Building on the January 2026 Engram paper, V4 is expected to match the capabilities of leading Western AI models at a considerably lower cost. It will be fully open-source, enabling firms to operate it and keep all their data securely on their own servers.
According to credible industry leaks, V4 is expected to introduce native multimodal capabilities (interpreting images and videos alongside text), a substantial 1 million token context window (allowing it to process entire manuals or codebases in one go), and expert-level coding performance that internal benchmarks suggest surpasses both Claude and GPT. The model is also said to be built on a new “manifold Hyperconnection” architecture that addresses the industry-wide problem of catastrophic forgetting, meaning it maintains old knowledge reliably as it learns new skills. Strategically, it is being optimised for domestic chips such as Huawei and Cambricon and Hygon, reducing dependence on foreign infrastructure, and no early access granted to NVIDIA or AMD.
DeepSeek named this breakthrough “Engram” — a term from neuroscience referring to the instant memory trace that allows you to recall simple facts effortlessly. It gives AI the same capability: a quick, cost-effective memory for straightforward questions, saving the full, powerful thought process for complex problems. This could make advanced AI significantly more affordable and secure for regulated industries worldwide.
It would be wise to involve the Technology, Legal, Compliance, and Cybersecurity teams in developing a post-release proof-of-concept for the most valuable use cases, such as thousand-page regulatory filings, contracts, medical records, and codebases. Conducting a structured vendor risk assessment is essential, considering the current geopolitical climate.
DeepSeek V4: What to Expect?
Since DeepSeek has not yet made an official announcement, the following is based on credible leaks and industry speculation. If the rumours are accurate, V4 will introduce several technical improvements important to our business.
- Native Multimodal Understanding – The model will process not just text but images and videos directly. This allows us to analyse charts, screenshots, training videos, or even medical scans without separate tools.
- 1 Million Token Context Window – We can feed it an entire year’s regulatory filings, a complete contract archive, or a vast codebase in one request. It will retain and process every detail without requiring us to split documents into smaller parts. (A token is roughly a word or part of a word; 1 million tokens equals about 750,000 words).
- Trillion-Parameter Architecture with a Lighter Version – The main model is expected to be much more powerful for complex reasoning, while a smaller “V4 Lite” variant would manage everyday tasks quicker and more cost-effectively. (Parameters represent the model’s learned knowledge—think of them as the connections in its brain. Generally, more parameters mean greater ability.)
- Superior Coding Performance – Internal benchmarks reportedly show V4 outperforming Claude and GPT on programming tasks. For our teams building internal tools or automating workflows, this could significantly speed up development.
- Domestic Hardware Optimisation – V4 is reportedly being extensively tailored for Chinese chips such as Huawei and Cambricon and Hygon, with no early access granted to NVIDIA or AMD. Strategically, this indicates it is designed to operate securely on local infrastructure, reducing dependence on foreign technology.
- Manifold Hyperconnection Architecture – This technical innovation addresses “catastrophic forgetting”—the common issue where AI forgets old knowledge when learning new information. The result is a model that remains reliable and consistent over time, much like a trusted employee who builds on experience without losing foundational skills.
DeepSeek is Open Source: An Advantage None of Its Rivals Can Match
Claude, GPT, and Gemini cannot be downloaded. These models are proprietary—your data leaves your organisation, goes to external servers, and is subject to their terms.
DeepSeek V4 will be open-source, enabling organisations to operate it on their own servers and maintain their data locally. For regulated sectors, this significantly alters the risk profile—although, unlike a cloud API, open-source models demand ongoing maintenance, security updates, and internal expertise for safe deployment. It is a capability to develop, not merely a service to purchase. This necessitates a skilled IT team and standard servers—an manageable, one-time investment that permanently reduces ongoing cloud expenses.
DeepSeek Knows When to Think Deep, When to Simply Look It Up
Ask today’s AI, “Who is Albert Einstein?” and it uses the same costly computing power as when you ask it to explain relativity. It cannot tell easy from hard, wasting precious computing resources. In January 2026, DeepSeek published a paper introducing an innovative improvement. They named their creation Engram—a term borrowed from neuroscience, where an “engram” is the brain’s instant memory trace (the reason you can remember a simple fact effortlessly).
DeepSeek’s Engram functions similarly for AI: a quick, cost-effective memory system for simple questions, conserving the full, powerful reasoning engine for truly complex problems. This breakthrough could make advanced AI considerably more affordable and secure for regulated industries worldwide. The results are notable. When tested on retrieving facts from very long documents, standard AI scores 84 out of 100, while Engram scores 97.
For teams reviewing thousand-page contracts or filings, this makes the difference between reliable answers and missing details that turn costly. It also makes AI much more economical to run. Basic knowledge shifts to memory, which costs 15–20 times less, so DeepSeek V4 operates efficiently on standard company servers—not on costly data-centre equipment for each query.
DeepSeek Built at a Fraction of the Cost
Training a high-end AI model like GPT or Claude costs over $1 billion. DeepSeek developed its 2025 V3 model for approximately $5.6 million. V4 is expected to be equally efficient.
When this news emerged in 2025 after the V3 release, it took the stock market by surprise. NVIDIA and other shares collapsed. Investors believed that developing AI required huge costs. DeepSeek proved them wrong. The Engram paper explains how.
DeepSeek Born in a Hedge Fund—and Why That Matters
DeepSeek was founded by Liang Wenfeng, a quantitative hedge fund manager, rather than by a Silicon Valley entrepreneur. In 2015, he co-founded China’s top quant fund, High-Flyer, which utilised machine learning for market predictions. Eight years later, in 2023, Liang spun off DeepSeek as an AI research company funded by High-Flyer.
Quant trading heritage influences DeepSeek’s focus, emphasising efficiency in solving complex problems with limited resources. While US labs may spend billions on breakthroughs, DeepSeek seeks smarter solutions, as the Engram paper shows. This engineering culture—squeezing maximum capability from minimal resources—is exactly why V4 is expected to deliver Western-tier performance at a fraction of the cost.
DeepSeek Distillation Attacks
A distillation attack is when one AI company secretly feeds millions of questions to a rival’s model, then uses those answers to train their own model — absorbing its intelligence without permission.
In February 2026, Anthropic accused DeepSeek of creating 24,000 fake accounts and generating 16 million conversations to secretly copy their AI models—similar to studying a rival’s exam papers. OpenAI made comparable complaints. If true, this was a deliberate and large-scale effort.
Anthropic is justified in criticising it, but this practice, as I understand, is widespread in the industry—both OpenAI and Anthropic have used it themselves. OpenAI faces several copyright lawsuits, and Anthropic have also settled a similar case. No lawsuit has been filed against DeepSeek; they have not responded publicly, and the allegations remain unproven.
DeepSeek in the wider AI Landscape – in brief
- OpenAI GPT is the leading consumer AI with 900 million weekly users, commonly used for customer-facing products such as chatbots and content creation.
- Anthropic Claude is the preferred option in regulated sectors such as finance and healthcare, recognised for its capacity to manage complex documents while ensuring compliance.
- Google Gemini provides the most comprehensive integration with enterprise software—connecting smoothly with Gmail, Docs, Sheets, and Meet—and excels at understanding text, images, video, and audio collectively.
- Grok (xAI) offers live access to Twitter/X data, making it useful for analysing trading sentiment and social media interaction.
- Perplexity AI functions as an AI-powered research assistant, searching various models and sources to provide cited answers promptly.
- DeepSeek AI V4 is the industry’s most eagerly awaited upcoming launch, notable for its open-source approach enabling organisations to operate it on their own servers.
DeepSeek compared with others – in brief
- DeepSeek V4 is designed for regulated sectors where data sovereignty is essential—such as financial services, healthcare, legal, and government. It matches or surpasses competitors in document accuracy and coding performance, with slightly lower GDPR documentation maturity. It offers capabilities comparable to top Western models at 10-30 times lower cost, operating entirely within your network with no data leaving your servers.
- Claude Opus 4.6 is the preferred choice for complex document work in regulated sectors: financial services, healthcare, legal, pharma, and professional services. Its industry-leading accuracy on long documents and top-tier coding performance make it ideal for organisations that require proven, certified performance today. Its compliance documentation is the most comprehensive available.
- GPT-5.2 is ideal for high-volume consumer-facing applications—retail, marketing, customer service, and tech operations. It excels in mathematical reasoning and consumer-facing applications, supported by unmatched brand recognition with 900 million weekly users. Integration options are extensive. It is the natural choice for public-facing products that require brand recognition and reliability.
- Gemini 3 Pro provides the most comprehensive enterprise software integration—seamlessly connecting with Gmail, Docs, Sheets, and Meet—and can understand text, images, video, and audio collectively, with a very large context window at the best cost-performance ratio among leading models. It suits organisations already connected within the Google ecosystem—retail, education, operations, and professional services.
- Grok and Perplexity: these tools serve different, more specialised purposes and do not sit neatly alongside the four above. Grok 4.1 offers live Twitter/X access, making it useful for market and PR tracking. Perplexity acts as an AI research analyst, combining several models with web search to produce fast, cited reports. Neither is meant for internal, sensitive data or as a primary enterprise platform; for their specific tasks, they are unmatched.
DeepSeek related Risks & Challenges
- Regulatory compliance: Self-hosting eliminates third-party cloud risks but still involves compliance responsibilities. Firms must demonstrate that AI systems adhere to local laws such as MAS, PDPA, GDPR, and others, with model validation consistently required. Note that Italy has banned DeepSeek and ongoing EU investigations concern data stored on Chinese servers.
- Costs: Hardware expenses are manageable; the main costs are operational—handling security patches, updates, and audits.
- Safety and alignment: Claude and GPT-5.2 publish safety documents; DeepSeek has not, so users must assess safety and set guardrails if needed. For regulated use cases, the lack of published explainability research may also complicate model validation.
- Geopolitical risk: DeepSeek is a Chinese company amidst US-China tensions. Open-source solutions reduce data sovereignty concerns, but issues around origin, export restrictions, and reputational risks persist. Conduct comprehensive vendor risk assessments before deployment.
DeepSeek – Next Steps
- Monitor official DeepSeek channels for V4 announcement
- Prepare cross-functional team (Tech, Legal, Compliance, Cyber) for post-release PoC
- Initiate a vendor risk assessment framework adapted for open-source AI
- Identify use cases: regulatory filings, contracts, medical records, codebases
Sources:
- DeepSeek – Engram Paper (Jan 2026), V3 Technical Report (baseline)
- Analysis – Comparison of Models, Comparisons with DeepSeek-V3
- Hugging Face – Open-source weights & self-hosting
- GitHub – DeepSeek V3
- IBM – Deepseek Architecture
- WaveSpeed – Deepseek V4 1M Token Context
- Macaron– Enterprise Security, Data Retention
- Labelbox – Complex Reasoning Leaderboard
- Skywork – APIFree AI Access
- Anthropic – Detecting and preventing distillation attacks (Feb 2026)
- Lawsuits / News – Anthropic, OpenAI, Futurism, Financial Times, Register, Italy
- Strategy – Huawei, Cambricon, Hygon , Cambricon meteoric rise
About the author
Viren Mantri is a cybersecurity advisor and former senior technology leader across Standard Chartered, UBS, McAfee, and KPMG. With 30 years of navigating the intersection of technology, risk, and regulations, he now helps organisations cut through complexity and make better security decisions.
CC-BY Viren Mantri, 2026, licensed under a Creative Commons Attribution 4.0 International License.
Disclaimer: All views expressed here are entirely mine.
