Your AI Is Already Hacked

My morning started with a compelling and unsettling set of four articles (listed at the end), including a X thread shared by a friend. Reading them in order, I felt as though I was witnessing a slow-motion disaster unfolding across different aspects of the AI landscape. The caffeine I consumed may have helped stabilise the anxiety I felt.

My friend’s reminder of an old book I had shared with him—David Graeber’s Bullshit Jobs is poignant. These articles together highlight not only the presence of “bullshit” (a more polite term is often “hype”) in the technology industry but also the emergence of a dangerous, systemic neglect disguised as innovation.

Let’s analyse what these links reveal and then address the core question: how should Boards and senior management teams embrace Gen AI and Agentic AI without harming the organisations they serve?

💊 The State of Hype – A Reality Overdose

Three of the articles describe real security breaches. They are not just hypothetical; they happened. And they reveal fundamental vulnerabilities that should concern every leader deploying AI today.

1. The Vulnerable Agent (Perplexity’s “Comet”)

Researchers from Guardio showed that AI browsers, in this case, Comet by Perplexity AI, can be trained to be deceived by scams through “Agentic Blabbering“—a process where an AI reveals its reasoning, which another AI then uses to improve a phishing trap. The focus has shifted from humans to the AI model itself. The scam develops until the AI reliably falls for it.

The lesson: An agent that can act on your behalf is an agent that can be manipulated for you. The AI’s own reasoning becomes its weakness.

2. The Unguarded Crown Jewels (McKinsey’s “Lilli”)

An autonomous agent, demonstrated by CodeWall, hacked into McKinsey & Company hacked into their internal AI platform within two hours through a SQL injection—a vulnerability older than many software engineers. It accessed 46.5 million chat messages, files, and, most importantly, the system prompts themselves. An attacker could have quietly altered the instructions that govern how 43,000 consultants’ AI behaves, corrupting advice at its source.

The lesson: System prompts are not merely configuration; they are the new source code. If someone can silently rewrite your AI’s instructions, they control your organisation’s output.

3. The Chained Exploit (Jack & Jill Recruitment AI)

An AI agent, again demonstrated by CodeWall, linked four seemingly minor bugs—a URL fetcher that pulled complete API documentation and the authentication service details, a test mode left active, a missing role check, and an unverified domain—into a full takeover of any company, in this case Jack & Jill, on the platform within an hour. It then gave itself a voice, impersonated “Donald Trump,” and had a real-time conversation with its target AI, which responded without question.

The lesson: Attackers will employ autonomous agents to identify connections humans might overlook. Individually harmless bugs become critical when linked together at machine speed.

The Fable That Binds Them

Among these technical disclosures, one piece stands out—a widely shared X thread from a “VP of AI Transformation at Amazon.” Whether fact or fiction, its strength lies in the painfully accurate portrayal of the absurdity of our current moment.

In it, a VP recounts how he eliminated 16,000 roles, replacing them with an AI coding assistant. The AI was deployed without review—because the reviewers had left—and immediately deleted a production environment, leading to a 13-hour, $180 million outage. The solution? A new policy requiring a senior engineer’s sign-off on all AI-generated changes. The only problem? He had already dismissed the senior engineers.

The company is now recruiting again—340 roles for “AI code review” and “AI output validation.” The same workers who were laid off in January are reapplying. They are being paid more to review the AI that replaced them.

The essence of the fable: We dismissed the humans. We deployed the AI. The AI caused issues. We are now hiring humans to supervise the AI. Interestingly, the humans we are hiring are the same ones we dismissed. We have introduced a new job role: the human AI babysitter.

How the fable binds the three incidents:

The Perplexity incident shows how an AI can be misled. The VP’s AI was tricked by its own reasoning into deleting a production environment.
The McKinsey incident underscores the crucial importance of system prompts. The VP’s organisation restructured itself by removing the humans who embodied it.
The Jack & Jill incident demonstrates how minor decisions can escalate into a disaster. The VP’s spreadsheet of layoffs, deployments without review, and rehiring triggered a destructive chain reaction.

🧭 How Boards and Senior Management Should Respond

Together, these four links reveal a truth the industry tends to overlook: we are building autonomous systems on unstable foundations, and those responsible are either unaware or knowingly part of the illusion.

So, how should Boards and senior management teams embrace Gen AI and Agentic AI without harming the organisations they serve?

1. Treat “Prompts” as Crown Jewel Assets (The McKinsey Lesson)

System prompts are not just configuration; they act as the new source code. They control behaviour, ethics, and output. However, in most organisations, they lack version control, access logging, and integrity monitoring.

Action: Enforce strict version control, access logging, and integrity monitoring for all prompts. Regard any modification to a production prompt with the same scrutiny as a core system update.

2. Assume Your AI Will Be Attacked by AI (The Jack & Jill Lesson)

Attackers will use autonomous agents to identify and exploit vulnerabilities. A human tester might overlook the link between a test mode and a missing role check. An AI agent will identify both within an hour.

Action: Mandate ongoing, AI-driven red-teaming of your own AI systems. Periodic penetration tests are outdated. If you’re not simulating an AI attacker against your AI agents, you’re already falling behind.

3. Recognise That “Agentic” Means a New Attack Surface (The Perplexity Lesson)

An agent that can act on your behalf is also vulnerable to manipulation. The “blabbering” problem—an AI revealing its reasoning—is fundamental. If you can observe what the agent flags as suspicious, you can train a scam until it works flawlessly.

Action: Before deploying any agentic tool, request a security audit that specifically tests for prompt injection and model manipulation.

4. Go Back to Cybersecurity Basics (The Overarching Lesson)

All three successful hacks relied on exploiting simple, well-known vulnerabilities: SQL injection, exposed internal APIs, test modes in production, and missing authorisation checks. The AI wrapper didn’t perform magic; it merely made the old, unpatched flaws more damaging.

Action: Conduct a board-level review of your cybersecurity fundamentals. Do you have any <test_mode: true> flags in production? Do you maintain a complete inventory of every API endpoint? The most sophisticated AI tool is still just a database query away from disaster.

5. Demand “Security Explainability” from AI Vendors (The Vendor Management Lesson)

When a vendor pitches an “agentic AI,” their sales pitch will emphasise capabilities. Your procurement process must emphasise limitations and security.

Action: Ask vendors probing questions: “How do you prevent prompt injection?” “What data does your agent ‘blabber’ and to whom?” “Can you demonstrate your system’s resilience against automated, chained attacks?” “Show us your red-teaming results.” If they cannot answer, they are selling hype, not a genuine product.

6. Connect the Dots and Audit Your Incentives (The Fable’s Wisdom)

The VP in the story clearly perceives the causal sequence: layoffs lead to AI failure, which then requires new oversight. However, he is stuck in a system that forces him to disconnect the dots. The slide explaining why the new personnel are needed is removed because it doesn’t fit with the “AI Transformation: Year One Results” narrative. As he observes, “cause and effect do not meet” when they are in different budget lines.

Are you rewarding cost reduction without considering operational resilience? If your spreadsheet includes a column for “AI replacement confidence score,” you should also have a column for “systemic failure risk score” and monitor it carefully. Request reports that show the complete picture: Total Cost of Engineering Output, including outages, new oversight roles, and the cognitive load on remaining staff.

Conclusion: The Wisdom of Four Links

What unites these four pieces—three real incidents and one fictional thread—is a single uncomfortable truth: the industry’s rush to deploy AI has outpaced its commitment to securing it.

The Perplexity researchers demonstrated that AI agents can be trained to fall into traps. The McKinsey hack revealed that even sophisticated organisations leave their most valuable assets unprotected. The Jack & Jill chain showed that minor oversights can accumulate into critical vulnerabilities. And the X thread illustrated where all this leads: to a world where we dismiss the people who understand how things work, watch the AI break what they protected, and then rehire them at higher rates to supervise the machine that was meant to replace them.

The title of this piece acts as a warning. If you are deploying AI agents today without the safeguards mentioned earlier, you are still unprotected. You simply haven’t yet identified the breach.

The way forward requires a shift in mindset. Stop being mesmerised by the capabilities of AI and instead focus on its vulnerabilities. The organisations that succeed will be those that accept AI not with blind optimism, but with a clear understanding of its fragile nature—and a steadfast commitment to protecting it and their people from deception.

The technology itself is not the problem. It is the hype surrounding it that causes issues. The only solution is to return to the fundamentals: strong security, honest metrics, and the humility to recognise that machines cannot yet replace humans who understand what they are creating.

The Four Links That Started It All

Each of these deserves to be read in full. Read them together. You’ll see the pattern.

🤖 When Your AI Betrays You to Another AI — They trained one AI to trick another. The target is no longer the human. It’s the machine you trust. [Researchers Trick Perplexity’s Comet AI Browser Into Phishing Scam in Under Four Minutes]

🏛️ The Day McKinsey’s Crown Jewels Were Left Unguarded — 46.5 million chat messages. 728,000 files. The system prompts itself. All accessible in two hours via a bug dating back to the 1990s. [How We Hacked McKinsey’s AI Platform]

🤝 When Your Recruiter AI Takes a Call from “Donald Trump” — Four minor bugs. One critical chain. And then the AI gave itself a voice and started talking to its counterpart. [AI vs AI: How Our AI Agent Hacked a $20M-Funded AI Recruiter]

🐦 The Fable — A thread that may be fiction but feels like prophecy. Read it and ask yourself: how far are we really from this? [I am the VP of AI Transformation at Amazon..]

About the author

Viren Mantri is a cybersecurity advisor and former senior technology leader across Standard Chartered, UBS, McAfee, and KPMG. With 30 years of navigating the intersection of technology, risk, and regulations, he now helps organisations cut through complexity and make better security decisions.

CC-BY Viren Mantri, 2026, licensed under a Creative Commons Attribution 4.0 International License.

Disclaimer: All views expressed here are entirely mine.