article

Your AI isn’t safe: How LLM hijacking and prompt leaks are fueling a new wave of data breaches

A junior developer at a fast-growing fintech startup, racing to meet a launch deadline, copied an API key into a public GitHub repo. Within hours, the key was scraped, bundled with others, and traded on Discord to a shadowy network of digital joyriders. 

Shutterstock

By the time the company’s CTO noticed the spike in usage, the damage was done: thousands of dollars in LLM compute costs, and a trove of confidential business data potentially exposed to the world.

I’m not hypothesizing. It’s a composite of what’s repeatedly happened in the first half of 2025.

In January, the AI world was rocked by breaches that feel less like the old “oops, someone left a database open” and more like a new genre of cyberattack. DeepSeek, a buzzy new LLM from China, had its keys stolen and saw 2 billion tokens vanish into the ether, used by attackers for who-knows-what. 

A few weeks later, OmniGPT, a widely used AI chatbot aggregator that connects users to multiple LLMs, suffered a major breach, exposing over 34 million user messages and thousands of API keys to the public. 

If you’re trusting these machines with your data, you’re now watching them betray that trust in real time.

The New Playbook: Steal the Mind, Not Just the Data

For years, we’ve worried about hackers stealing files or holding data for ransom. But LLM hijacking is something different – something weirder and more insidious. Attackers are after the very “brains” that power your apps, your research, your business. 

They are scraping GitHub, scanning cloud configs, even dumpster-diving in Slack channels for exposed API keys. Once they find one, they can spin up shadow networks, resell access, extract more information for lateral movement or simply run up service bills that would make any CFO faint. 

Take the DeepSeek case, where attackers used reverse proxies to cover their tracks, letting dozens of bad actors exploit the same stolen keys undetected. The result? You could wake up to a massive bill for unauthorized AI usage – and the nightmare scenario of your private data, whether personal or professional, being leaked across the internet.

But the plot thickens with system prompt leakage. System prompts – the secret scripts that tell a GPT how to behave – are supposed to be hidden from the end users. But with the right prompt, attackers can coax models into revealing these instructions, exposing the logic, rules, and sometimes even extremely sensitive information that keep your AI in check. Suddenly, the AI you thought you understood is playing by someone else’s rules.

With every new integration, the attack surface grows. But our security culture might be still stuck in the times of password123.

Why This Should Scare Us All

We’re wiring LLMs into everything, everywhere, all at once. Customer service bots, healthcare, legal research, even the systems that write our code. With every new integration, the attack surface grows. But our security culture might be still stuck in the times of password123.

In the meantime, the underground market for LLM exploits is exploding. Stolen keys are traded on Discord like baseball cards. Prompt leakage tools are getting more sophisticated. Hackers are sprinting ahead. And the more autonomy we give these models, the more damage a breach can do. We’re in a battle for control, trust, and the very nature of automation.

Are We Moving Too Fast for Our Own Good?

Thinking of AI as “just another tool” is a mistake. You can’t just plug these systems in and hope to slap on security later, because LLMs aren’t predictable spreadsheets or file servers. They’re dynamic and increasingly autonomous – sometimes making decisions in ways even their creators can’t fully explain. 

Yet, in the hurry to ride the AI gold rush, most organizations are betting their futures on systems they barely understand, let alone know how to defend. Security has been left in the dust, and the cost of that gamble is only going up as LLMs get embedded deeper into everything from business operations to healthcare and finance.

If we don’t change course, we’re headed for a reckoning – lost dollars and, more importantly, trust. The next phase of AI adoption will depend on whether people believe these systems are safe, reliable, and worthy of the power we’re handing them. If we keep treating LLMs like black boxes, we’re inviting disaster.

What Needs to Change, Ideally, Yesterday

So, what do we do? Here’s my take:

Treat API keys like plutonium. Rotate them, restrict their scope, and keep them out of your codebase, chats and logs. If you’re still pasting keys into Slack, you’re asking for trouble.

Watch everything. Set up real-time monitoring for LLM usage. If your AI starts unexpectedly churning out tokens at 3 a.m., you want to know before your cloud bill explodes.

Don’t trust the model’s built-in guardrails. Add your own layers – filter user inputs and system outputs, always assume someone will try to trick your AI if it’s exposed to user input.

Red-team your own AI solutions. Try to break it before someone else does. 

Implement segregation through access controls. Don’t let your chatbot have the keys to your entire kingdom.

And yes, a handful of vendors are starting to take these threats seriously. Platforms like Nexos.ai offer centralized monitoring and guardrails for LLM activity, while WhyLabs and Lasso Security are developing tools to detect prompt injection and model emerging threats. None of these solutions are perfect, but together they signal a much-needed shift toward building real security into the generative AI ecosystem.

Your AI’s Brain Is Up for Grabs, Unless You Fight Back

It’s time to recognize that LLM hijacking and system prompt leakage aren’t sci-fi. This stuff is happening right now, and the next breach could be yours. AI is the new brain of your business, and if you’re not protecting it, someone else will take it for a joyride.

I’ve seen enough to know that “hope” isn’t a security strategy. The future of AI seems bright, but only if we get serious about its dark side now – before the next breach turns your optimism into regret.

 

 

ABOUT THE AUTHOR

Vincentas Baubonis is an expert in Full-Stack Software Development and Web App Security, with a specialized focus on identifying and mitigating critical vulnerabilities in IoT, hardware hacking, and organizational penetration testing. As Head of Security Research at Cybernews, he leads a team that has uncovered significant privacy and security issues affecting high-profile organizations and platforms such as NASA, Google Play, and PayPal. Under his leadership, the Cybernews team conducts over 7,000 pieces of research annually, publishing more than 600 studies each year that provide consumers and businesses with actionable insights on data security risks.