AI Slopware Catastrophe: 45% Security Failures Proven

Reading mode

Opinion.

Here is the pitch: you describe what you want in plain language, and an AI builds the software for you. No coding experience required. Democratization of technology. Everyone becomes a developer. The future is here. Welcome to the age of AI slopware.

Here is the reality: the future arrived, and it is mostly broken.

What AI Slopware Actually Is

The term “slop” earned Merriam-Webster’s Word of the Year for 2025, a recognition that the flood of low-quality AI-generated content had become too large to ignore. But while the conversation focused on AI-generated articles, images, and music, something quieter and arguably more dangerous was happening: AI was being used to build software. Not good software. Not innovative software. Software that looks like it works until you need it to actually work.

The practice acquired a name in early 2025: “vibe coding.” The term describes the approach of letting an AI generate code based on natural-language descriptions, accepting whatever it produces without really understanding it. In theory, this lowers the barrier to entry. In practice, it lowers the bar.

The results are everywhere. App stores are filling with AI-generated applications that replicate existing tools poorly, introduce security vulnerabilities by default, and exist primarily because building them is now cheap enough that quality becomes optional. This is AI slop applied to software, and unlike a bad AI-generated article, bad AI-generated software can lose your data, expose your passwords, or brick your workflow.

The Numbers Are Not Encouraging

Veracode’s 2025 GenAI Code Security Report tested AI-generated code across more than 100 large language models and found that the models chose insecure coding methods roughly 45% of the time when given a choice between secure and insecure approaches. Cross-site scripting defences failed in 86% of relevant code samples. Java, one of the most widely used enterprise languages, showed a security failure rate above 70%. The most troubling finding: despite rapid improvements in how syntactically correct AI code has become, its security performance has remained flat.

CodeRabbit’s analysis of open-source pull requests found that AI co-authored code contained approximately 1.7 times more issues than human-written code. Logic and correctness errors appeared 1.75 times more often. Security and maintainability issues were also significantly elevated.

And then there is the productivity question. METR, an AI evaluation organisation, ran a randomised controlled trial with 16 experienced open-source developers in mid-2025. The developers predicted AI tools would make them 24% faster. The actual result: they were 19% slower. That is a 43-percentage-point gap between expectation and reality. Even after experiencing the slowdown firsthand, the developers still believed AI had sped them up by roughly 20%.

Read that again. The tools made people slower, and the people using them could not tell.

The Open Source Problem

The damage extends well beyond individual applications. Open-source software, the infrastructure that most of the internet runs on, is being actively degraded by the vibe coding wave.

Daniel Stenberg, the maintainer of cURL (a tool used by virtually every internet-connected device on the planet), shut down his project’s six-year bug bounty programme after AI-generated submissions overwhelmed it. Twenty percent of submissions were AI-generated, and the overall valid report rate dropped to 5%. The programme had paid out $86,000 over its lifetime. It became unsustainable not because of cost, but because sorting real bugs from AI-generated noise consumed more time than fixing actual vulnerabilities.

He is not alone. Mitchell Hashimoto banned AI code submissions to Ghostty. Steve Ruiz implemented auto-closure of all external pull requests to tldraw. RedMonk analyst Kate Holterhoff described the phenomenon as “AI Slopageddon,” a flood of AI-generated contributions so voluminous and low-quality that maintainers cannot keep up.

Stack Overflow, where developers have sought and shared knowledge for over a decade, saw 25% less activity within six months of ChatGPT’s launch. Tailwind CSS, a widely used framework, watched its documentation traffic fall 40%. These are not just numbers. They represent the erosion of the community knowledge base that made the software ecosystem work in the first place. The AI tools that generate code were trained on this ecosystem. They are now destroying the commons they were built from.

The Steelman, and Why It Only Partially Holds

The counterargument deserves a fair hearing. AI coding tools genuinely help with boilerplate, repetitive tasks, and prototyping. For experienced developers who review, test, and understand every line, AI is a sophisticated typing assistant. Programmer Simon Willison made the distinction clearly: “If an LLM wrote every line of your code, but you’ve reviewed, tested, and understood it all, that’s not vibe coding in my book, that’s using an LLM as a typing assistant.”

This is fair. The problem is not that AI can write code. The problem is that the industry is treating code generation as the hard part, when it was never the hard part. The hard part is understanding what the code should do, why it should do it that way, and what happens when it fails. Vibe coding skips all three.

The result is a world where building a minimum viable product takes a weekend instead of a month, and building a minimum viable product that does not fall over in production still takes a month, plus the weekend you spent building the wrong thing first.

The Incentive Problem

Why is this happening? Because the incentives are perfectly aligned for it to happen.

Building software with AI is cheap. Deploying it is cheap. Marketing it is cheap (also AI-generated). The cost of being wrong has not changed, but it has been transferred from the builder to the user. If your AI-built expense tracker leaks data, the person who built it in a weekend has already moved on to their next project. You are the one calling your bank.

This is platform degradation applied to the entire software supply chain. When the cost of production drops to near zero but the cost of quality remains constant, the market floods with cheap products and the average quality collapses. This is not a prediction. Over half of English-language web content is now AI-generated, according to SEO firm Graphite. The same dynamic is coming for software, and unlike articles, software that fails can cause real-world harm.

What This Actually Means

The AI slopware wave is not going to stop. The tools will improve. The code will get marginally less insecure. But the structural incentive to ship fast and fix never will remain as long as building is cheap and accountability is absent.

What matters is whether the ecosystem develops immune responses. Some already exist: Apple and Google removed dozens of scam AI apps in early 2026. The open-source community is experimenting with contribution gates. Some platforms are beginning to label AI-generated content.

But none of this addresses the root: AI coding tools have made it trivially easy to reinvent the wheel, and the new wheels are usually worse. The original software engineering discipline, the one that valued understanding problems before solving them, was not an obstacle to democratisation. It was the thing that made the software worth using.

The industry’s most expensive lesson is being taught right now, at the user’s expense: there is no shortcut to understanding what you are building and why.

The AI slopware problem is not theoretical. The hype cycle for AI-assisted development has officially lapped itself. What began as “GitHub Copilot will make you faster” has evolved through “AI will replace junior developers” and arrived at its current destination: a landscape where non-technical founders can ship applications they do not understand, using tools that generate code they cannot audit, deployed to infrastructure they cannot debug.

The industry has a name for this: vibe coding, a term that emerged in early 2025 to describe the practice of generating code from natural-language prompts without reviewing or understanding the output. The charitable interpretation is rapid prototyping. The accurate interpretation, for most of what ships, is technical debt creation as a service.

The Security Surface

Veracode’s 2025 GenAI Code Security Report tested code generation across more than 100 LLMs using 80 curated coding tasks across Java, JavaScript, Python, and C#. The headline finding: models chose insecure implementation methods 45% of the time when given a choice between secure and insecure approaches. But the details are worse.

XSS defences failed in 86% of relevant samples (CWE-80). Only 12-13% of generated code handling context-dependent vulnerabilities like XSS was actually secure. SQL injection prevention, by contrast, performed reasonably, suggesting that models have learned common patterns from training data but fail on anything requiring contextual security reasoning. Java showed a security failure rate exceeding 70%, which is notable given its dominance in enterprise environments.

The most structurally important finding: security performance has remained flat over successive model generations, even as syntactic correctness has improved. The models are getting better at code that compiles. They are not getting better at code that is safe. This is a Goodhart’s Law problem: the training optimises for functional correctness, and security is not part of the loss function.

CodeRabbit’s State of AI vs. Human Code Generation Report, analysing open-source pull requests, found 1.7x more total issues in AI co-authored code. Breaking this down: maintainability errors 1.64x higher, logic and correctness errors 1.75x higher, and security findings 1.57x higher. These are not toy benchmarks. This is production code in active repositories.

The Productivity Illusion

METR’s randomised controlled trial is the most rigorous study of AI coding productivity to date. Sixteen experienced open-source developers (from repos averaging 22,000+ stars) were randomly assigned 246 issues, some completed with AI tools (primarily Cursor Pro with Claude 3.5/3.7 Sonnet), some without. Participants were compensated at $150/hour to minimise incentive bias.

Result: 19% slower with AI tools. Predicted: 24% faster. Post-hoc self-assessment: still believed they were 20% faster.

The 43-percentage-point perception-reality gap is the finding that matters. Developers cannot accurately assess whether these tools are helping them. The METR authors note important caveats: 16 participants, specific to experienced developers on familiar codebases, and a “snapshot of early-2025 AI capabilities.” But the perception gap is the structural issue. If practitioners cannot tell whether they are faster or slower, optimisation at the team or organisation level becomes nearly impossible.

Google’s DORA report adds another dimension: while AI accelerates code production, it can also encourage larger changesets and rapid experimentation that increase deployment failures if engineering discipline does not keep pace.

Open Source Under Siege

The open-source ecosystem is experiencing what amounts to a distributed denial-of-service attack from well-meaning incompetence. Daniel Stenberg killed cURL’s bug bounty after AI submissions dropped the valid rate to 5%. A research paper from Central European University and Kiel Institute documented the feedback loop: as developers delegate package selection and usage to AI, documentation visits, bug reports, and community recognition decline. Stack Overflow lost 25% activity post-ChatGPT. Tailwind CSS documentation traffic dropped 40%, with revenue following at 80%.

Craig McLuckie (Stacklok co-founder) identified the mechanism: “good first issue” labels, designed to onboard new contributors, now attract low-quality AI submissions instead of nurturing genuine developers. The pipeline for creating new open-source contributors is being poisoned by the tools that depend on their future output.

This is a tragedy of the commons in real time. The AI models were trained on open-source code. The applications built with those models are degrading the open-source ecosystem. Nobody owns the problem, and the people profiting from the tools have no incentive to fix it.

The Honest Assessment

AI coding tools are genuinely useful in the hands of experienced engineers who treat them as autocomplete, not architecture. Simon Willison’s distinction is correct: using an LLM as a typing assistant is materially different from vibe coding. The problem is that the market does not distinguish between the two.

The enterprise failure rates are staggering. A 2025 MIT study found that 95% of generative AI pilots failed to produce measurable revenue or cost savings. Forty-two percent of companies abandoned most AI initiatives in 2025, more than double the 2024 rate. Industry analysts project that most technology decision-makers will face significant technical debt from AI adoption by 2026.

The Lovable platform incident is instructive: in 2025, security researchers found that a significant proportion of web applications built on the no-code platform had vulnerabilities allowing unauthorised data access, from a single platform, discovered by a single researcher.

The structural problem is not that AI generates bad code. It is that AI has decoupled the ability to produce software from the ability to evaluate software. The first skill is now free. The second never was, and the gap between them is where the damage happens.

AI Slopware: The Rise of Software Nobody Asked For, Built by Nobody Who Understands It

What AI Slopware Actually Is

The Numbers Are Not Encouraging

The Open Source Problem

The Steelman, and Why It Only Partially Holds

The Incentive Problem

What This Actually Means

The Security Surface

The Productivity Illusion

Open Source Under Siege

The Honest Assessment

Sources

What AI Slopware Actually Is

The Numbers Are Not Encouraging

The Open Source Problem

The Steelman, and Why It Only Partially Holds

The Incentive Problem

What This Actually Means

The Security Surface

The Productivity Illusion

Open Source Under Siege

The Honest Assessment

Sources

Related

Yann LeCun’s $1 Billion Bet Against LLMs: Why He Thinks the Whole Paradigm Is Wrong

Why Institutions Destroy Whistleblowers: The Psychology of Systemic Self-Preservation

Uncensored AI: What the Term Actually Means and What It Does Not

The Economics of Artificial Scarcity: Why Digital Goods Are Engineered to Degrade