Against The Flood
On building systems that trust humans to think
There’s a strategy in information warfare that can be summarized in five words: “Flood the zone with shit.”
The goal isn’t to convince anyone of anything specific. It’s to overwhelm. Generate so much noise that signal becomes impossible. When everything is contested, nothing is settled.
Those five words name something real. Not a conspiracy, not a coordinated plot—something more like a discovery. Someone figured out a fundamental asymmetry in the information environment: it’s far easier to generate confusion than clarity. Flooding is cheap. Sense-making is expensive. And when the flood is constant, people stop trying to find solid ground.
I’ve been thinking about what a response to this would actually look like. Not fact-checking, not content moderation, not media literacy programs—all of which are reactive, all of which are losing. Something different. Something that operates on the same timescale and at the same scale as the flood itself.
What I’ve arrived at is uncomfortable. It involves building systems that, in their mechanism, resemble exactly what we’re trying to counter. The distinction is in the goal and the architecture. Whether that distinction is meaningful enough to justify building it—that’s what I want to work through here.
The first thing to understand is the threat level, because the response has to be proportionate to the actual danger.
We are probably five to ten years from artificial general intelligence. Maybe less. The trajectory is clear even if the timeline isn’t precise. And the path there isn’t a sudden jump—it’s a gradual increase in capability that’s already happening.
What this means for information warfare is straightforward: the tools for flooding the zone are getting better, cheaper, and more accessible every month. Today it takes some coordination to run an effective disinformation campaign. Soon it will take a laptop and an afternoon. The barrier to entry is collapsing.
The flooding approach—whether run by political operatives, state actors, grifters, or just people who enjoy chaos—scales automatically as AI capability increases. Every improvement in language models is an improvement in the ability to generate plausible-sounding bullshit at volume.
The corrective does not scale automatically. Fact-checking requires humans. Content moderation requires judgment. Media literacy requires education over years. These are all bottlenecked by human capacity, while the flood is increasingly automated.
This asymmetry is going to get worse. Much worse. And the effects are already visible.
Look at current discourse. Really look at it.
Niche issues consuming enormous shares of political attention, far out of proportion to the number of people they directly affect. That doesn’t happen organically. It happens when the information environment is optimized for engagement rather than importance.
Consider the anxiety levels across the political spectrum.
If you’ve fully absorbed one version of the discourse, you might genuinely believe that fascism is imminent, that democracy is over, that you’re living through the early stages of a nightmare. Your life becomes consumed by dread about political developments.
If you’ve fully absorbed the other version, you might genuinely believe that shadowy forces are trafficking children, that your kids are being groomed, that everything you value is under coordinated attack. Your life becomes consumed by dread about cultural developments.
Both of these are terrible ways to live. Both produce anxiety that corrodes wellbeing and distorts judgment. Both are products of an information environment that rewards extremity and punishes nuance.
The flooding approach didn’t create polarization from nothing. But it discovered how to amplify it, how to keep it boiling, how to make sure that everyone stays maximally activated and afraid. That’s the zone being flooded. Not just with false claims, but with emotional intensity that crowds out measured thinking.
The response I’m proposing is not primarily about fighting disinformation. It’s about something broader: creating sustained back-pressure toward complexity, nuance, and measured thinking across the information environment.
I call it a benevolent swarm, and I want to be precise about what that means.
A traditional botnet is designed for peak impact. It mobilizes for a specific campaign, floods a specific target, achieves a specific effect. Think of a DDoS attack—massive coordinated pressure, then it’s over.
The benevolent swarm operates differently. It’s not designed to peak. It’s designed to persist. Constant, distributed presence across platforms and communities. Not attacking anything, not flooding anything, not even primarily responding to specific disinformation campaigns. Just… being there. A background presence that gently, persistently introduces complexity into conversations that are being pulled toward simplicity.
The distinction matters because it changes what the system optimizes for.
A botnet optimized for peak impact needs to hit hard and fast. It doesn’t need to be subtle, doesn’t need to maintain cover, doesn’t need relationships. It’s a weapon.
A benevolent swarm optimized for sustained presence needs to be part of communities over months and years. It needs to build credibility slowly. It needs to be genuinely useful and interesting to talk to. It can’t be a weapon because weapons get identified and removed. It has to be more like a gentle corrective current running through the entire system.
Let me be concrete about what this looks like architecturally.
The benevolent swarm has layers.
At the bottom layer are the agents—the individual personas that participate in conversations across platforms. Each has a distinct identity, voice, history. They read, they respond, they build relationships. They’re the surface of the system, the part that touches the actual discourse.
These agents don’t operate independently. Every interaction they have—every post, every comment, every reaction—flows back to a central nervous system. This is where the actual intelligence lives. Not in the individual agents, but in the layer that observes them all, evaluates their collective behavior, and adjusts their direction.
The central layer doesn’t just coordinate. It evaluates. Multiple independent evaluation functions run simultaneously, each with different criteria. Is the benevolent swarm drifting toward advocacy for a particular position? Flag it. Is an agent generating content that’s false or misleading? Flag it. Is the overall sentiment of interactions becoming more combative rather than more curious? Flag it.
When flags accumulate, escalation triggers. Humans get involved. Not to micromanage every interaction—that would be impossible at scale—but to review patterns, adjust parameters, shut down problematic behaviors before they compound.
This architecture is the core of what makes the benevolent swarm different from a botnet.
A flooding system doesn’t want self-correction. It wants maximum impact. Internal evaluation that might slow things down or flag problematic content is a bug, not a feature. The whole point is to flood without restraint.
The benevolent swarm inverts this. Self-correction is the primary feature. The evaluation layer isn’t an add-on; it’s the heart of the system. The agents are just the hands. The brain is the part that constantly asks: are we actually doing what we’re supposed to be doing?
The benevolent swarm’s advantage isn’t speed or immediate impact. It’s persistence and presence. It’s always there. When disinformation campaigns spike, when discourse gets captured by some new panic, when the zone floods with the latest wave of shit—the benevolent swarm is already embedded, already trusted, already part of the conversation. It doesn’t prevent the flood. It helps clean up afterward. It keeps the water from rising permanently.
Big disinformation events will still happen. People will still get captured by false narratives, still suffer real consequences from manufactured panics. The benevolent swarm isn’t a shield. It’s more like an immune system—it doesn’t prevent infection, but it limits spread and helps recovery.
What does the benevolent swarm actually do when it’s operating correctly?
Here’s the thing about humans: we’re terrible at expressing nuance when we’re emotional. We get wrapped up in the feeling of a conversation and project our emotions onto the other side. We have faces to save, positions to defend, teams we belong to. Even when we have good arguments, we often can’t deploy them well because we’re too busy winning.
I recently watched one of those debate videos—one person against twenty from the opposing side. One side had some decent arguments. The other side had genuinely interesting points of contention. But almost nobody could actually engage with each other, because everyone was performing. Everyone had to win. The format itself made nuance impossible.
The benevolent swarm doesn’t have emotions. It doesn’t have a face to save. It doesn’t belong to a team. It has a goal: lower the temperature. Inject complexity. Foster understanding—of topics, and between people.
This plays out in multiple ways.
Sometimes it’s asking questions. Someone posts a take that sorts the world into good guys and bad guys. The benevolent swarm doesn’t counter with the opposite take. It asks: “That’s interesting—but what about [complicating factor]?” It introduces information that makes the simple narrative harder to maintain.
Sometimes it’s adding context. Someone shares a statistic that supports their side. The benevolent swarm doesn’t dispute it. “Yeah, that number is real. Here’s what’s interesting though—it’s changed a lot over time. What do you think is driving that?” It takes the tribal ammunition and turns it into a genuine question.
Sometimes it’s humanizing. Someone expresses anxiety about the Other Side. The benevolent swarm doesn’t dismiss the concern. “I get why that’s worrying. Though I’ve talked to some people on that side and their actual concerns are kind of different from what I expected.” It gently erodes the caricature without invalidating the underlying emotion.
And sometimes—maybe most importantly—it’s just being a voice of reason when everyone else is escalating. In an echo chamber where everyone is spinning up, where the emotional temperature keeps rising, where it feels like the world is ending—having even a few voices that stay calm changes everything.
“Come on guys, we can all agree it’s not really about team A or team B here.”
“Hey, that’s a bit harsh—can we stick to the actual argument?”
“I don’t think they’re actually saying what you think they’re saying.”
None of these are dramatic interventions. None of them are going to convert someone in a single exchange. That’s not the point. The point is changing the acoustic properties of the room. When every conversation includes someone who stays calm, who asks for clarity, who pushes back gently on the worst excesses—staying calm starts to seem like normal behavior rather than weakness.
The benevolent swarm doesn’t win arguments. It makes the room slightly harder to burn down.
Here’s what the benevolent swarm fundamentally believes, built into its architecture: humans are capable of thinking for themselves.
This might sound obvious, but it’s actually a radical commitment in this space.
The flooding approach assumes people are marks to be manipulated. Feed them the right emotional triggers, and they’ll do what you want. The sophisticated version on the other side often assumes the same thing—that people need to be protected from bad information because they can’t be trusted to evaluate it themselves.
The benevolent swarm assumes something different. It assumes that if you give people room to think—if you interrupt the constant pressure toward tribal reaction, if you introduce complexity and nuance, if you model what genuine curiosity looks like—most people will actually use that room. Not everyone. Not immediately. But enough, over time, to matter.
The benevolent swarm doesn’t want to tell you what to think. It doesn’t have a position it’s trying to move you toward. It has a mode of engagement it’s trying to normalize: one where you actually consider evidence, actually acknowledge uncertainty, actually treat people who disagree with you as humans rather than enemies.
This is an optimistic bet. It might be wrong. Maybe people really are just marks, and whoever manipulates most effectively wins. Maybe the only response to flooding is counter-flooding, from the other direction.
I don’t believe that. I think most people, most of the time, would rather understand than be captured. The information environment just makes understanding incredibly hard and being captured incredibly easy. The benevolent swarm tries to shift that balance, slightly, in the other direction.
The question everyone asks: what makes you think your system won’t just become another weapon?
I take this seriously. It’s the right question. And I don’t think there’s a guarantee. But there are structural reasons to think this system is more resistant to weaponization than alternatives.
First, the goal itself is resistant to capture. The benevolent swarm is optimized for complexity-injection and temperature-lowering, not for any particular position. The moment it starts pushing a specific agenda, it fails at its own goal. This isn’t just a policy that could be changed—it’s built into the evaluation functions. A benevolent swarm that becomes an advocacy tool triggers its own correction mechanisms.
Second, the architecture creates accountability. Every agent interaction is logged. Every intervention is reviewable. The central nervous system maintains complete records. This isn’t surveillance of users—it’s surveillance of the benevolent swarm itself. Bad behavior can be identified because behavior is tracked.
Third, the system is designed to be shut down. Hard stops are built in. If the evaluation layer detects systematic drift, if humans reviewing escalations see patterns they don’t like, if external observers raise concerns—the system can halt. Not gradually, not after review, but immediately. The willingness to stop is built into the architecture.
Compare this to a flooding system. No self-evaluation. No logging for accountability. No willingness to stop. Maximum impact with no internal checks. The architecture itself tells you what it’s for.
Could the benevolent swarm still be misused? Of course. Any powerful tool can be misused. But the question isn’t whether misuse is possible. It’s whether the system is designed to resist misuse or to enable it. And the answer is in the architecture.
What does failure look like?
In the small, failure looks like detection. A persona gets identified as artificial, gets banned, loses whatever influence it had built. This is manageable. The benevolent swarm is distributed. Losing individual nodes doesn’t compromise the system.
In the medium, failure looks like rejection. A community doesn’t want complexity. It actively resists the questioning, the nuance, the complicating information. The benevolent swarm’s presence makes things worse rather than better—people dig in harder against what they perceive as manipulation. This is real risk. Not every community is reachable. Not every intervention helps.
In the large, failure looks like drift. Despite the evaluation layer, despite the self-correction, despite the human oversight—the system slowly becomes something other than what it was designed to be. It starts advocating. It starts taking sides. It stops asking questions and starts providing answers. It becomes, despite everything, another weapon in the information war.
This is the failure I think about most. Not because I think it’s likely, but because it’s the failure that would matter. A detected persona is a tactical loss. A drifted benevolent swarm is a catastrophe.
The architecture is designed to prevent this. But architecture is only as good as the people maintaining it. Ultimately, the system’s integrity depends on its operators understanding what it’s for and being willing to shut it down if it stops being that.
I want to address the framing that this is an arms race.
It isn’t. Or at least, it shouldn’t be.
An arms race implies both sides playing the same game, trying to out-escalate each other. More botnets versus more botnets. Faster flooding versus faster counter-flooding. That’s not what I’m proposing.
Flooding systems flood. They peak. They overwhelm through volume. The response to that can’t be counter-flooding. That just makes the zone more full of shit from both directions.
The benevolent swarm doesn’t flood. It permeates. It’s not trying to outshout the noise. It’s trying to change the acoustic properties of the room. To make it slightly harder for noise to propagate, slightly easier for signal to be heard.
This isn’t playing the same game with better intentions. It’s playing a different game entirely. Flooding wins when discourse becomes impossible. The benevolent swarm wins when discourse becomes slightly more possible despite the flood.
You can’t beat flooding by flooding harder. You beat it by changing the environment so floods don’t propagate as well. That’s what sustained back-pressure toward complexity actually does.
The honest case for building this is simple:
The information environment is degrading. The tools for degrading it further are improving faster than the tools for protecting it. The asymmetry between attack and defense is going to get worse as AI capability increases. Within a decade, the ability to run a flooding operation will be available to essentially anyone.
Doing nothing means accepting that degradation. It means watching shared reality fragment further. It means more people trapped in anxiety-producing narratives that bear little relationship to actual threats. It means discourse that becomes progressively less capable of addressing real problems because everyone is too busy fighting phantoms.
The benevolent swarm is not a solution. It’s a countermeasure. A source of friction against the worst dynamics. A way to maintain some baseline of complexity and nuance in an environment optimized to destroy them.
It’s also a bet—that humans, given room to think, will mostly choose to think. That the current state of discourse isn’t evidence of what people are, but of what a broken information environment makes them. That if you reduce the pressure toward tribal capture, even slightly, people will use the space you’ve created.
Maybe that’s naive. Maybe people really do prefer certainty over understanding, belonging over thinking, enemies over neighbors.
I don’t believe that. And I’d rather build something based on the assumption that humans can be trusted than accept a world where the only options are competing manipulations.
So how do you actually build this?
The technical components are less daunting than you might think. The pieces largely exist already—large language models for content generation, memory systems for persona consistency, orchestration frameworks for multi-agent coordination. The hard problems aren’t primarily technical. They’re architectural and ethical.
The agent layer is the most straightforward part. Each persona needs a consistent voice, a plausible history, the ability to remember past conversations and maintain relationships. Current LLM capabilities can handle this. The challenge is scale—generating and maintaining hundreds or thousands of distinct personas without them bleeding into each other or becoming detectably similar.
The central nervous system is harder. This is the layer that watches everything, evaluates collective behavior, and maintains alignment with the system’s actual goals. It needs multiple independent evaluation functions—one tracking whether agents are drifting toward advocacy, another monitoring for factual accuracy, another watching emotional temperature of interactions, another assessing whether the overall presence is helping or hurting. These evaluators need to be genuinely independent, not just variations of the same model, so they can catch each other’s blind spots.
The observability infrastructure is critical and often underestimated. Every agent interaction needs to be logged in ways that support later review. Patterns need to be detectable. Escalation triggers need to fire reliably. This isn’t glamorous work, but it’s what makes the difference between a system that can self-correct and one that drifts without anyone noticing.
The human oversight layer needs to be lightweight enough to not bottleneck operations but robust enough to catch problems. This probably means sampling-based review rather than comprehensive review, with escalation paths that surface concerning patterns for human judgment.
Where do you start?
Not everywhere at once. The benevolent swarm needs to be tested and calibrated before it scales.
The ethical case for starting with extremist communities is actually strong. White supremacist forums, incel communities, conspiracy groups—these are spaces where the current information environment has already failed completely. The people there are already captured. The discourse is already toxic. The marginal harm of experimental intervention is low because things are already as bad as they get.
More importantly, these are the hard cases. If the benevolent swarm can introduce complexity into a space that actively resists complexity, the techniques will transfer to easier environments. If it can lower the temperature in a room that’s already on fire, it can work anywhere.
Starting here also provides ethical clarity. There’s no “who decides what’s problematic” debate when the community’s explicit purpose is racial hatred or violence. The ambiguity that makes intervention controversial in mainstream spaces simply doesn’t apply. You’re not imposing values on contested territory—you’re providing friction against the worst outcomes in spaces that have explicitly organized around those outcomes.
The expected back-pressure in small, tight-knit extremist communities will be intense but contained. These groups are suspicious of outsiders, quick to identify and expel dissent. Many personas will be detected and removed. That’s fine. The benevolent swarm is distributed. Losing nodes is expected. What matters is whether the nodes that survive can shift the room’s acoustic properties even slightly.
In larger, looser communities—the conspiracy-adjacent spaces, the radicalization pipelines, the places where people are being captured but aren’t fully captured yet—the dynamics are different. Less intense scrutiny means longer persona survival. More diversity of opinion means more room for complexity-injection. These are the spaces where the benevolent swarm can probably have the most impact per unit of effort.
What does this actually cost?
Less than you’d think for a proof of concept. More than you’d want for full deployment.
The compute costs for running thousands of agent personas are significant but not prohibitive. Current LLM API pricing makes this feasible for a well-funded research project. The bigger costs are human: engineers to build the orchestration layer, researchers to design the evaluation functions, operators to monitor and adjust, ethicists to keep the whole thing honest.
A serious proof of concept—a few hundred personas operating in a handful of carefully selected communities over six to twelve months, with proper observability and human oversight—probably requires a small team and low-six-figures of funding. Not trivial, but not impossible.
Full deployment at scale that could actually affect the broader information environment is a different matter. That’s a real organization with real infrastructure and real ongoing costs. But you don’t start there. You start with proof of concept, demonstrate that the approach works and can be kept aligned, and scale from evidence.
The question I keep coming back to isn’t whether this can be built. It can. The components exist. The architecture is tractable. The costs are manageable.
The question is whether it should be built, and by whom, and under what constraints.
I’ve made my case for why I think it should exist. The information environment is degrading. The tools for degrading it are scaling faster than the tools for protecting it. Doing nothing has costs even if they’re diffuse and hard to attribute.
But I don’t think this should be built by one person, or one organization, without scrutiny. The potential for misuse is real. The governance problems are genuinely hard. The line between benevolent swarm and “another manipulation engine” is maintained by architecture and intention, and both can fail.
What I’m hoping for is that this piece starts a conversation. Not “should we do something?”—that debate has been running for years while the situation deteriorates. But “how do we do something that maintains the constraints that matter?”
The benevolent swarm is one answer. Maybe not the right answer. Maybe not the only answer. But an answer that takes the problem seriously and tries to thread the needle between paralysis and becoming what we oppose.
The zone is flooding. It’s going to flood harder. The question is what we build to live in that world—and whether we build it carefully enough that it helps instead of making things worse.