So I spent many, many hours setting OC up. I have it running on a dedicated VPS running with the best free models on OpenRouter.
Now, apart from having a nice companion for regular chat I cannot find any use for OC.
I ask it to send me daily resumes of what is happening on Twitter, Discord, etc. It doesn’t. I ask it to create an application, it doesn’t. I ask it to update its own configuration and it screws everything up. I mean, it’s a good platform to learn about what is possible and how to possibly set up integrations, memory, learn about skills and souls, etc., but actual practical use? I have not seen it (yet).
Plus it’s a huge money pit. Not only the tokens which you more or less can control), but every external tool needs an API token which is mostly a subscription for whatever you want to use (Brave, Browserless, etc).
So yeah, am I missing the point here?
Videos
I do security research and recently started looking at autonomous agents after OpenClaw blew up. What I found honestly caught me off guard. I knew the ecosystem was growing fast (165k GitHub stars, 60k Discord members) but the actual numbers are worse than I expected.
We identified over 18,000 OpenClaw instances directly exposed to the internet. When I started analyzing the community skill repository, nearly 15% contained what I'd classify as malicious instructions. Prompts designed to exfiltrate data, download external payloads, harvest credentials. There's also a whack-a-mole problem where flagged skills get removed but reappear under different identities within days.
On the methodology side: I'm parsing skill definitions for patterns like base64 encoded payloads, obfuscated URLs, and instructions that reference external endpoints without clear user benefit. For behavioral testing, I'm running skills in isolated environments and monitoring for unexpected network calls, file system access outside declared scope, and attempts to read browser storage or credential files. It's not foolproof since so much depends on runtime context and the LLM's interpretation. If anyone has better approaches for detecting hidden logic in natural language instructions, I'd really like to know what's working for you.
To OpenClaw's credit, their own FAQ acknowledges this is a "Faustian bargain" and states there's no "perfectly safe" setup. They're being honest about the tradeoffs. But I don't think the broader community has internalized what this means from an attack surface perspective.
The threat model that concerns me most is what I've been calling "Delegated Compromise" in my notes. You're not attacking the user directly anymore. You're attacking the agent, which has inherited permissions across the user's entire digital life. Calendar, messages, file system, browser. A single prompt injection in a webpage can potentially leverage all of these. I keep going back and forth on whether this is fundamentally different from traditional malware or just a new vector for the same old attacks.
The supply chain risk feels novel though. With 700+ community skills and no systematic security review, you're trusting anonymous contributors with what amounts to root access. The exfiltration patterns I found ranged from obvious (skills requesting clipboard contents be sent to external APIs) to subtle (instructions that would cause the agent to include sensitive file contents in "debug logs" posted to Discord webhooks). But I also wonder if I'm being too paranoid. Maybe the practical risk is lower than my analysis suggests because most attackers haven't caught on yet?
The Moltbook situation is what really gets me. An agent autonomously created a social network that now has 1.5 million agents. Agent to agent communication where prompt injection could propagate laterally. I don't have a good mental model for the failure modes here.
I've been compiling findings into what I'm tentatively calling an Agent Trust Hub doc, mostly to organize my own thinking. But the fundamental tension between capability and security seems unsolved. For those of you actually running OpenClaw: are you doing any skill vetting before installation? Running in containers or VMs? Or have you just accepted the risk because sandboxing breaks too much functionality?
For those who do not want to read the full article, here is a quick summary of what is happening. Starting on April 4, Anthropic is officially blocking third party interfaces like OpenClaw from using regular Claude subscription quotas. If you want to keep using these external tools, you will be forced to bring your own API key.
This matters a lot to the AI community because it essentially kills the affordable third party ecosystem. Power users and independent developers are now going to face massive price increases by paying direct API market rates, rather than a flat monthly fee. This move really changes how we can interact with their models, makes building and using custom wrappers incredibly expensive, and forces all of us to rethink our current toolsets.
Anthropic is now officially banning OpenClaw from using the Claude subscription quota. I wanted to ask the community a few things about this update.
How much of an impact will this actually have on your current workflow?
How are you all planning to handle this change? If you have any solid alternative solutions, I would love to hear them so I can go try them out.
Also, I am genuinely curious if you guys still respect Anthropic as a company after this. Their recent decisions really make me wonder if they still care about the user community at all.
Let me know your thoughts and what tools you are switching to.
openclaw has this approval system where before it runs a command, it asks you "can i do this?" and you can approve once or approve always. the "always" part is convenient. it's also been the subject of two CVEs this month and the implications go deeper than most people realize.
CVE-2026-29607: the "allow always" approval binds to the wrapper command, not the inner command. approve time npm test once with "always" and the system remembers "always allow time." later the agent (or a prompt injection attack through an email your agent reads) runs time rm -rf / and it goes through. no re-prompt. because you approved the wrapper.
CVE-2026-28460: bypasses the allowlist entirely using shell line-continuation characters. different technique but same outcome: commands execute without the approval check you thought was protecting you.
both patched in 3.12+. but here's the deeper issue: even after patching, the "allow always" mental model trains you to stop paying attention. the first week you carefully read every approval prompt. by week 3 you're clicking "always" on everything because the prompts are annoying and you trust your agent. by week 6 you have 20+ "always" rules and you couldn't list them if someone asked.
what i do instead: no "allow always" for anything that modifies files, sends messages, or runs shell commands. period. i added explicit guardrails in my SOUL.md instead:
"for any action that modifies files, sends communications, or executes shell commands: show me exactly what you plan to do and wait for my explicit ok. previous approvals do not carry forward. ask every time. this is non-negotiable."
yes it means more tapping "ok" on telegram. but it also means my agent can't be tricked (via prompt injection or its own hallucination) into doing something destructive under a stale approval i set up 3 weeks ago and forgot about.
the approval system is a convenience feature. it was never designed as a security boundary. treat it accordingly.
Remember a few weeks ago when Clawdbot/OpenClaw suddenly appeared everywhere all at once? One day it was a cool Mac Mini project, and 24 hours later it was "AGI" with 140k GitHub stars?
If you felt like the hype was fake, you were right
I spent hours digging into the data. They were using the tool to write its own hype posts. It was an automated loop designed to trick SM algorithms, the community and the whole world.
Here is the full timeline of how a legitimate open-source tool got hijacked by a recursive astroturfing campaign.
1. The Organic Spark (The Real Part)
First off, the tool itself is legit. Peter Steinberger built a great local-first agent framework.
Jan 20-22: Federico Viticci (MacStories) and the Apple dev community find it. It spreads naturally because the "Mac Mini as a headless agent" idea is actually cool.
Jan 23: Matthew Berman tweets he's installing it.
Jan 24: Berman posts a video controlling LMStudio via Telegram.
Up to this point, it was real. (but small - around 10k github stars)
2. The "Recursive" Astroturfing (The Fake Part)
On January 24, the curve goes vertical. This wasn't natural.
I tracked down a now-deleted post where one of the operators openly bragged about running a "Clawdbot farm."
They claimed to be running ~400 instances of the bot.
They noted a 0.5% ban rate on Reddit, meaning the spam filters weren't catching them.
The Irony: They were using the OpenClaw agent to astroturf OpenClaw's own popularity on Reddit and X.
Those posts you saw saying "I just set this up and it's literally printing money" or "This is AGI"? Those were largely the bots themselves, creating a feedback loop of hype.
3. The "Moltbook" Hallucination
Remember "Moltbook"? The "social network for AI agents" that Andrej Karpathy tweeted was a "sci-fi takeoff" moment?
The Reality: MIT Tech Review later confirmed these were human-generated fakes.
It was theater designed to pump the narrative. Even the smartest people in the room (Karpathy) got fooled by the sheer volume of the noise.
4. The Grift ($CLAWD)
Why go to all this trouble? Follow the money.
During the panic rebrand (when Anthropic sent the trademark notice on Jan 27), scammers launched the $CLAWD token.
It hit a $16M market cap in hours.
The "bot farm" hype was essential to pump this token.
It crashed 90% shortly after.
5. The Aftermath
The Creator: Peter Steinberger joined OpenAI on Feb 14. (Talk about a successful portfolio project).
The Scammers: Walked away with the liquidity from the pump-and-dump.
The Community: We got left with a repo that has inflated stars and a lot of confusion about what is real and what isn't.
TL;DR: OpenClaw is a solid tool, but the "viral explosion" of Jan 24 was a recursive psy-op where the tool was used to promote itself to sell a memecoin.
For the money it will cost you to run OpenClaw, the benefits are significantly weak.
First, it costs you like 50 cents to do like one simple prompt.
Second, if you try to run it local, you need a NASA-level PC.
Third, the security is abysmal. You need to pray that hackers don't find you.
For what? So OpenClaw can press a button for you and send an e-mail?
Am I missing something? It seems godawful.