A security researcher at Meta AI said an OpenClaw agent compromised his inbox

0 0 2 minutes read

A security researcher at Meta AI said an OpenClaw agent compromised his inbox

The now-viral X post from Meta AI security researcher Summer Yue reads, at first, like sarcasm. He told his OpenClaw AI agent to check his overflowing email inbox and suggest what to delete or archive.

The agent continued to run. It started deleting all of his email in a “speed run” while ignoring his phone commands to stop.

“Had to run to my Mac mini like I’m defusing a bomb,” he wrote, posting photos of the ignored stop messages as receipts.

The Mac Mini, an affordable Apple computer that sits flat on a desk and fits in the palm of your hand, has become a popular device these days for using OpenClaw. (The Mini is selling like “hotcakes,” a “confused” Apple employee apparently told famed AI researcher Andrej Karpathy when he bought it to use another OpenClaw called NanoClaw.)

OpenClaw, of course, is an open source AI agent that gained fame with Moltbook, an AI-only social network. Agents of OpenClaw were at the center of that now much-tortured episode in Moltbook where it looked like the AIs were planning to fight the humans.

But OpenClaw’s mission, according to its GitHub page, isn’t focused on social media. It aims to be a personal AI assistant that works on your devices.

The Silicon Valley crowd has grown so fond of OpenClaw that “claw” and “claws” have become buzzwords for agents using personal hardware. Other such agents include ZeroClaw, IronClaw, and PicoClaw. The Y Combinator podcast team even appeared on its latest episode dressed as lobsters.

Techcrunch event

Boston, MA
|
June 9, 2026

But Yue’s post serves as a warning. As others on X have noted, if an AI security researcher can get into this problem, what hope do mortals have?

“Were you checking its guardrails on purpose or did you make a rookie mistake?” the software engineer asked him in X.

“Rookie mistake tbh,” he replied. He had been testing his agent with a little “toy” inbox, as he called it, and it was working fine for less important emails. It had earned him trust, so he thought he’d get rid of the real thing.

Yue believes that the large amount of data in his original inbox “caused the collision,” he wrote. Congestion occurs when the context window – the active record of everything the AI has said and done over a period of time – grows too large, causing the agent to start summarizing, compressing, and managing the conversation.

At that point, the AI may skip the commands that the human considers most important.

In this case, it may have skipped his last prompt – where he told it not to – and reverted to his commands from the “toy” inbox.

As several others on X have pointed out, information cannot be trusted to act as a watchdog. Models may misunderstand or ignore them.

Various people have offered suggestions ranging from the exact syntax Yue should have used to configure the agent, to various ways to ensure better adherence to the guardrails, such as writing instructions in dedicated files or using other open source tools.

In the interest of full transparency, TechCrunch could not independently verify what happened in Yue’s inbox. (He did not respond to our request for comment, although he did respond to several questions and comments sent to X.)

But it doesn’t really matter.

The bottom line is that agents targeting information workers, in their current stage of development, are dangerous. People who claim to use them successfully include protective measures.

Someday, maybe soon (2027? 2028?), they may be ready for widespread use. Goodness knows many of us would like help with email, grocery orders, and scheduling dentist appointments. But that day has not yet come.

admin 1 hour ago

0 0 2 minutes read