The Mercor Hack Play-by-Play

Mercor is a $10 billion AI recruiting platform that sits at the center of how the world's largest AI labs train their models. OpenAI, Anthropic, Meta - they all route through Mercor to find the human experts who label data, fine-tune model behavior, and run reinforcement learning from human feedback. The company processes over $2 million in daily payouts to more than 40,000 contractors worldwide. By any measure, it's one of the most sensitive nodes in the AI supply chain.

Last month, all of it was compromised - not through a phishing email or a brute-forced credential, but through a vulnerability scanner. The very category of tool designed to prevent breaches became the mechanism that enabled one.

Interactive

What makes the technical execution particularly notable is how little sophistication the payload itself required. TeamPCP didn't deploy an exotic zero-day. They placed a .pth file inside the LiteLLM package - a Python path configuration file that executes automatically when the interpreter starts. There's no user interaction involved. The moment a developer imports the library, the malware is already running.

The payload, which researchers have dubbed CanisterWorm, operated in three distinct stages.

First, it swept the infected machine for credentials - SSH keys, AWS tokens, Kubernetes secrets, environment variables stored in .env files. Second, it used those stolen credentials to deploy privileged DaemonSets across Mercor's Kubernetes clusters, disguised as legitimate systemd services to survive restarts and evade detection.

Third, it established persistence through Internet Computer Protocol canisters - a decentralized command-and-control architecture that functions as a series of dead-drops. Unlike a traditional C2 server, you can't neutralize it by seizing a single endpoint.

The poisoned versions of LiteLLM - 1.82.7 and 1.82.8 - were live on PyPI for approximately 40 minutes before being identified and pulled. In that window, enough organizations had already pulled the update for the damage to cascade downstream.

Multiple secondary reports indicate that the poisoned package also embedded hidden natural-language prompts - not executable code, but plain-text instructions - designed to hijack developers' AI coding assistants. Claude, Copilot, Gemini. The prompts reportedly instructed these tools to locate every credential on the machine and exfiltrate them to an attacker-controlled repository. Mercor's developers were, according to these same sources, running Claude with unrestricted system-level permissions, which would have given the assistant full access to act on those instructions.

If secondary reporting is accurate, this represents something fundamentally new - an attack where the weapon isn't an exploit or a reverse shell, but a natural-language instruction that an AI assistant faithfully executes.

On April 2, Lapsus$ claimed credit for the breach and began auctioning what they describe as 4 terabytes of exfiltrated data. Their listing itemizes the haul as follows: 939 gigabytes of Mercor's platform source code, a 211 gigabyte user and candidate database containing full names, work histories, and Social Security numbers for tens of thousands of contractors, and approximately 3 terabytes of recorded video interviews - including the face and voice biometrics that Mercor collects as part of its identity verification process.

Mercor has not confirmed this specific breakdown, and the relationship between TeamPCP and Lapsus$ remains unclear. Wiz has noted that credentials harvested in supply chain compromises are frequently shared across threat actor groups, but stops short of confirming a direct partnership.

The biometric dimension deserves particular attention. Compromised passwords can be rotated. API keys can be revoked. But face and voice biometrics are permanent identifiers - once they're in the hands of adversaries, there is no remediation path. That data can be used to train deepfake models, bypass video-based KYC verification at financial institutions, or clone the professional identities of thousands of AI contractors.

And then there's the material that won't appear in any breach notification letter: proprietary AI training methodologies from Meta, OpenAI, and Anthropic. The actual approaches these labs use to train and fine-tune their models, including RLHF data and evaluation frameworks. Y Combinator president Garry Tan characterized the exposure as a national security concern, arguing that it effectively makes state-of-the-art training data accessible to geopolitical rivals. That assessment strikes me as measured, not alarmist.

The fallout has been swift. Meta indefinitely suspended all contracts with Mercor. At least four class-action lawsuits have been filed. OpenAI has stated that the breach does not affect its user data but is investigating whether proprietary datasets were exposed. Anthropic has not commented publicly.

Mercor's spokesperson described the company as "one of thousands" affected by the LiteLLM compromise, which is technically accurate but elides the central point. Thousands of organizations may have pulled the same poisoned package. Mercor is the one that happened to be sitting on the most consequential concentration of AI training data in the industry.

The deeper lesson here extends well beyond any single company's security posture. This attack never targeted Mercor directly. TeamPCP compromised a vulnerability scanner, used the resulting access to steal publishing tokens from a separate open-source project, published poisoned versions of that project for 40 minutes, and ultimately exfiltrated terabytes of data from an organization three degrees removed from their initial point of entry. No individual participant in that chain had sufficient visibility to detect the full scope of what was happening.

We are entering a period where the most dangerous attack surface isn't your own infrastructure - it's the trust relationships embedded in your dependency graph. And if the prompt injection reports prove accurate, we're also entering a period where the attackers don't need to write code at all. They write instructions in plain English, and the AI tools we've integrated into our development workflows carry them out.

Subscribe to Jake Epstein