U.S. Government to Test AI Models from Google, Microsoft, and xAI Before Public Release

Sara Montes de Oca
6 hours ago
2 min read

The Center for AI Standards and Innovation announced agreements Tuesday with Google DeepMind, Microsoft, and Elon Musk's xAI that will allow federal officials to evaluate artificial intelligence models before they reach the public — a step that signals the Trump administration's deepening role in AI oversight.

CAISI, which operates under the U.S. Department of Commerce, said it will "conduct pre-deployment evaluations and targeted research to better assess frontier AI capabilities and advance the state of AI security," according to a release from the center.

The agreements extend an existing framework. CAISI struck similar partnerships with OpenAI and Anthropic in 2024, and those earlier agreements have since been renegotiated to reflect directives from Commerce Secretary Howard Lutnick and the administration's AI Action Plan, the center said.

Beyond CAISI's formal agreements, the White House has been weighing the creation of a new AI working group that would explore broader oversight procedures, including pre-release vetting of models, according to a source close to the discussions who asked not to be named because the details are confidential. The group would bring together tech executives and government officials, and may be established through an executive order.

The White House told CNBC that discussion about potential executive orders is speculation, and that any policy announcement will come directly from President Donald Trump.

The administration's renewed attention to AI oversight comes in part after Anthropic announced a powerful new model, Claude Mythos Preview, last month. According to Anthropic, Mythos excels at identifying weaknesses and security flaws in software, prompting the company to limit its rollout to a select group of companies through a new cybersecurity initiative called Project Glasswing.

Anthropic CEO Dario Amodei subsequently met with senior members of the Trump administration at the White House to discuss the model — a meeting that took place even after the Defense Department had declared Anthropic a supply chain risk, characterizing the company as a potential threat to U.S. national security. Both the White House and Anthropic described that meeting as "productive."

Tuesday's announcement reflects a broader tension in Washington over how to govern AI systems that are growing more capable — and, in some cases, more consequential for national security — while the industry moves quickly toward public deployment.

With agreements now covering five of the most prominent AI developers — OpenAI, Anthropic, Google DeepMind, Microsoft, and xAI — CAISI's pre-deployment evaluation network is expanding at a pace that suggests the framework could become a standard checkpoint for frontier models ahead of release. Whether the forthcoming working group will formalize that process through executive action remains an open question.