The guardrails appear to be coming off at Anthropic, the AI lab founded by former OpenAI employees who once championed responsible AI development. The company has released a new version of its Responsible Scaling Policy (RSP), which weakens one of its cornerstone safety pledges, the promise to halt progress on new models if they outpaced internal safety mechanisms.
“We’re releasing the third version of our Responsible Scaling Policy (RSP), the voluntary framework we use to mitigate catastrophic risks from AI systems,” Anthropic announced.
Anthropic’s safety shift: what’s changed?
Under the earlier framework, Anthropic was required to pause or delay the training of more advanced models if their abilities exceeded the company’s safety controls. That clause, however, has been removed.
In its statement to Business Insider, Anthropic said that the decision reflected “heightened competition” and a lack of effective regulation across the AI sector.
The company confirmed it would “no longer abide by its commitment to pause the scaling and/or delay the deployment of new models” even when development surpasses existing safety measures.
In its updated policy, Anthropic said the broader environment around AI regulation has shifted significantly, with governments prioritising competitiveness and economic gains over risk management.
“The current policy environment has shifted toward prioritising AI competitiveness and economic growth, while safety-oriented discussions have yet to gain meaningful traction at the federal level,” the company explained.
Chief Science Officer Jared Kaplan elaborated on the reasoning in an interview with Time Magazine, noting that the original policy had fallen behind the pace of technological change.
Quick Reads
View All“We felt that it wouldn’t actually help anyone for us to stop training AI models,” Kaplan said. “We didn’t really feel, with the rapid advance of AI, that it made sense for us to make unilateral commitments … if competitors are blazing ahead.”
Balancing competition, safety, and government pressure
Anthropic emphasised that the update does not signal abandonment of its safety mission. The company said it remains “convinced” that meaningful government involvement in AI safety is “both necessary and achievable” but added that such efforts have been slower than expected.
“Effective government engagement on AI safety is both necessary and achievable,” Anthropic said. “But it’s proving to be a long-term project, not something that is happening organically as AI becomes more capable or crosses certain thresholds.”
Going forward, the company said it will continue to make recommendations for AI safety across the industry but will separate those guidelines from its own internal development plans.
The timing of Anthropic’s policy revision has drawn attention. It comes amid tensions with the US Department of Defense, following reports that Defence Secretary Pete Hegseth had issued CEO Dario Amodei a Friday deadline to reinstate some AI safeguards . Failure to comply, could jeopardise Anthropic’s $200 million defence contract and potentially see the firm placed on a government blacklist.
However, a source familiar with the matter told media outlets that the timing of the policy change was unrelated to the Pentagon dispute.
Anthropic’s revised stance underscores a growing dilemma across the AI industry, whether companies can balance safety and competitiveness as development races ahead faster than regulation can keep up.
)