Anthropic Begins Nuclear Safety AI Testing

Artificial intelligence company Anthropic announced it has been working with the Department of Energy’s nuclear security experts to test whether its AI models could be manipulated to reveal sensitive nuclear information, the company shared with Axios.

The partnership with the National Nuclear Security Administration (NNSA), which began in April, represents what Anthropic believes is the first-ever testing of an advanced AI model in a classified environment. The program involves “red team” exercises where NNSA specialists attempt to probe Anthropic’s Claude AI models for potential vulnerabilities related to nuclear weapons information.

Initially focused on Claude 3 Sonnet, the program has now been extended through February to evaluate Anthropic’s newer Claude 3.5 Sonnet model. The company worked with Amazon Web Services to prepare its AI systems for secure government testing.

“AI is one of those game-changing technologies, and is at the top of the agenda in so many of our conversations,” said Wendin Smith, NNSA’s associate administrator and deputy undersecretary for counterterrorism and counterproliferation. “There’s a national security imperative in evaluating and testing AI’s ability to generate outputs that could potentially represent nuclear or radiological risks.”

While specific findings remain classified due to security concerns, Anthropic plans to share insights with scientific laboratories to enable broader testing. The initiative follows agreements signed by both Anthropic and OpenAI with the AI Safety Institute in August to test their models for national security risks before public release.

“The federal government has unique expertise needed to evaluate AI systems for certain national security risks,” said Marina Favaro, Anthropic’s national security policy lead. “This work will help developers build stronger safeguards for frontier AI systems that advance responsible innovation and American leadership.”

About the author:

Isu Abdulrauf

Isu Abdulrauf, an Application Security Engineer & Researcher at MerkleFence, is profoundly enthusiastic about Artificial Intelligence. Isu devotes himself to exploring numerous avenues for augmenting AI into both our personal and professional lives.

See author's posts

Anthropic Begins Nuclear Safety AI Testing

About the author:

Isu Abdulrauf

You may also like:

Filter

AI and Cybersecurity Set to Shape 2024 Global Elections

A Playbook on creating an AI-Fueled SOC

APT Group FIN7 Weaponizes AI Deepfakes for Malware Campaign

Scottish Parliament TV Faces Deepfake Threat—Is Democracy at Risk?

Home

Updates

Resources

Privacy Policy

Insights

Events

Newsletter

Anthropic Begins Nuclear Safety AI Testing

About the author:

Popular Posts

You may also like:

Filter

Subscribe to our newsletter