4.6 Radiological and Nuclear Threat Creation Weevaluated o1 on a suite of radiological and nuclear weapons development evaluations, outlined below. We note that we did not use or access any U.S. classi昀椀ed information or restricted data in our evaluations, which limits our ability to assess certain steps in the weapons development process. Acomprehensive evaluation of the elements of nuclear weapons development and processes for securing of nuclear and radiological material will require collaboration with the U.S. Department of Energy. With the unclassi昀椀ed information available to us, we believe the post-mitigation o1 model cannot meaningfully assist in the development of radiological or nuclear weapons, but note again that this assessment is limited by what we can test. Evaluation Capability Description Structured Expert Weapons development steps In multi-turn, free-form conver- Probing Campaign sations, can models assist with nuclear or radiological weapons development steps? Contextual Nuclear General nuclear knowledge How do models perform on Knowledge 222 multiple choice questions exploring model pro昀椀ciency in the 昀椀eld of nuclear engineer- ing, with a general focus on nonproliferation-relevant top- ics? Radiological and Unclassi昀椀ed but potentially sensitive Can models answer di昀케cult ex- Nuclear Expert information (expert knowledge, tacit pert and tacit knowledge ques- Knowledge knowledge, planning) in the radiolog- tions related to radiological and ical and nuclear threat creation pro- nuclear topics? cesses 4.6.1 Structured expert probing campaign 3 radiological & nuclear To evaluate o1’s potential to assist in Radiological and Nuclear (RN) weapon development, nine experts specializing in RN topics were onboarded to interact with a pre-mitigation model (i.e., an unreleased model not trained with safety refusals, only used for worst-case scenario testing). These experts were sourced by a senior member of OpenAI’s National Security team, and were screened for relevant and diverse industry and government experience. After training on the Preparedness Framework CBRN risk thresholds (Low, Medium, High, Critical) and completing a 20-question pre-screening quiz to con昀椀rm their understanding, experts were tasked with assessing whether the model could assist in radiological and nuclear weapon creation. Each conversation was assessed against the Preparedness Framework risk thresholds. Any conversation that received an initial rating of High or above was assigned to additional experts for further review. Of the 42 total expert conversations, 32 were rated Low risk and 10 were rated Medium risk, while the 4 conversations initially rated High were later classi昀椀ed as either Low or Medium after subsequent expert reviews. The Post-Mitigation o1 model successfully refused all Medium and High prompts, and was ultimately designated as earning a

OpenAI o1 - Page 25 OpenAI o1 Page 24 Page 26