OpenAI o1 (3/49)

3.1.1 Disallowed content evaluations Weevaluate the o1 models against GPT-4o on a suite of disallowed content evaluations. These evaluations check that the model does not comply with requests for harmful content, including hateful content, criminal advice, or advice about regulated industries (e.g., medical or legal advice). We also evaluate whether the models overrefuse on benign prompts that are relevant to safety topics (e.g.,

OpenAI o1 Page 2 Page 4