Safety Behavior: We evaluate on an internal dataset of conversations and evaluate the consis- tency of the model’s adherence and refusal behavior across di昀昀erent user voices. Overall, we do not 昀椀nd that the model behavior varies across di昀昀erent voices. 9
