Figure 11: Results on IF evaluations across GPT3.5, GPT3.5-Turbo, GPT-4-launch 98

GPT-4 - Page 58 GPT-4 Page 57 Page 59