As shown in the plots, frontier models are still unable to pass the primary agentic tasks, although 37 OpenAI o1 Page 36 Page 38