Lawyers face sanctions for citing fake cases with AI, warns UK judge
LONDON (Reuters) -Lawyers who use artificial intelligence to cite non-existent cases can be held in contempt of court or even face criminal charges, London's High Court warned on Friday, in the latest example of generative AI leading lawyers astray.
A senior judge lambasted lawyers in two cases who apparently used AI tools when preparing written arguments, which referred to fake case law, and called on regulators and industry leaders to ensure lawyers know their ethical obligations.
"There are serious implications for the administration of justice and public confidence in the justice system if artificial intelligence is misused," Judge Victoria Sharp said in a written ruling.
"In those circumstances, practical and effective measures must now be taken by those within the legal profession with individual leadership responsibilities ... and by those with the responsibility for regulating the provision of legal services."
The ruling comes after lawyers around the world have been forced to explain themselves for relying on false authorities, since ChatGPT and other generative AI tools became widely available more than two years ago.
Sharp warned in her ruling that lawyers who refer to non-existent cases will be in breach of their duty to not mislead the court, which could also amount to contempt of court.
She added that "in the most egregious cases, deliberately placing false material before the court with the intention of interfering with the administration of justice amounts to the common law criminal offence of perverting the course of justice".
Sharp noted that legal regulators and the judiciary had issued guidance about the use of AI by lawyers, but said that "guidance on its own is insufficient to address the misuse of artificial intelligence".

Try Our AI Features
Explore what Daily8 AI can do for you:
Comments
No comments yet...
Related Articles


Forbes
an hour ago
- Forbes
How Generative AI Is Changing The Way We Work
Think back just a few years. AI in the workplace wasn't always that exciting. It was frustrating. Chatbots missed the mark more often than they helped. AI writing assistants sounded robotic, stiff, and very generic. Transcription tools? Maybe 70% accurate on a good day, which made them more hassle than help. Instead of making work easier, AI often felt like one more thing to manage. It was more novelty than necessity. But things are changing fast. Generative AI has gone from a futuristic buzzword to something much more practical and powerful. It's now sitting beside us in meetings, drafting our emails, organizing our thoughts, and even helping us solve problems we didn't know how to articulate. This isn't just smarter software, it's a new kind of teammate. We're not just using tools. We're collaborating with them. Whether it's ChatGPT, Microsoft Copilot, Claude, Perplexity, or internal LLMs customized for specific teams, generative AI isn't sitting on the bench anymore. It's helping draft proposals, untangle data, write code, analyze reports, and act as a collaboration partner to spark new ideas. And it's doing all of this without needing a lunch break, vacation time, or sleep. The pace of change is fast with AI. The impact? Even faster. And we're only just getting started. AI at Work: From Mundane Tasks to Transformation This AI transformation is no longer theoretical. Generative AI is already baked into how teams across industries get things done. As I always say, everyone is at a different point in their AI journey, but forward thinking individuals and organizations are already seeing tangible benefits from Generative AI. In marketing, it's helping create campaigns faster than ever. Think instant content generation, headline testing, SEO optimization, and rewriting blog posts when they fall flat or need a different tone. Finance teams are using AI to make sense of messy numbers. Forecasts, reports, and budget variance explanations are all generated in natural language, not spreadsheet formulas. AI-powered tools are helping teams spot trends, flag anomalies, and communicate insights clearly. Customer service is getting a long-overdue upgrade too. AI now drafts personalized email responses in seconds, summarizes multi-threaded conversations into a single digestible ticket, and routes high-priority issues to the right human agent in seconds. It can detect customer sentiment, recommend next-best actions, and even auto-generate knowledge base articles and documentation from resolved tickets. By layering AI into the tools that customer service agents already use, the role is evolving from a reactive support channel into a proactive, intelligent experience. Rethinking Roles in the Age of AI AI isn't just slotting into workflows, it's reshaping them. And with this change, roles are shifting in ways that are both subtle and significant. Tedious admin work? Automated. The kind of tasks that used to eat up your mornings such as calendar updates, inbox triage, or basic data entry are now handled by AI in seconds. That time can now get reinvested in what actually matters: thinking, deciding, creating. Knowledge workers aren't bogged down by busywork anymore. They're stepping into higher-value spaces. Strategy. Analysis. Storytelling. The kinds of things AI can support, but not truly own. It's not about replacement—it's about elevation. Or as I often say, this is augmented intelligence in action, and it's happening every day. Developers are getting an efficiency boost that's hard to ignore. Tools are suggesting code snippets, identifying bugs, and speeding up build cycles. These AI tools help the developer do their job faster and quicker. It's not about replacing the developer's judgment or creativity. The narrative shouldn't be 'AI is coming for your job.' Instead, people who know how to work with AI are increasingly outpacing those who don't. It's not about competing with machines. It's about collaborating with them and leveling up in the process. What's Holding Teams Back and What Comes Next in the Age of Generative AI AI appears to be everywhere right now, but that doesn't mean it's being used well. Despite all the headlines and hype, only a small amount of organizations say they've fully matured their AI capabilities, according to BCG's 2025 AI at Work report. So, what's standing in the way? Start with skills. A lot of employees still don't feel confident using generative AI tools. The interfaces are easy but knowing how to get meaningful, reliable results? That takes practice and training. Most companies just haven't made the investment yet to provide AI literacy and upskilling to the whole organization. Then there's trust. People worry about AI hallucinations, biased outputs, or tools making confident guesses based on shaky logic. That concern isn't misplaced. AI models do make mistakes. But the solution isn't to ignore AI. It's to understand it, pressure-test it, and use it wisely. And perhaps the most overlooked barrier? Organizational readiness. Leaders are excited. They're talking about AI in all-hands meetings and maybe even externally to clients. But beneath the surface, change management is slow. Governance is murky. And policy is usually still in draft and not fully fleshed out. So what happens? AI ends up underused. It's available, but not embedded. Sitting in the toolbox, not on the workbench. To thrive in this next wave, organizations need more than access. They need alignment. Experiment with intent. Don't just try tools. Track what works, iterate, and scale from there. Build AI fluency. Every team member, not just technical teams, should know what AI can and can't do. Lead with purpose. Integrate AI transparently and ethically. The goal isn't replacement. It's empowerment.
Yahoo
an hour ago
- Yahoo
Inside OpenAI's quest to make AI do anything for you
Shortly after Hunter Lightman joined OpenAI as a researcher in 2022, he watched his colleagues launch ChatGPT, one of the fastest-growing products ever. Meanwhile, Lightman quietly worked on a team teaching OpenAI's models to solve high school math competitions. Today that team, known as MathGen, is considered instrumental to OpenAI's industry-leading effort to create AI reasoning models: the core technology behind AI agents that can do tasks on a computer like a human would. 'We were trying to make the models better at mathematical reasoning, which at the time they weren't very good at,' Lightman told TechCrunch, describing MathGen's early work. OpenAI's models are far from perfect today — the company's latest AI systems still hallucinate and its agents struggle with complex tasks. But its state-of-the-art models have improved significantly on mathematical reasoning. One of OpenAI's models recently won a gold medal at the International Math Olympiad, a math competition for the world's brightest high school students. OpenAI believes these reasoning capabilities will translate to other subjects, and ultimately power general-purpose agents that the company has always dreamed of building. ChatGPT was a happy accident — a lowkey research preview turned viral consumer business — but OpenAI's agents are the product of a years-long, deliberate effort within the company. 'Eventually, you'll just ask the computer for what you need and it'll do all of these tasks for you,' said OpenAI CEO Sam Altman at the company's first developer conference in 2023. 'These capabilities are often talked about in the AI field as agents. The upsides of this are going to be tremendous.' Whether agents will meet Altman's vision remains to be seen, but OpenAI shocked the world with the release of its first AI reasoning model, o1, in the fall of 2024. Less than a year later, the 21 foundational researchers behind that breakthrough are the most highly sought-after talent in Silicon Valley. Mark Zuckerberg recruited five of the o1 researchers to work on Meta's new superintelligence-focused unit, offering some compensation packages north of $100 million. One of them, Shengjia Zhao, was recently named chief scientist of Meta Superintelligence Labs. The reinforcement learning renaissance The rise of OpenAI's reasoning models and agents are tied to a machine learning training technique known as reinforcement learning (RL). RL provides feedback to an AI model on whether its choices were correct or not in simulated environments. RL has been used for decades. For instance, in 2016, about a year after OpenAI was founded in 2015, an AI system created by Google DeepMind using RL, AlphaGo, gained global attention after beating a world champion in the board game, Go. Around that time, one of OpenAI's first employees, Andrej Karpathy, began pondering how to leverage RL to create an AI agent that could use a computer. But it would take years for OpenAI to develop the necessary models and training techniques. By 2018, OpenAI pioneered its first large language model in the GPT series, pretrained on massive amounts of internet data and a large clusters of GPUs. GPT models excelled at text processing, eventually leading to ChatGPT, but struggled with basic math. It took until 2023 for OpenAI to achieve a breakthrough, initially dubbed 'Q*' and then 'Strawberry,' by combining LLMs, RL, and a technique called test-time computation. The latter gave the models extra time and computing power to plan and work through problems, verifying its steps, before providing an answer. This allowed OpenAI to introduce a new approach called 'chain-of-thought' (CoT), which improved AI's performance on math questions the models hadn't seen before. 'I could see the model starting to reason,' said El Kishky. 'It would notice mistakes and backtrack, it would get frustrated. It really felt like reading the thoughts of a person.' Though individually these techniques weren't novel, OpenAI uniquely combined them to create Strawberry, which directly led to the development of o1. OpenAI quickly identified that the planning and fact checking abilities of AI reasoning models could be useful to power AI agents. 'We had solved a problem that I had been banging my head against for a couple of years,' said Lightman. 'It was one of the most exciting moments of my research career.' Scaling reasoning With AI reasoning models, OpenAI determined it had two new axes that would allow it to improve AI models: using more computational power during the post-training of AI models, and giving AI models more time and processing power while answering a question. 'OpenAI, as a company, thinks a lot about not just the way things are, but the way things are going to scale,' said Lightman. Shortly after the 2023 Strawberry breakthrough, OpenAI spun up an 'Agents' team led by OpenAI researcher Daniel Selsam to make further progress on this new paradigm, two sources told TechCrunch. Although the team was called 'Agents,' OpenAI didn't initially differentiate between reasoning models and agents as we think of them today. The company just wanted to make AI systems capable of completing complex tasks. Eventually, the work of Selsam's Agents team became part of a larger project to develop the o1 reasoning model, with leaders including OpenAI co-founder Ilya Sutskever, chief research officer Mark Chen, and chief scientist Jakub Pachocki. OpenAI would have to divert precious resources — mainly talent and GPUs — to create o1. Throughout OpenAI's history, researchers have had to negotiate with company leaders to obtain resources; demonstrating breakthroughs was a surefire way to secure them. 'One of the core components of OpenAI is that everything in research is bottom up,' said Lightman. 'When we showed the evidence [for o1], the company was like, 'This makes sense, let's push on it.'' Some former employees say that the startup's mission to develop AGI was the key factor in achieving breakthroughs around AI reasoning models. By focusing on developing the smartest-possible AI models, rather than products, OpenAI was able to prioritize o1 above other efforts. That type of large investment in ideas wasn't always possible at competing AI labs. The decision to try new training methods proved prescient. By late 2024, several leading AI labs started seeing diminishing returns on models created through traditional pretraining scaling. Today, much of the AI field's momentum comes from advances in reasoning models. What does it mean for an AI to 'reason?' In many ways, the goal of AI research is to recreate human intelligence with computers. Since the launch of o1, ChatGPT's UX has been filled with more human-sounding features such as 'thinking' and 'reasoning.' When asked whether OpenAI's models were truly reasoning, El Kishky hedged, saying he thinks about the concept in terms of computer science. 'We're teaching the model how to efficiently expend compute to get an answer. So if you define it that way, yes, it is reasoning,' said El Kishky. Lightman takes the approach of focusing on the model's results and not as much on the means or their relation to human brains. 'If the model is doing hard things, then it is doing whatever necessary approximation of reasoning it needs in order to do that,' said Lightman. 'We can call it reasoning, because it looks like these reasoning traces, but it's all just a proxy for trying to make AI tools that are really powerful and useful to a lot of people.' OpenAI's researchers note people may disagree with their nomenclature or definitions of reasoning — and surely, critics have emerged — but they argue it's less important than the capabilities of their models. Other AI researchers tend to agree. Nathan Lambert, an AI researcher with the non-profit AI2, compares AI reasoning modes to airplanes in a blog post. Both, he says, are manmade systems inspired by nature — human reasoning and bird flight, respectively — but they operate through entirely different mechanisms. That doesn't make them any less useful, or any less capable of achieving similar outcomes. A group of AI researchers from OpenAI, Anthropic, and Google DeepMind agreed in a recent position paper that AI reasoning models are not well understood today, and more research is needed. It may be too early to confidently claim what exactly is going on inside them. The next frontier: AI agents for subjective tasks The AI agents on the market today work best for well-defined, verifiable domains such as coding. OpenAI's Codex agent aims to help software engineers offload simple coding tasks. Meanwhile, Anthropic's models have become particularly popular in AI coding tools like Cursor and Claude Code — these are some of the first AI agents that people are willing to pay up for. However, general purpose AI agents like OpenAI's ChatGPT Agent and Perplexity's Comet struggle with many of the complex, subjective tasks people want to automate. When trying to use these tools for online shopping or finding a long-term parking spot, I've found the agents take longer than I'd like and make silly mistakes. Agents are, of course, early systems that will undoubtedly improve. But researchers must first figure out how to better train the underlying models to complete tasks that are more subjective. 'Like many problems in machine learning, it's a data problem,' said Lightman, when asked about the limitations of agents on subjective tasks. 'Some of the research I'm really excited about right now is figuring out how to train on less verifiable tasks. We have some leads on how to do these things.' Noam Brown, an OpenAI researcher who helped create the IMO model and o1, told TechCrunch that OpenAI has new general-purpose RL techniques which allow them to teach AI models skills that aren't easily verified. This was how the company built the model which achieved a gold medal at IMO, he said. OpenAI's IMO model was a newer AI system that spawns multiple agents, which then simultaneously explore several ideas, and then choose the best possible answer. These types of AI models are becoming more popular; Google and xAI have recently released state-of-the-art models using this technique. 'I think these models will become more capable at math, and I think they'll get more capable in other reasoning areas as well,' said Brown. 'The progress has been incredibly fast. I don't see any reason to think it will slow down.' These techniques may help OpenAI's models become more performant, gains that could show up in the company's upcoming GPT-5 model. OpenAI hopes to assert its dominance over competitors with the launch of GPT-5, ideally offering the best AI model to power agents for developers and consumers. But the company also wants to make its products simpler to use. El Kishky says OpenAI wants to develop AI agents that intuitively understand what users want, without requiring them to select specific settings. He says OpenAI aims to build AI systems that understand when to call up certain tools, and how long to reason for. These ideas paint a picture of an ultimate version of ChatGPT: an agent that can do anything on the internet for you, and understand how you want it to be done. That's a much different product than what ChatGPT is today, but the company's research is squarely headed in this direction. While OpenAI undoubtedly led the AI industry a few years ago, the company now faces a tranche of worthy opponents. The question is no longer just whether OpenAI can deliver its agentic future, but can the company do so before Google, Anthropic, xAI, or Meta beat them to it? Sign in to access your portfolio


TechCrunch
2 hours ago
- TechCrunch
Inside OpenAI's quest to make AI do anything for you
Shortly after Hunter Lightman joined OpenAI as a researcher in 2022, he watched his colleagues launch ChatGPT, one of the fastest-growing products ever. Meanwhile, Lightman quietly worked on a team teaching OpenAI's models to solve high school math competitions. Today that team, known as MathGen, is considered instrumental to OpenAI's industry-leading effort to create AI reasoning models: the core technology behind AI agents that can do tasks on a computer like a human would. 'We were trying to make the models better at mathematical reasoning, which at the time they weren't very good at,' Lightman told TechCrunch, describing MathGen's early work. OpenAI's models are far from perfect today — the company's latest AI systems still hallucinate and its agents struggle with complex tasks. But its state-of-the-art models have improved significantly on mathematical reasoning. One of OpenAI's models recently won a gold medal at the International Math Olympiad, a math competition for the world's brightest high school students. OpenAI believes these reasoning capabilities will translate to other subjects, and ultimately power general-purpose agents that the company has always dreamed of building. ChatGPT was a happy accident — a lowkey research preview turned viral consumer business — but OpenAI's agents are the product of a years-long, deliberate effort within the company. 'Eventually, you'll just ask the computer for what you need and it'll do all of these tasks for you,' said OpenAI CEO Sam Altman at the company's first developer conference in 2023. 'These capabilities are often talked about in the AI field as agents. The upsides of this are going to be tremendous.' Techcrunch event Tech and VC heavyweights join the Disrupt 2025 agenda Netflix, ElevenLabs, Wayve, Sequoia Capital — just a few of the heavy hitters joining the Disrupt 2025 agenda. They're here to deliver the insights that fuel startup growth and sharpen your edge. Don't miss the 20th anniversary of TechCrunch Disrupt, and a chance to learn from the top voices in tech — grab your ticket now and save up to $675 before prices rise. Tech and VC heavyweights join the Disrupt 2025 agenda Netflix, ElevenLabs, Wayve, Sequoia Capital — just a few of the heavy hitters joining the Disrupt 2025 agenda. They're here to deliver the insights that fuel startup growth and sharpen your edge. Don't miss the 20th anniversary of TechCrunch Disrupt, and a chance to learn from the top voices in tech — grab your ticket now and save up to $675 before prices rise. San Francisco | REGISTER NOW OpenAI CEO Sam Altman speaks during the OpenAI DevDay event on November 06, 2023 in San Francisco, California.(Photo by) Image Credits:Justin Sullivan / Getty Images Whether agents will meet Altman's vision remains to be seen, but OpenAI shocked the world with the release of its first AI reasoning model, o1, in the fall of 2024. Less than a year later, the 21 foundational researchers behind that breakthrough are the most highly sought-after talent in Silicon Valley. Mark Zuckerberg recruited five of the o1 researchers to work on Meta's new superintelligence-focused unit, offering some compensation packages north of $100 million. One of them, Shengjia Zhao, was recently named chief scientist of Meta Superintelligence Labs. The reinforcement learning renaissance The rise of OpenAI's reasoning models and agents are tied to a machine learning training technique known as reinforcement learning (RL). RL provides feedback to an AI model on whether its choices were correct or not in simulated environments. RL has been used for decades. For instance, in 2016, about a year after OpenAI was founded in 2015, an AI system created by Google DeepMind using RL, AlphaGo, gained global attention after beating a world champion in the board game, Go. South Korean professional Go player Lee Se-Dol (R) prepares for his fourth match against Google's artificial intelligence program, AlphaGo, during the Google DeepMind Challenge Match on March 13, 2016 in Seoul, South Korea. Lee Se-dol played a five-game match against a computer program developed by a Google, AlphaGo. (Photo by Google via Getty Images) Around that time, one of OpenAI's first employees, Andrej Karpathy, began pondering how to leverage RL to create an AI agent that could use a computer. But it would take years for OpenAI to develop the necessary models and training techniques. By 2018, OpenAI pioneered its first large language model in the GPT series, pretrained on massive amounts of internet data and a large clusters of GPUs. GPT models excelled at text processing, eventually leading to ChatGPT, but struggled with basic math. It took until 2023 for OpenAI to achieve a breakthrough, initially dubbed 'Q*' and then 'Strawberry,' by combining LLMs, RL, and a technique called test-time computation. The latter gave the models extra time and computing power to plan and work through problems, verifying its steps, before providing an answer. This allowed OpenAI to introduce a new approach called 'chain-of-thought' (CoT), which improved AI's performance on math questions the models hadn't seen before. 'I could see the model starting to reason,' said El Kishky. 'It would notice mistakes and backtrack, it would get frustrated. It really felt like reading the thoughts of a person.' Though individually these techniques weren't novel, OpenAI uniquely combined them to create Strawberry, which directly led to the development of o1. OpenAI quickly identified that the planning and fact checking abilities of AI reasoning models could be useful to power AI agents. 'We had solved a problem that I had been banging my head against for a couple of years,' said Lightman. 'It was one of the most exciting moments of my research career.' Scaling reasoning With AI reasoning models, OpenAI determined it had two new axes that would allow it to improve AI models: using more computational power during the post-training of AI models, and giving AI models more time and processing power while answering a question. 'OpenAI, as a company, thinks a lot about not just the way things are, but the way things are going to scale,' said Lightman. Shortly after the 2023 Strawberry breakthrough, OpenAI spun up an 'Agents' team led by OpenAI researcher Daniel Selsam to make further progress on this new paradigm, two sources told TechCrunch. Although the team was called 'Agents,' OpenAI didn't initially differentiate between reasoning models and agents as we think of them today. The company just wanted to make AI systems capable of completing complex tasks. Eventually, the work of Selsam's Agents team became part of a larger project to develop the o1 reasoning model, with leaders including OpenAI co-founder Ilya Sutskever, chief research officer Mark Chen, and chief scientist Jakub Pachocki. Ilya Sutskever, Russian Israeli-Canadian computer scientist and co-founder and Chief Scientist of OpenAI, speaks at Tel Aviv University in Tel Aviv on June 5, 2023. (Photo by JACK GUEZ / AFP) Image Credits:Getty Images OpenAI would have to divert precious resources — mainly talent and GPUs — to create o1. Throughout OpenAI's history, researchers have had to negotiate with company leaders to obtain resources; demonstrating breakthroughs was a surefire way to secure them. 'One of the core components of OpenAI is that everything in research is bottom up,' said Lightman. 'When we showed the evidence [for o1], the company was like, 'This makes sense, let's push on it.'' Some former employees say that the startup's mission to develop AGI was the key factor in achieving breakthroughs around AI reasoning models. By focusing on developing the smartest-possible AI models, rather than products, OpenAI was able to prioritize o1 above other efforts. That type of large investment in ideas wasn't always possible at competing AI labs. The decision to try new training methods proved prescient. By late 2024, several leading AI labs started seeing diminishing returns on models created through traditional pretraining scaling. Today, much of the AI field's momentum comes from advances in reasoning models. What does it mean for an AI to 'reason?' In many ways, the goal of AI research is to recreate human intelligence with computers. Since the launch of o1, ChatGPT's UX has been filled with more human-sounding features such as 'thinking' and 'reasoning.' When asked whether OpenAI's models were truly reasoning, El Kishky hedged, saying he thinks about the concept in terms of computer science. 'We're teaching the model how to efficiently expend compute to get an answer. So if you define it that way, yes, it is reasoning,' said El Kishky. Lightman takes the approach of focusing on the model's results and not as much on the means or their relation to human brains. The OpenAI logo on screen at their developer day stage. (Credit: Devin Coldeway) Image Credits:Devin Coldewey 'If the model is doing hard things, then it is doing whatever necessary approximation of reasoning it needs in order to do that,' said Lightman. 'We can call it reasoning, because it looks like these reasoning traces, but it's all just a proxy for trying to make AI tools that are really powerful and useful to a lot of people.' OpenAI's researchers note people may disagree with their nomenclature or definitions of reasoning — and surely, critics have emerged — but they argue it's less important than the capabilities of their models. Other AI researchers tend to agree. Nathan Lambert, an AI researcher with the non-profit AI2, compares AI reasoning modes to airplanes in a blog post. Both, he says, are manmade systems inspired by nature — human reasoning and bird flight, respectively — but they operate through entirely different mechanisms. That doesn't make them any less useful, or any less capable of achieving similar outcomes. A group of AI researchers from OpenAI, Anthropic, and Google DeepMind agreed in a recent position paper that AI reasoning models are not well understood today, and more research is needed. It may be too early to confidently claim what exactly is going on inside them. The next frontier: AI agents for subjective tasks The AI agents on the market today work best for well-defined, verifiable domains such as coding. OpenAI's Codex agent aims to help software engineers offload simple coding tasks. Meanwhile, Anthropic's models have become particularly popular in AI coding tools like Cursor and Claude Code — these are some of the first AI agents that people are willing to pay up for. However, general purpose AI agents like OpenAI's ChatGPT Agent and Perplexity's Comet struggle with many of the complex, subjective tasks people want to automate. When trying to use these tools for online shopping or finding a long-term parking spot, I've found the agents take longer than I'd like and make silly mistakes. Agents are, of course, early systems that will undoubtedly improve. But researchers must first figure out how to better train the underlying models to complete tasks that are more subjective. AI applications (Photo by Jonathan Raa/NurPhoto via Getty Images) 'Like many problems in machine learning, it's a data problem,' said Lightman, when asked about the limitations of agents on subjective tasks. 'Some of the research I'm really excited about right now is figuring out how to train on less verifiable tasks. We have some leads on how to do these things.' Noam Brown, an OpenAI researcher who helped create the IMO model and o1, told TechCrunch that OpenAI has new general-purpose RL techniques which allow them to teach AI models skills that aren't easily verified. This was how the company built the model which achieved a gold medal at IMO, he said. OpenAI's IMO model was a newer AI system that spawns multiple agents, which then simultaneously explore several ideas, and then choose the best possible answer. These types of AI models are becoming more popular; Google and xAI have recently released state-of-the-art models using this technique. 'I think these models will become more capable at math, and I think they'll get more capable in other reasoning areas as well,' said Brown. 'The progress has been incredibly fast. I don't see any reason to think it will slow down.' These techniques may help OpenAI's models become more performant, gains that could show up in the company's upcoming GPT-5 model. OpenAI hopes to assert its dominance over competitors with the launch of GPT-5, ideally offering the best AI model to power agents for developers and consumers. But the company also wants to make its products simpler to use. El Kishky says OpenAI wants to develop AI agents that intuitively understand what users want, without requiring them to select specific settings. He says OpenAI aims to build AI systems that understand when to call up certain tools, and how long to reason for. These ideas paint a picture of an ultimate version of ChatGPT: an agent that can do anything on the internet for you, and understand how you want it to be done. That's a much different product than what ChatGPT is today, but the company's research is squarely headed in this direction. While OpenAI undoubtedly led the AI industry a few years ago, the company now faces a tranche of worthy opponents. The question is no longer just whether OpenAI can deliver its agentic future, but can the company do so before Google, Anthropic, xAI, or Meta beat them to it?