Latest news with #EpochAI


The Irish Sun
3 days ago
- Science
- The Irish Sun
We went head-to-head with AI and LOST as 30 of Earth's top brains left ‘frightened' after secret battle with chatbot
A SUPER-SMART artificial intelligence (AI) chatbot has spooked mathematicians who believe tech companies are on the verge of creating a robot "genius". 30 of the world's most renowned mathematicians congregated in Berkeley, California in mid-May for a secret maths battle against a machine. Advertisement 3 The bot uses a large language models (LLM), called o4-mini, which was produced by ChatGPT creator OpenAI Credit: Reuters The bot uses a large language models (LLM), called o4-mini, which was produced by ChatGPT creator OpenAI. And it proved itself to be smarter than some of the human geniuses graduating universities today, according to Ken Ono, a mathematician at the University of Virginia and a leader and judge at the meeting. It was able to answer some of the toughest math equations out there in mere minutes - problems that would have taken a human expert weeks or months to solve. OpenAI had asked Epoch AI, a nonprofit than benchmarks AI models, to come up with 300 math questions whose solutions had not yet been published. Advertisement READ MORE ON AI This meant the AI couldn't just trawl the internet for the answer; it had to solve it on its own. The group of mathematicians, hand-selected by Elliot Glazer, a recent math Ph.D. graduate hired by Epoch AI , were tasked with coming up with the hardest equations they could. Everyone who participated had to sign a nondisclosure agreement to ensure they only communicated through secure messenger app Signal. This would prevent the AI from potentially seeing their conversations and using it to train its robot brain. Advertisement Most read in Tech Only a small group of people in the world are capable of developing such questions, let alone answering them. Each problem the o4-mini couldn't solve would grant its creator a $7,500 reward. By April 2025, Glazer found that o4-mini could solve around 20 percent of the questions. Father of murdered girl turned into AI chatbot warns of dangers of new tech Then at the in-person, two-day meeting in May, participants finalised their last batch of challenge questions. Advertisement The 30 attendees were split into groups of six, and competed against each other to devise problems that they could solve but would stump the AI reasoning bot. By the end of that Saturday night, the bot's mathematical prowess was proving too successful. "I came up with a problem which experts in my field would recognize as an open question in number theory — a good Ph.D.-level problem," said Ken Ono, a mathematician at the University of Virginia and a leader and judge at the meeting, reported by Early that Sunday morning, Ono alerted the rest of the participants. Advertisement "I was not prepared to be contending with an LLM like this," he said. "I've never seen that kind of reasoning before in models. That's what a scientist does. That's frightening." Over the two days, the bot was able to solve some of the world's trickiest math problems. "I have colleagues who literally said these models are approaching mathematical genius," added Ono. Advertisement "I've been telling my colleagues that it's a grave mistake to say that generalised artificial intelligence will never come, [that] it's just a computer. "I don't want to add to the hysteria, but in some ways these large language models are already outperforming most of our best graduate students in the world." Just 10 questions stumped the bot, according to researchers. Yang Hui He, a mathematician at the London Institute for Mathematical Sciences and an early pioneer of using AI in maths, said: "This is what a very, very good graduate student would be doing - in fact, more." Advertisement 3 Over the two days, the bot was able to solve some of the world's trickiest math problems Credit: Getty 3 Just 10 questions stumped the bot, according to researchers Credit: Getty Read more about Artificial Intelligence Everything you need to know about the latest developments in Artificial Intelligence What is the popular AI How do you use Google's latest AI chatbot What is the AI image generator How do you use Snapchat's My AI tool? What are the What are the


Scottish Sun
3 days ago
- Science
- Scottish Sun
We went head-to-head with AI and LOST as 30 of Earth's top brains left ‘frightened' after secret battle with chatbot
Each problem the o4-mini couldn't solve would grant its creator a $7,500 reward CHAT'S TERRIFYING We went head-to-head with AI and LOST as 30 of Earth's top brains left 'frightened' after secret battle with chatbot Click to share on X/Twitter (Opens in new window) Click to share on Facebook (Opens in new window) A SUPER-SMART artificial intelligence (AI) chatbot has spooked mathematicians who believe tech companies are on the verge of creating a robot "genius". 30 of the world's most renowned mathematicians congregated in Berkeley, California in mid-May for a secret maths battle against a machine. Sign up for Scottish Sun newsletter Sign up 3 The bot uses a large language models (LLM), called o4-mini, which was produced by ChatGPT creator OpenAI Credit: Reuters The bot uses a large language models (LLM), called o4-mini, which was produced by ChatGPT creator OpenAI. And it proved itself to be smarter than some of the human geniuses graduating universities today, according to Ken Ono, a mathematician at the University of Virginia and a leader and judge at the meeting. It was able to answer some of the toughest math equations out there in mere minutes - problems that would have taken a human expert weeks or months to solve. OpenAI had asked Epoch AI, a nonprofit than benchmarks AI models, to come up with 300 math questions whose solutions had not yet been published. This meant the AI couldn't just trawl the internet for the answer; it had to solve it on its own. The group of mathematicians, hand-selected by Elliot Glazer, a recent math Ph.D. graduate hired by Epoch AI, were tasked with coming up with the hardest equations they could. Everyone who participated had to sign a nondisclosure agreement to ensure they only communicated through secure messenger app Signal. This would prevent the AI from potentially seeing their conversations and using it to train its robot brain. Only a small group of people in the world are capable of developing such questions, let alone answering them. Each problem the o4-mini couldn't solve would grant its creator a $7,500 reward. By April 2025, Glazer found that o4-mini could solve around 20 percent of the questions. Father of murdered girl turned into AI chatbot warns of dangers of new tech Then at the in-person, two-day meeting in May, participants finalised their last batch of challenge questions. The 30 attendees were split into groups of six, and competed against each other to devise problems that they could solve but would stump the AI reasoning bot. By the end of that Saturday night, the bot's mathematical prowess was proving too successful. "I came up with a problem which experts in my field would recognize as an open question in number theory — a good Ph.D.-level problem," said Ken Ono, a mathematician at the University of Virginia and a leader and judge at the meeting, reported by Live Science. Early that Sunday morning, Ono alerted the rest of the participants. "I was not prepared to be contending with an LLM like this," he said. "I've never seen that kind of reasoning before in models. That's what a scientist does. That's frightening." Over the two days, the bot was able to solve some of the world's trickiest math problems. "I have colleagues who literally said these models are approaching mathematical genius," added Ono. "I've been telling my colleagues that it's a grave mistake to say that generalised artificial intelligence will never come, [that] it's just a computer. "I don't want to add to the hysteria, but in some ways these large language models are already outperforming most of our best graduate students in the world." Just 10 questions stumped the bot, according to researchers. Yang Hui He, a mathematician at the London Institute for Mathematical Sciences and an early pioneer of using AI in maths, said: "This is what a very, very good graduate student would be doing - in fact, more." 3 Over the two days, the bot was able to solve some of the world's trickiest math problems Credit: Getty


The Sun
3 days ago
- Business
- The Sun
We went head-to-head with AI and LOST as 30 of Earth's top brains left ‘frightened' after secret battle with chatbot
A SUPER-SMART artificial intelligence (AI) chatbot has spooked mathematicians who believe tech companies are on the verge of creating a robot "genius". 30 of the world's most renowned mathematicians congregated in Berkeley, California in mid-May for a secret maths battle against a machine. 3 The bot uses a large language models (LLM), called o4-mini, which was produced by ChatGPT creator OpenAI. And it proved itself to be smarter than some of the human geniuses graduating universities today, according to Ken Ono, a mathematician at the University of Virginia and a leader and judge at the meeting. It was able to answer some of the toughest math equations out there in mere minutes - problems that would have taken a human expert weeks or months to solve. OpenAI had asked Epoch AI, a nonprofit than benchmarks AI models, to come up with 300 math questions whose solutions had not yet been published. This meant the AI couldn't just trawl the internet for the answer; it had to solve it on its own. The group of mathematicians, hand-selected by Elliot Glazer, a recent math Ph.D. graduate hired by Epoch AI, were tasked with coming up with the hardest equations they could. Everyone who participated had to sign a nondisclosure agreement to ensure they only communicated through secure messenger app Signal. This would prevent the AI from potentially seeing their conversations and using it to train its robot brain. Only a small group of people in the world are capable of developing such questions, let alone answering them. Each problem the o4-mini couldn't solve would grant its creator a $7,500 reward. By April 2025, Glazer found that o4-mini could solve around 20 percent of the questions. Father of murdered girl turned into AI chatbot warns of dangers of new tech Then at the in-person, two-day meeting in May, participants finalised their last batch of challenge questions. The 30 attendees were split into groups of six, and competed against each other to devise problems that they could solve but would stump the AI reasoning bot. By the end of that Saturday night, the bot's mathematical prowess was proving too successful. "I came up with a problem which experts in my field would recognize as an open question in number theory — a good Ph.D.-level problem," said Ken Ono, a mathematician at the University of Virginia and a leader and judge at the meeting, reported by Live Science. Early that Sunday morning, Ono alerted the rest of the participants. "I was not prepared to be contending with an LLM like this," he said. "I've never seen that kind of reasoning before in models. That's what a scientist does. That's frightening." Over the two days, the bot was able to solve some of the world's trickiest math problems. "I have colleagues who literally said these models are approaching mathematical genius," added Ono. "I've been telling my colleagues that it's a grave mistake to say that generalised artificial intelligence will never come, [that] it's just a computer. "I don't want to add to the hysteria, but in some ways these large language models are already outperforming most of our best graduate students in the world." Just 10 questions stumped the bot, according to researchers. Yang Hui He, a mathematician at the London Institute for Mathematical Sciences and an early pioneer of using AI in maths, said: "This is what a very, very good graduate student would be doing - in fact, more." 3 3


Forbes
25-06-2025
- Business
- Forbes
What Happens When LLM's Run Out Of Useful Data?
There is one obvious solution to a looming shortage of written content: have LLMs generate more of it. By SAP Insights Team Most of us feel like we're drowning in data. And yet, in the world of generative AI, a looming data shortage is keeping some researchers up at night. A 2024 report from the nonprofit watchdog Epoch AI projected that large language models (LLMs) could run out of fresh, human-generated training data as soon as 2026. Earlier this year, the ubiquitous Elon Musk declared that 'the cumulative sum of human knowledge has been exhausted in AI training,' and that the doomsday scenario envisioned by some AI researchers 'happened basically last year.' GenAI is unquestionably a technology whose breakthroughs in power and sophistication have generally relied on ever-larger datasets to train on. Beneath the flurry of investment, adoption, and general GenAI activity, a quiet concern has surfaced: What if the fuel driving all this progress is running low? There is one obvious solution to a looming shortage of written content: have LLMs generate more of it. The role of synthetic data in large language models Synthetic data is computer-generated information that has the same statistical properties and patterns as real data but doesn't include real-world records. Amazon recently had success using this method with LLM-generated pairs of questions and answers to fine-tune a customer service model. Because the task was narrow and the outputs were easily reviewed by human beings, the additional training on synthetic data helped the model get better at responding accurately to customer inquiries, even in scenarios it hadn't seen before. Another use case for synthetic data is for businesses using proprietary data to train bespoke LLMs—whether building them from scratch or, more commonly, layering retrieval-augmented generation (RAG) atop a commercial foundation model. In many such cases, the proprietary data involved is tightly structured, such as with historical transaction records formatted like spreadsheets with dates, locations, and dollar amounts. In contexts like these, LLM-generated synthetic data is often indistinguishable from the real thing and just as effective for training. But in less narrowly defined training scenarios, specifically the development of those big commercial models RAG relies on, the risks of training on synthetic data are real. The most widely cited danger has the dramatic name 'model collapse.' In a 2024 study published in Nature, researchers showed that when models are repeatedly trained on synthetic data generated by other models, they gradually lose diversity and accuracy, drifting further from the true distribution of real-world data until they can no longer produce reliably useful output. Mohan Shekar, SAP's AI and quantum adoption lead for cloud-based ERP, likens the process to 'model incest.' With every successive iteration, a model trained on its own output will tend to reinforce biases and flaws that may at first have been barely noticeable, until those minor defects become debilitating deformities. Long before reaching these extreme states, models trained with synthetic data have also been shown to exhibit a dullness and predictability reflecting their lack of fresh input. Such models may still have their uses, especially for mundane work and applications, but as Shekar puts it, 'If you're trying to innovate—really innovate—[a synthetic-data–trained model] won't get you there. It's just remixing what you already had.' Some researchers, including OpenAI CEO Sam Altman, have long argued that innovation in how models are trained may soon start to matter more than what they're trained on. The next wave of breakthroughs, the thinking goes, may come from rethinking the architecture and logic of training itself and then applying those new ideas. Yaad Oren, head of research and innovation at SAP, is confident that such a shift is underway. Recent advances in training methods already mean 'you can shrink the amount of data needed to build a robust product,' he says. One of those recent advances is multimodal training: building models that learn not just from text but also from video, audio, and other inputs. These models can effectively multiply one dataset by another, combining different types of information to create new datasets. Oren gives the example of voice recognition in cars during a rainstorm. For car manufacturers trying to train an LLM to understand and follow spoken natural-language instructions from a driver, rain in the background presents a hurdle. One unwieldy solution, says Oren, would be to 'record millions of hours of people talking in the rain,' he says, to familiarize the model with the soundwaves produced by a person asking for directions in a torrential downpour. More elegant and practical, though, is to combine an existing dataset of human speech with existing datasets of 'different rain and weather sounds,' he says. The result is a model that can decipher speech across a full range of meteorological backdrops—without ever having encountered the combination firsthand. Even more promising is the potential impact of quantum computing on model training. 'What quantum brings in,' says Shekar, 'is a way to look at all the possible options that exist within your datasets and derive patterns, connections, and possibilities that were not visible before.' Quantum computing could even increase the total supply of usable data by accessing the vast, underutilized oceans of so-called unstructured data, says Shekar. 'Instead of needing 50 labeled images to train a model,' he says, 'you might be able to throw in 5,000 unlabeled ones and still get a more accurate result.' That could be a very big deal indeed. AI engineers have long had the same feelings about unstructured data that physicists have about dark matter: an exquisite blend of awe, annoyance, and yearning. If quantum computing finally unlocks it, especially in tandem with multimodal learning and other innovations, today's fears of a data drought might recede. A version of this story appears on
Yahoo
07-06-2025
- Science
- Yahoo
Inside the Secret Meeting Where Mathematicians Struggled to Outsmart AI
On a weekend in mid-May, a clandestine mathematical conclave convened. Thirty of the world's most renowned mathematicians traveled to Berkeley, Calif., with some coming from as far away as the U.K. The group's members faced off in a showdown with a 'reasoning' chatbot that was tasked with solving problems they had devised to test its mathematical mettle. After throwing professor-level questions at the bot for two days, the researchers were stunned to discover it was capable of answering some of the world's hardest solvable problems. 'I have colleagues who literally said these models are approaching mathematical genius,' says Ken Ono, a mathematician at the University of Virginia and a leader and judge at the meeting. The chatbot in question is powered by o4-mini, a so-called reasoning large language model (LLM). It was trained by OpenAI to be capable of making highly intricate deductions. Google's equivalent, Gemini 2.5 Flash, has similar abilities. Like the LLMs that powered earlier versions of ChatGPT, o4-mini learns to predict the next word in a sequence. Compared with those earlier LLMs, however, o4-mini and its equivalents are lighter-weight, more nimble models that train on specialized datasets with stronger reinforcement from humans. The approach leads to a chatbot capable of diving much deeper into complex problems in math than traditional LLMs. To track the progress of o4-mini, OpenAI previously tasked Epoch AI, a nonprofit that benchmarks LLMs, to come up with 300 math questions whose solutions had not yet been published. Even traditional LLMs can correctly answer many complicated math questions. Yet when Epoch AI asked several such models these questions, which were dissimilar to those they had been trained on, the most successful were able to solve less than 2 percent, showing these LLMs lacked the ability to reason. But o4-mini would prove to be very different. [Sign up for Today in Science, a free daily newsletter] Epoch AI hired Elliot Glazer, who had recently finished his math Ph.D., to join the new collaboration for the benchmark, dubbed FrontierMath, in September 2024. The project collected novel questions over varying tiers of difficulty, with the first three tiers covering undergraduate-, graduate- and research-level challenges. By February 2025, Glazer found that o4-mini could solve around 20 percent of the questions. He then moved on to a fourth tier: 100 questions that would be challenging even for an academic mathematician. Only a small group of people in the world would be capable of developing such questions, let alone answering them. The mathematicians who participated had to sign a nondisclosure agreement requiring them to communicate solely via the messaging app Signal. Other forms of contact, such as traditional e-mail, could potentially be scanned by an LLM and inadvertently train it, thereby contaminating the dataset. The group made slow, steady progress in finding questions. But Glazer wanted to speed things up, so Epoch AI hosted the in-person meeting on Saturday, May 17, and Sunday, May 18. There, the participants would finalize the final batch of challenge questions. Ono split the 30 attendees into groups of six. For two days, the academics competed against themselves to devise problems that they could solve but would trip up the AI reasoning bot. Each problem the o4-mini couldn't solve would garner the mathematician who came up with it a $7,500 reward. By the end of that Saturday night, Ono was frustrated with the bot, whose unexpected mathematical prowess was foiling the group's progress. 'I came up with a problem which experts in my field would recognize as an open question in number theory—a good Ph.D.-level problem,' he says. He asked o4-mini to solve the question. Over the next 10 minutes, Ono watched in stunned silence as the bot unfurled a solution in real time, showing its reasoning process along the way. The bot spent the first two minutes finding and mastering the related literature in the field. Then it wrote on the screen that it wanted to try solving a simpler 'toy' version of the question first in order to learn. A few minutes later, it wrote that it was finally prepared to solve the more difficult problem. Five minutes after that, o4-mini presented a correct but sassy solution. 'It was starting to get really cheeky,' says Ono, who is also a freelance mathematical consultant for Epoch AI. 'And at the end, it says, 'No citation necessary because the mystery number was computed by me!'' Defeated, Ono jumped onto Signal early that Sunday morning and alerted the rest of the participants. 'I was not prepared to be contending with an LLM like this,' he says, 'I've never seen that kind of reasoning before in models. That's what a scientist does. That's frightening.' Although the group did eventually succeed in finding 10 questions that stymied the bot, the researchers were astonished by how far AI had progressed in the span of one year. Ono likened it to working with a 'strong collaborator.' Yang Hui He, a mathematician at the London Institute for Mathematical Sciences and an early pioneer of using AI in math, says, 'This is what a very, very good graduate student would be doing—in fact, more.' The bot was also much faster than a professional mathematician, taking mere minutes to do what it would take such a human expert weeks or months to complete. While sparring with o4-mini was thrilling, its progress was also alarming. Ono and He express concern that the o4-mini's results might be trusted too much. 'There's proof by induction, proof by contradiction, and then proof by intimidation,' He says. 'If you say something with enough authority, people just get scared. I think o4-mini has mastered proof by intimidation; it says everything with so much confidence.' By the end of the meeting, the group started to consider what the future might look like for mathematicians. Discussions turned to the inevitable 'tier five'—questions that even the best mathematicians couldn't solve. If AI reaches that level, the role of mathematicians would undergo a sharp change. For instance, mathematicians may shift to simply posing questions and interacting with reasoning-bots to help them discover new mathematical truths, much the same as a professor does with graduate students. As such, Ono predicts that nurturing creativity in higher education will be a key in keeping mathematics going for future generations. 'I've been telling my colleagues that it's a grave mistake to say that generalized artificial intelligence will never come, [that] it's just a computer,' Ono says. 'I don't want to add to the hysteria, but in many ways these large language models are already outperforming most of our best graduate students in the world.'