Humans beat AI gold-level score at top maths contest

6 days ago

Google's Gemini chatbot solved five out of the six maths problems set at the IMO. (EPA Images pic)
SYDNEY : Humans beat generative AI models made by Google and OpenAI at a top international mathematics competition, despite the programmes reaching gold-level scores for the first time.
Neither model scored full marks – unlike five young people at the International Mathematical Olympiad (IMO), a prestigious annual competition where participants must be under 20 years old.
Google said yesterday that an advanced version of its Gemini chatbot had solved five out of the six maths problems set at the IMO, held in Australia's Queensland this month.
'We can confirm that Google DeepMind has reached the much-desired milestone, earning 35 out of a possible 42 points – a gold medal score,' the US tech giant cited IMO president Gregor Dolinar as saying.
'Their solutions were astonishing in many respects. IMO graders found them to be clear, precise and most of them easy to follow.'
Around 10% of human contestants won gold-level medals, and five received perfect scores of 42 points.
US ChatGPT maker OpenAI said that its experimental reasoning model had scored a gold-level 35 points on the test.
The result 'achieved a longstanding grand challenge in AI' at 'the world's most prestigious math competition', OpenAI researcher Alexander Wei wrote on social media.
'We evaluated our models on the 2025 IMO problems under the same rules as human contestants,' he said.
'For each problem, three former IMO medallists independently graded the model's submitted proof.'
Google achieved a silver-medal score at last year's IMO in the British city of Bath, solving four of the six problems.
That took two to three days of computation – far longer than this year, when its Gemini model solved the problems within the 4.5-hour time limit, it said.
The IMO said tech companies had 'privately tested closed-source AI models on this year's problems', the same ones faced by 641 competing students from 112 countries.
'It is very exciting to see progress in the mathematical capabilities of AI models,' said IMO president Dolinar.
Contest organisers could not verify how much computing power had been used by the AI models or whether there had been human involvement, he cautioned.

Hashtags

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Content Forum becomes first Malaysian partner in Google's flagger programme

New Straits Times

2 minutes ago

New Straits Times

Content Forum becomes first Malaysian partner in Google's flagger programme

KUALA LUMPUR: Google has partnered with the Communications and Multimedia Content Forum of Malaysia (Content Forum) to strengthen online safety through its global Priority Flagger programme. The move makes the Content Forum the first Malaysian organisation to join the initiative, which allows select partners to identify and report harmful content directly to Google and YouTube via dedicated review channels. Operating under the purview of the Malaysian Communications and Multimedia Commission (MCMC), the Content Forum will now assist in flagging content that potentially violates platform policies, with consideration for local cultural contexts. Google Malaysia country director Farhan Qureshi said the collaboration reflects the importance of tapping into local knowledge to create a safer digital environment. "By working with organisations like the Content Forum, we are adding a crucial layer of local expertise, which deepens our ability to respond to harmful content with relevance and precision," he said. The Priority Flagger programme enables trusted local agencies and non-governmental organisations (NGOs) to alert Google about problematic material across platforms such as Search, Maps, Play, Gmail, and YouTube. These reports receive priority review due to the flaggers' industry expertise. As a Priority Flagger, Content Forum will also participate in policy discussions and feedback sessions with Google, helping shape platform governance. Content Forum chief executive officer Mediha Mahmood said the onboarding marked a meaningful advancement in the country's approach to content regulation. "It allows us to move beyond dialogue into action, ensuring that harmful content is flagged and reviewed with the urgency it deserves. "This collaboration reflects our continued role in setting industry standards, empowering communities, and contributing to a safer digital ecosystem through collective responsibility." Content Forum is a self-regulatory industry body designated under the Communications and Multimedia Act 1998. It represents stakeholders ranging from broadcasters and advertisers to content creators, internet service providers, and civic groups.

Meta clashes with Apple, Google over age check legislation

The Star

10 hours ago

The Star

Meta clashes with Apple, Google over age check legislation

The biggest tech companies are warring over who's responsible for children's safety online, with billions of dollars in fines on the line as states rapidly pass conflicting laws requiring companies to verify users' ages. The struggle has pitted Meta Platforms Inc and other app developers against Apple Inc and Alphabet Inc's Google, the world's largest app stores. Lobbyists for both sides are moving from state to state, working to water down or redirect the legislation to minimize their clients' risks. This year alone, at least three states – Utah, Texas and Louisiana – passed legislation requiring tech companies to authenticate users' ages, secure parental consent for anyone under 18 and ensure minors are protected from potentially harmful digital experiences. Now, lobbyists for all three companies are flooding into South Carolina and Ohio, the next possible states to consider such legislation. The debate has taken on new importance after the Supreme Court this summer ruled age verification laws are constitutional in some instances. A tech group on Wednesday petitioned the Supreme Court to block a social media age verification law in Mississippi, teeing up a highly consequential decision in the next few weeks. Child advocates say holding tech companies responsible for verifying the ages of their users is key to creating a safer online experience for minors. Parents and advocates have alleged the social media platforms funnel children into unsafe and toxic online spaces, exposing young people to harmful content about self harm, eating disorders, drug abuse and more. Blame game Meta supporters argue the app stores should be responsible for figuring out whether minors are accessing inappropriate content, comparing the app store to a liquor store that checks patrons' IDs. Apple and Google, meanwhile, argue age verification laws violate children's privacy and argue the individual apps are better-positioned to do age checks. Apple said it's more accurate to describe the app store as a mall and Meta as the liquor store. The three new state laws put the responsibility on app stores, signaling Meta's arguments are gaining traction. The company lobbied in support of the Utah and Louisiana laws putting the onus on Apple and Google for tracking their users' ages. Similar Meta-backed proposals have been introduced in 20 states. Federal legislation proposed by Republican Senator Mike Lee of Utah would hold the app stores accountable for verifying users' ages. Still, Meta's track record in its state campaigns is mixed. At least eight states have passed laws since 2024 forcing social media platforms to verify users' ages and protect minors online. Apple and Google have mobilized dozens of lobbyists across those states to argue that Meta is shirking responsibility for protecting children. "We see the legislation being pushed by Meta as an effort to offload their own responsibilities to keep kids safe,' said Google spokesperson Danielle Cohen. "These proposals introduce new risks to the privacy of minors, without actually addressing the harms that are inspiring lawmakers to act.' Meta spokesperson Rachel Holland countered that the company is supporting the approach favored by parents who want to keep their children safe online. "Parents want a one-stop-shop to oversee their teen's online lives and 80% of American parents and bipartisan lawmakers across 20 states and the federal government agree that app stores are best positioned to provide this,' Holland said. As the regulation patchwork continues to take shape, the companies have each taken voluntary steps to protect children online. Meta has implemented new protections to restrict teens from accessing "sensitive' content, like posts related to suicide, self-harm and eating disorders. Apple created "Child Accounts,' which give parents more control over their children's' online activity. At Apple, spokesperson Peter Ajemian said it "soon will release our new age assurance feature that empowers parents to share their child's age range with apps without disclosing sensitive information.' Splintered groups As the lobbying battle over age verification heats up, influential big tech groups are splintering and new ones emerging. Meta last year left Chamber of Progress, a liberal-leaning tech group that counts Apple and Google as members. Since then, the chamber, which is led by a former Google lobbyist and brands itself as the Democratic-aligned voice for the tech industry, has grown more aggressive in its advocacy against all age verification bills. "I understand the temptation within a company to try to redirect policymakers towards the company's rivals, but ultimately most legislators don't want to intervene in a squabble between big tech giants,' said Chamber of Progress CEO Adam Kovacevich. Meta tried unsuccessfully to convince another major tech trade group, the Computer & Communications Industry Association, to stop working against bills Meta supports, two people familiar with the dynamics said. Meta, a CCIA member, acknowledged it doesn't always agree with the association. Meta is also still a member of NetChoice, which opposes all age verification laws no matter who's responsible. The group currently has 10 active lawsuits on the matter, including battling some of Meta's preferred laws. The disagreements have prompted some of the companies to form entirely new lobbying outfits. Meta in April teamed up with Spotify Technology SA and Match Group Inc to launch a coalition aimed at taking on Apple and Google, including over the issue of age verification. Competing campaigns Meta is also helping to fund the Digital Childhood Alliance, a coalition of conservative groups leading efforts to pass app-store age verification, according to three people familiar with the funding. Neither the Digital Childhood Alliance nor Meta responded directly to questions about whether Meta is funding the group. But Meta said it has collaborated with Digital Childhood Alliance. The group's executive director, Casey Stefanski, said it includes more than 100 organizations and child safety advocates who are pushing for more legislation that puts responsibility on the app stores. Stefanski said the Digital Childhood Alliance has met with Google "several times' to share their concerns about the app store in recent months. The App Association, a group backed by Apple, has been running ads in Texas, Alabama, Louisiana and Ohio arguing that the app store age verification bills are backed by porn websites and companies. The adult entertainment industry's main lobby said it is not pushing for the bills; pornography is mostly banned from app stores. "This one-size fits all approach is built to solve problems social media platforms have with their systems while making our members, small tech companies and app developers, collateral damage,' said App Association spokesperson Jack Fleming. In South Carolina and Ohio, there are competing proposals placing different levels of responsibility on the app stores and developers. That could end with more stringent legislation that makes neither side happy. "When big tech acts as a monolith, that's when things die,' said Joel Thayer, a supporter of the app store age verification bills. "But when they start breaking up that concentration of influence, all the sudden good things start happening because the reality is, these guys are just a hair's breath away from eating each other alive.' – Bloomberg

AI is replacing search engines as a shopping guide, research suggests

The Star

a day ago

The Star

AI is replacing search engines as a shopping guide, research suggests

Finding products, comparing prices and browsing reviews: Until now, you'd have done most of this in a search engine like Google. But that era appears to be ending thanks to AI, research shows. — Photo: Christin Klose/dpa COPENHAGEN: Three in four people who use AI are turning to the likes of ChatGPT, Gemini and Copilot to get advice and recommendations on shopping and travel instead of using the previous online method of search engines like Google, new research shows. AI-supported online shopping is done at least occasionally by 76% of AI users, with 17% doing so most or even all of the time, according to a study conducted by the market research institute Norstat on behalf of Verdane, a leading European investment company. The changes in consumer search behaviour pose a major challenge not only for search engine providers like Google but also for manufacturers and retailers, who must adapt to maintain their visibility in the AI-driven world. AI chatbots have emerged as powerful tools for tracking down specific products, often providing helpful advice in response to complex and specific queries. Of the survey respondents, 3% are dedicated AI enthusiasts who always use AI tools instead of search engines when shopping online, while 14% said they mostly use AI and 35% do so occasionally. A total of 7,282 people from the UK, Germany, Sweden, Norway, Denmark and Finland aged between 18 and 60 participated in the survey in June. The highest proportion of AI use is in online travel research, at 33%. This is followed by consumer electronics (22%), DIY and hobby supplies (20%), and software or digital subscriptions (19%). However, AI usage is still relatively low in fashion and clothing (13%), cosmetics (12%), and real estate (7%). Among AI tools, ChatGPT is far ahead of its competitors and 86% of AI users regularly use OpenAI's chatbot. This is followed at a considerable distance by Google's Gemini (26% regular users) and Microsoft's Copilot (20%). The Chinese AI bot DeepSeek, which has been the subject of heated debate among AI experts and data protection advocates, appears to have no significant role among consumers in Europe. – dpa