Latest news with #o1

Meet Trapit Bansal, Meta's new AI superintelligence team hire - Is Meta poaching top talent from OpenAI?

Time of India

2 days ago

Business
Time of India

Meet Trapit Bansal, Meta's new AI superintelligence team hire - Is Meta poaching top talent from OpenAI?

Meta has poached Trapit Bansal, a key AI researcher from OpenAI who contributed significantly to their early AI reasoning and reinforcement learning efforts. Bansal's move to Meta's new AI superintelligence team underscores the intense competition for AI talent. This team aims to develop next-generation AI reasoning models, rivaling those of OpenAI and Google. Tired of too many ads? Remove Ads Ex-OpenAI Researcher Trapit Bansal Joins Meta A Key Figure in OpenAI's Reasoning Work Tired of too many ads? Remove Ads Joining a Powerhouse Team at Meta Mark Zuckerberg's AI Hiring Spree Tired of too many ads? Remove Ads FAQs Meta has made another bold move in the AI talent wars by hiring Trapit Bansal , an AI researcher who played a pivotal role in shaping OpenAI 's early efforts in AI reasoning and reinforcement learning, according to a report by who joined OpenAI in 2022, is now among the most publicly visible names to leave the firm and join Meta's brand-new AI superintelligence team , an initiative that's fast attracting all the top minds in the field of AI, as per the spokesperson Kayla Wood confirmed the news to TechCrunch that Bansal had departed OpenAI, while even Bansal's LinkedIn page mentions that he has left OpenAI in June this year, according to the TechCrunch READ: After Canada, now US: College graduates face the toughest job market in decades – what's gone wrong? During his time at OpenAI, Bansal worked closely with co-founder Ilya Sutskever and played an instrumental role in the development of the company's foundational AI reasoning model, o1, as reported by growing interest in AI reasoning models , especially as Meta's competitors like OpenAI's o3 and DeepSeek's R1 hit new performance milestones, makes Bansal's move even more impactful, according to the READ: Karoline Leavitt says no enriched uranium was removed from Iranian nuclear sites prior to US attacks Bansal brings his expertise to an impressive team at Meta's AI superintelligence lab, which includes former Scale AI CEO Alexandr Wang, ex-Google DeepMind researcher Jack Rae, and machine learning veteran Johan Schalkwyk, as per the report. Bloomberg and The Wall Street Journal reported that several other former OpenAI researchers, Lucas Beyer, Alexander Kolesnikov, and Xiaohua Zhai, have recently joined Meta as mission of the lab is to develop next-gen AI reasoning models that may match or exceed OpenAI and Google, however, Meta has not yet put out a public AI reasoning model, as reported by READ: Last chance to claim your Fortnite refund – Act fast or risk missing out on free cash Meta CEO Mark Zuckerberg has also been making compensation deals in the $100 million range to lure top AI talent to build his new AI team, as reported by TechCrunch. However, it is not known what Bansal was offered to join in this deal, as reported by has also reportedly tried to acquire startups with heavy-hitting AI research labs, like Sutskever's Safe Superintelligence, Mira Murati's Thinking Machines Labs, and Perplexity, to further fill out its new AI unit, but those talks never progressed to a final stage, according to the a recent podcast, OpenAI CEO Sam Altman asserted that Meta has been trying to poach his startup's top talent, but highlighted that 'none of our best people have decided to take him up on that,' quoted an AI researcher who helped OpenAI develop its first major reasoning model and worked closely with Ilya a newly formed unit aimed at developing advanced AI reasoning models, similar to those at OpenAI and Google.

Meta recruits leading OpenAI researcher Trapit Bansal for AI reasoning lab

Business Standard

2 days ago

Business
Business Standard

Meta recruits leading OpenAI researcher Trapit Bansal for AI reasoning lab

Meta has onboarded a prominent OpenAI researcher, Trapit Bansal, to work on advanced aritificial intelligence (AI) reasoning models within its recently established AI superintelligence team, according to a report by TechCrunch. Trapit Bansal had been with OpenAI since 2022 and played a major role in launching the company's reinforcement learning research, working closely with co-founder Ilya Sutskever. He is named as one of the original contributors to OpenAI's first AI reasoning model, known as o1. His LinkedIn profile indicates that he left OpenAI in June. OpenAI spokesperson Kayla Wood confirmed to TechCrunch that Bansal had indeed exited the organisation. Boost to Meta's AI superintelligence team Bansal is expected to significantly strengthen Meta's new AI superintelligence group, which already includes notable figures such as former Scale AI CEO Alexandr Wang. The team is also in discussions to bring in former GitHub CEO Nat Friedman and Safe Superintelligence co-founder Daniel Gross. His expertise could help Meta develop a cutting-edge AI reasoning model to compete with leading offerings like OpenAI's o3 and DeepSeek's R1. At present, Meta does not have a publicly available AI reasoning model. Zuckerberg's high-profile hiring strategy In recent months, Meta CEO Mark Zuckerberg has aggressively recruited top AI talent, reportedly offering compensation packages as high as $100 million. While Bansal's offer remains undisclosed, his decision to join indicates the success of Zuckerberg's strategy in attracting leading AI researchers. According to The Wall Street Journal, Bansal will join other recent hires from OpenAI — Lucas Beyer, Alexander Kolesnikov, and Xiaohua Zhai — at Meta. The team also includes Jack Rae, formerly of Google DeepMind, and Johan Schalkwyk, previously with startup Sesame, according to a Bloomberg report. Attempts to acquire AI startups fell through In a bid to expand its AI capabilities further, Meta also explored acquiring startups known for their AI research, including Safe Superintelligence (co-founded by Sutskever), Mira Murati's Thinking Machines Labs, and Perplexity. However, none of these talks reached a final agreement. On a recent podcast, OpenAI CEO Sam Altman commented on Meta's recruitment attempts, stating, 'None of our best people have decided to take him up on that.' AI reasoning a critical focus for Meta Developing powerful AI reasoning models is essential for Meta's new unit. Over the past year, firms such as OpenAI, Google, and DeepSeek have released high-performing models that can tackle complex tasks by reasoning through problems before producing answers. This approach, which makes use of additional computation time and resources, has led to improved performance both in benchmarks and in real-world applications. Future ambitions for Meta's AI lab Meta's AI superintelligence group is expected to become a crucial part of its wider operations, similar to the role DeepMind plays within Google. The company has plans to develop AI agents for enterprise use, led by Clara Shih, the former Salesforce CEO of AI.

Microsoft-backed AI lab Mistral is launching its first reasoning model in challenge to OpenAI

CNBC

10-06-2025

Business
CNBC

Microsoft-backed AI lab Mistral is launching its first reasoning model in challenge to OpenAI

French artificial intelligence firm Mistral is on Tuesday launching its first reasoning model to compete with rival options from the likes of OpenAI and China's DeepSeek. The startup, which is backed by U.S. tech giant Microsoft, on Tuesday said that it plans to release its own reasoning model Magistral, which is "competitive with all the others," including OpenAI's o1 and Chinese AI firm DeepSeek's R1, according to CEO Arthur Mensch. Reasoning models are systems that can execute more complicated tasks through a step-by-step logical thought process. Mistral's new model "is great at mathematics [and] great at coding," Mensch told CNBC's Arjun Kharpal onstage during a fireside chat at London Tech Week. The Mistral boss said that the unique selling point of the company's upcoming Magistral reasoning model is that it'll be able to reason with European languages. "Historically, we've seen U.S. models reason in English and Chinese models reason in Chinese," Mensch said. At the start of this year, Chinese AI startup DeepSeek released a reasoning model called R1 that shocked the AI community — and global markets — promising competitive performance with OpenAI's rival o1 model at a lower cost.

Alibaba launches Qwen3 AI, claims it's better than DeepSeek R1: Details

Business Standard

29-04-2025

Business
Business Standard

Alibaba launches Qwen3 AI, claims it's better than DeepSeek R1: Details

Alibaba Group Holding unveiled the third generation of its open-source artificial intelligence (AI) model Qwen3 series, on Tuesday, raising the stakes in an increasingly competitive Chinese and global AI market. The Qwen3 family boasts faster processing speeds and expanded multilingual capabilities compared to other AI models, including DeepSeek-R1 and OpenAI's o1. What is the Qwen3 series? The Qwen3 range features eight models, varying from 600 million to 235 billion parameters, each offering performance improvements, according to Alibaba's cloud computing division. Parameters, often seen as a measure of an AI model's complexity and capability, are essential for tasks such as language understanding, coding, and mathematical problem-solving. How do Qwen3 models compare to rivals? According to benchmark tests, cited by the developors, the Qwen3-235B and Qwen3-4B models either matched or outperformed advanced competitors from both Chinese and international companies — including OpenAI's o1, Google's Gemini, and DeepSeek's R1 — particularly in instruction following, coding support, text generation, mathematical problem-solving, and complex reasoning. "Qwen3 represents a significant milestone in our journey towards artificial general intelligence and artificial superintelligence," the Qwen team added, highlighting that enhanced pre-training and reinforcement learning had resulted in a marked leap in the models' intelligence. "Notably, our smaller MoE model, Qwen3-30B-A3B, surpasses QwQ-32B, and even the compact Qwen3-4B rivals the performance of the much larger Qwen2.5-72B-Instruct," the company added in a blog post on the launch. ALSO READ | Qwen3 introduces 'hybrid reasoning' capability One of the standout features of the Qwen3 series is its hybrid reasoning capability. Users can select between a slower but deeper "thinking" mode for complex tasks and a faster "non-thinking" mode for quicker, simpler responses. This flexibility aims to cater to diverse user needs, from casual interactions to advanced problem-solving. In contrast, DeepSeek-R1 primarily uses Chain-of-Thought (CoT) reasoning, a method where the model generates a sequence of thought steps or reasoning processes before providing a final answer. Training for the Qwen3 models involved 36 trillion tokens across 119 languages and dialects, tripling the language scope achieved by its predecessor, Qwen2.5. This expansion is expected to significantly enhance the models' ability to understand and generate multilingual content. Where and how to use Qwen3? The new Qwen3 models are available for download on platforms such as Hugging Face, ModelScope, Kaggle, and Microsoft's GitHub. Alibaba recommends deployment using frameworks like SGLang and vLLM, while users who prefer local integration can turn to tools such as Ollama, LMStudio, MLX, and KTransformers. Global AI race The release of Qwen3 arrives at a time when the global AI landscape is witnessing a surge of new developments. Baidu recently unveiled two upgraded models, and DeepSeek's R2 launch is also anticipated soon.

Alibaba unveils Qwen3 AI models that it says outperform DeepSeek R1

South China Morning Post

29-04-2025

Business
South China Morning Post

Alibaba unveils Qwen3 AI models that it says outperform DeepSeek R1

Advertisement The Qwen3 family consists of eight models, ranging from 600 million parameters to 235 billion, with enhancements across all models, according to the Qwen team at Alibaba's cloud computing unit. Alibaba owns the South China Morning Post. In AI, parameters are a measurement of the variables present during model training. They serve as an indicator of sophistication: larger parameter sizes typically suggest greater capacity. Benchmark tests cited by Alibaba revealed that models such as Qwen3-235B and Qwen3-4B matched or exceeded the performance of advanced models from both domestic and overseas competitors – including OpenAI's o1, Google's Gemini and DeepSeek's R1 – in areas like instruction following, coding assistance, text generation, mathematical skills and complex problem solving. 11:13 How is betting on AI to transform e-commerce How is betting on AI to transform e-commerce The launch of Qwen3, which was anticipated this month as previously reported by the Post , is expected to solidify Alibaba's position as a leading provider of open-source models. With over 100,000 derivative models built upon it, Qwen is currently the world's largest open-source AI ecosystem, surpassing Facebook parent Meta Platforms' Llama community. Advertisement

Latest news with #o1

Meet Trapit Bansal, Meta's new AI superintelligence team hire - Is Meta poaching top talent from OpenAI?

Meta recruits leading OpenAI researcher Trapit Bansal for AI reasoning lab

Microsoft-backed AI lab Mistral is launching its first reasoning model in challenge to OpenAI

Alibaba launches Qwen3 AI, claims it's better than DeepSeek R1: Details

Alibaba unveils Qwen3 AI models that it says outperform DeepSeek R1

Get Started Now: Download the App