WEKA Debuts NeuralMesh Axon For Exascale AI Deployments

08-07-2025

New Offering Delivers a Unique Fusion Architecture That's Being Leveraged by Industry-Leading AI Pioneers Like Cohere, CoreWeave, and NVIDIA to Deliver Breakthrough Performance Gains and Reduce Infrastructure Requirements For Massive AI Training and Inference Workloads
PARIS and CAMPBELL, Calif., July 8, 2025 /PRNewswire/ -- From RAISE SUMMIT 2025: WEKA unveiled NeuralMesh Axon, a breakthrough storage system that leverages an innovative fusion architecture designed to address the fundamental challenges of running exascale AI applications and workloads. NeuralMesh Axon seamlessly fuses with GPU servers and AI factories to streamline deployments, reduce costs, and significantly enhance AI workload responsiveness and performance, transforming underutilized GPU resources into a unified, high-performance infrastructure layer.
Building on the company's recently announced NeuralMesh storage system, the new offering enhances its containerized microservices architecture with powerful embedded functionality, enabling AI pioneers, AI cloud and neocloud service providers to accelerate AI model development at extreme scale, particularly when combined with NVIDIA AI Enterprise software stacks for advanced model training and inference optimization. NeuralMesh Axon also supports real-time reasoning, with significantly improved time-to-first-token and overall token throughput, enabling customers to bring innovations to market faster.
AI Infrastructure Obstacles Compound at Exascale
Performance is make-or-break for large language model (LLM) training and inference workloads, especially when running at extreme scale. Organizations that run massive AI workloads on traditional storage architectures, which rely on replication-heavy approaches, waste NVMe capacity, face significant inefficiencies, and struggle with unpredictable performance and resource allocation.
The reason? Traditional architectures weren't designed to process and store massive volumes of data in real-time. They create latency and bottlenecks in data pipelines and AI workflows that can cripple exascale AI deployments. Underutilized GPU servers and outdated data architectures turn premium hardware into idle capital, resulting in costly downtime for training workloads. Inference workloads struggle with memory-bound barriers, including key-value (KV) caches and hot data, resulting in reduced throughput and increased infrastructure strain. Limited KV cache offload capacity creates data access bottlenecks and complicates resource allocation for incoming prompts, directly impacting operational expenses and time-to-insight. Many organizations are transitioning to NVIDIA accelerated compute servers, paired with NVIDIA AI Enterprise software, to address these challenges. However, without modern storage integration, they still encounter significant limitations in pipeline efficiency and overall GPU utilization.
Built For The World's Largest and Most Demanding Accelerated Compute Environments
To address these challenges, NeuralMesh Axon's high-performance, resilient storage fabric fuses directly into accelerated compute servers by leveraging local NVMe, spare CPU cores, and its existing network infrastructure. This unified, software-defined compute and storage layer delivers consistent microsecond latency for both local and remote workloads—outpacing traditional local protocols like NFS.
Additionally, when leveraging WEKA's Augmented Memory Grid capability, it can provide near-memory speeds for KV cache loads at massive scale. Unlike replication-heavy approaches that squander aggregate capacity and collapse under failures, NeuralMesh Axon's unique erasure coding design tolerates up to four simultaneous node losses, sustains full throughput during rebuilds, and enables predefined resource allocation across the existing NVMe, CPU cores, and networking resources—transforming isolated disks into a memory-like storage pool at exascale and beyond while providing consistent low latency access to all addressable data.
Cloud service providers and AI innovators operating at exascale require infrastructure solutions that can match the exponential growth in model complexity and dataset sizes. NeuralMesh Axon is specifically designed for organizations operating at the forefront of AI innovation that require immediate, extreme-scale performance rather than gradual scaling over time. This includes AI cloud providers and neoclouds building AI services, regional AI factories, major cloud providers developing AI solutions for enterprise customers, and large enterprise organizations deploying the most demanding AI inference and training solutions that must agilely scale and optimize their AI infrastructure investments to support rapid innovation cycles.
Delivering Game-Changing Performance for Accelerated AI Innovation
Early adopters, including Cohere, the industry's leading security-first enterprise AI company, are already seeing transformational results.
Cohere is among WEKA's first customers to deploy NeuralMesh Axon to power its AI model training and inference workloads. Faced with high innovation costs, data transfer bottlenecks, and underutilized GPUs, Cohere first deployed NeuralMesh Axon in the public cloud to unify its AI stack and streamline operations.
"For AI model builders, speed, GPU optimization, and cost-efficiency are mission-critical. That means using less hardware, generating more tokens, and running more models—without waiting on capacity or migrating data," said Autumn Moulder, vice president of engineering at Cohere. "Embedding WEKA's NeuralMesh Axon into our GPU servers enabled us to maximize utilization and accelerate every step of our AI pipelines. The performance gains have been game-changing: Inference deployments that used to take five minutes can occur in 15 seconds, with 10 times faster checkpointing. Our team can now iterate on and bring revolutionary new AI models, like North, to market with unprecedented speed."
To improve training and help develop North, Cohere's secure AI agents platform, the company is deploying WEKA's NeuralMesh Axon on CoreWeave Cloud, creating a robust foundation to support real-time reasoning and deliver exceptional experiences for Cohere's end customers.
"We're entering an era where AI advancement transcends raw compute alone—it's unleashed by intelligent infrastructure design. CoreWeave is redefining what's possible for AI pioneers by eliminating the complexities that constrain AI at scale," said Peter Salanki, CTO and co-founder at CoreWeave. "With WEKA's NeuralMesh Axon seamlessly integrated into CoreWeave's AI cloud infrastructure, we're bringing processing power directly to data, achieving microsecond latencies that reduce I/O wait time and deliver more than 30 GB/s read, 12 GB/s write, and 1 million IOPS to an individual GPU server. This breakthrough approach increases GPU utilization and empowers Cohere with the performance foundation they need to shatter inference speed barriers and deliver advanced AI solutions to their customers."
"AI factories are defining the future of AI infrastructure built on NVIDIA accelerated compute and our ecosystem of NVIDIA Cloud Partners," said Marc Hamilton, vice president of solutions architecture and engineering at NVIDIA. "By optimizing inference at scale and embedding ultra-low latency NVMe storage close to the GPUs, organizations can unlock more bandwidth and extend the available on-GPU memory for any capacity. Partner solutions like WEKA's NeuralMesh Axon deployed with CoreWeave provide a critical foundation for accelerated inferencing while enabling next-generation AI services with exceptional performance and cost efficiency."
The Benefits of Fusing Storage and Compute For AI Innovation
NeuralMesh Axon delivers immediate, measurable improvements for AI builders and cloud service providers operating at exascale, including:
"The infrastructure challenges of exascale AI are unlike anything the industry has faced before. At WEKA, we're seeing organizations struggle with low GPU utilization during training and GPU overload during inference, while AI costs spiral into millions per model and agent," said Ajay Singh, chief product officer at WEKA. "That's why we engineered NeuralMesh Axon, born from our deep focus on optimizing every layer of AI infrastructure from the GPU up. Now, AI-first organizations can achieve the performance and cost efficiency required for competitive AI innovation when running at exascale and beyond."
Availability
NeuralMesh Axon is currently available in limited release for large-scale enterprise AI and neocloud customers, with general availability scheduled for fall 2025. For more information, visit:
About WEKA
WEKA is transforming how organizations build, run, and scale AI workflows through NeuralMesh™, its intelligent, adaptive mesh storage system. Unlike traditional data infrastructure, which becomes more fragile as AI environments expand, NeuralMesh becomes faster, stronger, and more efficient as it scales, growing with your AI environment to provide a flexible foundation for enterprise and agentic AI innovation. Trusted by 30% of the Fortune 50 and the world's leading neoclouds and AI innovators, NeuralMesh maximizes GPU utilization, accelerates time to first token, and lowers the cost of AI innovation. Learn more at www.weka.io, or connect with us on LinkedIn and X.

Hashtags

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Canada's Cohere taps Seoul to lead AI growth in Asia-Pacific

Korea Herald

20 hours ago

Korea Herald

Canada's Cohere taps Seoul to lead AI growth in Asia-Pacific

Cohere, a Toronto-based security-first artificial intelligence company, announced Tuesday its expansion into the Asia-Pacific region, unveiling plans to establish a new office in Seoul. This move underscores Cohere's commitment to delivering secure, cutting-edge AI solutions to enterprises and governments across diverse languages and markets. The Seoul office will serve as a strategic hub for business growth and innovation throughout the region, reinforcing Korea's rising importance in the global AI landscape. 'We are excited to build a strong local team, support forward-looking clients and collaborate with government bodies to deliver meaningful and safe AI solutions that drive economic productivity,' said Cohere CEO and co-founder Aidan Gomez. To lead this initiative, the company has appointed Andrew Chang as the new president of Cohere APAC, who previously held executive roles at leading global tech companies such as Google Cloud, Microsoft, IBM and Samsung SDS. Cohere has already begun hiring top talent in Korea to accelerate the adoption of secure, multilingual AI solutions across regulated sectors that handle sensitive data, such as finance, health care and manufacturing. The company will also launch a research grants program, Cohere Labs, and strengthen partnerships with leaders like LG CNS, a Korean information technology solutions provider under LG Group, with whom it recently secured a major public sector AI project with the Ministry of Foreign Affairs. Cohere, founded in 2019 by three former Google Brain researchers, has raised a total of $970 million from major investors, including Nvidia and AMD. Its current valuation is estimated at around $5.5 billion.

Seoul shares reach near 4-yr high on strong chip gains

Korea Herald

6 days ago

Korea Herald

Seoul shares reach near 4-yr high on strong chip gains

South Korean stocks rose for the fourth consecutive session Thursday to climb to a near four-year high, driven by overnight gains in US artificial intelligence chip giant Nvidia that lifted semiconductor shares. The local currency gained against the US dollar. The benchmark Korea Composite Stock Price Index added 49.49 points, or 1.58 percent, to close at 3,183.23, marking the highest closing level since Sept. 7, 2021, when the index finished at 3,187.42. The KOSPI also extended its winning streak to a fourth straight session, which began Monday. Trade volume was moderate at 589.8 million shares worth 14 trillion won (US$10.2 billion), with gainers beating decliners 597 to 287. Foreign and institutional investors led the rally, scooping up a net 445.8 billion won and 41.6 billion won worth of stocks, respectively. Individuals dumped a net 560 billion won. In the US market, Nvidia became the world's first company to hit $4 trillion in market value on Wednesday, pushing up the tech-heavy Nasdaq to an all-time high. In Seoul, semiconductor and internet shares were among the biggest winners. SK hynix, a key supplier to Nvidia, jumped 5.69 percent to 297,000 won, and Samsung Electronics, the world's largest memory chip maker, gained 0.99 percent to 61,000 won. Naver, No. 1 internet platform company, increased 2.17 percent to 259,500 won, and its rival Kakao climbed 0.50 percent to 60,800 won. Pharmaceutical stocks also finished in positive territory, with industry leader Samsung Biologics surging 6.09 percent to 1,080,000 won, while SK biopharm advanced 5.54 percent to 99,000 won. Samyang Foods, best known for its global hit Buldak spicy ramyeon, added 1.28 percent to a record 1,498,000 won. However, Hybe, the management agency behind global superstars BTS, fell 0.9 percent to 274,500 won as its founder Bang Si-hyuk is set to face criminal charges for allegedly engaging in illegal transactions ahead of the company's initial public offering in 2020. The local currency was quoted at 1,370.0 won against the greenback at 3:30 p.m., up 5.0 won from the previous session. (Yonhap)

Korea Herald

6 days ago

Korea Herald

LG CNS debuts new LLM beating GPT-4o

LG CNS, a Korean information technology solutions provider under LG Group, has partnered with artificial intelligence startup Cohere from Canada to jointly develop a new inference large language model supporting 23 languages, including Korean and English. With 111 billion parameters, the model outperformed leading global LLMs such as OpenAI's GPT-4o and Anthropic's Claude 3.7 Sonnet, the Korean company said. The latest model comes just two months after LG CNS and Cohere unveiled a 7-billion parameter, Korean-specialized lightweight model. Cohere is considered a leading challenger to OpenAI in the global LLM market. According to CNS, the newly released inference LLM is optimized for complex reasoning tasks, which the company aims to utilize for "agentic AI" services, where AI autonomously assesses situations and executes multistep tasks. "With differentiated AI capabilities and competitiveness, we aim to provide specialized agentic AI services tailored to client businesses and become the leading partner in advancing enterprise experience," said Kim Tae-hoon, senior vice president and head of the AI & cloud division at LG CNS. With the addition of the new large-scale, lightweight models co-developed with Cohere and the Exaone model developed by LG AI Research, LG CNS said it is prepared with a full LLM lineup to support customized agentic AI services for clients. To develop the new LLM, LG CNS said it integrated its extensive IT expertise and AI capabilities into Cohere's enterprise-grade LLM, known as Command. The Command model is already in use at major global institutions, including Canada's largest bank, the Royal Bank of Canada. LG CNS said it plans to offer the new LLM in an on-premise format -- models that runs locally on a company's own servers, not on cloud -- reducing risk of external exposure. The company also explained the LLM can run on just two graphics processing units, where as models with over 100 billion parameters typically require at least four, enabling cost-effective deployment. In benchmarking tests such as Math500 and AIME 2024, which evaluate mathematical reasoning and logic, the LLM model jointly developed by LG CNS and Cohere surpassed GPT-4o, GPT-4.1 and Claude 3.7 Sonnet in both Korean and English performance, the company said. The LLM supports 23 languages, including Korean, English, Japanese, Chinese, Hebrew and Persian, LG CNS said. In Korean-lanaguage benchmarks, it demonstrated state-of-the-art performance among on-premise LLMs.

WEKA Debuts NeuralMesh Axon For Exascale AI Deployments

Hashtags

Try Our AI Features

Comments

Related Articles

Canada's Cohere taps Seoul to lead AI growth in Asia-Pacific

Seoul shares reach near 4-yr high on strong chip gains

LG CNS debuts new LLM beating GPT-4o

Get Started Now: Download the App