Atlas Cloud Launches High-Efficiency AI Inference Platform, Outperforming DeepSeek
NEW YORK CITY, NEW YORK / ACCESS Newswire / May 28, 2025 / Atlas Cloud, the all-in-one AI competency center for training and deploying AI models, today announced the launch of Atlas Inference, an AI inference platform that dramatically reduces GPU and server requirements, enabling faster, more cost-effective deployment of large language models (LLMs).
Atlas Inference, co-developed with SGLang, an AI inference engine, maximizes GPU efficiency by processing more tokens faster and with less hardware. When comparing DeepSeek's published performance results, Atlas Inference's 12-node H100 cluster outperformed DeepSeek's reference implementation of their DeepSeek-V3 model while using two-thirds of the servers. Atlas' platform reduces infrastructure requirements and operational costs while addressing hardware costs, which represent up to 80% of AI operational expenses.
"We built Atlas Inference to fundamentally break down the economics of AI deployment," said Jerry Tang, Atlas CEO. "Our platform's ability to process 54,500 input tokens and 22,500 output tokens per second per node means businesses can finally make high-volume LLM services profitable instead of merely break-even. I believe this will have a significant ripple effect throughout the industry. Simply put, we're surpassing industry standards set by hyperscalers by delivering superior throughput with fewer resources."
Atlas Inference's performance also exceeds major players like Amazon, NVIDIA and Microsoft, delivering up to 2.1 times greater throughput using 12 nodes compared to competitors' larger setups. It maintains sub-5-second first-token latency and 100-millisecond inter-token latency with more than 10,000 concurrent sessions, ensuring a scaled, superior experience. The platform's performance is driven by four key innovations:
Prefill/Decode Disaggregation: Separates compute-intensive operations from memory-bound processes to optimize efficiencyDeepExpert (DeepEP) Parallelism with Load Balancers: Ensures over 90% GPU utilizationTwo-Batch OverlapTechnology: Increases throughput by enabling larger batches and utilization of both compute and communication phases simultaneouslyDisposableTensor Memory Models: Prevents crashes during long sequences for reliable operation
"This platform represents a significant leap forward for AI inference," said Yineng Zhang, Core Developer at SGLang. "What we built here may become the new standard for GPU utilization and latency management. We believe this will unlock capabilities previously out of reach for the majority of the industry regarding throughput and efficiency."
Combined with a lower cost per token, linear scaling behavior, and reduced emissions compared to leading vendors, Atlas Inference provides a cost-efficient and scalable AI deployment.
Atlas Inference works with standard hardware and supports custom models, giving customers complete flexibility. Teams can upload fine-tuned models and keep them isolated on dedicated GPUs, making the platform ideal for organizations requiring brand-specific voice or domain expertise.
The platform is available immediately for enterprise customers and early-stage startups.
About Atlas Cloud
Atlas Cloud is your all-in-one AI competency center, powering leading AI teams with safe, simple, and scalable infrastructure for training and deploying models. Atlas Cloud also offers an on-demand GPU platform that delivers fast, serverless compute. Backed by Dell, HPE, and Supermicro, Atlas delivers near instant access to up to 5,000 GPUs across a global SuperCloud fabric with 99% uptime and baked-in compliance. Learn more at atlascloud.ai.
SOURCE: Atlas Cloud
press release
Hashtags

Try Our AI Features
Explore what Daily8 AI can do for you:
Comments
No comments yet...
Related Articles
Yahoo
15 minutes ago
- Yahoo
This Chinese Stock Just Launched Something That Could Be Even Bigger and More Powerful Than DeepSeek
Baidu (BIDU) shares are in focus after the company made its ERNIE large language models (LLMs) open source – effectively enabling the global AI community to use them for R&D. Note that Baidu's ERNIE 4.5 series outperforms DeepSeek's V3 model on a significant number of benchmarks across key capability categories. Elon Musk's Tesla Makes History With 'First Time That a Car Has Delivered Itself to Its Owner' This Defense Stock Could Be the Next Palantir. Should You Buy It Now? Cathie Wood Is Pounding the Table on AMD Stock. Should You Buy Shares Now? Get exclusive insights with the FREE Barchart Brief newsletter. Subscribe now for quick, incisive midday market analysis you won't find anywhere else. Still, Baidu stock is down some 19% versus its year-to-date high at the time of writing. Baidu shares stand to benefit rather significantly from management's decision to open source ERNIE models. By making its LLMs freely available under the Apache 2.0 license, BIDU invites global developers to build, fine-tune, and deploy on its architecture – accelerating adoption and ecosystem growth. Baidu's move mirrors the success of DeepSeek's open-source strategy, which reshaped China's AI landscape. The open-source release also enhances transparency and trust – potentially attracting enterprise and government partnerships. In short, the announcement positions BIDU as a foundational platform – a shift that could unlock long-term monetization and valuation upside. Baidu stock hasn't been particularly exciting for investors this year, but Miranda Zhuang – a Bank of America analyst – recommends owning it for the long term. Zhuang is bullish on BIDU shares primarily because the Beijing-headquartered firm has already launched fully autonomous vehicle operations in China. Moreover, Baidu is fully committed to expanding its robotaxi services internationally, which she's convinced will drive incremental revenue growth in the years ahead. Integrating artificial intelligence will help BIDU reinvigorate growth in its advertising business as well, according to the BofA analyst. Miranda Zhuang currently has a 'Buy' rating on Baidu shares and a price target of $100, indicating potential upside of another 18% from here. Other Wall Street firms also expect BIDU shares to extend gains as the Chinese tech behemoth continues to tap on artificial intelligence in pursuit of lower costs and higher returns. At the time of writing, the consensus rating on Baidu stock sits at 'Moderate Buy' with the mean target of about $105 indicating potential upside of some 22% from current levels. On the date of publication, Wajeeh Khan did not have (either directly or indirectly) positions in any of the securities mentioned in this article. All information and data in this article is solely for informational purposes. This article was originally published on Error in retrieving data Sign in to access your portfolio Error in retrieving data Error in retrieving data Error in retrieving data Error in retrieving data

Associated Press
18 minutes ago
- Associated Press
CyCraft Launches XecGuard: LLM Firewall for Trustworthy AI
TAIPEI, TAIWAN - Media OutReach Newswire - 1 July 2025 - CyCraft, a leading AI cybersecurity firm, today announced the global launch of XecGuard, the industry's first plug-and-play LoRA security module purpose-built to defend Large Language Models (LLMs). XecGuard's introduction marks a pivotal moment for secure, trustworthy AI, addressing the critical security challenges posed by the rapid adoption of LLMs. CyCraft Co-Founders (from left to right): Benson Wu (CEO), Jeremy Chiu (CTO), and PK Tsung (CISO) are leading the mission to build the world's most advanced AI security platform. Trustworthy AI Matters The transformative power of Large Language Models (LLMs) brings significant security uncertainty, requiring enterprises to urgently safeguard their AI models from malicious attacks like prompt injection, prompt extraction, and jailbreak attempts. Historically, AI security has been an 'optional add-on' rather than a fundamental feature, leaving valuable AI and data exposed. This oversight can compromise sensitive data, undermine service stability, and erode customer trust. CyCraft emphasizes that 'AI security must be a standard feature—not an optional add-on,' believing it's paramount for delivering stable and trustworthy intelligent services. The Imminent Need for Proactive AI Defense The need for immediate and effective AI security is more critical than ever before. As AI becomes increasingly embedded in core business operations, the attack surface expands exponentially, making proactive defenses an absolute necessity. CyCraft has leveraged its extensive 'battle-tested expertise across critical domains—including government, finance, and high-tech manufacturing' to precisely address these emerging AI-specific threats. The development of XecGuard signifies a shift from 'using AI to tackle cybersecurity challenges' to now 'using AI to protect AI' , ensuring that security and resilience are embedded from day one. 'AI security must be a standard feature—not an optional add-on,' stated Benson Wu, CEO, highlighting XecGuard's resilience and integration of experience from defending critical sectors. Jeremy Chiu, CTO and Co-Founder, emphasized, 'In the past, we used AI to tackle cybersecurity challenges; now, we're using AI to protect AI,' adding that XecGuard enables enterprises to confidently adopt AI and deliver trustworthy services. PK Tsung, CISO, concluded, 'With XecGuard, we're empowering enterprises to embed security and resilience from day one' as part of their vision for the world's most advanced AI security platform. CyCraft's Solution: XecGuard Empowers Secure AI Deployment CyCraft leads with the global launch of XecGuard, the industry's first plug-and-play LoRA security module purpose-built to defend LLMs. XecGuard provides robust protection against prompt injection, prompt extraction, and jailbreak attacks, ensuring enterprise-grade resilience for AI models. Its seamless deployment allows instant integration with any LLM without architectural modification, delivering powerful autonomous defense out of the box. XecGuard is available as a SaaS, an OpenAI-compatible LLM firewall on your cloud (e.g., AWS or Cloudflare Workers AI), or an embedded firewall for on-premises, NVIDIA-powered custom LLM servers. Rigorously validated on major open-source models like Llama 3B, Qwen3 4B, Gemma3 4B, and DeepSeek 8B, it consistently improves security resilience while preserving core performance, enabling even small models to achieve protection comparable to large commercial-grade systems. Even small models gain enterprise-level defenses, approaching large commercial-grade performance. Real-world validation through collaboration with APMIC, an NVIDIA partner, integrated XecGuard into the F1 open-source model, demonstrating an average 17.3% improvement in overall security defense scores and up to 30.1% in specific attack scenarios via LLM Red Teaming exercises. With XecGuard and the Safety LLM service, CyCraft delivers enterprise-grade AI security, accelerating the adoption of resilient and trustworthy AI across industries, empowering organizations to deploy AI securely, protect sensitive data, and drive innovation with confidence. To learn more about how XecGuard can protect your LLMs and to request a demo, visit: Hashtag: #CyCraft #LLMFirewall #AISecurity The issuer is solely responsible for the content of this announcement. About CyCraft Technology CyCraftis a leading AI-driven cybersecurity company in the Asia-Pacific region. Trusted by hundreds of organizations in defense, finance, and semiconductor industries, our AI is designed to prevent, preempt, and protect against cyber threats. Our expertise has been recognized by top-tier institutions like Gartner and IDC and showcased at prestigious global conferences, including Black Hat, DEFCON, EMNLP, and Code Blue.

Associated Press
3 hours ago
- Associated Press
Imagene AI Hits $45M in Total Funding After Completing $23M Series B Led by Larry Ellison
Imagene AI delivers a living intelligence engine for adaptive, insight-driven precision medicine. MIAMI, FL / ACCESS Newswire / July 1, 2025 / Imagene AI, a pioneer in multi‑modal foundation models for precision medicine, today announced the successful close of a $23 million Series B financing round led by Oracle Chairman and CTO Larry Ellison. This latest infusion brings Imagene AI's total capital raised to $45 million under terms that reflect the company's accelerating momentum and expanding strategic footprint. The round also included participation from existing investor Aguras Pathology Investments. 'We are honored to have the continued support of one of the world's foremost technology visionaries,' said Dean Bitan, CEO and Co‑founder of Imagene AI. 'We're developing the core intelligence that allows precision medicine to function as an adaptive, insight-driven system - from real-time trial design to rapid patient identification and more informed clinical decisions. Our goal is to make every trial more responsive, every insight more actionable, and every patient journey more personalized.' 'Imagene AI's ability to unite imaging, omics, and clinical data in a single foundation model is exactly the kind of breakthrough we believe will drive the next generation of drugs and diagnostics,' said Larry Ellison, Oracle Chairman and CTO. 'Their approach opens the door to a more adaptive, biologically fluent era of precision medicine.' Powering Living Precision Medicine Through Adaptive Intelligence At the core of Imagene AI is a dynamic intelligence infrastructure, an orchestration layer that continuously evolves with every stream of biological and clinical data. Histopathology, molecular profiles, and clinical context are routed in real-time through our proprietary foundation models, large language models, and biology-tuned analysis continuous loop ensures insights remain current and biologically grounded, empowering clinicians and researchers with personalized, up-to-date guidance at every stage of the trial. This momentum builds on a series of important milestones. Imagene AI's state-of-the-art digital pathology foundation model, trained on more than 1.5 million biopsy images, has demonstrated benchmark performance in key research tasks, particularly when working with limited data, a challenge common in real-world scenarios. The company launched the OI Suite platform to a broader community of users, including non-AI experts, delivering capabilities for biomarker discovery, indication expansion, and patient identification across multiple therapeutic areas. Additionally, Imagene AI is expanding strategic collaborations, such as its partnership with Tempus, to bring rapid predictive assays to more clinicians and patients, guiding treatment decisions with greater precision. Contact InformationAvital Rabani VP of Marketing SOURCE: Imagene press release