logo
Kioxia AiSAQ Improves AI Inference With Lower DRAM Costs

Kioxia AiSAQ Improves AI Inference With Lower DRAM Costs

Forbes11 hours ago
Artificial Intelligence
In April this year, Kioxia's Rory Bolt gave me a briefing on Kioxia's AiSAQ, an open-source project intended to promote the expanded use of SSDs in RAG AI solutions. The focus on AI is moving from generating foundational models with massive and expensive training to cost effective and scalable ways to create inference solutions that can solve real world problems.
Retrieval-Augmented Generation is an approach to AI that combined traditional information retrieval systems with large language models. RAG enhances the performance of LLMs by allowing them to access and incorporate information from external knowledge sources, such as databases, websites, and internal documents, before generating a response. This approach helps LLMs produce more accurate, contextually relevant, and up-to-date information, especially when dealing with specific domains or real-time data.
Kioxia has used AI to improve the output of its NAND fabs since 2017, mostly using machine vision to monitor trends and defect rates. In 2020 Kioxia used AI to generate the world's first AI-designed Manga, Phaedo, drawing on manga drawings and stories based on Osuma Tezuka's work.
I was told that although larger data centers feed data to their AI models using hard drives, many in-house solutions train using data on SSDs. These solutions often work with foundational LLM models created with very large data sets and use RAG using in-house and perhaps more up to date data to tune the foundational model for a particular application and to avoid hallucinations. The image below illustrates how a database can be used for tuning of the original LLM.
How Retrieval-Augmented Generation works to improve LLM Inference
Here the customer query is answered using the LLM as well as domain specific and up to date information in a vector data base. Such RAG solutions can be done with the data base index and vectors all in DRAM, but such an approach can use a lot of memory, making them very expensive, particularly for large data bases.
Microsoft developed Disk ANN which moved the bulk of the vector DB content to SSDs. This reduced the required DRAM footprint for the DB enabling greater scaling of vector DBS. This is used in products such as Azure Vector DB and Cosmos DB.
Kioxia's All-in-Storage ANNS with Product Quantization, or AiSAQ completes the move of database vectors into storage, further reducing the DRAM requirements. These three approaches are represented in the drawing below.
Comparison of data base DRAM requirements for DRAM and SSD-based RAG architectures
Kioxia says that this approach enabled greater scalability for RAG workflows and thus better accuracy in the models. The image below shows the significant reduction of DRAM required for large databases compared to the DRAM-based, and DiskANN approach and the improved query accuracy.
AiSAQ reduces DRAM costs, improves speed and inference accuracy
In early July Kioxia announced further improvements to its AiSAQ. This new open source release allows flexible controls that allow system architects to define the balance point between search performance and the number of vectors, which are opposing factors with the fixed capacity of SSD storage in the system. The resulting benefit enables architects of RAG systems to fine-tune the optimal balance between specific workloads and their requirements, without any hardware modifications.
Kioxia's AiSAQ allows more scalable RAG AI inference systems by moving database vectors entirely into storage, thus avoiding DRAM growth with increasing database sizes.
Orange background

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Related Articles

AI Impact Awards 2025: New Innovations Seek to Gamify the Shopping Experience
AI Impact Awards 2025: New Innovations Seek to Gamify the Shopping Experience

Newsweek

time10 minutes ago

  • Newsweek

AI Impact Awards 2025: New Innovations Seek to Gamify the Shopping Experience

Based on facts, either observed and verified firsthand by the reporter, or reported and verified from knowledgeable sources. Newsweek AI is in beta. Translations may contain inaccuracies—please refer to the original content. In the age of artificial intelligence, it feels like AI knows more about our shopping habits than we do. It knows what we want to buy before we add it to our shopping carts. It knows how many consumers are going to want a product before businesses begin restocking their shelves. It even knows how much we're willing to pay for items before we punch in our credit card information. Innovations across the industry, however, suggest AI knows much more. To recognize businesses that are using these capabilities in new ways, Newsweek announced three winners in the Brand & Retail category of its inaugural AI Impact Awards. The recipients of this year's awards are software company Perfect Corp., tech company Trax Retail and beauty incubator Maesa. "It's super exciting," David Gottlieb, Trax's chief revenue officer, told Newsweek. "We really feel this is a validation of a decades-long strategy that we've had in building this company on the back of AI before it was cool." Trax took home the award for Best Outcomes, Product Development and Innovation, for its image recognition technology. The company, which operates in more than 90 countries and works with the top 100 consumer goods companies, has trained computers to identify items in shopping aisles to generate real-time data and metrics that could help manufacturers do a better job of selling their products. "The industry has an incredibly high appetite for better understanding execution," Gottlieb said. "[CPG companies] want to know, What's my share of shelf? Am I at eye level? Do I leave the aisle? How do I stack competitively? What's happening with private label?" AI Impact Awards: Brand & Retail AI Impact Awards: Brand & Retail Newsweek Illustration According to the company, integration of its technology has resulted in 95 percent accuracy in in-store data capture. Trax has also become a pioneer in the image recognition space by deploying representatives to visit retailers and execute tasks on behalf of manufacturers as well as by offering consumers a fun and budget-friendly way to engage with its technology. Shoppers can download Shopkick, an app that gamifies the shopping experience by offering different discounts. Say a shopper watches a video at home about a product, this would earn them a small reward. But if they were to go to the store and actually hold the product and scan the barcode, they'd earn a bigger reward. And if they were to buy the product and scan the receipt, they'd get the maximum reward. "We're driving shopper engagement, awareness and, ultimately, purchase behavior," Gottlieb said. In the future, he hopes Trax will dramatically expand its insights with augmented reality (AR), so that instead of taking pictures, users can just walk up to the shelves and look through their phone cameras, capturing real-time insights as they scan the aisles. This new way of interacting with products will help users more quickly identify the goods they're looking for—for instance, picking out only gluten-free beers or beers brewed in Canada—by just panning the shelves instead of individually scanning every item. "It's going to unlock a volume of information and a scale of collection that hasn't really been possible before and can create a lot of value for all the brands that want to better understand [consumer data], especially in independent stores and places where it's not as easy to get that information," Gottlieb said. Another company that has been developing AI to gamify the shopping experience is Perfect Corp., the recipient of this year's Best Outcomes, Customer Experience, award. The company, which focuses on AI and AR in the beauty and fashion industries, won this year's award for its new Real-Time Skin Analysis tool—a technology used by major brands like Sephora. The tool helps identify skin type, tone, sensitivity, texture and conditions to help come up with customized product recommendations. "The interesting thing is skin analysis is not a new idea. The dermatology industry has existed for many, many years," Wayne Liu, the chief growth officer and president of Americas at Perfect Corp., told Newsweek. "The true problem here is accessibility," he said. "The machine is pretty expensive—the cheapest one is probably $20,000—and it just sits there, so that makes it challenging for many people to get the assessment. When we talked to these doctors, we realized another problem: Because it's a big machine, you have to go to the site to do the analysis, and that's why some people just give up on treatment." Liu said Real-Time Skin Analysis has not only solved the accessibility problem but also turned a medical-like assessment into a "fun, gamified, playful" experience that is still profitable. Take makeup brand Benefit for example. The brand uses Real-Time Skin Analysis to power its Pore Analysis Tool, which, according to Perfect Corp., has been found to boost product sales 14 times over normal among those who use the technology. Customers who engaged with the Pore Analysis Tool reportedly spent twice as long on Benefit's website as well. Skinsight—another custom tool powered by Real-Time Skin Analysis and used on cruise lines like Royal Caribbean International, Carnival Cruise Line and Virgin Voyages—also prompted a 35 percent increase in AI-recommended product sales, Perfect Corp. reported. And Dr. Eunice Park, a New York–based plastic surgeon and an early adopter of Real-Time Skin Analysis, told Liu that the latest capabilities have led to a 36 percent conversion among her patients. Liu noted that Park, who had just one office when she started implementing Perfect Corp.'s technology, has now expanded to four locations. Using Park as an example, Liu argued that while AI has upended employment, it also has the potential to create new jobs. "Dr. Park probably doesn't need that many receptionists now, but in the grand scale, she actually expanded her business," Liu said. "She's actually hiring more people." "That's the high-level effect of AI. It creates more opportunities. It will probably replace current jobs, but it will create new jobs," he added. "We want to make sure AI is making this world a beautiful place. That's what we've always believed." Perfect Corp. was not the only company in the beauty space to win an award in the Brand & Retail category. Maesa received the Best Outcomes, Marketing and Creative, award for its content creation around fragrance brand Fine'ry. For Fine'ry, which launched exclusively at Target in 2023, Maesa decided to experiment with generative AI in response to its viral success on social media. "This level of engagement required high-quality content produced at scale," Maesa said in its application to Newsweek's AI Impact Awards. "Traditionally, producing creative assets of such quality required significant time and financial investment, often involving large teams of designers, editors and creative." "The introduction of AI technology enabled Maesa to cut 90 percent of the time spent and significantly reduce production costs for a similar output," the company said. "The ability to generate assets quickly and at scale allowed Maesa to allocate resources more strategically, investing in other areas of growth and innovation." Leveraging AI, Maesa's creative team helped Fine'ry revolutionize its marketing strategies by leaving creative assets to generative AI, by enhancing its user experience at pop-up exhibits, by launching a visual experience on gaming platform Roblox and by releasing AI-driven video campaigns for the Fine'ry fragrance line. To see the full list of winners and awards, visit the official page for Newsweek's AI Impact Awards.

Asian Economies in Rush to Cut Tariff Deals as US Deadline Moves
Asian Economies in Rush to Cut Tariff Deals as US Deadline Moves

Bloomberg

time15 minutes ago

  • Bloomberg

Asian Economies in Rush to Cut Tariff Deals as US Deadline Moves

Asian countries including Japan and South Korea said they'll keep pushing for a better deal for their exports to the US after President Donald Trump shifted his tariff deadline to Aug. 1 and tweaked the rates he's set for many economies. In his first wave of letters to key trading partners, Trump announced levies of 25% on goods from Japan and South Korea, with rates for Indonesia and Thailand set at over 30%. The US president also signed an executive order holding off the new duties until Aug. 1.

Trump threatens more than a dozen countries with new tariffs by Aug. 1
Trump threatens more than a dozen countries with new tariffs by Aug. 1

CBS News

time17 minutes ago

  • CBS News

Trump threatens more than a dozen countries with new tariffs by Aug. 1

President Trump on Monday said the U.S will impose 25% tariffs on goods from South Korea and Japan by Aug. 1, while also threatening to hit a dozen other nations with steep import duties by next month. Mr. Trump initially posted two nearly identical letters, addressed to Japanese Prime Minister Shigeru Ishiba and South Korean President Lee Jae-myung, on his Truth Social media platform outlining the new tariffs. A 90-day freeze on country-specific U.S. tariffs on dozens of nations, including Japan and South Korea, expires on July 9. That deadline, however, is no longer significant, with Mr. Trump saying on social media that "tariffs will start being paid on Aug. 1, 2025." "In other words, all money will be due and payable starting Aug. 1, 2025 — No extensions will be granted," he added. Mr. Trump on Monday afternoon also announced separate tariffs on an additional 12 trade partners. Myanmar and Laos face a 40% tariff rate, the highest of the new levies the U.S. is threatening to deploy. Thailand and Cambodia face potential tariffs of 36%, while Mr. Trump announced slightly lower tariffs of 35% on both Bangladesh and Serbia. Indonesia is the lone country facing a possible tariff rate of 32%. Imports from South Africa and Bosnia and Herzegovina will be subject to 30% duties as of Aug. 1, while goods from Malaysia, Kazakhstan and Tunisia will be taxed at 25%, Mr. Trump said in letters posted on Truth Social. On Tuesday, Mr. Trump said on Truth Social that more tariff letters "will be sent today, tomorrow, and for the next short period of time." White House press secretary Karoline Leavitt said during a media briefing Monday that Mr. Trump is set to sign an executive order extending the July 9 tariff freeze. The order delays the deadline to "Aug. 1 so the reciprocal tariff rate, or these new rates that will be provided in this correspondence to these foreign leaders, will be going out the door within the next month," Leavitt said. Leavitt also said Mr. Trump was planning to send tariff letters to "approximately 12 other countries" informing them of new U.S. levies on their goods. She did not indicate when the notices will be sent out, but said Mr. Trump will post them on Truth Social "so you can enjoy them yourself." Trump is also expected to announce more deals with U.S. trade partners before the beginning of August, she said. Mr. Trump included identical language in each letter informing recipients that the notice "demonstrates the strength and commitment of our Trading Relationship," and that "we invite you to participate in the extraordinary Economy of the United States." The 25% tariffs on South Korea and Japan are in line with the so-called "reciprocal" rates Mr. Trump had announced when he unveiled country-specific duties on April 2. At the time, the Trump administration said it would impose tariffs of 24% and 25% on imports from Japan and South Korea, respectively. "If none of these 14 countries manage to seal a preliminary trade deal (and assuming Trump doesn't delay implementation for another month) then the effective tariff rate on U.S. imports would rise from 15.5% to 17.3%," Paul Ashworth, chief North America economist at Capital Economics, said in a report. In 2024, the effective tariff rate on imports was 2.5%, he noted. 25% tariff would bog down trade Trade policy expert Barry Appleton, co-director of the Center for International Law at New York Law School, told CBS MoneyWatch that Monday's announcements "are simply indications that he is making them pay full admission price to access the U.S. market." "At 25%, it is possible, but challenging, to trade with Japan and Korea. This rate was carefully set at the higher side of the spectrum. In essence trade with the U.S. now is a pay to play proposition for Japan, Korea and likely others to come," he said. Appleton added that the country-specific rates announced Monday do not bode well for other nations looking to strike deals with the U.S., including EU countries, Canada and Switzerland. Economists warn that wide-ranging U.S. tariffs on trading partners could spark another bout of inflation and hinder economic growth, while global financial markets have heaved at the prospect of significantly higher trading costs. Trump administration officials have said stiff U.S. tariffs can ensure fair global trade while also boosting the American manufacturing sector.

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into a world of global content with local flavor? Download Daily8 app today from your preferred app store and start exploring.
app-storeplay-store