Technology that raises AI efficiency and reduces reliance on memory
At the paper stage, amid claims the ‘impact is overstated’
Lower entry barriers for companies could even lift demand
The memory semiconductor market has been unsettled since Google published a paper on ‘TurboQuant’ (TurboQuant), a technology that sharply reduces artificial intelligence (AI) memory usage. Shares of major chipmakers such as Samsung Electronics and SK hynix continued a steep decline for a second day on the 27th.
TurboQuant, unveiled in a paper by Google Research on the 24th (local time), is a technology that increases the efficiency of AI models and reduces memory usage. TurboQuant can compress the KV cachean ‘interim memory’ of sorts that stores previous conversation in large language models (LLMs)without degrading performance. By compressing the context data that is crucial for AI inference to relieve bottlenecks, it boosts computation speed while minimizing accuracy loss, thereby maintaining performance in coding, Q&A, and text summarization.
According to Google’s findings, this reduces memory usage to as little as one-sixth and increases data processing speed by up to eightfold. The TurboQuant research team comprised eight members, including Google’s Amir Zandieh and Bahav Mirokni, as well as Han In-Soo, a professor at KAIST.
As expectations grew that, if applied in practice, TurboQuant would cut the memory capacity required for AI inference workloads, shares of memory-chip makers wobbled. Not only Samsung Electronics and SK hynix but also, on the New York stock market on the 26th (local time), Micron and SanDisk, the largest U.S. NAND memory company, plunged more than 7% and 11%, respectively.
Some even say TurboQuant delivered a shock comparable to China’s LLM DeepSeek. They argue it is a breakthrough that could reduce the AI industry’s dependence on memory. Cloudflare CEO Matthew Prince likened TurboQuant to “Google’s DeepSeek,” saying it “shows there is still room for improvement in the AI inference industry in areas such as speed, memory usage, and power consumption.”
Others argue the impact of TurboQuant, still only at the paper-publication stage, is being overstated. If AI memory efficiency improves and costs fall, the barrier to entry for companies could drop, which could in turn increase demand for AI memory, they say. Tech outlet TechCrunch noted that algorithmic techniques operating in the inference phase alone cannot solve the massive memory demand required for AI training. SemiAnalysis analyst Ray Wang told CNBC that “as AI hardware performance improves and AI models become more powerful, more hardware will be needed.”
Reuters Yonhap News