AMD’s Ryzen AI Halo recently went on sale for $4,000, sparking an interesting debate about how it compares to Nvidia’s slightly pricier DGX Spark offering.
The configuration that the Ryzen AI Halo offers, however, has been on the market for a few months now, and while most OEMs and enterprise providers are offering the same flavor and configuration, Shenzhen-based memory and storage company Longsys has taken things a step further.
The storage giant demonstrated a localized version of a 397B-parameter AI model running on its own version of the Ryzen AI Halo, featuring the same 16-core Ryzen AI Max+ 395 and 128GB of RAM configuration.
How was the Ryzen AI Max+ 395 able to run such a massive model with only 128GB of RAM?
While the model being run was not explicitly stated, it seems to be a customized version derived from Alibaba’s Qwen 3.5 397B (A17B), a multimodal foundation model that leverages a Mixture-of-Experts (MoE) approach, which made the original DeepSeek such a potent challenger.
Even if it was leveraging INT4 quantization, the memory requirements far exceed the memory the device demonstrating the feat had on offer: only 96GB of VRAM is available to the GPU in a 128GB unified configuration, versus an estimated 200-250GB of VRAM the model needs to run.
The secret sauce is Longsys’s recently unveiled custom SPU and iSA configuration that offers the ability to compress data in real time, a feat that the company says allows it to fit as much as twice the amount of data in storage drives of up to 128GB, leveraging a caching layer that reduces DRAM requirements considerably.
The approach involves offloading experts not in active use to a large, fast storage buffer that the AI chip can then reintroduce them from if needed.
In a press release, Longsys claimed its…


























