|
← back
LoCaLUT: Harnessing Capacity–Computation Tradeoffs for LUT-Based Inference in DRAM-PIM
Jan 31, 2026 by Junguk Hong, Changmin Shin, Sukjin Kim, Si Ung Noh, Taehee Kwon, Seong-Yeol Park, Hanjun Kim, Youngsok Kim, Jinho Lee (International Symposium on High-Performance Computer Architecture)
DOI 10.1109/HPCA68181.2026.11408523
We turned DRAM-PIM’s memory abundance into compute by packing many MACs into LUT lookups and then shrunk and streamed those LUTs for real devices—introducing canonicalization to kill redundancy, a tiny remap LUT to recover canonical forms, and slice streaming to only pull useful columns into the buffer. The result, LoCaLUT, makes low-bit DNN inference on UPMEM-style PIMs actually fast and scalable without adding logic—check the repo if you want to steal the idea.
source S2, crossref
|