|
← back
A Highly-Parallel AI Accelerator Architecture for Convolution and Activation, Implemented in Verilog
Feb 28, 2026 by Qilin Xie, Jingyu Zhang (Science and Technology of Engineering, Chemistry and Environmental Protection)
DOI 10.61173/fmwnqv14
Built a Verilog prototype of a highly parallel C1+ReLU accelerator for LeNet-5 that crams synchronous data reuse and grouped convolution units into an FPGA-friendly topology to smash the throughput and energy limits of software inference on edge hardware. If you care about pragmatic, reusable accelerator blueprints for embedded CNNs, this paper shows a tidy path from algorithm to low-level RTL with real speedups and modest resource use.
source S2, crossref
|