|
← back
Architecture Design of a Convolutional Neural Network Accelerator for Heterogeneous Computing Based on a Fused Systolic Array
Jan 1, 2026 by Yang Zong, Zhenhao Ma, Jian Ren, Yu Cao, Meng Li, Bin Liu (Italian National Conference on Sensors)
DOI 10.3390/s26020628
We built a low-power CNN accelerator that fuses Conv, BN and activation into a 2D systolic array and pairs it with a RISC-V CPU for heterogeneous execution, pushing YOLOv5n on an FPGA to ~20.6 GFLOPs at 1.96 W — about 10.46 GOPs/W — by combining operator fusion, a lightweight core, locking/prefetch for async stability, and a fused-systolic microarchitecture that actually makes embedded inference efficient.
source S2, crossref
|