Papernews
← back

Architecture Design of a Convolutional Neural Network Accelerator for Heterogeneous Computing Based on a Fused Systolic Array

Jan 1, 2026 by Yang Zong, Zhenhao Ma, Jian Ren, Yu Cao, Meng Li, Bin Liu (Italian National Conference on Sensors)

DOI 10.3390/s26020628



We built a low-power CNN accelerator that fuses Conv, BN and activation into a 2D systolic array and pairs it with a RISC-V CPU for heterogeneous execution, pushing YOLOv5n on an FPGA to ~20.6 GFLOPs at 1.96 W — about 10.46 GOPs/W — by combining operator fusion, a lightweight core, locking/prefetch for async stability, and a fused-systolic microarchitecture that actually makes embedded inference efficient.

source S2, crossref



dgfl, 2026