논문 아카이브

「논문 아카이브」

[JSSC'20] SIMBA

A 0.32–128 TOPS, Scalable Multi-Chip-Module-Based Deep Neural Network Inference Accelerator With Ground-Referenced Signaling in 16 nm

Abstract Custom accelerators improve the energy efficiency, area efficiency, and performance of deep neural network (DNN) inference. This article presents a scalable DNN accelerator consisting of ...

[FPGA'20] AutoDNNchip

AutoDNNchip: An Automated DNN Chip Predictor and Builder for Both FPGAs and ASICs

Abstract Recent breakthroughs in Deep Neural Networks (DNNs) have fueled a growing demand for domain-specific hardware accelerators (i.e., DNN chips). However, designing DNN chips is non-trivial b...

[TC'20] A. Ardakani, et.al.

Fast and Efficient Convolutional Accelerator for Edge Computing

Abstract ZASCA achieves a performance efficiency of up to 94 percent over a set of state-of-the-art CNNs for image classification with dense representation where the performance efficiency is the ...

[TVLSI'20] Jaehyeong Sim, et.al.

An Energy-Efficient Deep Convolutional Neural Network Inference Processor With Enhanced Output Stationary Dataflow in 65-nm CMOS

Abstract A deep convolutional neural network (CNN) inference processor based on a novel enhanced output stationary (EOS) dataflow that employs dedicated register files (RFs) for storing reused act...

[arXiv.org'19] Gemmini

Gemmini: An Agile Systolic Array Generator Enabling Systematic Evaluations of Deep-Learning Architectures

Abstract Gemmini is presented – an open source and agile systolic array generator enabling systematic evaluations of deep-learning architectures and achieves two to three orders of magnitude speed...

[ICCAD'19] MAGNet

MAGNet: A Modular Accelerator Generator for Neural Networks

Abstract Deep neural networks have been adopted in a wide range of application domains, leading to high demand for inference accelerators. However, the high cost associated with ASIC hardware desi...

[MICRO'19] ASV

ASV: Accelerated Stereo Vision System

Abstract Estimating depth from stereo vision cameras, i.e., "depth from stereo", is critical to emerging intelligent applications deployed in energy- and performance-constrained devices, such as a...

[MICRO'19] SparTen

SparTen: A Sparse Tensor Accelerator for Convolutional Neural Networks

Abstract Convolutional neural networks (CNNs) are emerging as powerful tools for image processing. Recent machine learning work has reduced CNNs' compute and data volumes by exploiting the natural...

[MICRO'19] Simba

Simba: Scaling Deep-Learning Inference with Multi-Chip-Module-Based Architecture

Abstract 패키지 수준 통합에서 멀티칩모듈(MCM)을 사용하는 것은 대규모 시스템을 구축하는데 있어 유망한 접근법입니다. 대형 일체형 다이와 비교했을 때, MCM은 많은 작은 칩렛을 하나의 큰 시스템으로 결합하여 제작 및 설계 비용을 상당히 줄입니다. 현재 MCM은 칩렛 간 통신과 관련된 높은 면적, 성능 및 에너지 오버헤드 때문에 소수의 거친...

[MICRO'19] eCNN

eCNN: A Block-Based and Highly-Parallel CNN Accelerator for Edge Inference

Abstract This paper applies a block-based inference flow which can eliminate all the DRAM bandwidth for feature maps and accordingly proposes a hardware-oriented network model, ERNet, to optimize ...