首页 > 其他分享 >2023存内计算/近存计算宏电路进展调研

2023存内计算/近存计算宏电路进展调研

时间:2023-01-03 00:00:18浏览次数:48  
标签:Computing AI 近存 2023 TOPS 存内 Edge Memory Macro

主要关注ISSCC2020至2022,三年内的存算宏和近算宏电路,23虽然advanced program已经出来了,但是毕竟只能看到titile,所以没有把文章整理出来,等论文集放出来后再update,除了被重点关注的SRAM,被报告的还有基于Flash,RRAM,DRAM,PCM,STT-MRAM等多种存储介质的。此外除了存算宏以外,也有processor级的工作。本统计可能存在遗漏,也欢迎指出。

可以将存算/近存宏根据实现原理区分成数字式和模拟式两类,数字式的主要特点为直接将存储器读出结果送入加法器完成后续的计算,而模拟式则要在模拟域完成求和,再通过ADC的方式进行结果转换。

从整理中可以发现的趋势包括:

  1. 存算/近存算相关的文章数量迅速增加,20年仅有6篇,21年增加到9篇,22年增加到13篇
  2. 数字式方法增加,从21年被首次提出以来,22年迅速增加到4篇
  3. 21年开始存算宏作为一个IP合并到ML处理器中,共有4篇工作,22年增加到6篇

一些目前仍然值得聚焦的点:

  1. 对多种精度的支持,包括多定点精度和浮点精度

  2. 外围电路开销问题,模拟式:ADC,数字式:加法树

  3. 稀疏性优化问题

  4. 权重更新问题

  5. 实现片上训练功能

模拟式

20.15.2 A 28nm 64Kb Inference-Training Two-Way Transpose Multibit 6T SRAM Compute-in-Memory Macro for AI Edge Chips

20.15.3 A 351TOPS/W and 372.4GOPS Compute-in-Memory SRAM Macro in 7nm FinFET CMOS for Machine-Learning Applications

20.15.4 A 22nm 2Mb ReRAM Compute-in-Memory Macro with 121-28TOPS/W for Multibit MAC Computing for Tiny AI Edge Devices

20.15.5 A 28nm 64Kb 6T SRAM Computing-in-Memory Macro with 8b MAC Operation for AI Edge Chips

20.33.1 A 74 TMACS/W CMOS-RRAM Neurosynaptic Core with Dynamically Reconfigurable Dataflow and In-situ Transposable Weights for Probabilistic Graphical Models

20.33.2 A Fully Integrated Analog ReRAM Based 78.4TOPS/W Compute-In-Memory Chip with Fully Parallel MAC Computing

21.15.1 A Programmable Neural-Network Inference Accelerator Based on Scalable In-Memory Computing

21.15.2 A 2.75-to-75.9TOPS/W Computing-in-Memory NN Processor Supporting Set-Associate Block-Wise Zero Skipping and Ping-Pong CIM with imultaneous Computation and Weight Updating

21.15.3 A 65nm 3T Dynamic Analog RAM-Based Computing-in-Memory Macro and CNN Accelerator with Retention Enhancement, Adaptive Analog Sparsity and 44TOPS/W System Energy Efficiency

21.15.4 A 5.99-to-691.1TOPS/W Tensor-Train In-Memory-Computing Processor Using Bit-Level-Sparsity-Based Optimization and Variable-Precision Quantization

21.16.1 A 22nm 4Mb 8b-Precision ReRAM Computing-in-Memory Macro with 11.91 to 195.7TOPS/W for Tiny AI Edge Devices

21.16.2 eDRAM-CIM: Compute-In-Memory Design with Reconfigurable Embedded-Dynamic-Memory Array Realizing Adaptive Data Converters and Charge-Domain Computing

21.16.3 A 28nm 384kb 6T-SRAM Computation-in-Memory Macro with 8b Precision for AI Edge Chips

21.29.1 A 40nm 64Kb 56.67TOPS/W Read-Disturb-Tolerant Compute-in-Memory/Digital RRAM Macro with Active-Feedback-Based Read and In-Situ Write Verification

22.7.5 A 512Gb In-Memory-Computing 3D-NAND Flash Supporting Similar-Vector-Matching Operations on Edge-AI Devices

22.11.2 A 22nm 4Mb STT-MRAM Data-Encrypted Near-Memory Computation Macro with a 192GB/s Read-and-Decryption Bandwidth and 25.1-55.1TOPS/W 8b MAC for AI Operations

22.11.3 A 40-nm, 2M-Cell, 8b-Precision, Hybrid SLC-MLC PCM Computing-in-Memory Macro with 20.5 - 65.0TOPS/W for Tiny-AI Edge Devices

22.11.4 An 8-Mb DC-Current-Free Binary-to-8b Precision ReRAM Nonvolatile Computing-in-Memory Macro using Time-Space-Readout with 1286.4 - 21.6TOPS/W for Edge-AI Devices

22.11.8 A 28nm 1Mb Time-Domain Computing-in-Memory 6T-SRAM Macro with a 6.6ns Latency, 1241GOPS and 37.01TOPS/W for 8b-MAC Operations for Edge-AI Devices

22.15.3 COMB-MCM: Computing-on-Memory-Boundary NN Processor with Bipolar Bitwise Sparsity Optimization for Scalable Multi-Chiplet-Module Edge Machine Learning

22.15.6 DIANA: An End-to-End Energy-Efficient DIgital and ANAlog Hybrid Neural Network SoC

数字式

21.16.4 An 89TOPS/W and 16.3TOPS/mm2 All-Digital SRAM-Based Full-Precision Compute-In Memory Macro in 22nm for Machine-Learning Edge Applications

22.11.1 A 1ynm 1.25V 8Gb, 16Gb/s/pin GDDR6-based Accelerator-in-Memory supporting 1TFLOPS MAC Operation and Various Activation Functions for Deep-Learning Applications

22.11.6 A 5-nm 254-TOPS/W 221-TOPS/mm2 Fully-Digital Computing-in-Memory Macro Supporting Wide-Range Dynamic-Voltage-Frequency Scaling and Simultaneous MAC and Write Operations

22.11.7 A 1.041-Mb/mm2 27.38-TOPS/W Signed-INT8 Dynamic-Logic-Based ADC-less SRAM Compute-In-Memory Macro in 28nm with Reconfigurable Bitwise Operation for AI and Embedded Applications

22.15.5 A 28nm 29.2TFLOPS/W BF16 and 36.5TOPS/W INT8 Reconfigurable Digital CIM Processor with Unified FP/INT Pipeline and Bitwise In-Memory Booth Multiplication for Cloud Deep Learning Acceleration

22.29.1 184QPS/W 64Mb/mm2 3D Logic-to-DRAM Hybrid Bonding with Process-Near-Memory Engine for Recommendation System

22.29.3 A 28nm 15.59μJ/Token Full-Digital Bitline-Transpose CIM-Based Sparse Transformer Accelerator with Pipeline/Parallel Reconfigurable Modes

标签:Computing,AI,近存,2023,TOPS,存内,Edge,Memory,Macro
From: https://www.cnblogs.com/sasasatori/p/17020890.html

相关文章

  • 2023-01-02 新年第一篇,反向线段如何生成(反向线段生成,参考之前的线段和笔的空头萌发)
    反向线段生成,参考之前的线段和笔的空头萌发1.向上线段,害怕反向笔,因为可以一笔跌穿2.向上线段,害怕同向不出新高的笔,因为可以生成反向线段  ......
  • 2023.1.2周报
    2023.1.2周报本周总结:阳了,所以一直处于虚弱状态,现在基本好了,接下去可以恢复正常训练了。大致学了下矩阵的运用,但大部分时间在看文档,没什么精力写题目。矩阵构造有非常多......
  • the eleventh——2023.1.2
    scanf()函数一般只读取字符串中的一个单词,而不是一句话。例如:scanf("%s",name);printf("Hello,%s!",name) NingBabaHello,Ning!(后面的Baba在scanf这读取不到,在遇......
  • 2023.1.2周报
    本周总结:学习了《算法竞赛》第六章数论6.7-6.9、第七章组合数学7.1-7.6内容,牛客组合数学课程,做书上例题和习题。准备新手课堂文档和讲课,顺便出结训赛题目。大方向组合数......
  • 2023/01 LeetCode练习
    ......
  • 2023,整装待发!
    我是个怎么样的人呢.....高考过后,我常常断断续续的思索这些。但是对刚刚“解放”的我来说,思考这些还是过于疲惫和自找不快的一件事。所以就这样遗忘了,在混沌与半清醒......
  • HashMap-2023-1-2
    packageCollection;importjava.util.HashMap;importjava.util.Map;importjava.util.Scanner;importjava.util.Set;publicclassMapTest{publicMap<String,Stu......
  • 2023年跨年演讲对技术人的一点启发
    前言故事1.电动车与书店罗胖点评大叔点评故事2.《螃蟹与红酒》罗胖点评大叔点评故事3.《甘地与糖》罗胖点评大叔点评故事4.《60秒与10年》罗胖点评......
  • 2023/1/2 周报
    周报本周总结​ 最近状态不佳,特别是打abc的时候,总感觉就是刷了很多题,但是就是不会写,赛后被人一启发就能知道怎么写并且写出来,让我想是不是自己的刷题方式有问题,太过于依......
  • SZTUACM寒假周报(2022.12.24~2023.1.1)
    SZTUACM寒假周报(2022.12.24~2023.1.1)杂项——搜索专题知识整理前言:因为之前搜索学得很随意,知识点很杂,加上期末一直在赶ddl,投入训练时间很少,所以本周决定整理一下有关搜......