Post

arXiv cs.AR Daily Update

arXiv cs.AR Daily Update

cs.AR 领域 2026年4月13日 共有 9 篇论文更新:

  • 4 篇新投稿:Neuromorphic Computing ([1], [4]), Energy Efficiency (DRIFT [3], [1]), RTL Verification ([2]), Code Generation ([2]), Diffusion Model (DRIFT [3])
  • 1 篇跨领域投稿:DNN Deployment (MATCHA [5]), Heterogeneous Computing (MATCHA [5]), Edge Computing (MATCHA [5])
  • 4 篇替换投稿:GPU Computing (DMA-Latte [7], [9]), LLM Inference (DMA-Latte [7], [8]), RTL Generation (ChipSeek [6]), Reinforcement Learning (ChipSeek [6]), EDA (ChipSeek [6])

整体趋势:今日论文主要聚焦于Neuromorphic Computing、Energy Efficiency、Edge Computing等方向。

已录用论文[3](DAC 2026), [5](DAC 2026), [6](ACL 2026)

开源论文[6](code)


新投稿 (4)

[1] Memory Wall is not gone: A Critical Outlook on Memory Architecture in Digital Neuromorphic Computing

  • arXiv: 2604.08774
  • Authors: Amirreza Yousefzadeh, Sameed Sohail, Ana Lucia Varbanescu
  • Subjects: cs.AR; cs.NE
  • Tags: Neuromorphic Computing, Energy Efficiency
  • Summary: 本文批判性地审视了数字神经形态处理器及其缓解内存瓶颈的策略,发现片上存储系统(包括SRAM和新兴技术如STT-MRAM)已成为面积和能耗的主要消耗者,形成了新的内存墙问题。

[2] From Indiscriminate to Targeted: Efficient RTL Verification via Functionally Key Signal-Driven LLM Assertion Generation

  • arXiv: 2604.08932
  • Authors: Yonghao Wang, Hongqin Lyu, Boling Chen, MinYang Bao, Wenchao Ding, Feng Gu, Zhiteng Chao, Jianan Mu, Kan Shi, Tiancheng Wang, Huawei Li
  • Subjects: cs.AR
  • Tags: RTL Verification, Code Generation
  • Summary: 本文提出AgileAssert框架,通过构建RTL语义图识别关键信号,引导LLM生成针对性的断言,实现从无差别验证到目标验证的转变,在减少断言数量的同时提高覆盖率。

[3] DRIFT: Harnessing Inherent Fault Tolerance for Efficient and Reliable Diffusion Model Inference

  • arXiv: 2604.09073
  • Authors: Jinqi Wen, Tong Xie, Runsheng Wang, Meng Li
  • Subjects: cs.AR
  • Tags: Diffusion Model, Fault Tolerance, Energy Efficiency
  • Venue: DAC 2026
  • Summary: 本文提出DRIFT框架,利用扩散模型的固有容错特性,通过弹性感知的DVFS策略和回滚算法实现高效可靠的推理,平均节省36%能耗或实现1.7倍加速。

[4] A 0.5-V Linear Neuromorphic Voltage-to-Spike Encoder Using a Bulk-Driven Transconductor

  • arXiv: 2604.09315
  • Authors: Meysam Akbari, Erika Covi, Kea-Tiong Tang
  • Subjects: cs.AR; cs.NE
  • Tags: Neuromorphic Computing, Circuit Design, Low Power
  • Summary: 本文介绍了一种超低功耗电压-脉冲编码器,通过将线性化的体驱动跨导器与DPI前端LIF神经元配对,实现了近线性的电压-发放率转换,在0.5V电压下功耗仅为22-180nW。

跨领域投稿 (1)

[5] MATCHA: Efficient Deployment of Deep Neural Networks on Multi-Accelerator Heterogeneous Edge SoCs

  • arXiv: 2604.09124 (cross-listed)
  • Authors: Enrico Russo, Mohamed Amine Hamdi, Alessandro Ottaviano, Francesco Conti, Angelo Garofalo, Daniele Jahier Pagliari, Maurizio Palesi, Luca Benini, Alessio Burrello
  • Subjects: cs.DC; cs.AR; cs.LG
  • Tags: DNN Deployment, Heterogeneous Computing, Edge Computing
  • Venue: DAC 2026
  • Summary: 本文提出MATCHA框架,用于在多异构加速器的边缘SoC上高效部署DNN,通过模式匹配、分块和映射实现并行执行,推理延迟降低高达35%。

替换投稿 (4)

[6] ChipSeek: Optimizing Verilog Generation via EDA-Integrated Reinforcement Learning

  • arXiv: 2507.04736 (replaced)
  • Authors: Zhirong Chen, Kaiyan Chang, Zhuolin Li, Cangyuan Li, Xinyang He, Chujie Chen, Mengdi Wang, Haobo Xu, Yinhe Han, Huawei Li, Ying Wang
  • Subjects: cs.AI; cs.AR; cs.PL
  • Tags: RTL Generation, Reinforcement Learning, EDA
  • Venue: ACL 2026
  • Code: code
  • Summary: 本文提出ChipSeek框架,通过层次化奖励的强化学习引导LLM生成功能正确且PPA优化的RTL代码,集成了EDA仿真器和综合工具的直接反馈。

[7] DMA-Latte: Expanding the Reach of DMA Offloads to Latency-bound ML Communication

  • arXiv: 2511.06605 (replaced)
  • Authors: Suchita Pati, Shaizeen Aga, Mahzabeen Islam, Ryan Quach, Saleel Kudchadker, Mohamed Assem Ibrahim
  • Subjects: cs.DC; cs.AR
  • Tags: DMA, GPU Computing, LLM Inference
  • Summary: 本文扩展了DMA卸载在延迟受限场景中的应用,利用AMD MI300X GPU的新特性,在ML通信集合和LLM推理中实现了显著的性能提升和功耗节省。

[8] Offline-First LLM Architecture for Adaptive Learning in Low-Connectivity Environments

  • arXiv: 2603.03339 (replaced)
  • Authors: Joseph Walusimbi, Ann Move Oguti, Joshua Benjamin Ssentongo, Keith Ainebyona
  • Subjects: cs.CY; cs.AR; cs.CL; cs.HC
  • Tags: Education Technology, LLM Inference, Edge Computing
  • Summary: 本文提出了一种离线优先的大语言模型架构,通过量化模型和硬件感知的模型选择,在低连接环境中实现AI辅助学习,支持不同复杂度的自适应响应级别。

[9] Fine-Grained Power and Energy Attribution on AMD GPU/APU-Based Exascale Nodes

  • arXiv: 2604.06056 (replaced)
  • Authors: Adam McDaniel, Michael Jantz, Ashesh Sharma, Steve Abbott, Steven Martin, Shreyas Khandekar, Brandon Neth, Bruno Villasenor Alvarez, Aditya Kashi, Wael Elwasif, Oscar Hernandez
  • Subjects: cs.DC; cs.AR
  • Tags: Power Management, High Performance Computing, GPU Computing
  • Summary: 本文提出了一种在AMD GPU/APU百亿亿次计算节点上进行细粒度功耗和能耗归因的方法,通过重建功耗数据实现时间对齐的相位级归因,应用于rocHPL等基准测试。
This post is licensed under CC BY 4.0 by the author.

Trending Tags