arXiv cs.AR Daily Update
arXiv cs.AR Daily Update
cs.AR 领域 2026年4月13日 共有 9 篇论文更新:
- 4 篇新投稿:Neuromorphic Computing ([1], [4]), Energy Efficiency (DRIFT [3], [1]), RTL Verification ([2]), Code Generation ([2]), Diffusion Model (DRIFT [3])
- 1 篇跨领域投稿:DNN Deployment (MATCHA [5]), Heterogeneous Computing (MATCHA [5]), Edge Computing (MATCHA [5])
- 4 篇替换投稿:GPU Computing (DMA-Latte [7], [9]), LLM Inference (DMA-Latte [7], [8]), RTL Generation (ChipSeek [6]), Reinforcement Learning (ChipSeek [6]), EDA (ChipSeek [6])
整体趋势:今日论文主要聚焦于Neuromorphic Computing、Energy Efficiency、Edge Computing等方向。
已录用论文:[3](DAC 2026), [5](DAC 2026), [6](ACL 2026)
新投稿 (4)
[1] Memory Wall is not gone: A Critical Outlook on Memory Architecture in Digital Neuromorphic Computing
- arXiv: 2604.08774
- Authors: Amirreza Yousefzadeh, Sameed Sohail, Ana Lucia Varbanescu
- Subjects: cs.AR; cs.NE
- Tags: Neuromorphic Computing, Energy Efficiency
- Summary: 本文批判性地审视了数字神经形态处理器及其缓解内存瓶颈的策略,发现片上存储系统(包括SRAM和新兴技术如STT-MRAM)已成为面积和能耗的主要消耗者,形成了新的内存墙问题。
[2] From Indiscriminate to Targeted: Efficient RTL Verification via Functionally Key Signal-Driven LLM Assertion Generation
- arXiv: 2604.08932
- Authors: Yonghao Wang, Hongqin Lyu, Boling Chen, MinYang Bao, Wenchao Ding, Feng Gu, Zhiteng Chao, Jianan Mu, Kan Shi, Tiancheng Wang, Huawei Li
- Subjects: cs.AR
- Tags: RTL Verification, Code Generation
- Summary: 本文提出AgileAssert框架,通过构建RTL语义图识别关键信号,引导LLM生成针对性的断言,实现从无差别验证到目标验证的转变,在减少断言数量的同时提高覆盖率。
[3] DRIFT: Harnessing Inherent Fault Tolerance for Efficient and Reliable Diffusion Model Inference
- arXiv: 2604.09073
- Authors: Jinqi Wen, Tong Xie, Runsheng Wang, Meng Li
- Subjects: cs.AR
- Tags: Diffusion Model, Fault Tolerance, Energy Efficiency
- Venue: DAC 2026
- Summary: 本文提出DRIFT框架,利用扩散模型的固有容错特性,通过弹性感知的DVFS策略和回滚算法实现高效可靠的推理,平均节省36%能耗或实现1.7倍加速。
[4] A 0.5-V Linear Neuromorphic Voltage-to-Spike Encoder Using a Bulk-Driven Transconductor
- arXiv: 2604.09315
- Authors: Meysam Akbari, Erika Covi, Kea-Tiong Tang
- Subjects: cs.AR; cs.NE
- Tags: Neuromorphic Computing, Circuit Design, Low Power
- Summary: 本文介绍了一种超低功耗电压-脉冲编码器,通过将线性化的体驱动跨导器与DPI前端LIF神经元配对,实现了近线性的电压-发放率转换,在0.5V电压下功耗仅为22-180nW。
跨领域投稿 (1)
[5] MATCHA: Efficient Deployment of Deep Neural Networks on Multi-Accelerator Heterogeneous Edge SoCs
- arXiv: 2604.09124 (cross-listed)
- Authors: Enrico Russo, Mohamed Amine Hamdi, Alessandro Ottaviano, Francesco Conti, Angelo Garofalo, Daniele Jahier Pagliari, Maurizio Palesi, Luca Benini, Alessio Burrello
- Subjects: cs.DC; cs.AR; cs.LG
- Tags: DNN Deployment, Heterogeneous Computing, Edge Computing
- Venue: DAC 2026
- Summary: 本文提出MATCHA框架,用于在多异构加速器的边缘SoC上高效部署DNN,通过模式匹配、分块和映射实现并行执行,推理延迟降低高达35%。
替换投稿 (4)
[6] ChipSeek: Optimizing Verilog Generation via EDA-Integrated Reinforcement Learning
- arXiv: 2507.04736 (replaced)
- Authors: Zhirong Chen, Kaiyan Chang, Zhuolin Li, Cangyuan Li, Xinyang He, Chujie Chen, Mengdi Wang, Haobo Xu, Yinhe Han, Huawei Li, Ying Wang
- Subjects: cs.AI; cs.AR; cs.PL
- Tags: RTL Generation, Reinforcement Learning, EDA
- Venue: ACL 2026
- Code: code
- Summary: 本文提出ChipSeek框架,通过层次化奖励的强化学习引导LLM生成功能正确且PPA优化的RTL代码,集成了EDA仿真器和综合工具的直接反馈。
[7] DMA-Latte: Expanding the Reach of DMA Offloads to Latency-bound ML Communication
- arXiv: 2511.06605 (replaced)
- Authors: Suchita Pati, Shaizeen Aga, Mahzabeen Islam, Ryan Quach, Saleel Kudchadker, Mohamed Assem Ibrahim
- Subjects: cs.DC; cs.AR
- Tags: DMA, GPU Computing, LLM Inference
- Summary: 本文扩展了DMA卸载在延迟受限场景中的应用,利用AMD MI300X GPU的新特性,在ML通信集合和LLM推理中实现了显著的性能提升和功耗节省。
[8] Offline-First LLM Architecture for Adaptive Learning in Low-Connectivity Environments
- arXiv: 2603.03339 (replaced)
- Authors: Joseph Walusimbi, Ann Move Oguti, Joshua Benjamin Ssentongo, Keith Ainebyona
- Subjects: cs.CY; cs.AR; cs.CL; cs.HC
- Tags: Education Technology, LLM Inference, Edge Computing
- Summary: 本文提出了一种离线优先的大语言模型架构,通过量化模型和硬件感知的模型选择,在低连接环境中实现AI辅助学习,支持不同复杂度的自适应响应级别。
[9] Fine-Grained Power and Energy Attribution on AMD GPU/APU-Based Exascale Nodes
- arXiv: 2604.06056 (replaced)
- Authors: Adam McDaniel, Michael Jantz, Ashesh Sharma, Steve Abbott, Steven Martin, Shreyas Khandekar, Brandon Neth, Bruno Villasenor Alvarez, Aditya Kashi, Wael Elwasif, Oscar Hernandez
- Subjects: cs.DC; cs.AR
- Tags: Power Management, High Performance Computing, GPU Computing
- Summary: 本文提出了一种在AMD GPU/APU百亿亿次计算节点上进行细粒度功耗和能耗归因的方法,通过重建功耗数据实现时间对齐的相位级归因,应用于rocHPL等基准测试。
This post is licensed under CC BY 4.0 by the author.