Post

arXiv cs.AR Daily Update

arXiv cs.AR Daily Update

cs.AR 领域 2026年4月16日 共有 8 篇论文更新:

  • 5 篇新投稿:Hardware Acceleration (ATLAAS [2], [1], [4]), Circuit Design ([3], [4], [5]), Edge Computing ([1], [3]), Memory Architecture ([1]), EDA (ATLAAS [2])
  • 1 篇跨领域投稿:Edge Computing (BioTrain [6]), Medical AI (BioTrain [6]), On-Device Learning (BioTrain [6])
  • 2 篇替换投稿:LLM Inference (Sandwich [7]), LLM Serving (Sandwich [7]), RTL Verification (ChatSVA [8]), LLM Agent (ChatSVA [8]), Code Generation (ChatSVA [8])

整体趋势:今日论文主要聚焦于Hardware Acceleration、Edge Computing、Circuit Design等方向。

已录用论文[3](ISQED 2026), [5](ISCAS 2026), [7](DAC 2026), [8](DAC 2026)

开源论文:无


新投稿 (5)

[1] Tensor Memory Engine: On-the-fly Data Reorganization for Ideal Locality

  • arXiv: 2604.13319
  • Authors: Denis Hoornaert, Cole Strickler, Manos Athanassoulis, Marco Caccamo, Heechul Yun, Renato Mancuso
  • Subjects: cs.AR
  • Tags: Memory Architecture, Edge Computing, Hardware Acceleration
  • Summary: 本文提出了一种软硬件协同设计的张量内存引擎,通过在CPU数据路径中插入该引擎,实时重组内存布局以提供理想的缓存局部性,从而解决边缘计算中数据密集型应用的内存墙问题。

[2] ATLAAS: Automatic Tensor-Level Abstraction of Accelerator Semantics

  • arXiv: 2604.13523
  • Authors: Ruijie Gao, Haoran Jin, Jirong Yang, Nathaniel Bleier
  • Subjects: cs.AR
  • Tags: EDA, Hardware Acceleration, RTL Verification
  • Summary: 本文提出了ATLAAS,首个基于MLIR的端到端流程,能够将RTL提取的加速器语义自动提升为张量ISA规范,从而实现软件栈的自动生成。

[3] Cross-Layer Co-Optimized LSTM Accelerator for Real-Time Gait Analysis

  • arXiv: 2604.13543
  • Authors: Mohammad Hasan Ahmadilivani, Levent Aksoy, Mohammad Eslami, Jaan Raik, Alar Kuusik
  • Subjects: cs.AR; cs.LG
  • Tags: Circuit Design, Medical AI, Edge Computing
  • Venue: ISQED 2026
  • Summary: 本文提出了首个用于实时步态分析的跨层协同优化LSTM加速器ASIC设计,从软件到位图级别进行全面的设计空间探索,实现了硬件复杂度与精度的平衡。

[4] GEM3D CIM General Purpose Matrix Computation Using 3D Integrated SRAM eDRAM Hybrid Compute In Memory on Memory Architecture

  • arXiv: 2604.13969
  • Authors: Subhradip Chakraborty, Ankur Singh, Akhilesh R. Jaiswal
  • Subjects: cs.AR
  • Tags: Compute-in-Memory, Circuit Design, Hardware Acceleration
  • Summary: 本文提出了一种3D集成的SRAM-eDRAM混合存内计算架构,能够在内存交叉阵列中直接执行通用矩阵运算(如转置、逐元素加法和乘法),扩展了传统存内计算的应用范围。

[5] An ASIC Emulated Oscillator Ising/Potts Machine Solving Combinatorial Optimization Problems

  • arXiv: 2604.14027
  • Authors: Yilmaz Ege Gonul, Baris Taskin
  • Subjects: cs.AR; cs.ET
  • Tags: Neural Combinatorial Optimization, Circuit Design, Hardware Acceleration
  • Venue: ISCAS 2026
  • Summary: 本文提出了一种定制ASIC架构,通过数字方式模拟振荡器Ising/Potts机器来解决NP难组合优化问题,在保持可编程性和精度的同时实现了显著的性能和能效提升。

跨领域投稿 (1)

[6] BioTrain: Sub-MB, Sub-50mW On-Device Fine-Tuning for Edge-AI on Biosignals

  • arXiv: 2604.13359 (cross-listed)
  • Authors: Run Wang, Victor J. B. Jung, Philip Wiese, Sebastian Frey, Giusy Spacone, Francesco Conti, Alessio Burrello, Luca Benin
  • Subjects: cs.LG; cs.AR; eess.SP
  • Tags: Edge Computing, Medical AI, On-Device Learning
  • Summary: 本文提出了BioTrain框架,能够在毫瓦级功耗和亚兆字节内存约束下实现生物信号模型的全网络微调,有效解决边缘设备上的域迁移问题。

替换投稿 (2)

[7] Sandwich: Joint Configuration Search and Hot-Switching for Efficient CPU LLM Serving

  • arXiv: 2507.18454 (replaced)
  • Authors: Juntao Zhao, Jiuru Li, Chuan Wu
  • Subjects: cs.AR; cs.AI; cs.DC; cs.PL
  • Tags: LLM Inference, LLM Serving
  • Venue: DAC 2026
  • Summary: 本文提出了Sandwich系统,通过无缝阶段切换、子结构感知的核心分配和动态张量程序生成,解决了CPU上LLM服务中预填充和解码阶段的资源冲突问题。

[8] ChatSVA: Bridging SVA Generation for Hardware Verification via Task-Specific LLMs

  • arXiv: 2604.02811 (replaced)
  • Authors: Lik Tung Fu, Jie Zhou, Shaokai Ren, Mengli Zhang, Jia Xiong, Hugo Jiang, Nan Guan, Xi Wang, Jun Yang
  • Subjects: cs.AR; cs.AI
  • Tags: RTL Verification, LLM Agent, Code Generation
  • Venue: DAC 2026
  • Summary: 本文提出了ChatSVA系统,基于多智能体框架实现SystemVerilog断言的自动生成,在语法正确率和功能正确率上大幅超越现有方法。
This post is licensed under CC BY 4.0 by the author.

Trending Tags