Post

arXiv cs.AR Daily Update

arXiv cs.AR Daily Update

cs.AR 领域 2026年4月15日 共有 9 篇论文更新:

  • 3 篇新投稿:Circuit Design (HARP [1]), Memory Architecture (HARP [1]), FPGA (CODO [2]), HLS (CODO [2]), High Performance Computing (EPAC [3])
  • 3 篇跨领域投稿:LLM Inference ([4]), Imitation Learning ([4]), LLM Agent (Aethon [5]), Multi-Agent System (Aethon [5]), Knowledge Distillation (TCL [6])
  • 3 篇替换投稿:FPGA (L-PCN [7]), 3D Vision (L-PCN [7]), EDA ([8]), Circuit Design ([8]), Formal Methods ([9])

整体趋势:今日论文主要聚焦于Circuit Design、FPGA、Memory Architecture等方向。

已录用论文[2](ISCA 2026), [3](CF 2026), [4](DAC 2026), [7](ISCA 2026)

开源论文[2](code)


新投稿 (3)

[1] HARP: Hadamard-Domain Write-and-Verify for Noise-Robust RRAM Programming

  • arXiv: 2604.12420
  • Authors: Ilhuan Choi, Jiwon Yoo, Yoona Lee, Yewon Jeong, Jason Jaesung Lee, Woo-Seok Choi
  • Subjects: cs.AR
  • Tags: Circuit Design, Memory Architecture
  • Summary: 本文提出了一种基于哈达玛域的写入验证框架,用于提高RRAM编程在低信噪比条件下的可靠性。该方法通过正交哈达玛模式替代传统验证读取,在不增加模拟硬件的情况下显著降低了读取噪声方差,实现了更高的精度和能效。

[2] CODO: An Automated Compiler for Comprehensive Dataflow Optimization

  • arXiv: 2604.12618
  • Authors: Weichuang Zhang, Yiquan Wang, Xinzhou Zhang, Chi Zhang, Yu Feng, Xiaofeng Hou, Chao Li, Jieru Zhao, Minyi Guo
  • Subjects: cs.AR
  • Tags: FPGA, HLS
  • Venue: ISCA 2026
  • Code: code
  • Summary: 本文介绍了CODO,一个自动化编译器,用于在FPGA上生成高效的数据流加速器。该编译器能够检测和消除粗粒度与细粒度的数据流违规,并优化片上片外数据移动,在典型计算核和DNN模型上实现了显著的延迟加速。

[3] EPAC: The Last Dance

  • arXiv: 2604.12715
  • Authors: Filippo Mantovani, Fabio Banchelli, Pablo Vizcaino, Roger Ferrer, Oscar Palomar, Francesco Minervini, Jesus Labarta, Mauro Olivieri, Sebastiano Pomata, Pedro Marcuello, Jordi Cortina, Alberto Moreno, Josep Sans, Roger Espasa, Vassilis Papaefstathiou, Nikolaos Dimou, Georgios Ieronymakis, Antonis Psathakis, Michalis Giaourtas, Iasonas Mastorakis, Manolis Marazakis, Eric Guthmuller, Andrea Bocco, Jérôme Fereyre, César Fuguet, Mate Kovač, Mario Kovač, Luka Mrković, Josip Ramljak, Luca Bertaccini, Tim Fischer, Frank K. Gurkaynak, Paul Scheffler, Luca Benini, Bhavishya Goel, Madhavan Manivannan, Tiago Rocha, Nuno Neves, Jens Krüger
  • Subjects: cs.AR; cs.DC
  • Tags: High Performance Computing, Hardware Acceleration
  • Venue: CF 2026
  • Summary: 本文介绍了EPAC,一款在欧洲处理器倡议下开发的RISC-V加速器芯片,集成了三种不同的RISC-V计算单元以支持不同类型的工作负载。该芯片采用22FDX工艺实现,成功完成了流片和验证,为欧洲HPC处理器生态系统奠定了基础。

跨领域投稿 (3)

[4] Active Imitation Learning for Thermal- and Kernel-Aware LFM Inference on 3D S-NUCA Many-Cores

  • arXiv: 2604.11948 (cross-listed)
  • Authors: Yixian Shen, Chaoyao Shen, Jan Deen, George Floros, Andy Pimentel, Anuj Pathania
  • Subjects: cs.LG; cs.AR
  • Tags: LLM Inference, Imitation Learning
  • Venue: DAC 2026
  • Summary: 本文提出了AILFM,一个基于主动模仿学习的调度框架,用于3D S-NUCA众核系统上的大基础模型推理的热感知调度。该框架从Oracle演示中学习近最优调度策略,在保持热安全的同时最大化性能。

[5] Aethon: A Reference-Based Replication Primitive for Constant-Time Instantiation of Stateful AI Agents

  • arXiv: 2604.12129 (cross-listed)
  • Authors: Swanand Rao, Kiran Kashalkar, Parvathi Somashekar, Priya Krishnan
  • Subjects: cs.AI; cs.AR; cs.DC; cs.MA
  • Tags: LLM Agent, Multi-Agent System
  • Summary: 本文介绍了Aethon,一种基于引用的复制原语,用于实现有状态AI代理的近常数时间实例化。该方法将代理实例表示为稳定定义、分层内存和本地上下文覆盖的组合视图,显著降低了创建开销并支持大规模多代理编排。

[6] TCL: Enabling Fast and Efficient Cross-Hardware Tensor Program Optimization via Continual Learning

  • arXiv: 2604.12891 (cross-listed)
  • Authors: Chaoyao Shen, Linfeng Jiang, Yixian Shen, Tao Xu, Guoqing Li, Anuj Pathania, Andy D. Pimentel, Meng Zhang
  • Subjects: cs.LG; cs.AR
  • Tags: Knowledge Distillation, Continual Learning
  • Summary: 本文介绍了TCL,一个用于跨硬件平台快速张量程序优化的编译器框架,结合了主动学习策略、Mamba代价模型和持续知识蒸馏。该框架在CPU和GPU平台上相比现有方法实现了显著的调谐时间加速和推理延迟降低。

替换投稿 (3)

[7] L-PCN: A Point Cloud Accelerator Exploiting Spatial Locality through Octree-based Islandization

  • arXiv: 2604.10716 (replaced)
  • Authors: Yiming Gao, Jieming Yin, Yuxiang Wang, Xiangru Chen, Zhilei Chai, Bowen Jiang, Jiliang Zhang, Herman Lam
  • Subjects: cs.AR
  • Tags: FPGA, 3D Vision
  • Venue: ISCA 2026
  • Summary: 本文提出了L-PCN,一种通过八叉树岛化技术利用空间局部性的点云加速器。该方法通过点云分区和基于Hub的调度减少了特征获取和计算中的重复操作,在FPGA上实现了显著的加速效果。

[8] EMSpice 3: Full-chip Temperature-Aware Multiphysics Electromigration and IR-Drop Analysis

  • arXiv: 2604.10743 (replaced)
  • Authors: Haotian Lu, Sheldon X.-D. Tan
  • Subjects: cs.AR
  • Tags: EDA, Circuit Design
  • Summary: 本文介绍了EMSpice 3,一个全芯片温度感知多物理场框架,用于电源网格网络的电迁移、热迁移和IR压降耦合分析。该框架整合了空间热图,实现了实用的、地图感知的全芯片EM可靠性评估。

[9] The Program Hypergraph: Multi-Way Relational Structure for Geometric Algebra, Spatial Compute, and Physics-Aware Compilation

  • arXiv: 2603.17627 (replaced)
  • Authors: Houston Haynes
  • Subjects: cs.PL; cs.AR
  • Tags: Formal Methods, Geometric Algebra
  • Summary: 本文引入了程序超图(PHG)作为程序语义图的推广,将二元边提升为任意元数的超边。该框架支持几何代数计算、等级推断,并能从单一图结构联合推导内存布局和硬件分区。
This post is licensed under CC BY 4.0 by the author.

Trending Tags