arXiv cs.AR Daily Update
arXiv cs.AR Daily Update
cs.AR 领域 2026年4月22日 共有 15 篇论文更新:
- 6 篇新投稿:Hardware Architecture (CHICO-Agent [2], ChipLight [4], [1]), Energy Efficiency ([3], [6]), FPGA ([5], [6]), Branch Prediction ([1]), LLM Agent (CHICO-Agent [2])
- 1 篇跨领域投稿:EDA ([7]), Hardware Architecture ([7])
- 8 篇替换投稿:FPGA (LUTstructions [13], [8]), Hardware Architecture (LUTstructions [13], [12]), Energy Efficiency ([8]), GPU Computing (CASS [9]), Code Generation (CASS [9])
整体趋势:今日论文主要聚焦于Hardware Architecture、FPGA、Energy Efficiency等方向。
已录用论文:[4](DATE 2026), [6](ARCS 2023), [8](ARCS 2024), [10](ACL 2026), [14](DAC 2026)
开源论文:无
新投稿 (6)
[1] Optimizing Branch Predictor for Graph Applications
- arXiv: 2604.18698
- Authors: Upasna, Venkata Kalyan Tavva
- Subjects: cs.AR
- Tags: Hardware Architecture, Branch Prediction
- Summary: 本文针对图应用中频繁出现的分支预测错误问题,提出优化分支预测器以提高图处理应用的整体性能。作者分析了不同类型的分支行为,指出分支预测器仍有进一步优化的空间来处理导致预测错误的分支。
[2] CHICO-Agent: An LLM Agent for the Cross-layer Optimization of 2.5D and 3D Chiplet-based Systems
- arXiv: 2604.18764
- Authors: Qihang Wu, Aman Arora, Vidya A. Chhabria
- Subjects: cs.AR
- Tags: LLM Agent, Hardware Architecture
- Summary: 本文提出CHICO-Agent,一个用于2.5D/3D芯粒系统跨层优化的LLM驱动框架。该框架通过维护知识库捕获参数-结果趋势,并采用管理员-现场多智能体工作流进行协同探索,相比模拟退火基线能找到更低成本的配置。
[3] A Comparative Analysis of ARM and x86-64 Laptop-Class Processors: Architecture, Assembly-Level Performance, and Energy Efficiency
- arXiv: 2604.18896
- Authors: Mustafa Mert Özyılmaz
- Subjects: cs.AR
- Tags: Hardware Architecture, Energy Efficiency
- Summary: 本文对ARM架构(Apple M3)和x86-64架构(AMD Ryzen)笔记本处理器进行了架构和实验对比分析。实验结果表明,Apple平台在能效方面显著优于Ryzen平台,在Fibonacci和矩阵乘法基准测试中分别降低约5.82倍和6.38倍的能耗。
[4] ChipLight: Cross-Layer Optimization of Chiplet Design with Optical Interconnects for LLM Training
- arXiv: 2604.18909
- Authors: Kangbo Bai, Zhantong Zhu, Yifan Ding, Tianyu Jia
- Subjects: cs.AR
- Tags: LLM Training, Hardware Architecture
- Venue: DATE 2026
- Summary: 本文提出ChipLight,一种针对芯粒和光互连训练集群的跨层多目标设计优化方法。该方法协同优化芯粒架构、训练并行策略和光互连网络拓扑,显著提升了LLM训练效率。
[5] Design Rules for Extreme-Edge Scientific Computing on AI Engines
- arXiv: 2604.19106
- Authors: Zhenghua Ma, G Abarajithan, Dimitrios Danopoulos, Olivia Weng, Francesco Restuccia, Ryan Kastner
- Subjects: cs.AR; cs.AI; cs.LG
- Tags: FPGA, Edge Computing
- Summary: 本文研究了极端边缘科学计算神经网络应如何在AI引擎与可编程逻辑上实现的问题。作者提出了延迟调整资源等价性(LARE)指标,并针对低延迟科学推理提出了空间和API级数据流优化方案。
[6] Energy Efficient LSTM Accelerators for Embedded FPGAs through Parameterised Architecture Design
- arXiv: 2604.19293
- Authors: Chao Qian, Tianheng Ling, Gregor Schiele
- Subjects: cs.AR
- Tags: FPGA, Energy Efficiency
- Venue: ARCS 2023
- Summary: 本文提出了一种针对资源受限嵌入式FPGA优化的LSTM硬件加速器设计。该加速器通过多种优化参数可适应不同场景,实现了11.89 GOP/s/W的能效,在执行速度和能耗方面均优于相关工作。
跨领域投稿 (1)
[7] A PPA-Driven 3D-IC Partitioning Selection Framework with Surrogate Models
- arXiv: 2604.18806 (cross-listed)
- Authors: Shang Wang, Shuai Liu, Owen Randall, Matthew E. Taylor
- Subjects: cs.LG; cs.AR
- Tags: EDA, Hardware Architecture
- Summary: 本文提出DOPP框架,一种基于代理模型的PPA驱动3D-IC划分选择方法。该框架在八个3D-IC设计上实现了拥塞度、布线长度、WNS、TNS和功耗的显著改善,同时大幅降低评估成本。
替换投稿 (8)
[8] Idle is the New Sleep: Configuration-Aware Alternative to Powering Off FPGA-Based DL Accelerators During Inactivity
- arXiv: 2407.12027 (replaced)
- Authors: Chao Qian, Christopher Cichiwskyj, Tianheng Ling, Gregor Schiele
- Subjects: cs.AR; cs.AI
- Tags: FPGA, Energy Efficiency
- Venue: ARCS 2024
- Summary: 本文提出了一种针对FPGA深度学习加速器的空闲等待策略,通过优化配置参数实现40.13倍的配置能耗降低。在占空比模式下,该策略在499.06ms以内的请求周期内优于传统的开关策略。
[9] CASS: Nvidia to AMD Transpilation with Data, Models, and Benchmark
- arXiv: 2505.16968 (replaced)
- Authors: Ahmed Heakl, Gustavo Bertolo Stahl, Sarim Hashmi, Seung Hun Eddie Han, Mukul Ranjan, Arina Kharlamova, Salman Khan, Abdulrahman Mahmoud
- Subjects: cs.AR; cs.AI; cs.CL; cs.LG; cs.PL
- Tags: GPU Computing, Code Generation
- Summary: 本文介绍了CASS,首个用于GPU代码跨架构转换(CUDA↔HIP, SASS↔RDNA3)的数据集和模型套件。训练的领域特定翻译模型在CUDA→HIP上达到88.2%准确率,在SASS→RDNA3上达到69.1%,显著优于商业基线。
[10] ChatHLS: Towards Systematic Design Automation and Optimization for High-Level Synthesis
- arXiv: 2507.00642 (replaced)
- Authors: Runkai Li, Jia Xiong, Xiuyuan He, Jieru Zhao, Jiaqi Lv, Haowen Fang, Lei Qi, Xi Wang
- Subjects: cs.AR
- Tags: HLS, LLM Agent
- Venue: ACL 2026
- Summary: 本文提出ChatHLS,一个利用专业LLM进行自动调试和指令调优的多智能体HLS设计框架。该框架结合自适应错误案例扩展机制和QoR感知推理,在调试方面相比Gemini-3-pro提升32.6%,显著加速硬件开发流程。
[11] ODMA: On-Demand Memory Allocation Strategy for LLM Serving on LPDDR-Class Accelerators
- arXiv: 2512.09427 (replaced)
- Authors: Guoqiang Zou, Wanyu Wang, Hao Zheng, Longxiang Yin, Yinhe Han
- Subjects: cs.AR; cs.AI
- Tags: LLM Serving, Memory Architecture
- Summary: 本文提出ODMA,一种针对LPDDR类加速器的按需内存分配策略,用于LLM服务。该方法结合轻量级长度预测器、自适应桶分区和安全池,在Cambricon MLU370-X4上实现了最高19.25%的KV缓存利用率提升和23-27%的吞吐量提升。
[12] A Case for Hypergraphs to Model and Map SNNs on Neuromorphic Hardware
- arXiv: 2601.16118 (replaced)
- Authors: Marco Ronzani, Cristina Silvano
- Subjects: cs.AR; cs.NE
- Tags: Neuromorphic Computing, Hardware Architecture
- Summary: 本文提出将SNN的抽象层次从图提升到超图,以更好地在神经形态硬件上映射神经元到核心。超图模型通过暴露超边共成员关系来忠实捕获核心内尖峰复制,利用超边重叠和局部性可实现更高质量的映射。
[13] LUTstructions: Self-loading FPGA-based Reconfigurable Instructions
- arXiv: 2602.20802 (replaced)
- Authors: Philippos Papaphilippou
- Subjects: cs.AR
- Tags: FPGA, Hardware Architecture
- Summary: 本文探索了通过在软核中集成可重配置区域来实现可重配置指令的概念。提出的LUTstruction架构针对自定义指令的低延迟和宽重配置进行了优化,在FPGA上实现了无显著频率开销的FPGA-on-FPGA设计。
[14] COmPOSER: Circuit Optimization of mm-wave/RF circuits with Performance-Oriented Synthesis for Efficient Realizations
- arXiv: 2603.20486 (replaced)
- Authors: Subhadip Ghosh, Surya Srikar Peri, Ramprasath S., Sosina A. Berhan, Endalk Y. Gebru, Ramesh Harjani, Sachin S. Sapatnekar
- Subjects: cs.AR
- Tags: Circuit Design, EDA
- Venue: DAC 2026
- Summary: 本文提出COmPOSER,一个开源的射频/毫米波设计自动化端到端框架,将目标规格转化为优化电路和版图。该框架统一了原理图综合、版图生成和布局布线,在65nm工艺上实现了与专家手工设计相当的性能,同时带来100-300倍的生产力提升。
[15] The data heat island effect: quantifying the impact of AI data centers in a warming world
- arXiv: 2603.20897 (replaced)
- Authors: Andrea Marinoni, Erik Cambria, Weisi Lin, Mauro Dalla Mura, Jocelyn Chanussot, Edoardo Ragusa, Chi Yan Tso, Yihao Zhu, Benjamin Horton
- Subjects: cs.CY; cs.AI; cs.AR
- Tags: Data Center, AI Sustainability
- Summary: 本文量化了AI数据中心对周围环境的热影响,通过遥感地表温度测量发现AI超算中心运营后周边地区平均升温2°C。研究估计超过3.4亿人可能受到这种数据热岛效应的影响,对社区和区域福利具有重要影响。
This post is licensed under CC BY 4.0 by the author.