Abstract: The demand for high-speed matrix multiplication continues to grow due to recent developments in images processing, graphics processing, digital signal processing and communication via ...
CUDA-L2 is a system that combines large language models (LLMs) and reinforcement learning (RL) to automatically optimize Half-precision General Matrix Multiply (HGEMM) CUDA kernels. CUDA-L2 ...
Abstract: In intelligent connected vehicle applications, tasks, such as path planning and health management involve numerous matrix operations, particularly matrix multiplication. Due to limited ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results