Can a Small Language Model Predict Kernel Latency, Memory, and Model Accuracy from Code? A New Regression Language Model (RLM) Says Yes
Researchers from Cornell and Google introduce a unified Regression Language Model (RLM) that predicts numeric outcomes straight from code strings—protecting GPU kernel latency, program reminiscence utilization, and even neural community accuracy and latency—with out hand-engineered options. A 300M-parameter encoder–decoder initialized from T5-Gemma achieves sturdy rank correlations throughout heterogeneous duties and languages, utilizing a single text-to-number…