The 8088 The 8088 ← All news
arXiv cs.LG AI Research Apr 22

Local Linearity of LLMs Enables Activation Steering via Model-Based Linear Optimal Control

★★★★★ significance 3/5

Researchers have developed a new method for activation steering in LLMs by treating model dynamics as a linear time-varying system. By using a linear quadratic regulator and layer-wise Jacobians, the approach enables closed-loop control to steer model behavior with minimal overhead and no additional training.

Why it matters Treating model activations as controllable dynamical systems offers a training-free pathway toward more precise, real-time behavioral alignment and safety interventions.
Read the original at arXiv cs.LG

Tags

#llm #activation steering #control theory #alignment #transformer

Related coverage