Gemma
Coverage
Researchers investigate why transformer models struggle to generalize to unseen tokens in symbolic reasoning tasks. The study identifies 'representational collapse' in unembeddings as a key cause and proposes architectural and training interventions to improve performance.
Researchers introduced ActuBench, a multi-agent LLM pipeline designed to automatically generate and evaluate complex actuarial reasoning tasks. The system uses specialized agents for drafting, distractor construction, and verification to ensure high-quality assessment items aligned with professional standards.
Researchers have introduced a two-dimensional early exit strategy that optimizes both layer-wise and sentence-wise processing for LLM inference. This method achieves significant computational savings and speed-ups for classification tasks across various open-source models like Llama and Gemma.
