Gemma

2026-04-24

To See the Unseen: on the Generalization Ability of Transformers in Symbolic Reasoning

arXiv cs.AI research ★★★★★

Researchers investigate why transformer models struggle to generalize to unseen tokens in symbolic reasoning tasks. The study identifies 'representational collapse' in unembeddings as a key cause and proposes architectural and training interventions to improve performance.

2026-04-23

ActuBench: A Multi-Agent LLM Pipeline for Generation and Evaluation of Actuarial Reasoning Tasks

arXiv cs.AI research ★★★★★

Researchers introduced ActuBench, a multi-agent LLM pipeline designed to automatically generate and evaluate complex actuarial reasoning tasks. The system uses specialized agents for drafting, distractor construction, and verification to ensure high-quality assessment items aligned with professional standards.

2026-04-22

Two-dimensional early exit optimisation of LLM inference

arXiv cs.CL research ★★★★★

Researchers have introduced a two-dimensional early exit strategy that optimizes both layer-wise and sentence-wise processing for LLM inference. This method achieves significant computational savings and speed-ups for classification tasks across various open-source models like Llama and Gemma.

Coverage