Llama
Coverage
This research investigates how explicit belief graphs impact LLM performance in cooperative multi-agent reasoning using the game Hanabi. The study identifies 'Planner Defiance' in certain models and finds that the effectiveness of graph integration depends heavily on whether the graph is used as context or as a structural gate for action selection.
This research investigates the internal mechanisms of LLMs like GPT-2 and Llama 3.2 to identify where societal biases reside within neural networks. The study explores identifying specific neurons and attention heads that encode stereotypical information to better understand and mitigate biased outputs.
Researchers have identified spectral phase transitions in the hidden activation spaces of large language models during reasoning versus factual recall. The study analyzes 11 models across 5 architectures to show how spectral properties can predict reasoning steps and correctness.
