Apr 22
Where Fake Citations Are Made: Tracing Field-Level Hallucination to Specific Neurons in LLMs
★★★★★
significance 3/5
Researchers investigated why Large Language Models generate fake citations, finding that author names are particularly prone to hallucination. The study identifies specific 'hallucination neurons' in the Qwen2.5-32B-Instruct model and demonstrates that suppressing these neurons can improve citation accuracy.
Why it matters
Identifying specific neurons responsible for hallucinations offers a potential mechanistic pathway for engineering more reliable, fact-based generative outputs.
Entities mentioned
QwenTags
#hallucination #llm #citations #interpretability #qwenRelated coverage
- Global South OpportunitiesPivotal Research Fellowship 2026 (Q3): AI Safety Research Opportunity - Global South Opportunities
- arXiv cs.AIAn Intelligent Fault Diagnosis Method for General Aviation Aircraft Based on Multi-Fidelity Digital Twin and FMEA Knowledge Enhancement
- arXiv cs.AIPExA: Parallel Exploration Agent for Complex Text-to-SQL
- arXiv cs.AIThe Power of Power Law: Asymmetry Enables Compositional Reasoning
- arXiv cs.AIOn the Existence of an Inverse Solution for Preference-Based Reductions in Argumentation