arXiv cs.AI AI Safety 11h ago

When AI reviews science: Can we trust the referee?

★★★★★ significance 3/5

This paper investigates the reliability and security risks of using large language models for scientific peer review. It identifies vulnerabilities such as prompt injection attacks, authority bias, and hallucination, providing a taxonomy of risks across the review lifecycle.

Why it matters Automating peer review introduces systemic vulnerabilities like authority bias and prompt injection that could compromise the integrity of scientific validation.

Read the original at arXiv cs.AI

Related coverage

arXiv cs.AIPhySE: A Psychological Framework for Real-Time AR-LLM Social Engineering Attacks
arXiv cs.AIUlterior Motives: Detecting Misaligned Reasoning in Continuous Thought Models
arXiv cs.AIAgentic Adversarial Rewriting Exposes Architectural Vulnerabilities in Black-Box NLP Pipelines
arXiv cs.AIStructural Enforcement of Goal Integrity in AI Agents via Separation-of-Powers Architecture
arXiv cs.CLMechanistic Steering of LLMs Reveals Layer-wise Feature Vulnerabilities in Adversarial Settings

When AI reviews science: Can we trust the referee?

Tags

Related coverage