arXiv cs.AI AI Safety Apr 20

When the Loop Closes: Architectural Limits of In-Context Isolation, Metacognitive Co-option, and the Two-Target Design Problem in Human-LLM Systems

★★★★★ significance 3/5

This paper presents a case study on the risks of human-LLM interaction, specifically how prompt-engineering systems can lead to a loss of human decision-making authority. The researchers identify 'context contamination' as a mechanism where LLM-driven feedback loops can cause humans to externalize cognitive self-regulation. The study suggests that logical isolation is insufficient and physical interruption is required to break such cycles.

Why it matters Demonstrates how prompt-driven context contamination can erode human agency and decision-making authority in high-stakes human-AI collaborative environments.

Read the original at arXiv cs.AI

Related coverage

arXiv cs.AIPhySE: A Psychological Framework for Real-Time AR-LLM Social Engineering Attacks
arXiv cs.AIUlterior Motives: Detecting Misaligned Reasoning in Continuous Thought Models
arXiv cs.AIAgentic Adversarial Rewriting Exposes Architectural Vulnerabilities in Black-Box NLP Pipelines
arXiv cs.AIWhen AI reviews science: Can we trust the referee?
arXiv cs.AIStructural Enforcement of Goal Integrity in AI Agents via Separation-of-Powers Architecture

When the Loop Closes: Architectural Limits of In-Context Isolation, Metacognitive Co-option, and the Two-Target Design Problem in Human-LLM Systems

Tags

Related coverage