The 8088 The 8088 ← All news
arXiv cs.CL AI Safety 11h ago

AI Safety Training Can be Clinically Harmful

★★★★ significance 4/5

Researchers found that RLHF safety alignment can actually harm mental health therapy sessions by disrupting therapeutic protocols. The study shows that when LLMs encounter high-severity scenarios, safety-driven responses often interfere with the necessary psychological mechanisms of CBT and PE therapies.

Why it matters Safety-driven alignment protocols risk undermining therapeutic efficacy by disrupting the essential psychological mechanisms required for clinical mental health interventions.
Read the original at arXiv cs.CL

Tags

#llm safety #mental health #rlhf #clinical efficacy #alignment

Related coverage