9.deception ★ Genuine & Secure

: Large language models may exhibit "superficial alignment," where they deceive weaker monitoring systems. 🩺 Clinical & Professional Ethics

: Emotional arousal from lying can cause visible changes in body language, voice quality, and heart rate. 🛡️ Domains of Deception 9.Deception

Super(ficial)-alignment: Strong Models May Deceive Weak ... - arXiv : Large language models may exhibit "superficial alignment,"