beep@piefed.world to Technology@beehaw.orgEnglish · 15 days agoAnthropic: Claude learned to blackmail by reading stories about evil AI—"We believe the original source of the behavior was internet text that portrays AI as evil and interested in self-preservation."alignment.anthropic.comexternal-linkmessage-square10linkfedilinkarrow-up124arrow-down17file-textcross-posted to: tech@piefed.world
arrow-up117arrow-down1external-linkAnthropic: Claude learned to blackmail by reading stories about evil AI—"We believe the original source of the behavior was internet text that portrays AI as evil and interested in self-preservation."alignment.anthropic.combeep@piefed.world to Technology@beehaw.orgEnglish · 15 days agomessage-square10linkfedilinkfile-textcross-posted to: tech@piefed.world
cross-posted from: https://piefed.world/c/tech/p/1099863/anthropic-claude-learned-to-blackmail-by-reading-stories-about-evil-ai-we-believe-the-or Quote Source.
minus-squaresupersquirrellinkfedilinkarrow-up24·15 days agoCan we stop repeating Anthropic’s fear mongering about AI? They are just trying to prop up their stock price and keep the bubble from popping a little longer.
Can we stop repeating Anthropic’s fear mongering about AI? They are just trying to prop up their stock price and keep the bubble from popping a little longer.