Not even stupid but just badly trained for that purpose. It’s no different than a LLM asked for coding that gets most of it right but flubs a subroutine. Misalignment doesn’t imply bad or evil, it’s just doing what it thinks the goal really is while we’re ignorant of the results.
Not even stupid but just badly trained for that purpose. It’s no different than a LLM asked for coding that gets most of it right but flubs a subroutine. Misalignment doesn’t imply bad or evil, it’s just doing what it thinks the goal really is while we’re ignorant of the results.