it will loose its ability to differentiate between there and their and its and it’s.

  • driving_crooner@lemmy.eco.br
    link
    fedilink
    arrow-up
    2
    ·
    9 months ago

    It’s about the counting subreddit. It was used on the token generation database, but then removed on the training. This user posted so much on that subreddit that a token with its username was created, but then it had nothing associated with it in the training and the model dosen’t know how to act when the token is present.