• Zaktor
    link
    fedilink
    English
    arrow-up
    2
    ·
    10 months ago

    That’s just not true. Semantic encodings work. It’s not like neural networks are some new untested concept, the LLMs have some new tricks under the hood and are way more extensive in their training goal, but they’re fundamentally the same thing. All neural networks are mimicry machines enabled and limited by their data, but mimicking largely correct data produces largely correct results when the answer, or interpolatable answers exists in the training data. The problem arises when asked to go further and further afield from their inputs. Some interpolation and substitutions work, but it gets increasingly unreliable the more niche the answer is.

    While the LLM hype has very seriously oversold their abilities, the instinctive backlash to say they’re useless is similarly way off-base.

    • Veraticus@lib.lgbtOP
      link
      fedilink
      English
      arrow-up
      2
      ·
      10 months ago

      No one is saying “they’re useless.” But they are indeed bullshit machines, for the reasons the author (and you yourself) acknowledged. Their purposes is to choose likely words. That likely and correct are frequently the same shouldn’t blind us to the fact that correctness is a coincidence.

      • Zaktor
        link
        fedilink
        English
        arrow-up
        1
        ·
        10 months ago

        That likely and correct are frequently the same shouldn’t blind us to the fact that correctness is a coincidence.

        That’s an absurd statement. Do you have any experience with machine learning?

          • Zaktor
            link
            fedilink
            English
            arrow-up
            1
            ·
            10 months ago

            Yes, it’s been my career for the last two decades and before that was the focus of my education. The idea that “correctness is a coincidence” is absurd and either fails to understand how training works or rejects the entire premise of large data revealing functional relationships in the underlying processes.

            • Veraticus@lib.lgbtOP
              link
              fedilink
              English
              arrow-up
              1
              ·
              10 months ago

              Or you’ve simply misunderstood what I’ve said despite your two decades of experience and education.

              If you train a model on a bad dataset, will it give you correct data?

              If you ask a question a model it doesn’t have enough data to be confident about an answer, will it still confidently give you a correct answer?

              And, more importantly, is it trained to offer CORRECT data, or is it trained to return words regardless of whether or not that data is correct?

              I mean, it’s like you haven’t even thought about this.