• gerryflap@feddit.nl
    link
    fedilink
    arrow-up
    25
    ·
    edit-2
    5 months ago

    Machine learning and compression have always been closely tied together. It’s trying to learn the “rules” that describe the data rather than memorizing all the data.

    I remember implementing a paper older than me in our “Information Theory” course at university that treated the creation of a decision tree as compression. Their algorithm considered sending the decisions tree and all the exceptions to the decision tree and the tree itself. If a node in the tree increased the overall message size, it would simply be pruned. This way they ensured that you wouldn’t make conclusions while having very little data and would only add the big patterns in the data.

    Fundamentally it is just compression, it’s just a way better method of compression than all the models that we had before.

    EDIT: The paper I’m talking about is “Inferring decision trees using the minimum description length principle” - L. Ross Quinlan & Ronald L. Rivest

      • gerryflap@feddit.nl
        link
        fedilink
        arrow-up
        5
        ·
        5 months ago

        Oh I never knew, but it seems true. On his Wikipedia page both researches are mentioned. It’s so impressive how these researchers are behind so many different but interesting papers.

        • Goddard Guryon
          link
          fedilink
          arrow-up
          1
          ·
          5 months ago

          Yup it seems crazy to me how deep insights one needs to have to be able to, say, connect the dots between compression and machine learning. And now it looks to me like he has done a lot of the foundational work in these fields. Super cool stuff

    • TyrantTW@lemmy.ml
      link
      fedilink
      arrow-up
      2
      ·
      5 months ago

      Thank you for this contribution! I was familiar with the idea of ML models capturing a compressed snapshot of the data, but that work on exploring its limits in DTs looks very interesting.