• Asafum@feddit.nl
    link
    fedilink
    English
    arrow-up
    38
    arrow-down
    3
    ·
    edit-2
    2 days ago

    Stocks take a hit with new foreign competition and they are immediately attacked. Coincidence I’m sure lol

    “We will deport you too if you sign up to deepseek!”

    -Trump admin probably

    • Breve@pawb.social
      link
      fedilink
      English
      arrow-up
      9
      arrow-down
      1
      ·
      2 days ago

      He’s going to slap 100% tariffs on all network packets passing in and out of China! That will fix everything including the price of eggs!

  • TheGrandNagus@lemmy.world
    link
    fedilink
    English
    arrow-up
    27
    ·
    2 days ago

    “Open source” and “commercial” aren’t opposites, plenty of models we consider commercial are also ‘open source’ - an obvious example being Facebook/Meta’s models…

    Looking outside of AI there’s plenty more examples. Chromium is open source, does that mean it and Google’s Blink web rendering engine is non-commercial? I’d say no.

    Should also be noted that there’s been some pushback recently on whether models trained on closed sources should be called “open source”, just because the model itself is.

    • gsfraley@lemmy.world
      link
      fedilink
      English
      arrow-up
      8
      ·
      2 days ago

      Fair, but even if it is commercial, the project being open-source is a huge step in the right direction. Specifically for DeepSeek, it has a number of censored topics like “Tiananmen Square” that it refuses to speak to, but because it’s open source, unaffiliated third parties have been able to start spins of it that reintegrate said sensitive topics.

      Perfect? Definitely no, but I’d say it’s almost 50% of the way there compared to the awful nature of ChatGPT/Google’s models. And it also made people realize that this is possible, so there’ll be more people taking it in a good direction that otherwise might not’ve tried.

    • Epzillon@lemmy.world
      link
      fedilink
      English
      arrow-up
      4
      ·
      2 days ago

      The Open Source Initiative have defined what they believe constitutes “open source AI” (https://opensource.org/ai/open-source-ai-definition). This includes detailed descriptions of training data, explanation on how it was obtained, selected, labeled, processed and filtered. As long as a company utilize any model trained on non-specified data I will assume it is either stolen or otherwise unlawfully obtained from non-consenting users.

      I will be clear that I have not read up on Deepseek yet, but I have a hard time believing their training data is specified according to OSI, since no big model yet has done so. Releasing the model source code means little for AI compared to all its training data.

      • cyd@lemmy.world
        link
        fedilink
        English
        arrow-up
        3
        ·
        2 days ago

        No AI org of any significant size will ever disclose its full training set, and it’s foolish to expect such a standard to be met. There is just too much liability. No matter how clean your data collection procedure is, there’s no way to guarantee the data set with billions of samples won’t contain at least one thing a lawyer could zero in on and drag you into a lawsuit over.

        What Deepseek did, which was full disclosure of methods in a scientific paper, release of weights under MIT license, and release of some auxiliary code, is as much as one can expect.

        • Epzillon@lemmy.world
          link
          fedilink
          English
          arrow-up
          1
          ·
          2 days ago

          As i wrote in my comment i have not read up on Deepseek, if this is true it is definetly a step in the right direction.

          I am not saying i expect any company of significant scale to follow OSI since, as you say, it is too high risk. I do still believe that if you cannot prove to me that your AI is not abusing artists or creators by using their art, or not using data non-consentually acquired from users of your platform, you are not providing an ethic or moral service. This is my main concern with AI. Big tech keeps showing us, time and time again, that they really dont care about about these topics and this needs to change.

          Imo AI today is developing and expanding way too fast for the general consumer to understand it and by extension also the legal and justice systems. We need more laws in place regarding how to handle AI and the data they use and produce. We need more education on what AI actually is doing.

      • TheGrandNagus@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        ·
        edit-2
        2 days ago

        Indeed. This is what I was thinking of, except I couldn’t remember whether it was OSI or FSF pushing for it and, well, I’m too lazy to check lol

        Thanks

      • TheGrandNagus@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        ·
        edit-2
        2 days ago

        They use it commercially. It’s for commercial use, just not outside of Meta because they want to be anticompetitive.