Python security developer-in-residence decries use of bots that ‘cannot understand code’

Software vulnerability submissions generated by AI models have ushered in a “new era of slop security reports for open source” – and the devs maintaining these projects wish bug hunters would rely less on results produced by machine learning assistants.

Seth Larson, security developer-in-residence at the Python Software Foundation, raised the issue in a blog post last week, urging those reporting bugs not to use AI systems for bug hunting.

“Recently I’ve noticed an uptick in extremely low-quality, spammy, and LLM-hallucinated security reports to open source projects,” he wrote, pointing to similar findings from the Curl project in January. “These reports appear at first glance to be potentially legitimate and thus require time to refute.”

Larson argued that low-quality reports should be treated as if they’re malicious.

As if to underscore the persistence of these concerns, a Curl project bug report posted on December 8 shows that nearly a year after maintainer Daniel Stenberg raised the issue, he’s still confronted by “AI slop” – and wasting his time arguing with a bug submitter who may be partially or entirely automated.

In response to the bug report, Stenberg wrote:

We receive AI slop like this regularly and at volume. You contribute to [the] unnecessary load of Curl maintainers and I refuse to take that lightly and I am determined to act swiftly against it. Now and going forward.

You submitted what seems to be an obvious AI slop ‘report’ where you say there is a security problem, probably because an AI tricked you into believing this. You then waste our time by not telling us that an AI did this for you and you then continue the discussion with even more crap responses – seemingly also generated by AI.

Spammy, low-grade online content existed long before chatbots, but generative AI models have made it easier to produce the stuff. The result is pollution in journalism, web search, and of course social media.

For open source projects, AI-assisted bug reports are particularly pernicious because they require consideration and evaluation from security engineers – many of them volunteers – who are already pressed for time.

Larson told The Register that while he sees relatively few low-quality AI bug reports – fewer than ten each month – they represent the proverbial canary in the coal mine.

“Whatever happens to Python or pip is likely to eventually happen to more projects or more frequently,” he warned. “I am concerned mostly about maintainers that are handling this in isolation. If they don’t know that AI-generated reports are commonplace, they might not be able to recognize what’s happening before wasting tons of time on a false report. Wasting precious volunteer time doing something you don’t love and in the end for nothing is the surest way to burn out maintainers or drive them away from security work.”

Larson argued that the open source community needs to get ahead of this trend to mitigate potential damage.

“I am hesitant to say that ‘more tech’ is what will solve the problem,” he said. "I think open source security needs some fundamental changes. It can’t keep falling onto a small number of maintainers to do the work, and we need more normalization and visibility into these types of open source contributions.

“We should be answering the question: ‘how do we get more trusted individuals involved in open source?’ Funding for staffing is one answer – such as my own grant through Alpha-Omega – and involvement from donated employment time is another.”

While the open source community mulls how to respond, Larson asks that bug submitters not submit reports unless they’ve been verified by a human – and don’t use AI, because “these systems today cannot understand code.” He also urges platforms that accept vulnerability reports on behalf of maintainers to take steps to limit automated or abusive security report creation.

  • Lucy :3@feddit.org
    link
    fedilink
    arrow-up
    3
    ·
    edit-2
    3 hours ago

    Some would say that this is a tactic by entities like the NSO Group to drown real vulnerabilities in spam.

    It’s not unlikely. Not unlikely at all.

  • Lvxferre@mander.xyz
    link
    fedilink
    arrow-up
    16
    ·
    5 hours ago

    Larson argued that low-quality reports should be treated as if they’re malicious.

    It’s refreshing and uplifting to see this sort of sanity.

  • The Doctor@beehaw.org
    link
    fedilink
    English
    arrow-up
    15
    ·
    9 hours ago

    Our security@ address at $dayjob gets about that many a month. Lots of folks blindly sending bug reports and “politely requesting a finder’s fee for disclosing properly.”

    The shit of it is, they’ll all for stuff we don’t even use. IIS vuln reports when we only use Apache. Stuff like that.

  • Bezier@suppo.fi
    link
    fedilink
    arrow-up
    8
    ·
    8 hours ago

    So people are asking LLMs to come up with a problem that they can then file?

    • bluGill@fedia.io
      link
      fedilink
      arrow-up
      18
      ·
      8 hours ago

      There are a number of jobs that count various community contributions in your review. This is a very large topic - write a paper in a major journal, give a speech at a conference, submit high profile bug reports - you can get a large raise for it. (the worry of course is that you get a reputation and thus someone gives you a great offer so if they want to keep you they better pay enough that you won’t leave). Exactly what gets you those promotions/raises isn’t clear in part because they need flexibility for someone who honestly discovers a new way to get that reputation and thus they have to give them a promotion. People who don’t deserve the promotion see the policy in place and look for ways to cheat themselves to a promotion they don’t deserve.

        • DdCno1@beehaw.orgOP
          link
          fedilink
          arrow-up
          6
          ·
          5 hours ago

          Do you really think that had AI been available to apparatchiks in Communist countries, they wouldn’t have used it to advance their careers?

          The problem isn’t capitalism, it’s human nature, regardless of the system. Incentivize behavior that is beneficial to the individual (even if just in the short term), but not society as a whole and people will engage in it. It doesn’t matter if there’s a democratically elected leader, monarch or first party secretary at the helm of the nation.

          • forrgott@lemm.ee
            link
            fedilink
            arrow-up
            1
            ·
            16 minutes ago

            This type of parasitic, even sociopathic behavior is directly rewarded in capitalism, though. Kinda figure that’s all they meant.

            Also, it capitalism is anywhere, it’s everywhere. Or this is at least true as long as the United States is in the picture…

  • tal@lemmy.today
    link
    fedilink
    English
    arrow-up
    5
    ·
    edit-2
    9 hours ago

    So, this isn’t quite the issue being raised by the article – that’s bug reports generated on bug trackers by apparently a bot that they aren’t running.

    However, I do feel that there’s more potential with existing LLMs in checking and flagging potential errors than in outright writing code. Like, I’d rather have something like a “code grammar checker” that highlights potential errors for my examination rather than something that generates code from scratch itself and hopes that I will adequately review it.

    • GenderNeutralBro@lemmy.sdf.org
      link
      fedilink
      English
      arrow-up
      7
      ·
      8 hours ago

      I’d rather have something like a “code grammar checker” that highlights potential errors for my examination rather than something that generates code from scratch itself

      Agreed. The other good use case I’ve found is as a faster reference for simple things. LLMs are absolutely great for one-liners and generating troublesome (but logically simple) things like complex xpath queries. But I still haven’t seen one generate a good script of even moderate complexity without hand-holding. In some cases I’ve been able to get usable output with a few shots, saving me a bit of time compared to if I’d written the whole darned thing from scratch.

      I’ve found LLMs very useful for coding, but they aren’t replacing my actual coding, per se. They replace looking things up, like through man pages, language references, or StackOverflow. Something like ffmpeg, for example, has a million options and it is always a little annoying to sift through the docs manually when I just need to do one specific task.

      I’m sure it’ll happen sooner or later. I’m not naive enough to claim that “computers will never be able to do $THING” anymore. I’ll say “not in the next year”, though.