Researchers were able to successfully hack into more than half their test websites using autonomous teams of GPT-4 bots, co-ordinating their efforts and spawning new bots at will. And this was using previously-unknown, real-world 'zero day' exploits.
Bruteforce and fuzztesting with AI identifying possible success results?