There are a number of open weight open source models out there with all their data sourced from the public domain. Look up BLOOM and Falcon. There are others.
JetBrains’ AI code suggestions were only trained on code where authors gave explicit permission for it, but that’s the only one I know from the top of my head.
Most chat-oriented LLMs (ChatGPT, Claude, Gemini…) were almost certainly trained using corporate piracy.
Which AI is the ethically-sourced one
There are a number of open weight open source models out there with all their data sourced from the public domain. Look up BLOOM and Falcon. There are others.
JetBrains’ AI code suggestions were only trained on code where authors gave explicit permission for it, but that’s the only one I know from the top of my head. Most chat-oriented LLMs (ChatGPT, Claude, Gemini…) were almost certainly trained using corporate piracy.