There are lots of articles about bad use cases of ChatGPT that Google already provided for decades.
Want to get bad medical advice for the weird pain in your belly? Google can tell you it’s cancer, no problem.
Do you want to know how to make drugs without a lab? Google even gives you links to stores where you can buy the materials for it.
Want some racism/misogyny/other evil content? Google is your ever helpful friend and garbage dump.
What’s the difference apart from ChatGPT’s inability to link to existing sources?
Edit: Just to clear things up. This post is specifically not about the new use cases that come from AI. Sure, Google cannot make semi-non-functional mini programs automatically, and Google will not write a fake paper in whole for me. I am specifically talking about the “This will change the world” articles, that mirror stuff that Google can do exactly like ChatGPT can.
It’s precisely ChatGPT’s inability to link the sources that is the concern.
also, the fact that Google is so bad about this stuff should give us some pause about directly handing even more incredibly powerful, not-well-curated tools to a constituency as broad as “everyone with an internet connection”. it’s actually insanely bad this a problem with Google, we’ve just normalized it there and it sucks!
At the heart of the issue I think is the fact that GPT can trick enough people into believing that there’s organized thought behind what it says. So people have started trusting and using AI in spaces that it doesn’t belong. Some fields have been resistant, but when there are places that operate under the incentive of cheapest labor wins (lowest bid contracts, for example), AI as a whole has been infiltrating under the guise of capitalism in places it shouldn’t currently (or perhaps ever) exist.
When you tell people not to trust google results or an article posted on facebook, there’s a surface level of understanding from most people that yea - you can go on the internet and spout nonsense and you shouldn’t trust these as sources. People might be a bit more willing to trust wikipedia nowadays, despite all their librarian friends and teachers telling them not to (and for hard research, this is still quite true). But despite the fact that AI is literally trained on data from the internet, people for some reason don’t view it the same way as the data it’s trained on. It’s ingested from people talking about and posting links to nonsense on facebook - it reflects this very data, yet many people treat it like it’s a thesaurus, or a dictionary, or better than google results. It’s not and we need to teach people that it isn’t.
The fact that there are lawyers already facing repercussions of using chatgpt in their cases citing things that don’t exist proves it
The hype cycle around AI right now is misleading. It isn’t revolutionary because of these niche one-off use-cases, it’s revolutionary because it’s one AI that can do anything. Problem with that is what it’s most useful for is boring for non-technical people.
Take the library I wrote to create “semantic functions” from natural language tasks - one of the examples I keep going to in order to demonstrate the usefulness is
@semantic def list_people(text) -> list[str]: '''List the people mentioned in the given text.'''
8 months ago, this would’ve been literally impossible. I could approximate it with thousands of lines of code using SpaCy and other NLP libraries to do NER, maybe a dictionary of known names with fuzzy matching, some heuristics to rule out city names or more advanced sentence structure parsing for false positives, but the result would be guaranteed to be worse for significantly more effort. Here, I just tell the AI to do it and it… does. Just like that. But you can’t hype up an algorithm that does boring stuff like NLP, so people focus on the danger of AI (which is real, but laymen and news focus on the wrong things), how it’s going to take everyone’s jobs (it will, but that’s a problem with our system which equates having a job to being allowed to live), how it’s super-intelligent, etc. It’s all the business logic and doing things that are hard to program but easy to describe that will really show off its power.
I’m both really excited and worried about the part where AI takes over so many jobs that enough people will be without work. I wonder how society will deal with that, will everyone get a proper base “salary” for existing or will there be huge refuge-like camps for the poor jobless people?
Under capitalism, i fear automationing will mean people losing their jobs only have worse, often dangerous opinions, that machines could do, meanwhile entertainment and the like will be flooded by even shittier quality AI made crap. I can only pray it will mean everyone’s basic needs being covered, but that requires a huge shift
Yeah, I just don’t see that happening. The whole “western” world is taking hard turn to the right, and that’s not going to get better any time soon.
The problem is that the ones that will benefit from AI taking over are the big companies that create such AIs - Google, Meta, Apple. They will grow exponentially by having AIs work for them 24/7. So its not like himanity as a whole will grow, it will just be these companies, and they will slowly become the rulers of humanity.
It’s not so much the inability to link sources but the active laundering of those sources that bugs me. We’ve been lucky that shady information has largely had a vibe that’s pretty easy to spot. ChatGPT presents everything with the same level of professionalism.
Worse, while we might collectively start discounting direct chatbot output because LLMs are dirty liars, scammers can now cheaply rewrite their typo-ridden weird ass screeds into something resembling professionally produced copy.
Often time scammers put a few typos and whatnot into initial contact to weed out smarter people. Mainly if the scam is going to involve phone calls or something. Scams just trying to get passwords or infect your computer might try harder to look legit.
Only because interacting with people smart enough to recognize their spelling/grammar as fucked is a waste of their time. If it’s borderline free because an AI does the work, there’s way less need to do that.
I like to think of it from a different standpoint. Propaganda and fake news has existed for hundreds of years, if not millennia. It’s just that in the past it was mostly created by wealthy folks, and now anyone can create their own.
You could even say, propaganda and fake news were the original form.
One enables the other, or rather the snake is constantly eating itself. SEO content and clickbait were already plagiarizing and consuming human communication, polluting the web by crowding out actual information – ChatGPT and LLMs calcify and turbo-charge this. Tech companies are reacting by piling their own LLMs on top – ingesting garbage and generating yet more garbage. Soon enough, appending " reddit" to our search terms will not be enough to quickly and freely get human information from the web.
Meanwhile – laymen are being told that ChatGPT is an oracle, an intelligence, by companies and enthusiasts trying to build a crypto-style hype train. And the laymen are reacting accordingly. They are being told that ChatGPT knows everything. It doesn’t even know what a pineapple is.
Yeah, the direction commercial AI took is truely disheartening… Like, AI is a useful tool, but it’s been buisnesized where everyone puts AI is places where it shouldn’t be. Mostly because people don’t understand what they are doing so surely an AI model will…
The other day a dude wanted to dev an app with me about some random shit with an AI, except it could all be done with standard algorithms, and would probably perform much better too.
I looked at him and almost facepalmed on the spot…
Yeah, like that one in this thread who uses an LLM in Python to perform trivial tasks. They write a function with an LLM prompt as the body and then it gets executed by the LLM at runtime. Python was apparently not inefficient enough.
I don’t know, if I need a trivial function I just code it. Then I know it works and performs in a decent time.
I mean, that use case is definitely cool, but there is no way this should ever be used in production code.
It’s kinda like using dynamite as a novel heating system.
I think it’s a combination of inability to link to sources (as you have stated) as well as the confidence in which it may provide incorrect information, and a lack of proper understanding from many people as to how LLMs work and exactly how incorrect they can be at times.
Sure, people can lie on the internet, but a chat bot talking to me and lying? Shouldn’t computers not be able to do that? (/s of course)
The issue is that LLMs are fundamentally not able to not know something. Non-LLM filters that are strapped in front of an LLM can catch stuff like that (“As an LLM I am not able to…”), but if the request makes it through the filter, the LLM is not able to say “Sorry, I don’t know that”, because the data set doesn’t contain that.
For example, there aren’t a lot of API documentations that contain a “Sorry, I don’t know how this endpoint works”.
Strongly agreed. I view this as the biggest issue with LLMs. They will hallucinate a confidently incorrect answer for those cases. It makes them misinformation machines.
Getting reliable information out of an LLM is almost impossible.
The hallucinations look so real, that to spot them, you need to already know the correct answer. And if you know the correct answer, why do you need to ask the question?
And if you don’t know the answer, you can never know whether the answer is actually based in reality at all or a pure fabrication.
It’s not the inability to link sources, it’s the wholesale manufacture of them. It’s a language model, not a search engine. It doesn’t get its information from somewhere. It generates it probibalistically based on the structure of the sentence its forming.
It’ll include sources if the sentence structure suggests they should be there, but they’ll also just be built by probabilistic insertion of words.
It’ll include sources if the sentence structure suggests they should be there, but they’ll also just be built by probabilistic insertion of words.
I’ve seen attempts of people trying to train a LLM on information with sources. The end result was a model that would still hallucinate false information, and follow it up with a convincing looking source that doesn’t actually exist or a link that just leads to a 404 page. The way current LLMs work makes it impossible for them to mention accurate sources by default as they don’t remember full sentences or even any actual information, but just pick up some underlying patterns.
Currently the best you can do is letting a LLM come up with search engine queries to find relevant and up to date information for a certain question, and then making it formulate an answer based on what it found and including links to the page(s) it used. The main problem here is that LLMs are not great yet at verifying if a source is accurate, and most people will just take anything that mentions a source as a hard fact without even looking at what the source is.
It’s like a fancy interface for Google’s “I’m feeling lucky” button.
I think the difference is Google is just linking you to this content. On the other hand ChatGPT is pretty confidently telling people all these things. Add the fact that a lot of people consider ChatGPT to be some sort of all-knowing entity and it’s a recipe for disaster.
There’s something that worries me about GPT-like technologies, and I see very few people talking about it: GPT-based social media bots.
It can give people and groups to create much advanced mass manipulation strategies. Imagine a lot of gpt accounts on all sites creating comments advocating pro or against something, every time it’s mentioned, in a very natural language, that can fool most people.
It worries me a lot, and I’m sure it will be done at some point. If recent elections around the world were a mess due to a lot of social media manipulation and fake news campaigns, now imagine that powered by gpt.
I was gonna reply to this in the style of ChatGPT, but I somehow feel like that’d be the same as joking about having a bomb at airport security. But yeah, this is my main concern as well. Not only social media, but even blogs and reputable-looking websites which can act as “sources”. And what about Wikipedia bots?
I’m not worried about the loss of jobs or the sentience of computers, but rather the incapability to discern what’s real and what’s not. Could online human certificates be a thing? Multi-factor authentication (that is somehow still anonymous)?
I have a hard time imagining a system that can simultaneously identify someone as uniquely human while still maintaining anonymity. Any given website or person online might not know your name, but you would have to have some sort of public key that would identify you. That key would be a fingerprint that could tie all your online activity together for anyone interested.
I don’t know. Social media bots have been doing exactly that quite well for a long time. Turns out, you don’t actually have to write a comment, you just need to find another one that talks about the same key words and copy it in.
You still get great natural language (since it is natural language) and it fools most people as well.
Political talking points aren’t that varied. There are a handful of different takes on each topic and people repeat them already, so just copying them doesn’t make much of a difference.
It’s not the same. GPT-based bots add much more to the situation.
Current bots are easily identifiable, and can be just banned when spotted, but gpt bots can interact in a way that makes is more difficult to spot. They can be programmed to present different personalities and tastes, commenting on several places, and even chit-chatting here and there. Then, they will do their propaganda, considering the contexts, arguing and replying to counterarguments.
It’s a much more complex structure, and much harder to identify. Today, gpt produces text following some patterns, but that’s something that can be improved.
All we can really hope for is effective AI-driven detection methods for AI generated content. Here’s hoping that AIs are good at spotting one another.
That’s not a workable solution. Since Meta’s algorithm was leaked, there has been such rapid advancement on the open-source side of LLMs that the tech has diverged too far to ever be detectable.
You can now spin up a custom, targeted LLM in a few hours on low-power consumer hardware. And it beats the massive incumbents within the narrower scope of the training.
Think, a Facebook comment bot, targeted specifically to sound like pro-[VIEW] comments, complete with typos and Internet slang. Or a high school essay bot, trained exclusivity on 5-paragraph essays.
The tech right now gives a bet high false positive rate, and there are also AI tools that rewrite text to avoid detection by the existing detection tools.
This exactly. Only I am quite certain it’s already being used this way, on a much wider scale than we have any way to measure.
The problem with chatGPT is that it allows for automation of content creation.
Imagine a a single guy using chatGPT to control thousands of social media bots, who answer in a human-like way and are able to follow conversations and context, but who all defend the same point of view.
Or imagine a single guy controlling thousands of “local news blogs” that have a constant stream of fresh AI-generated content (both articles and comments), once again all pushing the same narrative.
That is the main problem with things like chatGPT, if not controlled - they allow anyone to create their own “troll farm”.
But that was possible before. Just that these bots would just copy content, add random words and run it through Google Translate, translating it to a different language and back again. That already did the trick for the last 10 or so years.
Those bots wouldn’t pass the turing test, that’s for sure. One thing is pure spam like you’re describing, another is to be arguing with an AI (and losing) without even being aware that it’s an AI on the other side.
A friend of mine uses it to re-type emails to sound more professional. He even got a couple of others to start doing it at his workplace. A few people have started to notice one particular employee has suddenly completely changed how he talks in emails. It’s very amusing, but it works extremely well for my friend.
He even pays the $20usd/m for the “premium” or whatever version. He’s a C-Suite at the company so it’s nothing to him to pay for the service. Other than instances like that, or simple coding (hey I need a quick bs landing page, or I need this added to whatever) it’s pretty overblown for how people seem to think it works.
I have to copy a lot of text from a pdf but the returns are inserted in weird places.
I used to do this whole workaround in word.
Now I’m just like, “chatgpt, can you remove all the extra returns from the text I send you?” “Sure no problem”
It takes me like 5 minutes instead of 20 per document.
While the inability to source is a huge problem, but you also have to keep in mind that complaining about AI has other objective beyond the obvious “AI bad”.
- it’s marketing: “Our thing is so powerful it could irreparably change someone’s life” is still advertising even if that irreparable change is bad. Saying “AI so powerful it’s dangerous” just sounds less advertis-y than “AI so powerful you cannot not invest in it” despite both leading to similar conclusions (you can look back at the “fearvertising” done during the original AI boom: same paint, different color)
- it’s begging for regulatory zeals to be put into place: Everyone with a couple of millions can build an LLM from scratch. That might sound like a lot, but it’s only getting cheaper and it doesn’t need highly intricate systems to replicate. Specifically the ability to finetune a large model with few datapoints allows even open-source non-profits like OpenAssistant to compete against the likes of google and openai: Google has made that very explicit in their leaked We have no moat memo. This is why you see people like Sam Altman talking to congress about the dangers of AI: He has no serious competetive advantage and hopes that with sufficient fear-mongering he can get the government to give him one.
Complaining about AI is as much about the AI as it is about the economical incentives behind AI.
Previously making misinformation, propaganda, spam etc even if using Google was still a manual activity bound by human limitations, now you can have a fully autonomous scam bot that will relatively cheaply scale to infinity
Scam call centers right now are hugely successful unfortunately, and they’re limited by human beings manning the phones, imagine a fleet of gpt agents scamming old ladies out of their life savings at record efficiency
I feel like it’s the occasional unpredictability that people are scared of. Whether it’s people being unable to tell if something was created by ChatGPT, if it’s pulling false sources, or people finding ways around set limitations and filters.
I used it to help a friend with a cover letter for a job. I pasted in what my friend had written and asked if it could make it sound better. It literally just made up stuff to make it sound better.
Each step drastically lowers the barriers to get to that end, and also distorts the monetary incentives for people operating that technology to make them more likely to deceive you.
It used to be you have to go to a bookstore/library to read some crack theory on the ailment of your choice. In order to get your crack theories published, you had gatekeepers, publishers, bookstore owners, and then librarians, who would choose what books to stock. Pretty hard to abuse your powers to deceive people.
Then it became easier because you can do it on a desktop with google. Then it became easier because you can just ask google on your phone. Now if you can get a solid SEO page, you’re not gatekept by anyone.
Now it’s easier since you have an authoritative AI that tells you exactly how to do it, and in theory, you can freely develop these AIs to give answers that match your “version” of reality in order to get the most engagement and money out of you. Imagine you go to a website and some “doctor” chats you up that’s actually just a conversational AI, and it just persuades you via pseudo-scientific language that is targeted towards your personal preferences just to get you to buy their snake oil.
I dunno.
Every history book I’ve ever read seems to indicate that our species tends toward using new technology for the worst purposes.
Sure, but many purposes that people are afraid that ChatGPT could be used for are already 1:1 possible with Google.
Reminds me of the 3D printed gun discussion. Sure, you can use a 3D printer to print a super crappy gun that can’t build up enough pressure for the projectile to gain some actual lethal speed without exploding.
Or you could do what people have been doing for centuries and just get a pipe and some other hardware from the hardware store and build a gun manually. That’s literally how guns where made for centuries. (Except that you have to replace the big box hardware store with the correct equivalent of the time.)
Ebay has rolled out some kind of language model plugin for their listing app. It will generate a short product description for you. I’ve been using chat gpt to do this for awhile and also asking it to optimize titles within the 80 character limit.
I can’t always think of related words that other people might search for so this helps me a lot.
I guess, LLM generated hallucinations can’t be worse than the current atrocities that product pages are.
The only downside of the LLM version is that they tend to be much longer with much more nonsense that you have to parse to get to the real content.
Kinda like Aliexpress product pages. “Superior Ultrafast Powersaving Hightec Most Modern Best Beautiful CPU”.
Hopefully they don’t combine it with an image generator that then also generates nonsense product pictures.
From what I can tell it simply uses your current item title and item specifics that you’ve entered.
I actually have dall-e images for my store categories specifically because they look surreal as hell. It fits with my branding though.
Ebay is better than ali or Amazon with the seo titles because it limits you to 80 characters.
The issue with their descriptions is that they haven’t changed the description area in 20 years and you have sellers embedding a whole website from 1998 in there. That and it’s Impossible to find the product description in the app, most buyers don’t read it.
I have cgpt the prompt “write an seo optimized product description for a pentium D processor”
Introducing the Pentium D Processor: Unleash Powerhouse Performance
Upgrade your computing experience with the Pentium D Processor, a true powerhouse that delivers exceptional performance and efficiency. Engineered with advanced technology and optimized for speed, this dual-core processor is perfect for demanding tasks, multimedia applications, and seamless multitasking. With its blazing clock speed and enhanced multitasking capabilities, you can effortlessly tackle complex applications, intensive gaming, and multimedia editing tasks. Experience unrivaled speed and responsiveness that allows you to breeze through your workload with ease.
Not only does the Pentium D Processor excel in performance, but it also offers uncompromised efficiency. Built on the latest architecture, this processor intelligently balances power consumption and performance, ensuring that you get the most out of every watt. Enjoy an optimized computing experience without sacrificing power or utility bills. With seamless multitasking, enhanced multimedia performance, and wide compatibility, the Pentium D Processor is a reliable choice for users who demand both speed and efficiency. Upgrade today and unlock the true power of efficient computing.
That might be good for SEO, but for a buyer this is utter garbage.
The content is mostly not true at all. A Pentium D might have been optimized for speed and a true powerhouse at launch, but launch was 2005. Even the cheapest CPU introduced in 2022 (Intel Celeron G6900) has >6x the performance of the Pentium D.
And even when the Pentium D was released, its massive power consumption of 95W-130W TDP was criticized, so all the talk about balancing power consumption would have been wrong even in 2005.
What I’d like to see as a buyer would be the following product description:
- Intel Pentium D 840, 3.4 GHz
- Introduction date: May 2005
- Dual core
- CPU Mark score: 736
- TDP: 130W
- Compatible with Windows XP - Windows 10
Instead, ChatGPT waffles on for two full, dense paragraphs of meaningless marketing nonsense.
Speaking from a 2023 perspective, there is actually not a single statement in these two paragraphs that is true. Not even the product name is correct (for a marketing text it should include the name Intel and the correct model number, as there were 15 variants which had ~35% of performance difference between the cheapest and the best.).
most buyers don’t read it
Because it’s hard to find any valuable or even correct information on these product pages…