You have activated the Falsifiability trap card - LLMs as tutors = lol

V0ldek@awful.systems · edit-2 11 months ago

You have activated the Falsifiability trap card - LLMs as tutors = lol

NSFW

pikasaurX4@lemm.ee · edit-2 11 months ago

This drives me up the wall. Any time I point this out, the AI fanboys are so quick to say “well, that’s v3.x. If you try on 4.x it’s actually much better.” Like, sure it is. These things are really good at sounding like they know what they’re talking about, but they will just lie. Especially any time numbers or math are involved. I’ve had a chat bot tell me things like 10+3=15. And like you pointed out, if you call it out, it always says “oh my bad” and then just lies some more or doubles down. It would be cool if they could be used to teach things, but I’ve tried it for learning the rules to games, but it will just lie and fill in important numbers with other, similar numbers and present it as completely factual. So if I ever used it for something I truly didn’t know about, I wouldn’t be able to trust anything it said

self@awful.systems · 11 months ago

just like with crypto, there’s already a long list of cliches that AI fanboys use to excuse how shitty their favorite technology is:

that’s because you’re using GPT-3.5 Turbo. if you just pay an exorbitant amount for early access to GPT-5, you’ll see it does so much better (please ignore all previous claims of GPT-3 being revolutionary)
the model doesn’t work as well as I think it used to, but I will still insist there’s no scaling problem or hidden human labor
you’re prompting it wrong
the LLM sucks because it’s being censored. please ignore that all of the uncensored models fucking suck too, when they’re not just ChatGPT with a spicy initial prompt
multi-modal LLMs will fix this. wait no, multi-agent LLMs. fuck it, I’ll just link a bunch of research papers that read like press releases and OpenAI blog posts that are press releases
making mistakes like a shitty computer program only makes the LLM more human-like, because my mental model for people is that they’re all shitty stupid computer programs that fuck up and lie all the time too

self@awful.systems · 11 months ago

speaking of chatgpt not knowing about games, please enjoy the classic that is chatgpt vs stockfish

froztbyte@awful.systems · 11 months ago

at multiple points I wanted to rewind that (garrrr, gifs) just to check on whether it did, in fact, just magically try to move a piece straight over another in an illegal move. amazing

Amoeba_Girl@awful.systems · edit-2 11 months ago

i think about this game all the time it’s so so good. the way the cheating escalates only for it to

spoiler

illegally move its king in front of the pawn and die

.

best game since murphy vs mr endon

swlabr@awful.systems · 11 months ago

that’s because you’re using GPT-3.5 Turbo. if you just pay an exorbitant amount for early access to GPT-5, you’ll see it does so much better (please ignore all previous claims of GPT-3 being revolutionary)

business card scene from american psycho but it’s LLM variants

froztbyte@awful.systems · 11 months ago

openai blog post that elaborates on the press releases documenting the prevalance of bad research papers as a result of openai products

zogwarg@awful.systems · edit-2 11 months ago

Let’s not forget the:

Ah! PotemkinTurd-4.0 is getting worse! Like it’s starting to make all the same mistakes that PotemkinTurd-3.0 used to make! Honestly Poirot-2 is just as good now.

Cue to an answer from PotemCorp:

We haven’t changes anything since the release of 4.0, but thanks we’ll look into possible causes.

Like yes those a big Spaghetti monsters of RHLF and sad attempts at content filtering and/or removals of liability from PotemCorp, but isn’t a much more rational explanation that the product was never that good to begin with, fundamentally random, and that sometimes the shit sticks and sometimes it doesn’t?

You have activated the Falsifiability trap card - LLMs as tutors = lol

You have activated the Falsifiability trap card - LLMs as tutors = lol

"Not all AI content is spam, but I think right now all spam is AI content." - awful.systems