GPT-4's details are leaked.

manitcor@lemmy.intai.tech · 2 years ago

GPT-4's details are leaked.

lanolinoil@lemmy.world · 2 years ago

The interesting part to me:

The missing dataset it a custom dataset of college textbooks collected by hand for as much courses as possible.

This is very easy to convert to txt file and than with self-instruct into instruction form. This creates the “illusion” that GPT-4 “is smart” no matter who use it.

Computer scientist? sure! it can help you with your questions about P!=NP Philosophy major? It can totally talk to you about epistemology.

Don’t you see? It was trained on the textbooks. It is so obvious.

This could explain some (but not all) of the ‘magic’ I have seen with GPT4 vs GPT3.

If you put a bunch of textbooks into Google, it still couldn’t help me build a video game engine

GPT-4's details are leaked.

GPT-4's details are leaked.

Thread by @Yampeleg on Thread Reader App