Meta and Microsoft say they will buy AMD's new AI chip as an alternative to Nvidia's

misk · 1 year ago

Meta and Microsoft say they will buy AMD's new AI chip as an alternative to Nvidia's

Aniki 🌱🌿@lemm.ee · 1 year ago

I would kill to run my models on my own AMD linux server.

dublet@lemmy.world · 1 year ago

Does GPT4all not allow that? Or do you have specific other models?

Aniki 🌱🌿@lemm.ee · edit-2 1 year ago

I haven’t super looked into it but I’m not interested in playing the GPU game against the gamers so if AMD can do a Tesla equivalent with gobs of RAM and no display hardware I’d be all about it.

Right now it’s looking like I’m going to build a server with a pair of K80s off ebay for a hundred bucks which will give me 48GB of RAM to run models in.

dublet@lemmy.world · 1 year ago

Some of the LLMs it ships with are very reasonably sized and still be impressive. I can run them on a laptop with 32GB of RAM.

Aniki 🌱🌿@lemm.ee · 1 year ago

This is very interesting! Thanks for the link. I’ll dig into this when I manage to have some time.

tal@lemmy.today · edit-2 1 year ago

if AMD can do a Tesla equivalent with gobs of RAM and no display hardware I’d be all about it.

That segment of the market is less price-sensitive than gamers, which is why Nvidia is demanding the prices that they are for it.

An Nvidia H100 will give you 80GB of VRAM, but you’ll pay $30,000 for it.

AMD competing with Nvidia in the sector more-strongly will improve pricing, but I doubt very much that it’s going to make compute cards cheaper than GPUs.

Besides, if you did wind up with compute cards being cheaper, you’d have gamers just rendering frames on compute cards and then using something else to push the image to the screen. I know that Linux can do that with PRIME, and I assume that Windows can as well. That’d cause their attempt to split the market by price to fail. Nah, they’re going to split things up by amount of VRAM on the card, not by whether there’s a video interface on it.

I suspect that a better option is to figure out ways to reasonably split up models to run on lower-VRAM GPUs in parallel.