

Ah, a lot of good info! Thanks, I’ll look into all of that!


Ah, a lot of good info! Thanks, I’ll look into all of that!


Appreciate all the info! I did find this calculator the other day, and it’s pretty clear the RTX 4060 in my server isn’t going to do much though its NVMe may help.
https://apxml.com/tools/vram-calculator
I’m also not sure under 10 tokens per second will be usable, though I’ve never really tried it.
I’d be hesitant to buy something just for AI that doesn’t also have RTX cores because I do a lot of Blender rendering. RDNA 5 is supposed to have more competitive RTX cores along with NPU cores, so I guess my ideal would be a SoC with a ton of RAM. Maybe when RDNA 5 releases, the RAM situation will have have blown over and we will have much better options for AMD SoCs with strong compute capabilities that aren’t just a 1-trick pony for rasterization or AI.


I’ve been looking into self-hosting LLMs, and it seems a $10k GPU is kind of a requirement to run a decently-sized model and get reasonable tokens / s rate. There’s CPU and SSD offloading, but I’d imagine it would be frustratingly slow to use. I even find cloud-based AI like GH Copilot to be rather annoyingly slow. Even so, GH Copilot is like $20 a month per user, and I’d be curious what the actual costs are per user considering the hardware and electricity cost.
What we have now is clearly an experimental first generation of the tech, but the industry is building out data centers as though it’s always going to require massive GPUs / NPUs with wicked quantities of VRAM to run these things. If it really will require huge data centers full of expensive hardware where each user prompt requires minutes of compute time on a $10k GPU, then it can’t possibly be profitable to charge a nominal monthly fee to use this tech, but maybe there are optimizations I’m unaware of.
Even so, if the tech does evolve and it become a lot cheaper to host these things, then will all these new data centers still be needed? On the other hand, if the hardware requirements don’t decrease by an order of magnitude, then will it be cost effective to offer LLMs as a service, in which case, I don’t imagine the new data centers will be needed either.


That’s a great point. For any aspect of big tech that isn’t corrupted or enshittified, some unsung heros probably fought hard to make it that way.


As workers of conscience, we …
Here I was thinking everyone with a conscience quit long ago or refused to work for surveillance capitalists in the first place.


Hopefully the intention is to contribute and help make it into something that can serve large organizations reliably.


I have a router with OpenWRT, which has great firewall capabilities that I use to block specific devices from having internet access while allowing them to connect to my local network. Useful solution for any device you want connecting to local services, but you don’t trust not to phone home.


Well, at least now there is a LLM that can hallucinate based on the contents of all those books.


I think you have a meme: https://imgflip.com/memegenerator/Drake-Hotline-Bling


Yeah, it sure has been a while: maybe 15 or even close to 20 years! You’re right, I think I’d really enjoy replaying it again on my Steam Deck, so I think I’ll take your advice. It’s amazing how well 20 year old classics like that still hold up.


I’ve been waiting since 2013 for a new Splinter Cell, but I won’t be willing to boot Windows or run anything with malware (a.k.a., DRM, spyware, kernel root kits), so I suppose it’s probably a lost cause altogether.
Although there are usually 3rd party “security experts” who provide alternative distributions free of such malware.


Next up, a mass influx of refurbished GPUs that have been subjected ionizing radiation.


It’s somehow satisfying to get a brand new machine with Windows pre-installed and never let Windows boot even once. 😎


I actually started watching a video from this channel a while back and was disappointed to learn it wasn’t him. I suppose some people don’t mind and find the videos interesting, but I personally find it disturbing, as though they’re re-animating his corpse or something. 😆


Richard Freymann explaining light
Sounds like it’s not actually Feynman, it AI.
This isn’t his voice — it’s our tribute to his teaching style, created purely for education and inspiration. No impersonation intended, just deep respect for one of history’s greatest teachers. 🙏
All content is created to inspire, educate, and encourage reflection. This channel follows YouTube’s monetization policies, including clear labeling of synthetic media.
Yeah, LLMs do a decent job explaining what code does.
Nice, though $3k is still getting pretty pricey. I see mini PCs with a AMD RYZEN AI MAX+ 395 and 96GB of RAM can be had for $2k, or even $1k with less RAM: https://www.gmktec.com/products/amd-ryzen™-ai-max-395-evo-x2-ai-mini-pc?variant=f6803a96-b3c4-40e1-a0d2-2cf2f4e193ff
I’m looking for something that also does path tracing well if I’m going to drop that kind of coin. It sounds like this chip can be on par with a 4070 for rasterization, but it only gets a benchmark score of 495 for Blender rendering compared to 3110 for even a RTX 4060. RDNA 5 with true RTX cores should drastically change the situation of chips like this, though.