Dedicated ASIC is where all the hotness lies. Flexibility of FPGA doesn’t seem to overcome its overhead for most users. Not sure if it will change when custom ASIC becomes too expensive again, and all the magic money furnaces run out of bills to burn.
ASIC are single purpose at the benefit of potential power efficiency improvements. Not at all useful something like running neutral networks, especially not when they are being retrained and updated.
FPGAs are fully (re)programmable. There’s a reason why datacenters don’t lease ASIC instances.
Seriously. You realize that the most successful TPUs in the industry are ASICs, right? And that all the “AI” components in your phone are too? What are you even talking about here?
TPU units are specific to individual model frameworks, and engineers avoid using them for that reason. The most successful adoptions for them so far are vendor locked-in NN Models a la Amazon (Trainium), and Google (Coral), and neither of them has wide adoption since they have limited scopes. The GPU game being flexible in this arena is exactly why companies like OpenAI are struggling to justify the costs in using them over TPUs: it’s easy to run up front, but the cost is insane, and TPU is even more expensive in most cases. It’s also inflexible should you need to do something like multi-model inference (detection+evaluation+result…etc).
As I said, ASICs are single purpose, so you’re stuck running a limited model engine (Tensorflow) and instruction set. They also take a lot of engineering effort to design, so unless you’re going all-in on a specific engine and thinking you’re going to be good for years, it’s short sighted to do so. If you read up, you’ll see the most commonly deployed edge boards in the world are…Jetsons.
Enter FPGAs.
FPGAs have speedup improvements for certain things like transcoding and inference in the 2x-5x range for specific workloads, and much higher for ML purposes and in-memory datasets (think Apache Ignite+Arrow workloads), and at a massive reduction in power and cooling, so obviously very attractive for datacenters to put into production. The newer slew of chips out are even reprogrammable “on the fly”, meaning a simple context switch and flash can take milliseconds, and multi-purpose workloads can exist in a single application, where this was problematic before.
So unless you’ve got some articles about the most prescient AI companies currently using GPUs and moving to ASIC, the field is wide open for FPGA, and the datacenter adoption of such says it’s the path forward unless Nvidia starts kicking out more efficient devices.
Now ask open AI to type for you what the draw backs of FPGA is. Also the newest slew of chips is using partially charged NAND gates instead of FPGA.
Almost all ASIC being used right now is implementing the basic math functions, activations, etc. and the higher level work is happening in more generalized silicon. You can not get the transistor densities necessary for modern accelerator work in FPGA.
Dedicated ASIC is where all the hotness lies. Flexibility of FPGA doesn’t seem to overcome its overhead for most users. Not sure if it will change when custom ASIC becomes too expensive again, and all the magic money furnaces run out of bills to burn.
ASIC are single purpose at the benefit of potential power efficiency improvements. Not at all useful something like running neutral networks, especially not when they are being retrained and updated.
FPGAs are fully (re)programmable. There’s a reason why datacenters don’t lease ASIC instances.
Um. lol What? You may want to do your research here, because you’re so far off base I don’t think you’re even playing the right game.
Ok, so you should just go ahead and tell all the ASIC companies then.
https://www.allaboutcircuits.com/news/intel-and-google-collaborate-on-computing-asic-data-centers/
https://www.datacenterfrontier.com/servers/article/33005340/closer-look-metas-custom-asic-for-ai-computing
https://ieeexplore.ieee.org/document/7551392
Seriously. You realize that the most successful TPUs in the industry are ASICs, right? And that all the “AI” components in your phone are too? What are you even talking about here?
TPU units are specific to individual model frameworks, and engineers avoid using them for that reason. The most successful adoptions for them so far are vendor locked-in NN Models a la Amazon (Trainium), and Google (Coral), and neither of them has wide adoption since they have limited scopes. The GPU game being flexible in this arena is exactly why companies like OpenAI are struggling to justify the costs in using them over TPUs: it’s easy to run up front, but the cost is insane, and TPU is even more expensive in most cases. It’s also inflexible should you need to do something like multi-model inference (detection+evaluation+result…etc).
As I said, ASICs are single purpose, so you’re stuck running a limited model engine (Tensorflow) and instruction set. They also take a lot of engineering effort to design, so unless you’re going all-in on a specific engine and thinking you’re going to be good for years, it’s short sighted to do so. If you read up, you’ll see the most commonly deployed edge boards in the world are…Jetsons.
Enter FPGAs.
FPGAs have speedup improvements for certain things like transcoding and inference in the 2x-5x range for specific workloads, and much higher for ML purposes and in-memory datasets (think Apache Ignite+Arrow workloads), and at a massive reduction in power and cooling, so obviously very attractive for datacenters to put into production. The newer slew of chips out are even reprogrammable “on the fly”, meaning a simple context switch and flash can take milliseconds, and multi-purpose workloads can exist in a single application, where this was problematic before.
So unless you’ve got some articles about the most prescient AI companies currently using GPUs and moving to ASIC, the field is wide open for FPGA, and the datacenter adoption of such says it’s the path forward unless Nvidia starts kicking out more efficient devices.
Now ask open AI to type for you what the draw backs of FPGA is. Also the newest slew of chips is using partially charged NAND gates instead of FPGA.
Almost all ASIC being used right now is implementing the basic math functions, activations, etc. and the higher level work is happening in more generalized silicon. You can not get the transistor densities necessary for modern accelerator work in FPGA.