Auf YouTube findest du die angesagtesten Videos und Tracks. Außerdem kannst du eigene Inhalte hochladen und mit Freunden oder gleich der ganzen Welt teilen.
Most of the largest datasets are kind of garbage because of this. I’ve had this idea to run the data through the network every epoch and evict samples that are too similar to the output for the next epoch but never tried it. Probably someone smarter than me already tried that and it didn’t work. I just feel like there’s some mathematical way around this we aren’t seeing. Humans are great at filtering the cruft so there must be some indicators there.
Most of the largest datasets are kind of garbage because of this. I’ve had this idea to run the data through the network every epoch and evict samples that are too similar to the output for the next epoch but never tried it. Probably someone smarter than me already tried that and it didn’t work. I just feel like there’s some mathematical way around this we aren’t seeing. Humans are great at filtering the cruft so there must be some indicators there.