• jacksilver@lemmy.world
    link
    fedilink
    English
    arrow-up
    1
    ·
    2 days ago

    My point was a mixture of Experts model could suffer from generalization. Although in reading more I’m not sure if it’s the newer R model that had the MoE element.