• Arthur Besse@lemmy.mlM
    link
    fedilink
    arrow-up
    3
    ·
    3 years ago

    It appears that the captioning model on that website was trained on the MSCOCO dataset which was sourced from from Google and Bing image search, and also from Flickr.