cover of episode “Checking in on Scott’s composition image bet with imagen 3” by Dave Orr

“Checking in on Scott’s composition image bet with imagen 3” by Dave Orr

2024/12/22
logo of podcast LessWrong (30+ Karma)

LessWrong (30+ Karma)

Frequently requested episodes will be transcribed first

Shownotes Transcript

2.5 years ago Scott Alexander made a bet that by June of 2025, image gen should have more or less solved compositionality, operationalized through 5 prompts, must get at least 3 correct. There was a premature declaration of victory, but if the bet was settled I hadn't heard about it. It's time. Google's Imagen 3 gets 4/5. The bet specifies 10 shots per prompt, but I'm just going to put the four it generates since that's plenty.

  1. A stained glass picture of a woman in a library with a raven on her shoulder with a key in its mouth This is the only one that Imagen doesn't get. It makes multiple mistakes in the composition. It's a bit ironic that this is the one it missed given that the whole genesis of the bet was about designing stained glass.
  2. An oil painting of a man in [...]

The original text contained 5 images which were described by AI.


First published: December 22nd, 2024

Source: https://www.lesswrong.com/posts/8HSpbaAg8hvhiFDHB/checking-in-on-scott-s-composition-image-bet-with-imagen-3)

    ---
    

Narrated by TYPE III AUDIO).


Images from the article: undefined)undefined)undefined)undefined)undefined) Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts), or another podcast app.