Generative AI — Image generation benchmark

How can a simple prompt tell more about a model than a thousand words?

Benoit Lamouche

4 min readFeb 3, 2025

--

I have been using the same prompt for 2 years to evaluate and benchmark image generation models.

photography style, family portrait

This prompt is stupidly simple but encapsulates the complexity of image generation:

Very low context, making the request difficult to analyze and answer.
It involves people, a highly complex subject with numerous challenging details (number of fingers, positioning of arms or individuals, etc.).
It also reveals potential cultural biases in a model (the definition of “family” may vary from one country to another).

It’s also an effective way to measure the evolution of models and compare differences between them.

Let’s dive into the results!

Disclaimer : This article will be updated every 6 months (or more according to market announcements)

Dall-E

23/09/2022

My very first prompt from this article. Blurry, ugly, creepy.

17/01/2023

A lot of improvements in the details’ quality. No diversity. Warm temperature in a clean interior.

Deepai.org

17/01/2025

Ugly details, and a giant arm.

Pixlr.com

17/01/2025

Blurry. Diversity. Classic style. Classic dress.

Adobe Express

17/01/2025

Digital art, diversity, family of “single” person

Janus Pro

03/02/2025

ugly details, missing parts, little diversity

Midjourney

03/02/2025

warm, diversity, un-conventional, ugly details

Flux V1

03/02/2025

clean, un-conventional, no diversity, natural

Grok

03/02/2025

natural, clean, no diversity, weird ‘same sweat’ colors

My personal ranking :

03/02/2025

1 — Flux V1: for their very natural and qualitative scenes, very natural, could have been made by a photographer
2 — Midjourney : for their scenes and diversity, but many ugly details remains, and the quality is somehow random
3 — Grok : The details are good, but the scenes are very classic and formal, with no diversity

Let me know your thoughts and let me know which other models I could integrate to this benchmark !

Get more exciting publications from Benoit

You want to get more quality content ? Books ? Notion templates ? Management and leadership ? Productivity ?

subscribepage.io

Follow me on Mastodon and Bluesky

Written by Benoit Lamouche

Digital Factory Director & Tech culture addict https://lamouche.fr/ - Creator of The Hidden Leader https://thehiddenleader.substack.com/

No responses yet

Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams