| Model | Biology | Engineering | General | Overall | ||||
|---|---|---|---|---|---|---|---|---|
| Acc | Score | Acc | Score | Acc | Score | Acc | Score | |
| Nano Banana Pro | 0.849 | 0.625 | 0.708 | 0.434 | 0.816 | 0.601 | 0.791 | 0.553 |
| Wan2.5 | 0.714 | 0.433 | 0.606 | 0.309 | 0.755 | 0.519 | 0.692 | 0.420 |
| GPT-4o | 0.704 | 0.425 | 0.556 | 0.258 | 0.718 | 0.463 | 0.660 | 0.382 |
| Nano Banana | 0.697 | 0.400 | 0.579 | 0.276 | 0.716 | 0.468 | 0.664 | 0.381 |
| Seedream | 0.680 | 0.393 | 0.560 | 0.260 | 0.688 | 0.442 | 0.642 | 0.365 |
| Imagen-3 | 0.600 | 0.288 | 0.492 | 0.195 | 0.638 | 0.377 | 0.577 | 0.287 |
| FLUX-dev | 0.592 | 0.286 | 0.444 | 0.167 | 0.616 | 0.359 | 0.551 | 0.270 |
If you find ProImage-Bench useful in your research, please consider citing our paper:
@article{ni2025proimage,
title={ProImage-Bench: Rubric-Based Evaluation for Professional Image Generation},
author={Ni, Minheng and Yang, Zhengyuan and Zhang, Yaowen and Li, Linjie and Lin, Chung-Ching and Lin, Kevin and Wang, Zhendong and Wang, Xiaofei and Liu, Shujie and Zhang, Lei and others},
journal={arXiv preprint arXiv:2512.12220},
year={2025}
}