Skip to content

Image Evals Cookbook#2408

Merged
kathylau-oai merged 11 commits intomainfrom
image-evals
Feb 3, 2026
Merged

Image Evals Cookbook#2408
kathylau-oai merged 11 commits intomainfrom
image-evals

Conversation

@emre-openai
Copy link
Contributor

Summary

Image evals cookbook for evaluating image editing and generation use cases.

Motivation

We dont have any resources to share with customer who are building image use cases.


For new content

When contributing new content, read through our contribution guidelines, and mark the following action items as completed:

  • I have added a new entry in registry.yaml (and, optionally, in authors.yaml) so that my content renders on the cookbook website.
  • I have conducted a self-review of my content based on the contribution guidelines:
    • Relevance: This content is related to building with OpenAI technologies and is useful to others.
    • Uniqueness: I have searched for related examples in the OpenAI Cookbook, and verified that my content offers new insights or unique information compared to existing documentation.
    • Spelling and Grammar: I have checked for spelling or grammatical mistakes.
    • Clarity: I have done a final read-through and verified that my submission is well-organized and easy to understand.
    • Correctness: The information I include is correct and all of my code executes successfully.
    • Completeness: I have explained everything fully, including all necessary references and citations.

We will rate each of these areas on a scale from 1 to 4, and will only accept contributions that score 3 or higher on all areas. Refer to our contribution guidelines for more details.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8aa80a1a6d

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Copy link
Contributor

@kathylau-oai kathylau-oai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good to me!

@minh-hoque
Copy link
Contributor

The beginning mentions cookbook focuses on two sections, but then lists:

  1. Human feedback alignment

Rubric-based labels and pairwise preferences to capture subjective quality and “vibe”
Calibration techniques to keep human judgments consistent over time
4) Strategy for building evals

Start with non-negotiable correctness gates
Add graded quality metrics once failures are controlled
Tag failure modes to drive targeted iteration

Copy link
Contributor

@minh-hoque minh-hoque left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

requested a change.

Overall, looks much better and much more focused on how to create image evals vs also talking about strategy, ect

@kathylau-oai
Copy link
Contributor

Changes made

Copy link
Contributor

@minh-hoque minh-hoque left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@kathylau-oai kathylau-oai merged commit f0f4a14 into main Feb 3, 2026
1 check passed
@kathylau-oai kathylau-oai deleted the image-evals branch February 3, 2026 21:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants