test(docs): add end to end evaluation doc tests by BloggerBust · Pull Request #2442 · confident-ai/deepeval

BloggerBust · 2026-01-19T02:48:28Z

add deterministic/offline end-to-end coverage for single-turn + multi-turn evaluate() flows
validate EvaluationResult/TestResult shape plus dataset JSON and CSV export schemas
add dedicated cache end-to-end tests covering write_cache/use_cache and expected on-disk artifacts
add end-to-end tests for evaluate() configs (AsyncConfig, ErrorConfig, DisplayConfig) using deterministic metrics
introduce top-level test fixtures (telemetry opt-out, isolated .deepeval dir, settings reset, tracing cleanup) and keep core-only env sandboxing in tests/test_core
Add CLI smoke test

- end to end tests for docs/docs/evaluation-end-to-end-llm-evals.mdx - add deterministic offline E2E tests covering single-turn and multi-turn flows - validate EvaluationResult/TestResult shape and dataset JSON/CSV artifact schemas - add offline fixtures to disable dotenv loading and browser opening - add networked CLI smoke test gated on OPENAI_API_KEY

- add dedicated GitHub Actions workflow to run docs-based tests - run DeepEval end-to-end documentation tests in CI with secrets - support maintainer-only PRs, main branch pushes, and manual dispatch - temporarily disable Confident docs tests pending fixes

greptile-apps · 2026-01-19T02:48:31Z

Skipped: This PR was not opened by one of your configured authors: (tanayvaswani, trevor-cai, kritinv, ...)

vercel · 2026-01-19T02:48:31Z

@BloggerBust is attempting to deploy a commit to the Confident AI Team on Vercel.

A member of the Team first needs to authorize it.

- add deterministic metrics for missing-param and raising error scenarios - add ErrorConfig tests for skip_on_missing_params and ignore_errors (incl precedence) - add AsyncConfig, CacheConfig, and DisplayConfig behavior/validation coverage

…ites - Extract generic evaluate() e2e flows into dedicated test files - Add cache behavior coverage for write_cache/use_cache and on-disk artifacts - Add evaluate config coverage for AsyncConfig/ErrorConfig/DisplayConfig - Introduce top-level test fixtures for telemetry opt-out, settings reset, and tracing cleanup - Remove the monolithic end-to-end test file and reorganize fixtures between tests/ and tests/test_core/

Confident tests can all go under tests/confident, so we can flatten this test suite

BloggerBust added 2 commits January 18, 2026 18:38

BloggerBust added 5 commits January 18, 2026 22:17

test(core): explicitly opt-out of telemetry during env sandboxing

2aa97b1

refactor(tests): Move eval e2e tests to top level and update workflow

f46cf60

refactor(tests): Remove test_deepeval from test_end_to_end suite

dca7006

Confident tests can all go under tests/confident, so we can flatten this test suite

trevor-cai changed the title ~~test(docs): add end-to-end evaluation doc tests~~ test(docs): add component evaluation doc tests Jan 19, 2026

trevor-cai changed the title ~~test(docs): add component evaluation doc tests~~ test(docs): add end to end evaluation doc tests Jan 19, 2026

move tests into test_core

a8f6101

A-Vamshi force-pushed the test/docs-end-to-end-llm-evals branch from 760fe18 to dca7006 Compare January 20, 2026 17:30

lint

5230d5d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test(docs): add end to end evaluation doc tests#2442

test(docs): add end to end evaluation doc tests#2442
BloggerBust wants to merge 9 commits intoconfident-ai:mainfrom
BloggerBust:test/docs-end-to-end-llm-evals

BloggerBust commented Jan 19, 2026 •

edited by trevor-cai

Loading

Uh oh!

greptile-apps bot commented Jan 19, 2026

Uh oh!

vercel bot commented Jan 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

BloggerBust commented Jan 19, 2026 • edited by trevor-cai Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

greptile-apps bot commented Jan 19, 2026

Uh oh!

vercel bot commented Jan 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

BloggerBust commented Jan 19, 2026 •

edited by trevor-cai

Loading