Skip to content

Conversation

@jxnl
Copy link
Collaborator

@jxnl jxnl commented Feb 6, 2026

This PR isolates documentation, examples, and docs test/tooling updates from the broader mode-registry refactor work.

Why

  • The original branch mixed large runtime/provider refactors with substantial docs churn.
  • Splitting docs out reduces review scope for the runtime PR and makes docs review parallelizable.

What is included

  • docs/** updates (concepts, integrations, examples index/content, migration and mode docs)
  • examples/** updates aligned with new docs/API usage
  • docs test grouping changes in tests/docs/**
  • docs tooling and site config updates in mkdocs.yml, scripts/check_links.py, scripts/validate_headings.py, scripts/fix_api_calls.py, scripts/fix_doc_tests.py
  • plan cleanup in plan/seo_plan.md

What is intentionally excluded

  • Runtime/provider implementation refactors under instructor/**
  • non-doc test refactors outside tests/docs/**
  • dependency/runtime locking and code-path behavior changes

Validation

  • No additional runtime tests executed in this split-only PR.
  • Split verified by ensuring no runtime paths remain in diff scope against main.

Note

Low Risk
Documentation-only changes (no runtime code) with primary risk being inaccurate examples or mode names leading to user confusion.

Overview
Updates the documentation set to align with the core mode naming and current recommended APIs, including widespread edits across blog posts and concept pages (e.g., switching legacy Mode.* values to Mode.TOOLS/Mode.JSON/Mode.MD_JSON, and tightening/formatting code examples).

Expands and reorganizes the API reference (docs/api.md) to explicitly enumerate the public surface area (clients, creation helpers, DSL modules, schemas, validation, batch, distillation, multimodal, hooks, exceptions, patching) and adds a new docs/api-docstring-assessment.md documenting docstring quality gaps.

Refactors several concept guides for concision and clarity, notably a major rewrite/condensation of docs/concepts/batch.md and docs/concepts/error_handling.md, plus adds a new docs/concepts/citation.md guide for CitationMixin usage and citation validation patterns.

Written by Cursor Bugbot for commit 03a757b. Configure here.

jxnl and others added 30 commits January 18, 2026 13:40
- Add ModeRegistry for O(1) handler lookups via (Provider, Mode) tuples
- Add ModeHandler base class and protocol interfaces
- Add patch_v2() function for unified provider patching
- Add registry-based retry logic with handler integration
- Add exception hierarchy (RegistryError, ValidationContextError)
- Add mode normalization with deprecation warnings
- Add @register_mode_handler decorator for handler registration
- Add registry unit tests

This PR was written by [Cursor](https://cursor.com)
- Remove debug logging blocks in retry.py that wrote to hardcoded local path
- Fix GENAI_STRUCTURED_OUTPUTS enum value to avoid alias collision
- Fix sync retry to extract stream parameter from kwargs like async version
- Add docs/concepts/mode-migration.md explaining legacy mode deprecation
- Add tests/v2/test_mode_normalization.py for mode normalization logic
- Update mkdocs.yml with mode migration guide link
- Tests skip gracefully when handlers not yet registered

This PR was written by [Cursor](https://cursor.com)
- Fix tautological test assertion to verify handler exists
- Use provider-specific deprecated modes in warning test
- Add instructor/v2/providers/anthropic/ with handlers for TOOLS, JSON_SCHEMA, PARALLEL_TOOLS, ANTHROPIC_REASONING_TOOLS modes
- Add instructor/v2/providers/openai/ with handlers for TOOLS, JSON_SCHEMA, MD_JSON, PARALLEL_TOOLS, RESPONSES_TOOLS modes
- Update instructor/v2/__init__.py with from_anthropic and from_openai exports
- Update instructor/auto_client.py with v2 routing integration
- Add tests/v2/test_provider_modes.py for integration tests
- Add tests/v2/test_handlers_parametrized.py for unit tests
- Add tests/v2/test_openai_streaming.py for streaming tests

This PR was written by [Cursor](https://cursor.com)
- Remove debug logging in auto_client.py for Cohere client
- Fix google provider to use v1 from_genai (v2 not available yet)
- Add empty check for text_blocks in Anthropic MD_JSON handler
- Add None check for tool_calls in OpenAI PARALLEL_TOOLS handler
- Add instructor/v2/providers/genai/ with handlers for TOOLS, JSON modes
- Add instructor/v2/providers/cohere/ with handlers for TOOLS, JSON_SCHEMA, MD_JSON modes
- Add instructor/v2/providers/mistral/ with handlers for TOOLS, JSON_SCHEMA, MD_JSON modes
- Update instructor/v2/__init__.py with from_genai, from_cohere, from_mistral exports
- Add tests/v2/test_genai_integration.py
- Add tests/v2/test_cohere_handlers.py
- Add tests/v2/test_mistral_client.py and test_mistral_handlers.py

This PR was written by [Cursor](https://cursor.com)
- Remove debug logging in Cohere client
- Fix shallow copy mutation in Cohere handlers (copy messages list)
- Add empty list check in Mistral MD_JSON handler
…iter, Bedrock)

- Add instructor/v2/providers/xai/ with handlers for TOOLS, JSON_SCHEMA, MD_JSON modes
- Add instructor/v2/providers/groq/ with handlers for TOOLS, MD_JSON modes
- Add instructor/v2/providers/fireworks/ with handlers for TOOLS, MD_JSON modes
- Add instructor/v2/providers/cerebras/ with handlers for TOOLS, MD_JSON modes
- Add instructor/v2/providers/writer/ with handlers for TOOLS, MD_JSON modes
- Add instructor/v2/providers/bedrock/ with handlers for TOOLS, MD_JSON modes
- Update instructor/v2/__init__.py with all provider exports
- Add provider-specific test files for all 6 providers

All 11 v2 providers are now implemented.

This PR was written by [Cursor](https://cursor.com)
- Fix Bedrock MD_JSON handler to return early for None response_model
- Fix Fireworks async streaming to await the coroutine
- Fix xAI async streaming filter to only check for tool_calls
- Add fallback error handling for xAI sync streaming
- Add list content case to xAI MD_JSON handler
- Add truncated output detection to Writer handlers
Test reorganization:
- Move cache tests to tests/cache/
- Move core tests to tests/core/ (exceptions, patch, retry, schema)
- Move multimodal tests to tests/multimodal/
- Move processing tests to tests/processing/
- Move provider tests to tests/providers/
- Remove obsolete/duplicate test files

Unified test infrastructure:
- Add tests/v2/test_client_unified.py - Parametrized tests for all provider clients
- Add tests/v2/test_handler_registration_unified.py - Handler registration validation
- Add tests/v2/test_routing.py - Provider routing tests
- Add tests/v2/README.md - Test documentation
- Add tests/v2/UNIFICATION_OPPORTUNITIES.md - Future consolidation notes

This PR was written by [Cursor](https://cursor.com)
The test expects a deprecation warning that hasn't been added to v1 from_anthropic yet
Documentation:
- Add instructor/v2/README.md with comprehensive architecture documentation
- Update docs/modes-comparison.md with v2 mode mappings
- Update docs/integrations/anthropic.md, genai.md, google.md, bedrock.md
- Update docs/concepts/from_provider.md with v2 routing
- Update docs/api.md with v2 exports
- Update CLAUDE.md with v2 development notes

Code cleanup:
- Update pyproject.toml version
- Update .github/workflows/test.yml
- Minor fixes in instructor/core/, dsl/, processing/, providers/
- Remove obsolete plan/seo_plan.md

This PR was written by [Cursor](https://cursor.com)
Remove debug logging blocks that write to hardcoded local path
Fix non-deterministic test collection by sorting providers before parameterization
Remove debug logging blocks that write to hardcoded local path in prepare_request and parse_response methods
- Pass stream_extractor to Partial/Iterable streaming helpers (keep legacy fallback)

- Remove stray no-op import in vertexai shim

- Restore useful type info in openai_schema TypeError
Remove redundant handler files and register these providers directly
via OPENAI_COMPAT_PROVIDERS list. These providers use OpenAI-compatible
APIs, so they can share the same handler implementations.

- Add GROQ, FIREWORKS, CEREBRAS to OPENAI_COMPAT_PROVIDERS
- Update client imports to use OpenAI handlers module
- Update _HANDLER_SPECS to point to OpenAI handlers
- Remove redundant handler files (groq, fireworks, cerebras)
- Update registry to remove deleted handler modules
- Introduced `_parse_with_registry` to centralize parsing logic and handle deprecation warnings.
- Updated `ResponseSchema` methods for parsing Anthropic tools, JSON, OpenAI functions, and tools to use the new method.
- Deprecated `ResponseSchema.parse_*` methods in favor of `process_response` and `ResponseSchema.from_response` with core modes.
- Updated documentation to reflect the deprecation of legacy `ResponseSchema.parse_*` helpers.
…egistry

# Conflicts:
#	pyproject.toml
#	uv.lock
@cloudflare-workers-and-pages
Copy link

cloudflare-workers-and-pages bot commented Feb 6, 2026

Deploying with  Cloudflare Workers  Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status Name Latest Commit Preview URL Updated (UTC)
✅ Deployment successful!
View logs
instructor 1e0dba1 Commit Preview URL

Branch Preview URL
Feb 06 2026, 05:17 PM

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is ON. A Cloud Agent has been kicked off to fix the reported issue.

# Logging setup
logging.basicConfig(level=logging.INFO)

from instructor import Instructions, FinetuneFormat
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate Instructions import in code example

Low Severity

Instructions is imported twice: at line 37 with from instructor import Instructions and at line 42 with from instructor import Instructions, FinetuneFormat. The first import is redundant since the second import includes Instructions.

Fix in Cursor Fix in Web

@cursor
Copy link

cursor bot commented Feb 6, 2026

Bugbot Autofix prepared fixes for 1 of the 1 bugs found in the latest run.

  • ✅ Fixed: Duplicate Instructions import in code example
    • Removed the redundant Instructions import from the distillation documentation code example, leaving a single combined import.

@jxnl jxnl merged commit caea4ab into main Feb 6, 2026
@jxnl jxnl deleted the codex/pr7-docs-examples branch February 6, 2026 17:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants