Skip to content

v2.40.2

Choose a tag to compare

@j-mendez j-mendez released this 23 Jan 21:20
· 173 commits to main since this release

Whats Changed

Solve web challenges, perform actions, and more with remote multimodal iterative automation.

  • Remote Multimodal Engine for Chrome automation using vision + LLM
  • Iterative automation loop:
    capture → infer plan → execute → re-capture → repeat
  • Unified RemoteMultimodalConfigs to configure:
    • API endpoint
    • Model selection
    • Prompts
    • Retry behavior
    • Capture strategies
  • Strict JSON automation plans:
    { "label": "...", "done": true|false, "steps": [...] }
website.with_remote_multimodal(Some(
    RemoteMultimodalConfigs::new("http://localhost:11434/v1/chat/completions", "GLM-4.7-Flash")
        .with_api_key(Some(API_KEY))
));

Full Changelog: v2.39.14...v2.40.2