Add intelligent tool response optimization with call_tool integration #267

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open

aponcedeleonch wants to merge 3 commits into main from call-tool-optimized

.github/workflows/code-checks.yml

-Original file line number
+Diff line change
@@ Expand Up / @@ -7,15 +7,16 @@ on: @@
       workflow_call:
     jobs:
-      code_quality:
-        name: Code Quality
-        uses: ./.github/workflows/code-quality.yml
-      # Download models once, before image build
+      # Download models once, before image build and tests
       download_models:
         name: Download Models
         uses: ./.github/workflows/download-models.yml
+      code_quality:
+        name: Code Quality
+        uses: ./.github/workflows/code-quality.yml
+        needs: download_models
       image_build:
         name: Build Docker Image
         uses: ./.github/workflows/image-build.yml
@@ Expand Down @@

.github/workflows/code-quality.yml

-Original file line number
+Diff line change
@@ Expand Up / @@ -14,6 +14,12 @@ jobs: @@
         steps:
           - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
+          - name: Download ML models artifact
+            uses: actions/download-artifact@95815c38cf2ff2164869cbab79da8d1f422bc89e # v4.2.1
+            with:
+              name: ml-models
+              path: models/
           - name: Install uv
             uses: astral-sh/setup-uv@61cb8a9741eeb8a550a1b8544337180c0fc8476b # v7.2.0
             with:
@@ Expand All / @@ -30,10 +36,10 @@ jobs: @@
           - name: Run Linting
             run: task lint
           - name: Run Type Checking
             run: task typecheck
           - name: Run Tests
             run: task test
@@ Expand Down @@

.github/workflows/update-thv-models.yml

-Original file line number
+Diff line change
@@ Expand Up / @@ -84,6 +84,9 @@ jobs: @@
                 sleep 1
               done
+          - name: Install Dependencies
+            run: task install
           - name: Generate ToolHive Models
             env:
               MANAGE_THV: "false"
@@ Expand Down @@

.gitignore

-Original file line number
+Diff line change
@@ Expand Up / @@ -12,6 +12,10 @@ wheels/ @@
     # Local configuration files
     *.local*
+    # Environment files (keep .env.example)
+    .env
+    .env.local
     # Database files
     *.db
@@ Expand All / @@ -33,3 +37,15 @@ examples/anthropic_comparison/*.png @@
     # Pre-downloaded ML models (downloaded by scripts/download_models.py)
     models/
+    # AppWorld data
+    data/
+    experiments/
+    conversations/
+    # ONNX models (too large for git)
+    src/mcp_optimizer/response_optimizer/models/
+    # Experiment state and results
+    **/*_state.json
+    **/*_results.json

CLAUDE.md

-Original file line number
+Diff line change
@@ Expand Up @@
     - pyproject.toml should be the central place for configuring the project, i.e. linters, typecheckers, testing, etc
     - Always prefer to use native Python types over custom types, e.g. use `list` instead of `List`, `dict` instead of `Dict`, etc.
     - Prefer using `uv run python -c "import this"` instead of `python -c "import this"`. This ensures that the correct python version and environment is used.
+    - Prefer using module-level imports instead of function-level
     ## Code Structure
     - The main server code is located in `src/mcp_optimizer/server.py`
@@ Expand Down @@

Taskfile.yml

-Original file line number
+Diff line change
@@ Expand Up / @@ -4,36 +4,28 @@ tasks: @@
       install:
         desc: Install dependencies
         cmds:
-          - uv sync --dev --all-packages --group security --group examples
+          - uv sync --dev --all-packages --group security
       lint:
         desc: Run linting with ruff
         cmds:
           - uv run ruff check .
-        deps:
-          - install
       format:
         desc: Fix linting issues and format code
         cmds:
           - uv run ruff format .
           - uv run ruff check --fix .
-        deps:
-          - install
       typecheck:
         desc: Run type checking with ty
         cmds:
           - uv run ty check .
-        deps:
-          - install
       test:
         desc: Run unit tests with pytest
         cmds:
           - uv run pytest
-        deps:
-          - install
       check:
         desc: Run all checks (lint, typecheck, tests, and security)
@@ Expand All / @@ -51,29 +43,21 @@ tasks: @@
           - uv run pip-audit --ignore-vuln CVE-2026-0994
           - uv run bandit -r src/ -f json -o bandit-report.json || true
           - uv run pip-audit --ignore-vuln CVE-2026-0994 --format=json --output=pip-audit-report.json || true
-        deps:
-          - install
       sbom:
         desc: Generate Software Bill of Materials (SBOM)
         cmds:
           - uv run cyclonedx-py environment --output-format json --output-file sbom.json
-        deps:
-          - install
       generate-thv-models:
         desc: Generate Pydantic models from Toolhive's OpenAPI specification
         cmds:
           - ./scripts/generate_toolhive_models.sh
-        deps:
-          - install
       run-migrations:
         desc: Run database migrations
         cmds:
           - uv run alembic upgrade head
-        deps:
-          - install
       download-models:
         desc: Download ML models for offline/airgapped deployments
@@ Expand Down Expand Up / @@ -115,8 +99,6 @@ tasks: @@
           TOOLHIVE_PORT: "8080"
         cmds:
           - uv run mcpo
-        deps:
-          - install
       run-in-thv:
         desc: Build mcp-optimizer and run it in ToolHive
@@ Expand Down Expand Up / @@ -146,3 +128,39 @@ tasks: @@
         desc: Check status of all MCP server examples
         cmds:
           - ./examples/mcp-servers/status-mcp-servers.sh
+      appworld-install:
+        desc: Install AppWorld data (installs from source).
+          Installing from source requires Git LFS.
+          If it is installed and fails try cleaning uv cache with `rm -rf $(uv cache dir)/git-v0` and re-running.
+          Reference - https://github.com/astral-sh/uv/issues/14173
+        env:
+          UV_GIT_LFS: "1"
+        cmds:
+          - uv sync --dev --all-packages --group security --group examples
+          - uv run appworld install
+          - task: appworld-download-data
+          - uv run appworld --version
+      appworld-download-data:
+        desc: Download AppWorld data if not present
+        status:
+          - test -d ./data
+        cmds:
+          - uv run appworld download data
+      appworld-serve-api:
+        desc: Start AppWorld API server (port 9000) in isolated environment. Downloads base DBs if not present.
+        cmds:
+          - mkdir -p data/base_dbs
+          - uv run appworld serve apis --port 9000
+      appworld-serve-mcp:
+        desc: Start AppWorld MCP server (port 10000) in isolated environment
+        cmds:
+          - uv run appworld serve mcp http --remote-apis-url http://localhost:9000 --port 10000
+      appworld-experiment:
+        desc: Run AppWorld experiment (requires servers running, run `task local-dev` first)
+        cmds:
+          - uv run python examples/call_tool_optimizer/run_experiment.py {{.CLI_ARGS}}

examples/anthropic_comparison/comparison_orchestrator.py

-Original file line number
+Diff line change
@@ Expand Up / @@ -8,11 +8,12 @@ @@
     import structlog
     from mcp_optimizer_agent import McpOptimizerAgentRunner
     from metrics import MetricsComputer
-    from models import ComparisonReport, ComparisonResult, TestCase, TestDataset
     from native_approach import NativeApproachRunner
     from results_exporter import ResultsExporter
     from tool_converter import ToolConverter
+    from .models import ComparisonReport, ComparisonResult, TestCase, TestDataset
     logger = structlog.get_logger(__name__)
@@ Expand Down @@

examples/anthropic_comparison/ingest_test_data.py

-Original file line number
+Diff line change
@@ Expand Up / @@ -20,7 +20,7 @@ @@
     from mcp_optimizer.db.workload_tool_ops import WorkloadToolOps
     from mcp_optimizer.embeddings import EmbeddingManager
     from mcp_optimizer.ingestion import IngestionService
-    from mcp_optimizer.token_counter import TokenCounter
+    from mcp_optimizer.response_optimizer.token_counter import TokenCounter
     logger = structlog.get_logger(__name__)
@@ Expand Down @@

examples/anthropic_comparison/mcp_optimizer_agent.py

-Original file line number
+Diff line change
@@ Expand Up / @@ -7,7 +7,6 @@ @@
     import structlog
     from mcp.types import ListToolsResult
-    from models import ChosenMcpServerTool, McpOptimizerSearchResult, TestCase
     from pydantic_ai import Agent
     from pydantic_ai.agent import AgentRunResult
     from pydantic_ai.messages import ModelRequest, ModelResponse, ToolCallPart, ToolReturnPart
@@ Expand All / @@ -20,6 +19,8 @@ @@
     from mcp_optimizer.embeddings import EmbeddingManager
     from mcp_optimizer.server import find_tool
+    from .models import ChosenMcpServerTool, McpOptimizerSearchResult, TestCase
     logger = structlog.get_logger(__name__)
     SYSTEM_PROMPT = """You are a tool selection agent designed to identify the most appropriate tool
@@ Expand Down @@

examples/anthropic_comparison/metrics.py

-Original file line number
+Diff line change
@@ Expand Up / @@ -5,15 +5,16 @@ @@
     import click
     import structlog
-    from models import (
+    from .models import (
         AggregateMetrics,
         ComparisonReport,
         ComparisonResult,
         McpOptimizerSearchResult,
         NativeSearchResult,
         TestCase,
     )
-    from results_exporter import ResultsExporter
+    from .results_exporter import ResultsExporter
     logger = structlog.get_logger(__name__)
@@ Expand Down @@

examples/anthropic_comparison/native_approach.py

-Original file line number
+Diff line change
@@ Expand Up / @@ -6,7 +6,8 @@ @@
     import structlog
     from anthropic import AsyncAnthropic
-    from models import NativeSearchResult, TestCase
+    from .models import NativeSearchResult, TestCase
     logger = structlog.get_logger(__name__)
@@ Expand Down @@

examples/anthropic_comparison/results_exporter.py

-Original file line number
+Diff line change
@@ Expand Up / @@ -6,11 +6,12 @@ @@
     import matplotlib.pyplot as plt
     import structlog
-    from models import ComparisonReport
     from rich.console import Console
     from rich.panel import Panel
     from rich.table import Table
+    from .models import ComparisonReport
     logger = structlog.get_logger(__name__)
@@ Expand Down @@

examples/anthropic_comparison/tool_search_comparison.py

-Original file line number
+Diff line change
@@ Expand Up / @@ -3,11 +3,12 @@ @@
     from pathlib import Path
     import click
-    from comparison_orchestrator import ComparisonOrchestrator
-    from results_exporter import ResultsExporter
     from mcp_optimizer.configure_logging import configure_logging
+    from .comparison_orchestrator import ComparisonOrchestrator
+    from .results_exporter import ResultsExporter
     @click.command()
     @click.option(
@@ Expand Down @@

examples/call_tool_optimizer/.env.example

-Original file line number
+Diff line change
@@ -0,0 +1,5 @@
+    # Copy this file to .env and fill in your values
+    # The .env file can be placed here or in the project root
+    # Required: OpenRouter API key for LLM access
+    OPENROUTER_API_KEY=your_openrouter_api_key_here

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add intelligent tool response optimization with call_tool integration #267

Uh oh!

Diff view

Diff view

There are no files selected for viewing

Uh oh!

Add intelligent tool response optimization with call_tool integration #267

Are you sure you want to change the base?

Uh oh!

Add intelligent tool response optimization with call_tool integration #267

Uh oh!

Uh oh!

Diff view

Diff view

There are no files selected for viewing

Uh oh!