-
Notifications
You must be signed in to change notification settings - Fork 245
inference 1.0 RC1
#1959
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
inference 1.0 RC1
#1959
Conversation
…e-changes of defaults
…with use-inference-models
| np_images: List[np.ndarray] = [ | ||
| load_image_bgr( | ||
| v, | ||
| disable_preproc_auto_orient=kwargs.get( | ||
| "disable_preproc_auto_orient", False | ||
| ), | ||
| ) | ||
| for v in images | ||
| ] | ||
| mapped_kwargs = self.map_inference_kwargs(kwargs) | ||
| return self._model.pre_process(np_images, **mapped_kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⚡️Codeflash found 12% (0.12x) speedup for InferenceModelsObjectDetectionAdapter.preprocess in inference/core/models/inference_models_adapters.py
⏱️ Runtime : 4.82 milliseconds → 4.28 milliseconds (best of 39 runs)
📝 Explanation and details
The optimized code achieves a 12% speedup by eliminating redundant operations in the preprocess method:
Key Optimization:
The critical change is hoisting the kwargs.get("disable_preproc_auto_orient", False) call outside the list comprehension. In the original code, this dictionary lookup was performed once per image (747 times in the profiler results), taking ~1.48ms total. The optimized version performs this lookup just once before the loop, reducing it to a negligible ~46μs.
Why This Works:
- Dictionary lookups in Python have overhead (hash computation, key comparison)
- The
disable_preproc_auto_orientvalue is constant across all images in a batch - By extracting it to a variable, we eliminate 746 redundant lookups per batch
- This is particularly impactful when processing larger batches (see the 200-image test showing similar gains)
Additional Cleanup:
The map_inference_kwargs method was removed as it simply returned kwargs unchanged. This eliminates an unnecessary method call (taking ~3.66ms in the original) and simplifies the code path. The kwargs are now passed directly to self._model.pre_process().
Performance Profile:
- Line profiler shows the list comprehension time dropped from 33.1ms to 31.7ms (4.2% faster at the loop level)
- The overall
preprocessmethod improved from 42.8ms to 37.1ms (13% faster) - Test results confirm consistent 5-15% speedups across single images, batches, and edge cases
This optimization is most beneficial when preprocess is called frequently with batches of images, as the per-image overhead reduction compounds with batch size.
✅ Correctness verification report:
| Test | Status |
|---|---|
| ⏪ Replay Tests | 🔘 None Found |
| ⚙️ Existing Unit Tests | 🔘 None Found |
| 🔎 Concolic Coverage Tests | 🔘 None Found |
| 🌀 Generated Regression Tests | ✅ 6 Passed |
| 📊 Tests Coverage | 100.0% |
🌀 Click to see Generated Regression Tests
import types
import inference.core.models.inference_models_adapters as adapters_module
import numpy as np
# imports
import pytest # used for our unit tests
from inference.core.models.inference_models_adapters import (
InferenceModelsObjectDetectionAdapter,
)
# Helper lightweight model used by tests to capture calls to pre_process.
# This is not a mock from unittest.mock; it's a tiny concrete object used only
# to observe how preprocess forwards its data to the model.
class MinimalModel:
def __init__(self, return_value=None):
self.last_called_with = None
self.return_value = return_value if return_value is not None else {"ok": True}
def pre_process(self, np_images, **kwargs):
# Record what we received for assertions in tests
self.last_called_with = (list(np_images), dict(kwargs))
return self.return_value
def test_preprocess_single_image_invokes_load_and_model(monkeypatch):
# Prepare a deterministic ndarray to be returned by the patched load_image_bgr.
fake_image = np.zeros((8, 8, 3), dtype=np.uint8)
# Track calls to the patched loader to assert disable_preproc_auto_orient usage.
called = {"count": 0, "last_flag": None}
def fake_load_image_bgr(value, disable_preproc_auto_orient=False):
# Ensure the value passed through correctly (we don't assert a specific type)
called["count"] += 1
called["last_flag"] = disable_preproc_auto_orient
# return a copy to ensure preprocess can't mutate the original easily
return fake_image.copy()
# Patch the adapter module's load_image_bgr function (it was imported there)
monkeypatch.setattr(adapters_module, "load_image_bgr", fake_load_image_bgr)
# Build an adapter instance without calling __init__ to avoid heavy external dependencies.
adapter = object.__new__(InferenceModelsObjectDetectionAdapter)
# Provide a minimal model that records calls and returns a known value.
model = MinimalModel(return_value={"result": "single"})
adapter._model = model
# Call preprocess with a single "image" (could be any sentinel value)
sentinel = "SINGLE_IMAGE_SENTINEL"
codeflash_output = adapter.preprocess(sentinel, some_kw=1)
out = codeflash_output # 12.4μs -> 12.3μs (0.819% faster)
np_images_passed, kwargs_passed = model.last_called_with
def test_preprocess_batch_images_and_disable_flag(monkeypatch):
# Prepare a small batch
batch = ["img0", "img1", "img2"]
returned_images = [
np.full((4, 4, 3), fill_value=i, dtype=np.uint8) for i in range(len(batch))
]
# A loader that returns distinct arrays per invocation and records the flags
call_info = {"flags": []}
def fake_load_image_bgr(value, disable_preproc_auto_orient=False):
# choose an image from returned_images based on the invocation count
idx = len(call_info["flags"])
call_info["flags"].append(disable_preproc_auto_orient)
# Return a copy to avoid accidental shared-state modifications in tests
return returned_images[idx].copy()
monkeypatch.setattr(adapters_module, "load_image_bgr", fake_load_image_bgr)
adapter = object.__new__(InferenceModelsObjectDetectionAdapter)
model = MinimalModel(return_value={"result": "batch"})
adapter._model = model
# Call preprocess with the batch and force disable_preproc_auto_orient True
codeflash_output = adapter.preprocess(batch, disable_preproc_auto_orient=True)
out = codeflash_output # 8.49μs -> 7.35μs (15.4% faster)
np_images_passed, kwargs_passed = model.last_called_with
# Each image received by model should match the distinct arrays returned by our fake loader
for i, arr in enumerate(np_images_passed):
pass
def test_preprocess_empty_list_calls_model_with_empty_list(monkeypatch):
# Ensure loader would raise if called (it should not be called for an empty list)
def loader_should_not_be_called(value, disable_preproc_auto_orient=False):
raise AssertionError(
"load_image_bgr must not be called for an empty input list"
)
monkeypatch.setattr(adapters_module, "load_image_bgr", loader_should_not_be_called)
adapter = object.__new__(InferenceModelsObjectDetectionAdapter)
model = MinimalModel(return_value={"result": "empty"})
adapter._model = model
codeflash_output = adapter.preprocess([], someflag=True)
out = codeflash_output # 3.30μs -> 3.00μs (10.1% faster)
np_images_passed, kwargs_passed = model.last_called_with
def test_preprocess_uses_map_inference_kwargs(monkeypatch):
# Simple loader that always returns the same array
monkeypatch.setattr(
adapters_module,
"load_image_bgr",
lambda v, disable_preproc_auto_orient=False: np.zeros(
(2, 2, 3), dtype=np.uint8
),
)
adapter = object.__new__(InferenceModelsObjectDetectionAdapter)
model = MinimalModel(return_value={"ok": "mapped"})
adapter._model = model
# Attach a custom map_inference_kwargs bound method to this instance that transforms kwargs.
def custom_mapper(self, kwargs):
# Return a new dict that intentionally alters/filters incoming kwargs
return {"mapped_key": "mapped_value"}
adapter.map_inference_kwargs = types.MethodType(custom_mapper, adapter)
# Call preprocess with arbitrary kwargs; they should be replaced by custom_mapper output
codeflash_output = adapter.preprocess("dummy_input", original="value")
out = codeflash_output # 7.82μs -> 7.43μs (5.13% faster)
_, kwargs_passed = model.last_called_with
def test_preprocess_propagates_loader_exceptions(monkeypatch):
# Patch loader to raise a ValueError for the first element
def failing_loader(value, disable_preproc_auto_orient=False):
raise ValueError("invalid image data")
monkeypatch.setattr(adapters_module, "load_image_bgr", failing_loader)
adapter = object.__new__(InferenceModelsObjectDetectionAdapter)
adapter._model = MinimalModel()
with pytest.raises(ValueError) as excinfo:
adapter.preprocess("bad_input") # 3.35μs -> 3.41μs (1.73% slower)
def test_preprocess_large_batch_handles_many_images(monkeypatch):
# Create a batch of 200 small "images" to test scaling of the preprocessing step.
batch_size = 200
batch = [f"img_{i}" for i in range(batch_size)]
# Loader that returns the same small ndarray for each call and counts calls
call_count = {"n": 0}
def generic_loader(value, disable_preproc_auto_orient=False):
call_count["n"] += 1
# return a tiny array unique by filling with the call index modulo 256 to keep memory low
return np.full((1, 1, 3), fill_value=call_count["n"] % 256, dtype=np.uint8)
monkeypatch.setattr(adapters_module, "load_image_bgr", generic_loader)
adapter = object.__new__(InferenceModelsObjectDetectionAdapter)
# Model returns the number of images it received to make verification simple
class ReturnCountModel:
def __init__(self):
self.last_called_with = None
def pre_process(self, np_images, **kwargs):
self.last_called_with = (list(np_images), dict(kwargs))
return {"received": len(np_images)}
model = ReturnCountModel()
adapter._model = model
codeflash_output = adapter.preprocess(batch)
out = codeflash_output # 356μs -> 351μs (1.40% faster)
np_images_passed, _ = model.last_called_with
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.To test or edit this optimization locally git merge codeflash/optimize-pr1959-2026-02-04T11.48.16
| np_images: List[np.ndarray] = [ | |
| load_image_bgr( | |
| v, | |
| disable_preproc_auto_orient=kwargs.get( | |
| "disable_preproc_auto_orient", False | |
| ), | |
| ) | |
| for v in images | |
| ] | |
| mapped_kwargs = self.map_inference_kwargs(kwargs) | |
| return self._model.pre_process(np_images, **mapped_kwargs) | |
| disable_preproc_auto_orient = kwargs.get("disable_preproc_auto_orient", False) | |
| np_images: List[np.ndarray] = [ | |
| load_image_bgr(v, disable_preproc_auto_orient=disable_preproc_auto_orient) | |
| for v in images | |
| ] | |
| return self._model.pre_process(np_images, **kwargs) |
What does this PR do?
Related Issue(s):
Type of Change
Testing
Test details:
Checklist
Additional Context