Open
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR improves the clarity of metrics logging by:
- Renaming metrics methods and fields for better clarity (increment→increase, finished→completed)
- Adding tracking for API-level request states (running/waiting at the API server level)
- Reorganizing the log output to clearly separate API server metrics from engine core metrics
- Making the prefix cache hit rate conditional in the output (only shown when non-zero)
Changes:
- Renamed metrics methods from
increment_*toincrease_*andnum_finished_reqstonum_completed_reqs - Added
num_api_running_reqsandnum_api_waiting_reqstracking to distinguish API-level from engine-level request states - Improved log message format to show API server and Engine core metrics separately with clearer labels
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| lmdeploy/serve/async_engine.py | Updated to track API-level running requests in the model_inst context manager and use renamed metrics methods |
| lmdeploy/metrics/stats.py | Added new fields for API-level request tracking and updated documentation to explain the metric relationships |
| lmdeploy/metrics/metrics_processor.py | Renamed methods and added increase/decrease methods for API running requests tracking |
| lmdeploy/metrics/loggers.py | Reorganized log output format, made prefix cache conditional, and updated Prometheus metric names |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
lvhan028
reviewed
Feb 5, 2026
lmdeploy/metrics/stats.py
Outdated
| num_total_reqs: API server, the number of all requests received since server start. | ||
| num_completed_reqs: API server, the number of successfully completed requests since server start. | ||
| num_api_running_reqs: API server, the number of requests being assigned to engine instances. | ||
| num_api_waiting_reqs: API server, the number of requests waiting for free engine instances. |
Collaborator
There was a problem hiding this comment.
num_api_routed_reqs: API server, the number of requests routed to request handles.
num_api_waiting_reqs: API server, the number of requests waiting for free request handles.
lvhan028
reviewed
Feb 5, 2026
lmdeploy/metrics/stats.py
Outdated
| num_total_reqs: int = 0 | ||
| num_finished_reqs: int = 0 | ||
| num_completed_reqs: int = 0 | ||
| num_api_running_reqs: int = 0 |
Collaborator
There was a problem hiding this comment.
num_api_routed_reqs: int = 0
Collaborator
|
May merge latest main to resolve the conflicts |
lvhan028
approved these changes
Feb 5, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Try to make the metrics log clearer. Now, it looks like: