feat(trainer): add structured and configurable logging support#110
feat(trainer): add structured and configurable logging support#110anencore94 wants to merge 2 commits intokubeflow:mainfrom
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
149c8bc to
56896c3
Compare
6f99044 to
9642b41
Compare
Pull Request Test Coverage Report for Build 20360630788Details
💛 - Coveralls |
andreyvelich
left a comment
There was a problem hiding this comment.
Thank you for this @anencore94!
/assign @kubeflow/kubeflow-sdk-team
| "get_logger", | ||
| "setup_logging", |
There was a problem hiding this comment.
shall we move this to a separate package (e.g. kubeflow/logging)
There was a problem hiding this comment.
I agree with it will be better for the long-term purpose, how about migrating in another PR ?
However, I could apply that feature in this PR too. Feel free to give your feedback!
| ) | ||
|
|
||
| job_id = self.backend.train(runtime=runtime, initializer=initializer, trainer=trainer) | ||
| logger.debug("Successfully created TrainJob with ID: %s", job_id) |
There was a problem hiding this comment.
Can you move all of the debug statements out of backend, and place them into trainer_client.py
sdk/kubeflow/trainer/backends/kubernetes/backend.py
Lines 254 to 256 in 9642b41
@szaher @kramaranya Does it sound good ?
|
|
||
|
|
||
| def setup_logging( | ||
| level: Union[str, int] = "INFO", |
There was a problem hiding this comment.
As we discussed here, do we need to have INFO level for our logger ? #85 (comment)
There was a problem hiding this comment.
Let me double-check your word!
Did you mean that SDK will only have debug() logs, so there could be meaningless to introduce logger level, right ?
| class TestLoggingConfig: | ||
| """Test logging configuration functionality.""" | ||
|
|
||
| def test_get_logger(self): |
There was a problem hiding this comment.
Please refactor these test cases, similar to this:
| ValueError: Invalid backend configuration. | ||
|
|
||
| """ | ||
| logger.debug("Initializing TrainerClient with backend_config=%s", backend_config) |
There was a problem hiding this comment.
How you propagate your loggers to the clients ? Should it be part of the TrainerClient() property ?
There was a problem hiding this comment.
I just thought that users to use logger like this pattern in the description.
from kubeflow.trainer import TrainerClient, setup_logging
# Setup logging (optional - NullHandler prevents noise by default)
setup_logging(level="DEBUG", format_type="console")
# Use SDK - debug messages will appear if logging is configured
client = TrainerClient()
There was a problem hiding this comment.
Log records created by module-level loggers (e.g. logging.getLogger(name)) are attached to a named logger such as "kubeflow.trainer.api.trainer_client".
By default, these records propagate up through the logger name hierarchy ("kubeflow.trainer.api.trainer_client" -> "kubeflow.trainer.api" -> "kubeflow.trainer" -> "kubeflow" -> root) until they reach a logger that has handlers attached.
The handlers and levels configured on the "kubeflow" and root loggers here, therefore control how all Kubeflow SDK log records are ultimately emitted.
|
Thanks for the review! Sorry for the delay, I'll follow up this on this week 🙏 |
9642b41 to
38a15b7
Compare
This PR implements comprehensive logging support for the Kubeflow SDK as requested in Issue kubeflow#85. - **NullHandler Pattern**: Prevents logging noise by default, users can override with their own configuration - **Configurable Logging**: Support for console, detailed, and JSON output formats - **Structured Logging**: JSON formatter for log aggregation systems (ELK stack, Fluentd, etc.) - **Environment-based Configuration**: Configure logging via environment variables - `kubeflow/trainer/logging/config.py`: Centralized logging configuration - `kubeflow/trainer/logging/formatters.py`: Custom formatters including JSON structured logging - `kubeflow/trainer/logging/logging_test.py`: Comprehensive test suite (14 tests) - **SDK Integration**: Added debug logging to TrainerClient for better observability - **Package Integration**: Exposed logging utilities through kubeflow.trainer.logging - **NullHandler Setup**: Configured at package level to prevent logging noise ✅ **Consistent use of Python's logging library instead of print statements** - All SDK operations now use proper Python logging - NullHandler pattern prevents unwanted output by default ✅ **Support for different logging levels (DEBUG, INFO, WARNING, ERROR)** - Full support for all standard Python logging levels - Configurable via setup_logging() function or environment variables ✅ **Ability for users to configure log formatting and destinations** - Console, detailed, and JSON output formats - File output support - Custom formatter support ✅ **Clear and actionable log messages for key SDK operations** - Debug messages in TrainerClient initialization - Backend selection logging - Job creation and ID logging - **14 comprehensive unit tests** covering all logging functionality - **NullHandler pattern verification** with proper isolation - **SDK integration testing** with real TrainerClient usage - **Application integration examples** with file and console logging - **All tests pass** with proper linting compliance ```python from kubeflow.trainer import TrainerClient, setup_logging setup_logging(level="DEBUG", format_type="console") client = TrainerClient() ``` ```python setup_logging(level="INFO", format_type="json") ``` ```bash export KUBEFLOW_LOG_LEVEL=DEBUG export KUBEFLOW_LOG_FORMAT=json ``` None - this is a pure addition with backward compatibility. No migration required. Existing code will continue to work without changes. Users can optionally configure logging for better observability. Resolves kubeflow#85 Signed-off-by: Jaeyeon Kim <anencore94@gmail.com>
268ec58 to
564aac4
Compare
- Export configure_from_env function in logging module's public API to enable environment-based logging configuration - Add cleanup_logging fixture to ensure proper test isolation by storing and restoring logging handlers between tests - Add propagation settings in test_sdk_integration_with_nullhandler to ensure SDK logging works correctly when user configures logging Signed-off-by: Jaeyeon Kim <anencore94@gmail.com>
4711bf1 to
2fab08c
Compare
What this PR does / why we need it:
This PR implements comprehensive structured and configurable logging support for the Kubeflow SDK as requested in Issue #85. The implementation provides:
The logging system ensures that the SDK is quiet by default (using NullHandler) but provides rich logging capabilities when users explicitly configure logging.
Which issue(s) this PR fixes (optional, in
Fixes #<issue number>, #<issue number>, ...format, will close the issue(s) when PR gets merged):Fixes #85
Recent Updates
Latest Improvements (Commit: 4711bf1)
configure_from_envfunction: Now available in the public API for easier environment-based configurationcleanup_loggingpytest fixture with autouse to prevent test interferenceFeatures Implemented
Core Logging Infrastructure
configure_from_env()Key Components
kubeflow/trainer/logging/config.py: Centralized logging configurationget_logger(): Get properly named logger instancessetup_logging(): Configure logging with various optionsconfigure_from_env(): Configure from environment variables (now exported)kubeflow/trainer/logging/formatters.py: Custom formatters including JSON structured loggingkubeflow/trainer/logging/logging_test.py: Comprehensive test suite with proper isolationIntegration Points
TrainerClientfor better observabilitykubeflow.trainer.loggingIssue #85 Requirements Fulfilled
✅ Consistent use of Python's logging library instead of print statements
✅ Support for different logging levels (DEBUG, INFO, WARNING, ERROR)
setup_logging()function or environment variables✅ Ability for users to configure log formatting and destinations
✅ Clear and actionable log messages for key SDK operations
Testing
Usage Examples
Basic Usage
JSON Logging for Production
Environment Configuration
Code Review
A comprehensive code review was performed with the following assessment:
Breaking Changes
None - this is a pure addition with backward compatibility.
Migration Guide
No migration required. Existing code will continue to work without changes.
Users can optionally configure logging for better observability.
Checklist: