-
Notifications
You must be signed in to change notification settings - Fork 29
UN-3096 add 1st e2e test case #179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 60 commits
dfb11bd
2fd90a3
6a653b2
87bb429
fe68a4e
539cd33
9119952
ca4cdd0
993cac0
219a0b1
e60fcad
a1bfefd
66fc08e
c1273fc
4294245
c6215e4
f5f6d9d
769c352
2b4a38a
5802313
f57ceec
e58d261
c138c5f
5ae944e
d669421
bdd9548
a74094c
abdcde8
e028a2e
475a76d
cfe348f
2c2d694
67649c1
db5a2a1
38c79c5
c250fda
700459f
db10cf7
fadcd3b
4526cbd
3832285
a84833d
19b5fd8
a37c0a7
aab8039
476d9c6
469f982
3c09c55
00782ef
bfa15b8
739a198
40de058
84168d4
c778f69
a9e4aee
ea1d37b
d983628
be264d4
7e0c486
7c30777
b4ffbd8
417f680
4b952a7
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,53 @@ | ||
| name: Smoke Test | ||
|
|
||
| on: | ||
| schedule: | ||
| - cron: '0 12 * * *' # 12:00 PM UTC = 4:00 AM PT (standard) | ||
| workflow_dispatch: # enables manual triggering | ||
|
|
||
| jobs: | ||
| test: | ||
| runs-on: ubuntu-latest | ||
| container: | ||
| image: python:3.12-slim | ||
|
|
||
| steps: | ||
| - name: Checkout | ||
| uses: actions/checkout@v4 | ||
|
|
||
| - name: Install dependencies | ||
| run: | | ||
| apt-get update && apt-get install -y shellcheck | ||
| pip install -r requirements-build.txt | ||
| pip install -r requirements.txt | ||
|
|
||
| - name: Show installed packages | ||
| run: pip freeze | ||
|
|
||
|
|
||
| - name: Run Smoke Test [excluding all other tests] | ||
| run: python -v -m tests.e2e.tools.smoke_test_runner | ||
| env: | ||
| OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }} | ||
|
|
||
| - name: Notify Slack on success | ||
| if: success() | ||
| uses: slackapi/slack-github-action@v1.24.0 | ||
| with: | ||
| payload: | | ||
| { | ||
| "text": "β *Integration Tests Passed* for `${{ github.repository }}` on `${{ github.ref_name }}`" | ||
| } | ||
| env: | ||
| SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL }} | ||
|
|
||
| - name: Notify Slack on failure | ||
| if: failure() | ||
| uses: slackapi/slack-github-action@v1.24.0 | ||
| with: | ||
| payload: | | ||
| { | ||
| "text": "β *Integration Tests Failed* for `${{ github.repository }}` on `${{ github.ref_name }}`" | ||
| } | ||
| env: | ||
| SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL }} | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -36,10 +36,10 @@ jobs: | |
| run: build_scripts/run_shellcheck.sh | ||
|
|
||
| - name: Run flake8 | ||
| run: flake8 | ||
| run: flake8 | ||
|
|
||
| - name: Run pytest (excluding integration tests) | ||
| run: pytest --verbose -m "not integration" --timer-top-n 10 | ||
| - name: Run pytest Run All Other Tests (excluding integration and e2e) | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This name seems redundant. How about just |
||
| run: pytest --verbose -m "not integration and not smoke and not e2e" --timer-top-n 10 | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is your smoke test just an instance of e2e tests? Maybe saying 'not smoke' and 'not e2e' is redundant? I'm not sure. |
||
| env: | ||
| OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }} | ||
| AGENT_TOOL_PATH: "./neuro_san/coded_tools" | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -8,6 +8,11 @@ timeout-decorator==0.5.0 | |
| coverage==7.6.1 | ||
| pytest-cov==5.0.0 | ||
| parameterized | ||
| pexpect | ||
| pyhocon | ||
| pytest-xdist | ||
| pytest-timeout | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added requirement for e2e tests
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should these requirements go to |
||
| psutil | ||
|
|
||
| # Code quality | ||
| flake8==7.1.1 | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,99 @@ | ||
| # π§ͺ End-to-End Testing Suite for `music_nerd_pro` | ||
|
|
||
| This directory contains the full end-to-end (E2E) test infrastructure for the `music_nerd_pro` agent, including configuration, reusable utilities, test cases, and server lifecycle control tools. | ||
|
|
||
| --- | ||
|
|
||
| ## π Directory Structure | ||
|
|
||
| ```text | ||
| tests/e2e/ | ||
| βββ README.md # β You're here | ||
| βββ configs/ | ||
| β βββ config.hocon # HOCON config defining all agent connections | ||
| βββ conftest.py # Shared pytest setup, CLI options, parametrization, server startup | ||
| βββ requirements.txt # Pip requirements for test environment | ||
| βββ test_cases_data/ | ||
| β βββ mnpt_data.hocon # Input data and expectations for test runner | ||
| βββ tests/ | ||
| β βββ test_run_agent_cli_music_nerd_pro.py # Main test case driver (used by orchestrators) | ||
| βββ tools/ | ||
| β βββ smoke_test_runner.py # Orchestrator: start β test β stop | ||
| β βββ start_server_manual.py # Manual: starts server and stores PID | ||
| β βββ stop_all_servers.py # Manual: stops all running agent servers from PID file | ||
| β βββ stop_last_server.py # Manual: stops only the most recently started server | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Updated README |
||
| βββ utils/ | ||
| βββ logging_config.py # Shared logging setup (file + console) | ||
| βββ music_nerd_pro_hocon_loader.py # Extracts structured test data from HOCON config | ||
| βββ music_nerd_pro_output_parser.py # Parses CLI outputs for verification | ||
| βββ music_nerd_pro_runner.py # Executes the CLI test logic | ||
| βββ server_manager.py # Manages agent server lifecycle (start, stop, PID tracking) | ||
| βββ server_state.py # In-memory + file-based PID state tracking | ||
| βββ thinking_file_builder.py # Generates `thinking_file` argument path | ||
| βββ verifier.py # Assertion helper for output validation | ||
| ``` | ||
|
|
||
| --- | ||
|
|
||
| ## π¦ How to Run E2E Tests | ||
|
|
||
| ### π Option 1: Manual Mode | ||
|
|
||
| ```bash | ||
| # 1. Start agent server manually | ||
| python tests/e2e/tools/start_server_manual.py | ||
|
|
||
| # 2. Run E2E CLI tests | ||
| pytest tests/e2e/tests/test_run_agent_cli_music_nerd_pro.py \ | ||
| --capture=no --connection grpc --thinking-file --repeat 1 -n auto | ||
|
|
||
| # 3. Stop all running agent servers | ||
| python tests/e2e/tools/stop_all_servers.py | ||
| ``` | ||
|
|
||
| --- | ||
|
|
||
| ### β‘ Option 2: Orchestrated Smoke Test | ||
|
|
||
| Run everything in one go: | ||
|
|
||
| ```bash | ||
| python -m tests.e2e.tools.smoke_test_runner | ||
| ``` | ||
|
|
||
| --- | ||
|
|
||
| ## β Test CLI Options | ||
|
|
||
| | Option | Description | | ||
| |------------------|--------------------------------------------------| | ||
| | `--connection` | One of: `direct`, `grpc`, `http` | | ||
| | `--repeat` | Number of repetitions per connection | | ||
| | `--thinking-file`| Enables logging of agent `thinking_file` output | | ||
| | `-n` | This is Pytest to launch the runner in parallel | | ||
|
|
||
| --- | ||
|
|
||
| ## π¦ Test Environment Setup | ||
|
|
||
| ```bash | ||
| pip install -r tests/e2e/requirements.txt | ||
| ``` | ||
|
|
||
| You must also have the `neuro_san` package accessible via `PYTHONPATH`. | ||
|
|
||
| --- | ||
|
|
||
| ## π§ Notes | ||
|
|
||
| - PID tracking is handled via `/tmp/neuro_san_server.pid`. | ||
| - Multiple PIDs are supported and cleaned up automatically. | ||
| - The test file `test_run_agent_cli_music_nerd_pro.py` is ignored during normal discovery unless triggered explicitly. | ||
| - Logging is unified under `/tmp/e2e_server.log`. | ||
|
|
||
| --- | ||
|
|
||
| ## π οΈ Authors & Maintenance | ||
|
|
||
| Maintained by QA & Platform Engineering. | ||
| Contact: `@vincent.nguyen` | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,8 @@ | ||
| # config.hocon | ||
| # Agent config & connection setup | ||
|
|
||
| connection = ["direct", "grpc", "http"] | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is a shared config hocon to all e2e tests. |
||
| agent = [music_nerd_pro] | ||
|
|
||
| model_llm = ["gpt-4o", "llama3.1"] | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,164 @@ | ||
| # conftest.py | ||
|
|
||
| # ------------------------------------------------------------------------ | ||
| # Pytest configuration for shared CLI options, dynamic test generation, | ||
| # session-wide logging setup, and agent server lifecycle management. | ||
| # ------------------------------------------------------------------------ | ||
|
|
||
| import pytest | ||
| import os | ||
| import logging | ||
| from pyhocon import ConfigFactory | ||
| from pathlib import Path | ||
| from utils.logging_config import setup_logging, DEFAULT_LOG_PATH | ||
| setup_logging() # Make sure logger is initialized | ||
|
|
||
|
|
||
| # ------------------------------------------------------------------------------ | ||
| # Constants | ||
| # ------------------------------------------------------------------------------ | ||
|
|
||
| THINKING_FILE_PATH = "/private/tmp/agent_thinking" | ||
| LOG_PATH = DEFAULT_LOG_PATH # shared with logging_config | ||
| NAME_CONFIG_HOCON = "share_agent_config" | ||
|
|
||
| # ------------------------------------------------------------------------------ | ||
| # One-time Log Cleanup + Logging Setup | ||
| # ------------------------------------------------------------------------------ | ||
|
|
||
| try: | ||
| # Truncate the log file for a clean start (don't delete it) | ||
| open(LOG_PATH, "w").close() | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Pytest will clean up all the existing logs related to e2e at the start of this test. |
||
|
|
||
| print(f"[setup] Truncated log file: {LOG_PATH}") | ||
| except Exception as e: | ||
| print(f"[setup] WARNING: Could not prepare log file: {e}") | ||
|
|
||
|
|
||
| # Initialize shared logging (both file and console) | ||
| setup_logging(log_path=LOG_PATH) | ||
| logging.info("β Logging system initialized by conftest.py") | ||
|
|
||
| # ------------------------------------------------------------------------------ | ||
| # Load Static Agent Configuration (HOCON) | ||
| # ------------------------------------------------------------------------------ | ||
|
|
||
| CONFIG_HOCON_PATH = os.path.join(os.path.dirname(__file__), "configs", NAME_CONFIG_HOCON + ".hocon") | ||
|
|
||
| config = ConfigFactory.parse_file(CONFIG_HOCON_PATH) | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Parse the config hocon to get connections. |
||
|
|
||
| # ------------------------------------------------------------------------------ | ||
| # Pytest Hooks | ||
| # ------------------------------------------------------------------------------ | ||
|
|
||
|
|
||
| def pytest_ignore_collect(collection_path: Path, config): | ||
| """ | ||
| Prevents pytest from collecting a specific test file during discovery. | ||
|
|
||
| This is used to ignore test_agent_cli_music_nerd_pro.py during normal pytest runs, | ||
| because: | ||
| - It depends on a pre-started server (via start_server_manual.py) | ||
| - It is intended to be run only as part of tools/smoke_test_runner.py | ||
| - This helps avoid accidental test failures or unwanted execution | ||
|
|
||
| Note: Uses pathlib.Path as required by pytest 9+ (fix for PytestRemovedIn9Warning). | ||
| """ | ||
| return "test_agent_cli_music_nerd_pro.py" in str(collection_path) | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This test pytest to skip or not run this test. |
||
|
|
||
|
|
||
| def pytest_configure(config): | ||
| """ | ||
| Pytest hook: called once at the start of the test session. | ||
| This function logs useful context about the test configuration. | ||
|
|
||
| - Logs the repeat count from `--repeat` CLI option (default = 1) | ||
| - Detects if pytest-xdist is enabled (i.e., running in parallel) | ||
| """ | ||
| # Fetch repeat count from command-line option or default to 1 | ||
| repeat = config.getoption("repeat", default=1) | ||
|
|
||
| # Check if we are in a worker process (i.e., xdist parallel run) | ||
| is_parallel = hasattr(config, "workerinput") | ||
|
|
||
| # Emit a log entry showing test mode | ||
| logging.info(f"π§ͺ Test mode: repeat={repeat}, parallel={is_parallel}") | ||
| logging.info("Custom Environment Info") | ||
| logging.info(f"thinking-file path : {THINKING_FILE_PATH}") | ||
|
|
||
|
|
||
| # This is a special pytest hook. Do not rename it! | ||
| # Pytest uses this to register custom CLI options. | ||
| def pytest_addoption(parser): | ||
vince-leaf marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| """ | ||
| Defines CLI options: | ||
| --connection: Limit tests to a specific connection (e.g., direct/grpc/http) | ||
| --repeat: Repeat each test multiple times | ||
| --thinking-file: Enables optional thinking_file logging | ||
| """ | ||
| group = parser.getgroup("custom options") | ||
| group.addoption("--connection", action="store", default=None, | ||
| help="Specify a connection to test: direct, grpc, or http.") | ||
| group.addoption("--repeat", action="store", type=int, default=1, | ||
| help="Number of times to repeat each test.") | ||
| group.addoption("--thinking-file", action="store_true", default=False, | ||
| help="Enable thinking_file output per test run.") | ||
|
|
||
|
|
||
| def pytest_generate_tests(metafunc): | ||
| # π Skip parametrization if running the orchestrator module (test_*.py) | ||
| # This avoids injecting parameters into the orchestration entrypoint file, | ||
| # which is responsible for launching tests, not running them directly. | ||
| if metafunc.module.__name__.endswith("test_none"): | ||
| return | ||
|
|
||
| # β Only proceed if the test function expects 'connection_name' as a fixture | ||
| if "connection_name" in metafunc.fixturenames: | ||
| # Load all available connection types from the HOCON config (e.g., ['grpc', 'http', 'direct']) | ||
| all_connections = load_connections() | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. by default, the connection is all three |
||
|
|
||
| # Read CLI overrides (if any) | ||
| selected = metafunc.config.getoption("connection") # --connection grpc | ||
| repeat = metafunc.config.getoption("repeat") # --repeat 3 | ||
|
|
||
| # π If a specific connection is requested, validate and filter | ||
| if selected: | ||
| if selected not in all_connections: | ||
| raise ValueError(f"Connection '{selected}' not in config: {all_connections}") | ||
| all_connections = [selected] | ||
|
|
||
| # π§ͺ Build parameter combinations: (connection, repeat_index) | ||
| # ----------------------------------------------------------------------------- | ||
| # This block is responsible for *generating the test matrix*. | ||
| # It determines how many test cases will be launched based on: | ||
| # - the list of connections (e.g., grpc, http, direct) | ||
| # - the --repeat CLI argument (e.g., --repeat 3) | ||
| # | ||
| # Example: | ||
| # If connections = ['grpc', 'http'] and repeat = 2, this will produce: | ||
| # - grpc_run1 | ||
| # - grpc_run2 | ||
| # - http_run1 | ||
| # - http_run2 | ||
| # | ||
| # These become individual pytest cases, allowing for: | ||
| # β Parallel execution (when using `-n auto`) | ||
| # β Fine-grained control over test case identifiers and logs | ||
| # | ||
| # The generated values are injected into the test function via parametrize. | ||
| test_params = [ | ||
| pytest.param(conn, i, id=f"{conn}_run{i+1}") | ||
| for conn in all_connections | ||
| for i in range(repeat) | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Generate the matrix of runners |
||
| ] | ||
|
|
||
| # Inject parameters into the test function | ||
| # This allows dynamic test generation using standard pytest features | ||
| metafunc.parametrize("connection_name, repeat_index", test_params) | ||
|
|
||
|
|
||
| def load_connections(): | ||
| """ | ||
| Returns the list of connections from the test config. | ||
| """ | ||
| return config.get("connection") | ||
vince-leaf marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,13 @@ | ||
| json | ||
| logging | ||
| os | ||
| sys | ||
| pexpect | ||
| psutil | ||
| pyhocon | ||
| pytest | ||
| pytest-xdist | ||
| pytest-timeout | ||
| pytest-timer | ||
vince-leaf marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| subprocess | ||
| re | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a new GitHub CRON Job smoke test trigger file, as @donn-leaf suggested.