Skip to content

Comments

UN-3096 add 1st e2e test case#179

Closed
vince-leaf wants to merge 63 commits intomainfrom
un-3096_add_test_case_all_agent_cli_connections
Closed

UN-3096 add 1st e2e test case#179
vince-leaf wants to merge 63 commits intomainfrom
un-3096_add_test_case_all_agent_cli_connections

Conversation

@vince-leaf
Copy link
Contributor

@vince-leaf vince-leaf commented Apr 26, 2025

This PR is set to have all the general infrastructure to create e2e tests like [smoke test, load test, etc.)

Note: This e2e test does not use any of Dan's test infrastructure for this PR. It would be the next to do.

This directory contains the complete end-to-end (E2E) test infrastructure for the music_nerd_pro agent, including configuration, reusable utilities, test cases, and server lifecycle control tools.

I significantly refactored my first version, which had a start/stop server as its tool files. I also made a base e2e test case that can be called by pytest manually, which allowed me to create our smoke test as below.

📁 Directory Structure

tests/e2e/
├── README.md                      # ✅ You're here
├── configs/
│   └── config.hocon               # HOCON config defining all agent connections
├── conftest.py                    # Shared pytest setup, CLI options, parametrization, server startup
├── requirements.txt               # Pip requirements for test environment
├── test_cases_data/
│   └── mnpt_data.hocon            # Input data and expectations for the test runner
├── tests/
│   └── test_run_agent_cli_music_nerd_pro.py  # Main test case driver (used by orchestrators)
├── tools/
│   ├── smoke_test_runner.py       # Orchestrator: start → test → stop
│   ├── start_server_manual.py     # Manual: starts server and stores PID
│   ├── stop_all_servers.py        # Manual: stops all running agent servers from the PID file
│   └── stop_last_server.py        # Manual: stops only the most recently started server
└── utils/
    ├── logging_config.py          # Shared logging setup (file + console)
    ├── music_nerd_pro_hocon_loader.py  # Extracts structured test data from HOCON config
    ├── music_nerd_pro_output_parser.py # Parses CLI outputs for verification
    ├── music_nerd_pro_runner.py   # Executes the CLI test logic
    ├── server_manager.py          # Manages agent server lifecycle (start, stop, PID tracking)
    ├── server_state.py            # In-memory + file-based PID state tracking
    ├── thinking_file_builder.py   # Generates `thinking_file` argument path
    └── verifier.py                # Assertion helper for output validation

```bash
e2e/
├── README.md # This documentation
├── configs/ # Static agent configuration
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added README file

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's good that you have all your e2e stuff together under its own directory.

pexpect
pyhocon
pytest-xdist
pytest-timeout
Copy link
Contributor Author

@vince-leaf vince-leaf Apr 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added requirement for e2e tests

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these requirements go to tests/e2e/requirements.txt then?

# conftest.py
# ------------------------------------------------------------------------
# Provides custom CLI flags, dynamic test generation, and environment setup.
# Pytest configuration to share like MusicNerdPro test
Copy link
Contributor Author

@vince-leaf vince-leaf Apr 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is triggered by pytest to set up the environment for the e2e test cases to run. It is located and should be in the e2e directory.


# --- Step 5: Double-check that PID file is removed (optional cleanup)
assert not os.path.exists(PID_FILE), f"❌ [SERVER] PID file still exists: {PID_FILE}"
print(f"🧹 [SERVER] PID file successfully removed: {PID_FILE}")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean-up the PID file.

@@ -0,0 +1,60 @@
# tests/e2e/tools/stop_last_server.py

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option is to stop the PID listed in the file, which is started by the start server tool.


# ------------------------------------------------------------------------
# Shared Logging Setup for E2E Tests and CLI Runners
# ------------------------------------------------------------------------
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logging utils

import os
from pyhocon import ConfigFactory

NAME_DATA_HOCON = "music_nerd_pro_data"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to loader music nerd pro hocon

def extract_agent_response(output: str, start_marker="Response from MusicNerdPro:"):
"""
Extracts agent's response text based on a start marker.
Strips out trailing cost information or CLI prompts.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the Music Nerd Pro agent output, CLI gets the message

try:
parsed = json.loads(block.replace("'", '"'))
if "running_cost" in parsed:
return f"running_cost: {parsed['running_cost']}"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Loop Agent output to get COST output from CLI Music Nerd Pro

# mnpt_runner.py
# ------------------------------------------------------------------------
# CLI-based test runner: drives input/output to the MusicNerdPro agent CLI
# ------------------------------------------------------------------------
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test's heart uses the agent_cli and simulates the user reading and typing input from the terminal.

thinking_file_arg = build_thinking_file_arg(conn, repeat_index, use_thinking_file)

# Build command to launch agent CLI
command = (
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From this active terminal launch, the agent_cli

logging.info(f"[TEST] CMD: {command}")

# Start the agent CLI process
child = pexpect.spawn(command, encoding="utf-8", echo=False)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the pexpect to capture the text from the terminal.

"""
start = time.time()
while time.time() - start < timeout:
try:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will ensure the process stops this proc


def start_server(conn: str):
"""
Start the agent server as a detached background subprocess.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

start server

# After starting the subprocess and writing the PID
with open(PID_FILE, "a") as f:
f.write(f"{proc.pid}\n")
logging.info(f"📝 [SERVER] PID {proc.pid} written to {PID_FILE}")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

write its PID to file

def stop_all_servers():
"""
Stop all PIDs listed in the PID file. Clean up the file.
"""
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stop all server according to PID's file.



def stop_server_by_pid(pid: int):
try:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stop sever by a pid id



def _remove_pid_from_file(pid: int):
if not os.path.exists(PID_FILE):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete as clean up pip file

@@ -0,0 +1,53 @@
name: Smoke Test

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a new GitHub CRON Job smoke test trigger file, as @donn-leaf suggested.


- name: Run Smoke Tests
run: |
python -m tests.e2e.tools.smoke_test_runner
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed smoke test from building tests file.

@donn-leaf donn-leaf self-requested a review May 8, 2025 23:00

- name: Run pytest (excluding integration tests)
run: pytest --verbose -m "not integration" --timer-top-n 10
- name: Run pytest Run All Other Tests (excluding integration and e2e)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This name seems redundant. How about just
Run pytest excluding integration, e2e and smoke tests

- name: Run pytest (excluding integration tests)
run: pytest --verbose -m "not integration" --timer-top-n 10
- name: Run pytest Run All Other Tests (excluding integration and e2e)
run: pytest --verbose -m "not integration and not smoke and not e2e" --timer-top-n 10
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is your smoke test just an instance of e2e tests? Maybe saying 'not smoke' and 'not e2e' is redundant? I'm not sure.

Copy link
Contributor

@ofrancon ofrancon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasonable to me but please put the requirements either in requirements-build.txt or in the e2e folder, but not both - that's confusing. Make sure the doc reflects your choice. Thanks!

pexpect
pyhocon
pytest-xdist
pytest-timeout
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these requirements go to tests/e2e/requirements.txt then?

@vince-leaf vince-leaf closed this May 16, 2025
@vince-leaf vince-leaf deleted the un-3096_add_test_case_all_agent_cli_connections branch May 16, 2025 16:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants