Skip to content
Closed
Show file tree
Hide file tree
Changes from 62 commits
Commits
Show all changes
63 commits
Select commit Hold shift + click to select a range
dfb11bd
add 2 test case and config files
vince-leaf Apr 16, 2025
2fd90a3
updated and fixed front-man effective response
vince-leaf Apr 26, 2025
6a653b2
Added 1st e2e infrastructure with 1 test case
vince-leaf Apr 26, 2025
87bb429
Merge branch 'main' into un-3096_add_test_case_all_agent_cli_connections
vince-leaf Apr 26, 2025
fe68a4e
clean-up
vince-leaf Apr 26, 2025
539cd33
clean-up
vince-leaf Apr 26, 2025
9119952
fixed broken flake8
vince-leaf Apr 26, 2025
ca4cdd0
flake8 reported no newline at end of file, but none
vince-leaf Apr 26, 2025
993cac0
fixed flake8
vince-leaf Apr 26, 2025
219a0b1
added my test dependencies to requirements-build.txt
vince-leaf Apr 28, 2025
e60fcad
add pytest command for path on e2e tests
vince-leaf Apr 28, 2025
a1bfefd
fixed typo
vince-leaf Apr 28, 2025
66fc08e
updated to make flake8 happy
vince-leaf Apr 28, 2025
c1273fc
Made flake8 happy
vince-leaf Apr 28, 2025
4294245
Merge branch 'main' into un-3096_add_test_case_all_agent_cli_connections
vince-leaf Apr 28, 2025
c6215e4
tweaked e2e pytest
vince-leaf Apr 28, 2025
f5f6d9d
edit e2e pytest
vince-leaf Apr 28, 2025
769c352
Update cost values
vince-leaf Apr 28, 2025
2b4a38a
Merge branch 'main' into un-3096_add_test_case_all_agent_cli_connections
vince-leaf Apr 28, 2025
5802313
Merge branch 'main' into un-3096_add_test_case_all_agent_cli_connections
vince-leaf Apr 29, 2025
f57ceec
removed extra
vince-leaf Apr 29, 2025
e58d261
combined fileterwarning to top pytest.ini
vince-leaf Apr 29, 2025
c138c5f
Added server service utility
vince-leaf Apr 30, 2025
5ae944e
added start and stop server service
vince-leaf Apr 30, 2025
d669421
added ignore warning
vince-leaf Apr 30, 2025
bdd9548
renamed to smoketest
vince-leaf Apr 30, 2025
a74094c
updated to run smoke test
vince-leaf Apr 30, 2025
abdcde8
made flake8 happy
vince-leaf May 1, 2025
e028a2e
ignore pytest warning
vince-leaf May 1, 2025
475a76d
debug
vince-leaf May 1, 2025
cfe348f
debug
vince-leaf May 1, 2025
2c2d694
debug failure
vince-leaf May 1, 2025
67649c1
debug
vince-leaf May 1, 2025
db5a2a1
fixed flake8
vince-leaf May 1, 2025
38c79c5
debug
vince-leaf May 1, 2025
c250fda
increased timeout on wait for prompt
vince-leaf May 1, 2025
700459f
added logging
vince-leaf May 1, 2025
db10cf7
make flake8 happy
vince-leaf May 1, 2025
fadcd3b
Made Flake8 happy
vince-leaf May 1, 2025
4526cbd
made flake8 happy
vince-leaf May 1, 2025
3832285
add condition
vince-leaf May 1, 2025
a84833d
a major refactor to support start&stop server service
vince-leaf May 6, 2025
19b5fd8
update trigger smoke-test
vince-leaf May 6, 2025
a37c0a7
made flake8 happy
vince-leaf May 6, 2025
aab8039
add test requirement
vince-leaf May 6, 2025
476d9c6
tweaked stop all servers script
vince-leaf May 6, 2025
469f982
more tweaks
vince-leaf May 6, 2025
3c09c55
fixed a minor info message
vince-leaf May 6, 2025
00782ef
Tweaked timeout
vince-leaf May 6, 2025
bfa15b8
Changes Smoke-test to run after Unit tests
vince-leaf May 6, 2025
739a198
updated readme
vince-leaf May 7, 2025
40de058
Merge branch 'main' into un-3096_add_test_case_all_agent_cli_connections
vince-leaf May 7, 2025
84168d4
renamed the files
vince-leaf May 7, 2025
c778f69
renamed hocon
vince-leaf May 7, 2025
a9e4aee
added more comment
vince-leaf May 7, 2025
ea1d37b
added comment
vince-leaf May 7, 2025
d983628
added comment
vince-leaf May 7, 2025
be264d4
added comment
vince-leaf May 7, 2025
7e0c486
add smoketest cron job
vince-leaf May 8, 2025
7c30777
Removed smoke test build test
vince-leaf May 8, 2025
b4ffbd8
updated test result text
vince-leaf May 8, 2025
417f680
Merge branch 'main' into un-3096_add_test_case_all_agent_cli_connections
vince-leaf May 8, 2025
4b952a7
removed requirement file
vince-leaf May 9, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
53 changes: 53 additions & 0 deletions .github/workflows/smoke.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
name: Smoke Test

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a new GitHub CRON Job smoke test trigger file, as @donn-leaf suggested.

on:
schedule:
- cron: '0 12 * * *' # 12:00 PM UTC = 4:00 AM PT (standard)
workflow_dispatch: # enables manual triggering

jobs:
test:
runs-on: ubuntu-latest
container:
image: python:3.12-slim

steps:
- name: Checkout
uses: actions/checkout@v4

- name: Install dependencies
run: |
apt-get update && apt-get install -y shellcheck
pip install -r requirements-build.txt
pip install -r requirements.txt

- name: Show installed packages
run: pip freeze


- name: Run Smoke Test [excluding all other tests]
run: python -v -m tests.e2e.tools.smoke_test_runner
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}

- name: Notify Slack on success
if: success()
uses: slackapi/slack-github-action@v1.24.0
with:
payload: |
{
"text": "βœ… *Smoke Tests Passed* for `${{ github.repository }}` on `${{ github.ref_name }}`"
}
env:
SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL }}

- name: Notify Slack on failure
if: failure()
uses: slackapi/slack-github-action@v1.24.0
with:
payload: |
{
"text": "❌ *Smoke Tests Failed* for `${{ github.repository }}` on `${{ github.ref_name }}`"
}
env:
SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL }}
6 changes: 3 additions & 3 deletions .github/workflows/tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -36,10 +36,10 @@ jobs:
run: build_scripts/run_shellcheck.sh

- name: Run flake8
run: flake8
run: flake8

- name: Run pytest (excluding integration tests)
run: pytest --verbose -m "not integration" --timer-top-n 10
- name: Run pytest Run All Other Tests (excluding integration and e2e)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This name seems redundant. How about just
Run pytest excluding integration, e2e and smoke tests

run: pytest --verbose -m "not integration and not smoke and not e2e" --timer-top-n 10
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is your smoke test just an instance of e2e tests? Maybe saying 'not smoke' and 'not e2e' is redundant? I'm not sure.

env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
AGENT_TOOL_PATH: "./neuro_san/coded_tools"
Expand Down
8 changes: 8 additions & 0 deletions pytest.ini
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,14 @@
markers =
integration: Mark a test as an integration test. These generally take > 30 seconds to complete.

# Prevents PytestUnknownMarkWarning
e2e: marks tests as end-to-end tests
smoke: marks tests as smoke tests


filterwarnings =
# Ignore warnings about protobuf 4
ignore:Type google._upb._message.* uses PyType_Spec with a metaclass that has custom tp_new:DeprecationWarning

# Ignore warning about pexpect
ignore:.*use of forkpty.*:DeprecationWarning:pty
5 changes: 5 additions & 0 deletions requirements-build.txt
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,11 @@ timeout-decorator==0.5.0
coverage==7.6.1
pytest-cov==5.0.0
parameterized
pexpect
pyhocon
pytest-xdist
pytest-timeout
Copy link
Contributor Author

@vince-leaf vince-leaf Apr 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added requirement for e2e tests

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these requirements go to tests/e2e/requirements.txt then?

psutil

# Code quality
flake8==7.1.1
Expand Down
99 changes: 99 additions & 0 deletions tests/e2e/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
# πŸ§ͺ End-to-End Testing Suite for `music_nerd_pro`

This directory contains the full end-to-end (E2E) test infrastructure for the `music_nerd_pro` agent, including configuration, reusable utilities, test cases, and server lifecycle control tools.

---

## πŸ“ Directory Structure

```text
tests/e2e/
β”œβ”€β”€ README.md # βœ… You're here
β”œβ”€β”€ configs/
β”‚ └── config.hocon # HOCON config defining all agent connections
β”œβ”€β”€ conftest.py # Shared pytest setup, CLI options, parametrization, server startup
β”œβ”€β”€ requirements.txt # Pip requirements for test environment
β”œβ”€β”€ test_cases_data/
β”‚ └── mnpt_data.hocon # Input data and expectations for test runner
β”œβ”€β”€ tests/
β”‚ └── test_run_agent_cli_music_nerd_pro.py # Main test case driver (used by orchestrators)
β”œβ”€β”€ tools/
β”‚ β”œβ”€β”€ smoke_test_runner.py # Orchestrator: start β†’ test β†’ stop
β”‚ β”œβ”€β”€ start_server_manual.py # Manual: starts server and stores PID
β”‚ β”œβ”€β”€ stop_all_servers.py # Manual: stops all running agent servers from PID file
β”‚ └── stop_last_server.py # Manual: stops only the most recently started server
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated README

└── utils/
β”œβ”€β”€ logging_config.py # Shared logging setup (file + console)
β”œβ”€β”€ music_nerd_pro_hocon_loader.py # Extracts structured test data from HOCON config
β”œβ”€β”€ music_nerd_pro_output_parser.py # Parses CLI outputs for verification
β”œβ”€β”€ music_nerd_pro_runner.py # Executes the CLI test logic
β”œβ”€β”€ server_manager.py # Manages agent server lifecycle (start, stop, PID tracking)
β”œβ”€β”€ server_state.py # In-memory + file-based PID state tracking
β”œβ”€β”€ thinking_file_builder.py # Generates `thinking_file` argument path
└── verifier.py # Assertion helper for output validation
```

---

## 🚦 How to Run E2E Tests

### πŸ” Option 1: Manual Mode

```bash
# 1. Start agent server manually
python tests/e2e/tools/start_server_manual.py

# 2. Run E2E CLI tests
pytest tests/e2e/tests/test_run_agent_cli_music_nerd_pro.py \
--capture=no --connection grpc --thinking-file --repeat 1 -n auto

# 3. Stop all running agent servers
python tests/e2e/tools/stop_all_servers.py
```

---

### ⚑ Option 2: Orchestrated Smoke Test

Run everything in one go:

```bash
python -m tests.e2e.tools.smoke_test_runner
```

---

## βœ… Test CLI Options

| Option | Description |
|------------------|--------------------------------------------------|
| `--connection` | One of: `direct`, `grpc`, `http` |
| `--repeat` | Number of repetitions per connection |
| `--thinking-file`| Enables logging of agent `thinking_file` output |
| `-n` | This is Pytest to launch the runner in parallel |

---

## πŸ“¦ Test Environment Setup

```bash
pip install -r tests/e2e/requirements.txt
```

You must also have the `neuro_san` package accessible via `PYTHONPATH`.

---

## 🧠 Notes

- PID tracking is handled via `/tmp/neuro_san_server.pid`.
- Multiple PIDs are supported and cleaned up automatically.
- The test file `test_run_agent_cli_music_nerd_pro.py` is ignored during normal discovery unless triggered explicitly.
- Logging is unified under `/tmp/e2e_server.log`.

---

## πŸ› οΈ Authors & Maintenance

Maintained by QA & Platform Engineering.
Contact: `@vincent.nguyen`
8 changes: 8 additions & 0 deletions tests/e2e/configs/share_agent_config.hocon
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# config.hocon
# Agent config & connection setup

connection = ["direct", "grpc", "http"]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a shared config hocon to all e2e tests.

agent = [music_nerd_pro]

model_llm = ["gpt-4o", "llama3.1"]

164 changes: 164 additions & 0 deletions tests/e2e/conftest.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,164 @@
# conftest.py

# ------------------------------------------------------------------------
# Pytest configuration for shared CLI options, dynamic test generation,
# session-wide logging setup, and agent server lifecycle management.
# ------------------------------------------------------------------------

import pytest
import os
import logging
from pyhocon import ConfigFactory
from pathlib import Path
from utils.logging_config import setup_logging, DEFAULT_LOG_PATH
setup_logging() # Make sure logger is initialized


# ------------------------------------------------------------------------------
# Constants
# ------------------------------------------------------------------------------

THINKING_FILE_PATH = "/private/tmp/agent_thinking"
LOG_PATH = DEFAULT_LOG_PATH # shared with logging_config
NAME_CONFIG_HOCON = "share_agent_config"

# ------------------------------------------------------------------------------
# One-time Log Cleanup + Logging Setup
# ------------------------------------------------------------------------------

try:
# Truncate the log file for a clean start (don't delete it)
open(LOG_PATH, "w").close()
Copy link
Contributor Author

@vince-leaf vince-leaf May 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pytest will clean up all the existing logs related to e2e at the start of this test.


print(f"[setup] Truncated log file: {LOG_PATH}")
except Exception as e:
print(f"[setup] WARNING: Could not prepare log file: {e}")


# Initialize shared logging (both file and console)
setup_logging(log_path=LOG_PATH)
logging.info("βœ… Logging system initialized by conftest.py")

# ------------------------------------------------------------------------------
# Load Static Agent Configuration (HOCON)
# ------------------------------------------------------------------------------

CONFIG_HOCON_PATH = os.path.join(os.path.dirname(__file__), "configs", NAME_CONFIG_HOCON + ".hocon")

config = ConfigFactory.parse_file(CONFIG_HOCON_PATH)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parse the config hocon to get connections.


# ------------------------------------------------------------------------------
# Pytest Hooks
# ------------------------------------------------------------------------------


def pytest_ignore_collect(collection_path: Path, config):
"""
Prevents pytest from collecting a specific test file during discovery.

This is used to ignore test_agent_cli_music_nerd_pro.py during normal pytest runs,
because:
- It depends on a pre-started server (via start_server_manual.py)
- It is intended to be run only as part of tools/smoke_test_runner.py
- This helps avoid accidental test failures or unwanted execution

Note: Uses pathlib.Path as required by pytest 9+ (fix for PytestRemovedIn9Warning).
"""
return "test_agent_cli_music_nerd_pro.py" in str(collection_path)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test pytest to skip or not run this test.



def pytest_configure(config):
"""
Pytest hook: called once at the start of the test session.
This function logs useful context about the test configuration.

- Logs the repeat count from `--repeat` CLI option (default = 1)
- Detects if pytest-xdist is enabled (i.e., running in parallel)
"""
# Fetch repeat count from command-line option or default to 1
repeat = config.getoption("repeat", default=1)

# Check if we are in a worker process (i.e., xdist parallel run)
is_parallel = hasattr(config, "workerinput")

# Emit a log entry showing test mode
logging.info(f"πŸ§ͺ Test mode: repeat={repeat}, parallel={is_parallel}")
logging.info("Custom Environment Info")
logging.info(f"thinking-file path : {THINKING_FILE_PATH}")


# This is a special pytest hook. Do not rename it!
# Pytest uses this to register custom CLI options.
def pytest_addoption(parser):
"""
Defines CLI options:
--connection: Limit tests to a specific connection (e.g., direct/grpc/http)
--repeat: Repeat each test multiple times
--thinking-file: Enables optional thinking_file logging
"""
group = parser.getgroup("custom options")
group.addoption("--connection", action="store", default=None,
help="Specify a connection to test: direct, grpc, or http.")
group.addoption("--repeat", action="store", type=int, default=1,
help="Number of times to repeat each test.")
group.addoption("--thinking-file", action="store_true", default=False,
help="Enable thinking_file output per test run.")


def pytest_generate_tests(metafunc):
# πŸ›‘ Skip parametrization if running the orchestrator module (test_*.py)
# This avoids injecting parameters into the orchestration entrypoint file,
# which is responsible for launching tests, not running them directly.
if metafunc.module.__name__.endswith("test_none"):
return

# βœ… Only proceed if the test function expects 'connection_name' as a fixture
if "connection_name" in metafunc.fixturenames:
# Load all available connection types from the HOCON config (e.g., ['grpc', 'http', 'direct'])
all_connections = load_connections()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

by default, the connection is all three


# Read CLI overrides (if any)
selected = metafunc.config.getoption("connection") # --connection grpc
repeat = metafunc.config.getoption("repeat") # --repeat 3

# πŸ” If a specific connection is requested, validate and filter
if selected:
if selected not in all_connections:
raise ValueError(f"Connection '{selected}' not in config: {all_connections}")
all_connections = [selected]

# πŸ§ͺ Build parameter combinations: (connection, repeat_index)
# -----------------------------------------------------------------------------
# This block is responsible for *generating the test matrix*.
# It determines how many test cases will be launched based on:
# - the list of connections (e.g., grpc, http, direct)
# - the --repeat CLI argument (e.g., --repeat 3)
#
# Example:
# If connections = ['grpc', 'http'] and repeat = 2, this will produce:
# - grpc_run1
# - grpc_run2
# - http_run1
# - http_run2
#
# These become individual pytest cases, allowing for:
# βœ… Parallel execution (when using `-n auto`)
# βœ… Fine-grained control over test case identifiers and logs
#
# The generated values are injected into the test function via parametrize.
test_params = [
pytest.param(conn, i, id=f"{conn}_run{i+1}")
for conn in all_connections
for i in range(repeat)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generate the matrix of runners

]

# Inject parameters into the test function
# This allows dynamic test generation using standard pytest features
metafunc.parametrize("connection_name, repeat_index", test_params)


def load_connections():
"""
Returns the list of connections from the test config.
"""
return config.get("connection")
13 changes: 13 additions & 0 deletions tests/e2e/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
json
logging
os
sys
pexpect
psutil
pyhocon
pytest
pytest-xdist
pytest-timeout
pytest-timer
subprocess
re
Loading