UN-3096 add 1st e2e test case by vince-leaf · Pull Request #179 · cognizant-ai-lab/neuro-san

vince-leaf · 2025-04-26T00:23:13Z

This PR is set to have all the general infrastructure to create e2e tests like [smoke test, load test, etc.)

Note: This e2e test does not use any of Dan's test infrastructure for this PR. It would be the next to do.

This directory contains the complete end-to-end (E2E) test infrastructure for the music_nerd_pro agent, including configuration, reusable utilities, test cases, and server lifecycle control tools.

I significantly refactored my first version, which had a start/stop server as its tool files. I also made a base e2e test case that can be called by pytest manually, which allowed me to create our smoke test as below.

📁 Directory Structure

tests/e2e/
├── README.md                      # ✅ You're here
├── configs/
│   └── config.hocon               # HOCON config defining all agent connections
├── conftest.py                    # Shared pytest setup, CLI options, parametrization, server startup
├── requirements.txt               # Pip requirements for test environment
├── test_cases_data/
│   └── mnpt_data.hocon            # Input data and expectations for the test runner
├── tests/
│   └── test_run_agent_cli_music_nerd_pro.py  # Main test case driver (used by orchestrators)
├── tools/
│   ├── smoke_test_runner.py       # Orchestrator: start → test → stop
│   ├── start_server_manual.py     # Manual: starts server and stores PID
│   ├── stop_all_servers.py        # Manual: stops all running agent servers from the PID file
│   └── stop_last_server.py        # Manual: stops only the most recently started server
└── utils/
    ├── logging_config.py          # Shared logging setup (file + console)
    ├── music_nerd_pro_hocon_loader.py  # Extracts structured test data from HOCON config
    ├── music_nerd_pro_output_parser.py # Parses CLI outputs for verification
    ├── music_nerd_pro_runner.py   # Executes the CLI test logic
    ├── server_manager.py          # Manages agent server lifecycle (start, stop, PID tracking)
    ├── server_state.py            # In-memory + file-based PID state tracking
    ├── thinking_file_builder.py   # Generates `thinking_file` argument path
    └── verifier.py                # Assertion helper for output validation

neuro_san/coded_tools/music_nerd_pro/accounting.py

neuro_san/registries/music_nerd_pro.hocon

vince-leaf · 2025-04-26T00:28:31Z

tests/e2e/README.md

+```bash
+e2e/
+├── README.md              # This documentation
+├── configs/                # Static agent configuration


Added README file

It's good that you have all your e2e stuff together under its own directory.

tests/e2e/configs/config.hocon

.github/workflows/tests.yml

vince-leaf · 2025-04-28T23:20:57Z

requirements-build.txt

+pexpect
+pyhocon
+pytest-xdist
+pytest-timeout


Added requirement for e2e tests

Should these requirements go to tests/e2e/requirements.txt then?

vince-leaf · 2025-04-28T23:23:42Z

tests/e2e/conftest.py

+# conftest.py
+# ------------------------------------------------------------------------
+# Provides custom CLI flags, dynamic test generation, and environment setup.
+# Pytest configuration to share like MusicNerdPro test


This file is triggered by pytest to set up the environment for the e2e test cases to run. It is located and should be in the e2e directory.

tests/e2e/pytest.ini

tests/e2e/requirements.txt

tests/e2e/test_cases_data/mnpt_data.hocon

vince-leaf · 2025-05-07T22:49:04Z

tests/e2e/tools/stop_all_servers.py

+
+    # --- Step 5: Double-check that PID file is removed (optional cleanup)
+    assert not os.path.exists(PID_FILE), f"❌ [SERVER] PID file still exists: {PID_FILE}"
+    print(f"🧹 [SERVER] PID file successfully removed: {PID_FILE}")


Clean-up the PID file.

vince-leaf · 2025-05-07T22:50:47Z

tests/e2e/tools/stop_last_server.py

@@ -0,0 +1,60 @@
+# tests/e2e/tools/stop_last_server.py
+


Another option is to stop the PID listed in the file, which is started by the start server tool.

vince-leaf · 2025-05-07T22:53:30Z

tests/e2e/utils/logging_config.py

+
+# ------------------------------------------------------------------------
+# Shared Logging Setup for E2E Tests and CLI Runners
+# ------------------------------------------------------------------------


Logging utils

vince-leaf · 2025-05-07T22:54:38Z

tests/e2e/utils/music_nerd_pro_hocon_loader.py

+import os
+from pyhocon import ConfigFactory
+
+NAME_DATA_HOCON = "music_nerd_pro_data"


to loader music nerd pro hocon

vince-leaf · 2025-05-07T22:58:16Z

tests/e2e/utils/music_nerd_pro_output_parser.py

+def extract_agent_response(output: str, start_marker="Response from MusicNerdPro:"):
+    """
+    Extracts agent's response text based on a start marker.
+    Strips out trailing cost information or CLI prompts.


From the Music Nerd Pro agent output, CLI gets the message

vince-leaf · 2025-05-07T22:59:27Z

tests/e2e/utils/music_nerd_pro_output_parser.py

+        try:
+            parsed = json.loads(block.replace("'", '"'))
+            if "running_cost" in parsed:
+                return f"running_cost: {parsed['running_cost']}"


Loop Agent output to get COST output from CLI Music Nerd Pro

vince-leaf · 2025-05-07T23:01:50Z

tests/e2e/utils/music_nerd_pro_runner.py

+# mnpt_runner.py
+# ------------------------------------------------------------------------
+# CLI-based test runner: drives input/output to the MusicNerdPro agent CLI
+# ------------------------------------------------------------------------


This test's heart uses the agent_cli and simulates the user reading and typing input from the terminal.

vince-leaf · 2025-05-07T23:03:02Z

tests/e2e/utils/music_nerd_pro_runner.py

+    thinking_file_arg = build_thinking_file_arg(conn, repeat_index, use_thinking_file)
+
+    # Build command to launch agent CLI
+    command = (


From this active terminal launch, the agent_cli

vince-leaf · 2025-05-07T23:04:00Z

tests/e2e/utils/music_nerd_pro_runner.py

+    logging.info(f"[TEST] CMD: {command}")
+
+    # Start the agent CLI process
+    child = pexpect.spawn(command, encoding="utf-8", echo=False)


Using the pexpect to capture the text from the terminal.

vince-leaf · 2025-05-07T23:05:48Z