Chatbot Tester

Automated testing tool for web chatbots with multi-project support, local AI, and advanced reporting.

Getting Started

New here?	Start with
5-minute setup	Quick Start Guide
Detailed configuration	New Project Guide
All options	Configuration Reference

# Clone and install
git clone https://github.com/corradofrancolini/chatbot-tester.git
cd chatbot-tester && python3 -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt && playwright install chromium

# Create your first project
python run.py --new-project

Features

Multi-project: Test different chatbots from the same installation
3 Modes: Train, Assisted, Auto for each testing phase
Local AI: Ollama for privacy-first analysis
Flexible Reports: Local HTML + optional Google Sheets
Report Export: PDF, Excel, HTML, CSV
Notifications: Desktop (macOS), Email, Microsoft Teams
Full Screenshots: Capture entire conversation with all products
Single-Turn Mode: Execute only initial question without followups
LangSmith Integration: Advanced chatbot response debugging
Testing Analysis: A/B comparison, regressions, flaky tests
Parallel Execution: Multi-browser for fast testing
Scheduled Runs: Local cron and GitHub Actions
Performance Metrics: Timing, throughput, latency tracking with alerting
Bilingual: Italian and English
Health Check: Service verification before execution
Cloud Execution: Run tests on GitHub Actions without local Chromium
Docker Ready: Ready-to-use container

Quick Start

1. Installation

# Clone the repository
git clone https://github.com/corradofrancolini/chatbot-tester.git
cd chatbot-tester

# Create virtual environment
python3 -m venv .venv
source .venv/bin/activate

# Install dependencies
pip install -r requirements.txt
playwright install chromium

2. Run tests

# Auto mode - new run
python run.py -p my-chatbot -m auto --no-interactive --new-run

# Auto mode - continue existing run
python run.py -p my-chatbot -m auto --no-interactive

# Train mode (learning)
python run.py -p my-chatbot -m train

CLI Commands

Test Execution

# New full run (creates new Google Sheets sheet)
python run.py -p <project> -m auto --no-interactive --new-run

# Continue existing run (pending tests only)
python run.py -p <project> -m auto --no-interactive

# Execute single test
python run.py -p <project> -m auto --no-interactive -t TEST_050

# Re-run single test (overwrite)
python run.py -p <project> -m auto --no-interactive -t TEST_050 --tests all

# Re-run all failed tests
python run.py -p <project> -m auto --no-interactive --tests failed

# Re-run all tests (overwrite)
python run.py -p <project> -m auto --no-interactive --tests all

Report Export

# Export last run to HTML
python run.py -p <project> --export html

# Export specific run to PDF
python run.py -p <project> --export pdf --export-run 15

# Export all formats
python run.py -p <project> --export all

Performance Metrics

# Show performance report for last run
python run.py -p <project> --perf-report

# Historical dashboard (last N runs)
python run.py -p <project> --perf-dashboard 10

# Compare two runs (e.g., local vs cloud)
python run.py -p <project> --perf-compare 15:16

# List all runs from all projects
python run.py --list-runs

Notifications

# Test desktop notification
python run.py -p <project> --test-notify

# Send notification after run
python run.py -p <project> -m auto --no-interactive --notify desktop

Options

Option	Description	Default
`-p, --project`	Project name	-
`-m, --mode`	Mode: train, assisted, auto	train
`-t, --test`	Single test ID to execute	-
`--tests`	Which tests: all, pending, failed	pending
`--new-run`	Create new run on Google Sheets	false
`--no-interactive`	Non-interactive execution	false
`--dry-run`	Simulate without executing	false
`--health-check`	Check services and exit	false
`--skip-health-check`	Skip service verification	false
`--headless`	Browser in headless mode	false
`--lang`	Interface language: it, en	it
`--debug`	Detailed debug output	false
`--export`	Export report: pdf, excel, html, csv, all	-
`--export-run`	Run number to export	latest
`--notify`	Send notification: desktop, email, teams, all	-
`--test-notify`	Test notification configuration	false
`--perf-report`	Show performance report	-
`--perf-dashboard`	Historical performance dashboard	-
`--list-runs`	List recent runs from all projects	-
`-v, --version`	Show version	-

For the complete guide to all configuration options, see docs/CONFIGURATION.md.

Test Modes

Mode	Description	When to use
Train	Execute tests manually, tool learns	Initial setup
Assisted	AI suggests, you confirm	Validation, corrections
Auto	Fully automatic	Regression testing

Configuration

run_config.json

Each project has a projects/<name>/run_config.json file:

{
  "env": "DEV",
  "active_run": 15,
  "mode": "auto",
  "use_langsmith": true,
  "use_ollama": true,
  "single_turn": true
}

Option	Description	Accessible from
`env`	Environment: DEV, STAGING, PROD	Menu > Configure
`active_run`	Active run number on Google Sheets	Automatic
`dry_run`	Simulate without executing	Menu > Toggle
`use_langsmith`	Enable LangSmith tracing	Menu > Toggle
`use_rag`	Enable RAG retrieval	Menu > Toggle
`use_ollama`	Enable Ollama evaluation	Menu > Toggle
`single_turn`	Initial question only, no followups	Menu > Toggle

Runtime Toggles

From interactive menu: Project > Toggle Options

[1] Dry Run:      OFF  (simulate without executing)
[2] LangSmith:    ON   (tracing active)
[3] RAG:          OFF  (disabled)
[4] Ollama:       ON   (AI evaluation)
[5] Single Turn:  ON   (initial question only)

Screenshots

Screenshots capture the entire conversation with all products visible.

Automatically hides: input bar, footer, scroll indicators
Expands containers to show all content
Saves to: reports/<project>/run_<N>/screenshots/

Project Structure

chatbot-tester/
├── run.py                  # Entry point
├── CLAUDE.md               # Project notes for Claude Code
│
├── config/
│   ├── .env                # Credentials (gitignored)
│   └── settings.yaml       # Global settings
│
├── projects/               # Configured projects
│   └── <project-name>/
│       ├── project.yaml    # Chatbot configuration
│       ├── tests.json      # Test cases
│       ├── run_config.json # Current run state
│       └── browser-data/   # Browser session
│
├── reports/                # Local reports
│   └── <project-name>/
│       └── run_<N>/
│           ├── report.html
│           ├── screenshots/
│           └── performance/
│
└── src/                    # Source code
    ├── browser.py          # Playwright automation
    ├── tester.py           # Test logic
    ├── config_loader.py    # Configuration management
    ├── export.py           # Export PDF, Excel, HTML, CSV
    ├── notifications.py    # Desktop, Email, Teams notifications
    └── performance.py      # Performance metrics collection

Integrations

Ollama (Local AI)

# Install Ollama
brew install ollama

# Start service
ollama serve

# Download model
ollama pull llama3.2:3b

Google Sheets

Create project on Google Cloud Console
Enable Google Sheets API and Google Drive API
Create OAuth 2.0 credentials
Configure in config/.env

LangSmith

Create account on smith.langchain.com
Generate API Key
Configure in config/.env

Notifications

Configure in config/settings.yaml:

notifications:
  desktop:
    enabled: true      # macOS native
    sound: true
  email:
    enabled: false
    smtp_host: "smtp.gmail.com"
    smtp_port: 587
    smtp_user: "your@email.com"
    recipients: ["team@email.com"]
  teams:
    enabled: false
    webhook_url_env: "TEAMS_WEBHOOK_URL"  # Environment variable
  triggers:
    on_complete: true  # Notify on run completion
    on_failure: true   # Notify on failures

Microsoft Teams: Create an Incoming Webhook in Teams channel and set the environment variable:

export TEAMS_WEBHOOK_URL="https://outlook.office.com/webhook/..."

Deployment

Run tests without local Chromium. See docs/DEPLOYMENT.md for the complete guide.

GitHub Actions (recommended)

# Install GitHub CLI
brew install gh

# Launch tests in the cloud
gh workflow run chatbot-test.yml -f project=my-chatbot -f mode=auto

# Monitor cloud run with progress bar
python run.py --watch-cloud

Docker

# Build
docker build -t chatbot-tester .

# Run
docker run -v ./projects:/app/projects chatbot-tester -p my-chatbot -m auto

Health Check

# Verify services before execution
python run.py --health-check -p my-chatbot

Documentation

Guide	Description
QUICKSTART.md	5-minute setup guide
NEW_PROJECT.md	Detailed project configuration
CONFIGURATION.md	Complete guide to all options
DEPLOYMENT.md	Deploy on Docker, GitHub Actions, PyPI
TROUBLESHOOTING.md	Common issues and solutions
CLAUDE.md	Development notes for Claude Code

Troubleshooting

Browser doesn't open

source .venv/bin/activate
playwright install chromium

"Module not found" error

source .venv/bin/activate
pip install -r requirements.txt

Session expired

rm -rf projects/<name>/browser-data/
python run.py -p <name>

License

MIT License - see LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 81 Commits
.circleci		.circleci
.github/workflows		.github/workflows
adapters		adapters
config		config
docs		docs
knowledge		knowledge
locales		locales
mcp_server		mcp_server
projects		projects
scripts		scripts
src		src
templates		templates
tests		tests
wizard		wizard
.dockerignore		.dockerignore
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.secrets.baseline		.secrets.baseline
BACKLOG.md		BACKLOG.md
CLAUDE.md		CLAUDE.md
DECISIONS.md		DECISIONS.md
Dockerfile		Dockerfile
LICENSE		LICENSE
PRD.md		PRD.md
QUICKSTART.txt		QUICKSTART.txt
README.md		README.md
action.yml		action.yml
add_normalized_timing.py		add_normalized_timing.py
analyze_chatbot.py		analyze_chatbot.py
calculate_time_savings.py		calculate_time_savings.py
chatbot-tester.spec		chatbot-tester.spec
fix_missing_screenshots.py		fix_missing_screenshots.py
fix_timing_column_order.py		fix_timing_column_order.py
fly.toml		fly.toml
install.sh		install.sh
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
run.py		run.py
run_parallel_tests.py		run_parallel_tests.py
save_auth.py		save_auth.py
save_auth_silicon.py		save_auth_silicon.py
trigger_circleci.py		trigger_circleci.py
uninstall.sh		uninstall.sh
update.sh		update.sh

License

corradofrancolini/chatbot-tester

Folders and files

Latest commit

History

Repository files navigation

Chatbot Tester

Getting Started

Features

Quick Start

1. Installation

2. Run tests

CLI Commands

Test Execution

Report Export

Performance Metrics

Notifications

Options

Test Modes

Configuration

run_config.json

Runtime Toggles

Screenshots

Project Structure

Integrations

Ollama (Local AI)

Google Sheets

LangSmith

Notifications

Deployment

GitHub Actions (recommended)

Docker

Health Check

Documentation

Troubleshooting

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages