1. Manually generate REAL-WORLD RESULTS that mcp-testing must return 2. Add grouping so that CRUD tests are effectiely idempotent, and thus can be run 3. testing /stateless must explicitly fail if the docker container isn't available