Safe, Untrusted, Proof-Carrying AI Agents: towards the agentic lakehouse

Overview

This repo contains the code to reproduce the prototype presented in our paper "Safe, Untrusted, Proof-Carrying AI Agents: towards the agentic lakehouse", presented at S2AI@IEEE Big Data 2025; in particular, we leverage Bauplan as a programmable lakehouse (together with its MCP server) to showcase how an LLM-based agent can autonomously repair a data pipeline in a cloud lakehouse (and do it safely and under human supervision).

If you're curious about the final result before diving into the code, you can check out this short video for a walkthrough of the prototype.

Setup

Bauplan API key

To use Bauplan, you need a free API key from the website. Once you have your key, follow the instructions to create a local configuration file.

Python environment

We use uv to manage the Python environment: do

uv sync

to create the environment and install the dependencies for this project.

Environment variables

Create a .env file inside of the src folder by copying the local.env file and filling in the relevant API keys depending on the inference provider of choice.

MCP server

Get the MCP server from GitHub and follow the instructions to set it up. Start the MCP server with:

uv run python main.py --transport streamable-http --profile claudeagent

NOTE: we use claude as the model for this example and claudeagent as the corresponding Bauplan profile: make sure to change the relevant startup parameters in launch_agent.py (and possibly model_utils.py) if you wish to run with OpenAI or TogetherAI as inference provider.

Run the "self-repairing pipeline" loop

Run the agent

When launching the main agentic loop, the script will first run a faulty pipeline (the code is in bpln_pipeline) to create a failed job, and then it will run the agent prompting it to repair (generically) recent failed jobs (the exact request to the agent is in queries.py) - since we have just ran a pipeline, we know that the request can be fullfilled. Of course, you can check the Bauplan dashboard to check that indeed the run was attempted and failed.

cd src
uv run python launch_agent.py

Run the tests

We provide scenario-based tests to make sure the agent is calling the expected MCP-provided tools for certain predefined queries. To run the tests, make sure the MCP server is running (see above) and then run:

cd src
uv run pytest -vvv

NOTE: tests are currently only set up for the claudeagent profile.

License

The code is released "as is" under the MIT License. See the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
src		src
.gitignore		.gitignore
.python-version		.python-version
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Safe, Untrusted, Proof-Carrying AI Agents: towards the agentic lakehouse

Overview

Setup

Bauplan API key

Python environment

Environment variables

MCP server

Run the "self-repairing pipeline" loop

Run the agent

Run the tests

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

BauplanLabs/the-agentic-lakehouse

Folders and files

Latest commit

History

Repository files navigation

Safe, Untrusted, Proof-Carrying AI Agents: towards the agentic lakehouse

Overview

Setup

Bauplan API key

Python environment

Environment variables

MCP server

Run the "self-repairing pipeline" loop

Run the agent

Run the tests

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages