This folder contains the config and code snippets referenced by the tutorial:
- Redpanda Connect HTTP ingestion ->
clickstream - Redpanda Iceberg Topics (configured on the topic) -> Iceberg tables
- FastAPI inference service ->
/score - Redpanda Connect scoring pipeline ->
bid_requests->bid_predictions - Python bidder loop consumer ->
bid_predictions
connect/connect-clickstream.yamlconnect/connect-score-bids.yamlinference/inference_service.pyinference/requirements.txtbidder/bidder.py
-
Start the Redpanda Iceberg docker-compose lab: https://github.com/redpanda-data/redpanda-labs/tree/main/docker-compose/iceberg
-
Create topics:
rpk topic create clickstream --topic-config=redpanda.iceberg.mode=key_valuerpk topic create bid_requestsrpk topic create bid_predictions
-
Run clickstream ingestion Connect pipeline (Docker): docker run --rm -it --network redpanda-labs_default -p 4196:4196
-v "$PWD/connect/connect-clickstream.yaml:/connect.yaml:ro"
docker.redpanda.com/redpandadata/connect:latest run /connect.yaml -
Train model in Jupyter and write it as
/home/jovyan/work/bid_value_model.joblib, then copy it toinference/: docker cp spark-iceberg:/home/jovyan/work/bid_value_model.joblib ./inference/bid_value_model.joblib -
Run inference service on host: cd inference python3 -m venv .venv && source .venv/bin/activate pip install -r requirements.txt uvicorn inference_service:app --host 0.0.0.0 --port 8000
-
Run scoring Connect pipeline (Docker): docker run --rm -it --network redpanda-labs_default
-v "$PWD/connect/connect-score-bids.yaml:/connect.yaml:ro"
docker.redpanda.com/redpandadata/connect:latest run /connect.yaml -
Produce bid requests: echo '{"request_id":"req-1","ad_id":55,"campaign_id":9,"event_type":"view"}' | rpk topic produce bid_requests --format '%v\n'
-
Consume predictions: rpk topic consume bid_predictions -n 2 -f '%k %v\n'
-
Run bidder loop: pip install kafka-python python bidder/bidder.py
The scoring Connect pipeline calls http://host.docker.internal:8000/score, which works on macOS/Windows.
On Linux, you may need to adjust (e.g., use host networking or add an extra host entry).