-
Notifications
You must be signed in to change notification settings - Fork 3
Fill docs with content #392
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: feature/docs-toc
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1 +1,41 @@ | ||
| # What is streams-bootstrap? | ||
|
|
||
| `streams-bootstrap` is a Java library that standardizes the development and operation of Kafka-based applications (Kafka | ||
| Streams and plain Kafka clients). | ||
|
|
||
| The framework supports Apache Kafka 4.1 and Java 17. Its modules are published to Maven Central for straightforward | ||
| integration into existing projects. | ||
|
|
||
| ## Why use it? | ||
|
|
||
| Kafka Streams and the core Kafka clients provide strong primitives for stream processing and messaging, but they do not | ||
| prescribe: | ||
|
|
||
| - How to structure a full application around those primitives | ||
| - How to configure applications consistently | ||
| - How to deploy and operate these services on Kubernetes | ||
| - How to perform repeatable reprocessing and cleanup | ||
| - How to handle errors and large messages uniformly | ||
|
|
||
| `streams-bootstrap` addresses these aspects by supplying: | ||
|
|
||
| 1. **Standardized base classes** for Kafka Streams and client applications. | ||
| 2. **A common CLI/configuration contract** for all Kafka applications. | ||
| 3. **Helm-based deployment templates** and conventions for Kubernetes. | ||
| 4. **Built-in reset/clean workflows** for reprocessing and state management. | ||
| 5. **Consistent error-handling** and dead-letter integration. | ||
| 6. **Testing infrastructure** for local development and CI environments. | ||
| 7. **Optional blob-storage-backed serialization** for large messages. | ||
|
|
||
| ## Architecture | ||
|
|
||
| The framework uses a modular architecture with a clear separation of concerns. | ||
|
|
||
| ### Core Modules | ||
|
|
||
| - `streams-bootstrap-core`: Core abstractions for application lifecycle, execution, and cleanup | ||
| - `streams-bootstrap-cli`: CLI framework based on `picocli` | ||
| - `streams-bootstrap-test`: Utilities for testing streams-bootstrap applications | ||
| - `streams-bootstrap-large-messages`: Support for handling large Kafka messages | ||
| - `streams-bootstrap-cli-test`: Test support for CLI-based applications | ||
|
|
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -2,15 +2,187 @@ | |
|
|
||
| ## Application types | ||
|
|
||
| - App | ||
| - ConfiguredApp | ||
| - ExecutableApp | ||
| In streams-bootstrap, there are three application types: | ||
|
|
||
| - **App** | ||
| - **ConfiguredApp** | ||
| - **ExecutableApp** | ||
|
|
||
| --- | ||
|
|
||
| ### App | ||
|
|
||
| The **App** represents your application logic implementation. Each application type has its own `App` interface: | ||
|
|
||
| - **StreamsApp** – for Kafka Streams applications | ||
| - **ProducerApp** – for producer applications | ||
| - **ConsumerApp** – for consumer applications | ||
| - **ConsumerProducerApp** – for consumer–producer applications | ||
|
|
||
| You implement the appropriate interface to define your application's behavior. | ||
|
|
||
| --- | ||
|
|
||
| ### ConfiguredApp | ||
|
|
||
| A **ConfiguredApp** pairs an `App` with its configuration. Examples include: | ||
|
|
||
| - `ConfiguredConsumerApp<T extends ConsumerApp>` | ||
| - `ConfiguredProducerApp<T extends ProducerApp>` | ||
|
|
||
| This layer handles Kafka property creation, combining: | ||
|
|
||
| - base configuration | ||
| - app-specific configuration | ||
| - environment variables | ||
| - runtime configuration | ||
|
|
||
| --- | ||
|
|
||
| ### ExecutableApp | ||
|
|
||
| An **ExecutableApp** is a `ConfiguredApp` with runtime configuration applied, making it ready to execute. | ||
| It can create: | ||
|
|
||
| - a **Runner** for running the application | ||
| - a **CleanUpRunner** for cleanup operations | ||
|
|
||
| --- | ||
|
|
||
| ### Usage Pattern | ||
|
|
||
| 1. You implement an **App**. | ||
| 2. The framework wraps it in a **ConfiguredApp**, applying the configuration. | ||
| 3. Runtime configuration is then applied to create an **ExecutableApp**, which can be: | ||
|
|
||
| - **run**, or | ||
| - **cleaned up**. | ||
|
|
||
| --- | ||
|
|
||
| ## Application lifecycle | ||
|
|
||
| - Running an application | ||
| - Cleaning an application | ||
| Applications built with streams-bootstrap follow a defined lifecycle with specific states and transitions. | ||
|
|
||
| The framework manages this lifecycle through the KafkaApplication base class and provides several extension points for | ||
| customization. | ||
|
|
||
| | Phase | Description | Entry Point | | ||
| |----------------|--------------------------------------------------------------------------|----------------------------------------------------------| | ||
| | Initialization | Parse CLI arguments, inject environment variables, configure application | `startApplication()` or `startApplicationWithoutExit()` | | ||
| | Preparation | Execute pre-run/pre-clean hooks | `onApplicationStart()`, `prepareRun()`, `prepareClean()` | | ||
| | Execution | Run main application logic or cleanup operations | `run()`, `clean()`, `reset()` | | ||
| | Shutdown | Stop runners, close resources, cleanup | `stop()`, `close()` | | ||
|
|
||
| ### Running an application | ||
|
|
||
| Applications built with streams-bootstrap can be started in two primary ways: | ||
|
|
||
| - **Via Command Line Interface**: When packaged as a runnable JAR (for example, in a container), | ||
| the `run` command is the default entrypoint. An example invocation: | ||
|
|
||
| ```bash | ||
| java -jar example-app.jar \ | ||
| run \ | ||
| --bootstrap-servers kafka:9092 \ | ||
| --input-topics input-topic \ | ||
| --output-topic output-topic \ | ||
| --schema-registry-url http://schema-registry:8081 | ||
| ``` | ||
|
|
||
| - **Programmatically**: The application subclass calls `startApplication(args)` on startup. Example for a Kafka Streams | ||
| application: | ||
|
|
||
| ```java | ||
| public static void main(final String[] args) { | ||
| new MyStreamsApplication().startApplication(args); | ||
| } | ||
| ``` | ||
|
|
||
| ### Cleaning an application | ||
|
|
||
| The framework provides a built-in mechanism to clean up all resources associated with an application. | ||
|
|
||
| When the cleanup operation is triggered, the following resources are removed: | ||
|
|
||
| **TODO:** extend the table for new consumer apps | ||
|
|
||
| | Resource Type | Description | Streams Apps | Producer Apps | | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We need to extend the table for new consumer apps |
||
| |---------------------|-----------------------------------------------------------|--------------|---------------| | ||
| | Output Topics | The main output topic of the application | ✓ | ✓ | | ||
| | Intermediate Topics | Topics for stream operations like `through()` | ✓ | N/A | | ||
| | Internal Topics | Topics for state stores or repartitioning (Kafka Streams) | ✓ | N/A | | ||
| | Consumer Groups | Consumer group metadata | ✓ | N/A | | ||
| | Schema Registry | All registered schemas | ✓ | ✓ | | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is tied to topic deletion so we should state that somewhere in general (and manage topic clean up hooks) |
||
|
|
||
| Cleanup can be triggered: | ||
|
|
||
| - **Via Command Line**: Helm cleanup jobs | ||
| - **Programmatically**: | ||
|
|
||
| ```java | ||
| // For streams applications | ||
| try(StreamsCleanUpRunner cleanUpRunner = streamsApp.createCleanUpRunner()){ | ||
| cleanUpRunner. | ||
|
|
||
| clean(); | ||
| } | ||
|
|
||
| // For producer applications | ||
| try( | ||
| CleanUpRunner cleanUpRunner = producerApp.createCleanUpRunner()){ | ||
| cleanUpRunner. | ||
|
|
||
| clean(); | ||
| } | ||
| ``` | ||
|
|
||
| The framework ensures that cleanup operations are idempotent, meaning they can be safely retried without causing | ||
| additional issues. | ||
|
|
||
| ## Configuration | ||
|
|
||
| Kafka properties are applied in the following order (later values override earlier ones): | ||
|
|
||
| 1. Base configuration | ||
| 2. App config from .createKafkaProperties() | ||
| 3. Environment variables (`KAFKA_`) | ||
| 4. Runtime args (--bootstrap-servers, etc.) | ||
| 5. Serialization config from ProducerApp.defaultSerializationConfig() or StreamsApp.defaultSerializationConfig() | ||
| 6. CLI overrides via --kafka-config | ||
|
|
||
| The framework automatically parses environment variables with the `APP_ prefix` (configurable via `ENV_PREFIX`). | ||
| Environment variables are converted to CLI arguments: | ||
|
|
||
| ```text | ||
| APP_BOOTSTRAP_SERVERS → --bootstrap-servers | ||
| APP_SCHEMA_REGISTRY_URL → --schema-registry-url | ||
| APP_OUTPUT_TOPIC → --output-topic | ||
| ``` | ||
|
|
||
| Additionally, Kafka-specific environment variables with the `KAFKA_` prefix are automatically added to the Kafka | ||
| configuration. | ||
|
|
||
| ### Schema Registry integration | ||
|
|
||
| When the `--schema-registry-url` option is provided: | ||
|
|
||
| - Schemas are registered automatically during application startup | ||
| - Schema cleanup is handled as part of the `clean` command | ||
| - Schema evolution is fully supported | ||
|
|
||
| ## Command line interface | ||
|
|
||
| The framework provides a unified command-line interface for application configuration. | ||
|
|
||
| ### CLI Commands | ||
|
|
||
| - `run`: Run the application | ||
| - `clean`: Delete topics and consumer groups | ||
| - `reset`: Reset internal state and offsets (for Streams apps) | ||
|
|
||
| ### Common CLI Configuration Options | ||
|
|
||
| - `--bootstrap-servers`: Kafka bootstrap servers (required) | ||
| - `--schema-registry-url`: URL for Avro serialization | ||
| - `--kafka-config`: Key-value Kafka configuration | ||
vostres marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,15 +1,120 @@ | ||
| # Producer apps | ||
| # Producer applications | ||
|
|
||
| Producer apps are applications that generate data and send it to a Kafka topic. | ||
| They can be used to produce messages from various sources, such as databases, files, or real-time events. | ||
| Producer applications generate data and send it to Kafka topics. They can be used to produce messages from various | ||
| sources, such as databases, files, or real-time events. | ||
|
|
||
| streams-bootstrap provides a structured way to build producer applications with consistent configuration handling, | ||
| command-line support, and lifecycle management. | ||
|
|
||
| --- | ||
|
|
||
| ## Application lifecycle | ||
|
|
||
| - Running an application | ||
| - Cleaning an application | ||
| ### Running an application | ||
|
|
||
| Producer applications are executed using the `ProducerRunner`, which runs the producer logic defined by the application. | ||
|
|
||
| Unlike Kafka Streams applications, producer applications typically: | ||
|
|
||
| - Run to completion and terminate automatically, or | ||
| - Run continuously when implemented as long-lived services | ||
|
|
||
| The execution model is fully controlled by the producer implementation and its runnable logic. | ||
|
|
||
| --- | ||
|
|
||
| ### Cleaning an application | ||
|
|
||
| Producer applications support a dedicated `clean` command. | ||
|
|
||
| ```bash | ||
| java -jar my-producer-app.jar \ | ||
| --bootstrap-servers localhost:9092 \ | ||
| --output-topic my-topic \ | ||
| clean | ||
| ``` | ||
|
|
||
| The clean process can perform the following operations: | ||
|
|
||
| - Delete output topics | ||
| - Delete registered schemas from Schema Registry | ||
| - Execute custom cleanup hooks defined by the application | ||
|
|
||
| Applications can register custom cleanup logic by overriding `setupCleanUp`. | ||
|
|
||
| --- | ||
|
|
||
| ## Configuration | ||
|
|
||
| ### Serialization configuration | ||
|
|
||
| Producer applications define key and value serialization using the `defaultSerializationConfig()` method in their | ||
| `ProducerApp` implementation. | ||
|
|
||
| ```java | ||
|
|
||
| @Override | ||
| public SerializerConfig defaultSerializationConfig() { | ||
| return new SerializerConfig(StringSerializer.class, SpecificAvroSerializer.class); | ||
| } | ||
| ``` | ||
|
|
||
| ### Kafka properties | ||
|
|
||
| Producer-specific Kafka configuration can be customized by overriding `createKafkaProperties()`: | ||
|
|
||
| ```java | ||
|
|
||
| @Override | ||
| public Map<String, Object> createKafkaProperties() { | ||
| return Map.of( | ||
| ProducerConfig.ACKS_CONFIG, "all", | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is part of the base config. We should mention which base configs there are |
||
| ProducerConfig.RETRIES_CONFIG, 3, | ||
| ProducerConfig.BATCH_SIZE_CONFIG, 16384, | ||
| ProducerConfig.LINGER_MS_CONFIG, 5 | ||
| ); | ||
| } | ||
| ``` | ||
|
|
||
| These properties are merged with defaults and CLI-provided configuration. | ||
|
|
||
| --- | ||
|
|
||
| ### Lifecycle hooks | ||
|
|
||
| #### Clean up | ||
|
|
||
| Custom cleanup logic that is not tied to Kafka topics can be registered via cleanup hooks: | ||
|
|
||
| ```java | ||
|
|
||
| @Override | ||
| public void setupCleanUp(final EffectiveAppConfiguration configuration) { | ||
| configuration.addCleanupHook(() -> { | ||
| // Custom cleanup logic | ||
| }); | ||
| } | ||
| ``` | ||
|
|
||
| Topic-related cleanup should be implemented using topic hooks. | ||
|
|
||
| --- | ||
|
|
||
| ## Command line interface | ||
|
|
||
| Producer applications inherit standard CLI options from `KafkaApplication`. | ||
|
|
||
| ```text | ||
| --bootstrap-servers Kafka bootstrap servers (comma-separated) (Required) | ||
| --bootstrap-server Alias for --bootstrap-servers (Required) | ||
| --schema-registry-url URL of the Schema Registry (Optional) | ||
| --kafka-config Additional Kafka config (key=value,...) (Optional) | ||
| --output-topic Default output topic (Optional) | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Only mention those defined for Producer apps |
||
| --labeled-output-topics Named output topics (label1=topic1,...) (Optional) | ||
| ``` | ||
|
|
||
| --- | ||
|
|
||
| ## Deployment | ||
|
|
||
| TODO | ||
Uh oh!
There was an error while loading. Please reload this page.