Skip to content

Conversation

@cosmo0920
Copy link
Contributor

@cosmo0920 cosmo0920 commented Jan 30, 2026

Currently, filter_log_to_metrics frequently allocates heap memory.
This causes memory fragmentation and take a longer time to allocate memory which corresponds to running period.
Instead, we need to optimize this kind of heap memory allocations and suppress CPU stale for waiting I/O operations for memory.


Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

  • Example configuration file for the change
  • Debug log output from testing the change
  • Attached Valgrind output that shows no leaks or memory corruption was found

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

  • Run local packaging test showing all targets (including any new ones) build.
  • Set ok-package-test label to test for all targets (requires maintainer to do).

Documentation

  • Documentation required for this feature

Backporting

  • Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

Summary by CodeRabbit

  • New Features

    • Increased maximum label capacity from 32 to 128.
    • Introduced pre-allocated runtime label structures and pre-created accessors for faster, allocation-free filtering.
    • Added emitter aliasing when an explicit emitter name is not provided.
  • Bug Fixes

    • Reworked label setup and validation with overflow/mismatch checks.
    • Fixed resource cleanup to free runtime label buffers and accessors, preventing memory leaks.

✏️ Tip: You can customize this high-level summary in your review settings.

Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
…llocations

Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
@coderabbitai
Copy link

coderabbitai bot commented Jan 30, 2026

📝 Walkthrough

Walkthrough

Refactors log_to_metrics to pre-allocate runtime label structures: introduces label counting and preparation helpers, replaces per-call dynamic label construction with pre-created record accessors and contiguous buffers, updates Kubernetes label handling, and consolidates emitter aliasing and cleanup with stronger validation.

Changes

Cohort / File(s) Summary
Header Structure Updates
plugins/filter_log_to_metrics/log_to_metrics.h
Added <stddef.h>, raised MAX_LABEL_COUNT from 32 to 128, and extended struct log_to_metrics_ctx with label_ras, label_values_buf, and label_values for pre-allocated runtime label storage.
Label runtime helpers & init/refactor
plugins/filter_log_to_metrics/log_to_metrics.c
Added count_labels() and prepare_label_runtime(); converted set_labels() to a two-pass flow; replaced per-call label allocation with pre-allocated buffers and ctx->label_ras accessors; created/destroyed record accessors at init/destroy; added overflow/mismatch checks and strengthened validation.
Filter path & emitter wiring
plugins/filter_log_to_metrics/log_to_metrics.c
Updated cb_log_to_metrics_init() and cb_log_to_metrics_filter() to use ctx->label_counter, ctx->label_keys, ctx->label_accessors, ctx->label_ras, and ctx->label_values instead of building labels per-call; introduced emitter aliasing derived from filter name and adjusted emitter validation/configuration.
Removed legacy logic
plugins/filter_log_to_metrics/log_to_metrics.c
Removed legacy fill_labels logic and associated per-call label construction code paths; adjusted teardown to free new runtime buffers and accessors.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested labels

backport to v4.0.x, backport to v4.1.x

Suggested reviewers

  • edsiper
  • koleini
  • fujimotos

Poem

🐰 I hopped through labels, one by one—
Counting, packing, then they’re done.
Buffers ready, accessors born,
No more per-call chaos at dawn,
A tidy harvest — hop, hop, hooray! 🥕🐇

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 27.27% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately reflects the main objective: optimizing memory allocations in the filter_log_to_metrics plugin through pre-allocation strategies and reduced runtime fragmentation.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch cosmo0920-use-optimized-memory-allocations-on-log_to_metrics

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@plugins/filter_log_to_metrics/log_to_metrics.c`:
- Around line 810-821: The code assigns the result of flb_sds_printf directly to
emitter_alias_tmp which can return NULL and cause the original SDS to leak;
instead, call flb_sds_printf into a temporary pointer (e.g., tmp), check if tmp
is NULL, and if so call flb_sds_destroy(emitter_alias_tmp), flb_errno(),
log_to_metrics_destroy(ctx) and return -1; on success assign emitter_alias_tmp =
tmp. This uses the existing symbols emitter_alias_tmp, flb_sds_create_size,
flb_sds_printf, flb_sds_destroy and preserves current error handling via
log_to_metrics_destroy(ctx).

Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
plugins/filter_log_to_metrics/log_to_metrics.c (2)

360-475: ⚠️ Potential issue | 🟡 Minor

Tighten the bounds guard before indexing label arrays.
If the computed total and fill pass ever diverge (e.g., config mutation or unexpected properties), the current > check can still allow one out-of-bounds write before the final mismatch check. Use >= to fail fast before indexing.

🛠️ Proposed fix
-        if (counter > ctx->label_counter) {
+        if (counter >= ctx->label_counter) {
             flb_plg_error(ctx->ins, "internal label counter overflow");
             return -1;
         }

956-1079: ⚠️ Potential issue | 🟠 Major

Confirm thread-safety vulnerability in shared ctx->label_values buffer.

The label_values buffer is allocated once per filter instance and reused across all concurrent invocations from multiple input sources. Since Fluent Bit has input worker threads that independently process chunks and invoke filters (via flb_filter_do), multiple workers can simultaneously call cb_log_to_metrics_filter with the same ctx. The vulnerable window is between writing to label_values (lines 982–999) and passing it to cmt_counter_inc, cmt_gauge_set, or cmt_histogram_observe (lines 1009, 1030, 1051). A concurrent writer can corrupt label values mid-operation.

Locking exists on chunks and tasks but not on filter instances or their context. To fix: either allocate label_values per-invocation (stack or local scope), use thread-local storage, add a mutex around the vulnerable window, or buffer label values before the cmt call completes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants