-
Notifications
You must be signed in to change notification settings - Fork 1.9k
filter_log_to_metrics: Use optimized memory allocations #11414
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
filter_log_to_metrics: Use optimized memory allocations #11414
Conversation
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
…llocations Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
📝 WalkthroughWalkthroughRefactors log_to_metrics to pre-allocate runtime label structures: introduces label counting and preparation helpers, replaces per-call dynamic label construction with pre-created record accessors and contiguous buffers, updates Kubernetes label handling, and consolidates emitter aliasing and cleanup with stronger validation. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@plugins/filter_log_to_metrics/log_to_metrics.c`:
- Around line 810-821: The code assigns the result of flb_sds_printf directly to
emitter_alias_tmp which can return NULL and cause the original SDS to leak;
instead, call flb_sds_printf into a temporary pointer (e.g., tmp), check if tmp
is NULL, and if so call flb_sds_destroy(emitter_alias_tmp), flb_errno(),
log_to_metrics_destroy(ctx) and return -1; on success assign emitter_alias_tmp =
tmp. This uses the existing symbols emitter_alias_tmp, flb_sds_create_size,
flb_sds_printf, flb_sds_destroy and preserves current error handling via
log_to_metrics_destroy(ctx).
13732cf to
b9f17cb
Compare
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
plugins/filter_log_to_metrics/log_to_metrics.c (2)
360-475:⚠️ Potential issue | 🟡 MinorTighten the bounds guard before indexing label arrays.
If the computed total and fill pass ever diverge (e.g., config mutation or unexpected properties), the current>check can still allow one out-of-bounds write before the final mismatch check. Use>=to fail fast before indexing.🛠️ Proposed fix
- if (counter > ctx->label_counter) { + if (counter >= ctx->label_counter) { flb_plg_error(ctx->ins, "internal label counter overflow"); return -1; }
956-1079:⚠️ Potential issue | 🟠 MajorConfirm thread-safety vulnerability in shared
ctx->label_valuesbuffer.The
label_valuesbuffer is allocated once per filter instance and reused across all concurrent invocations from multiple input sources. Since Fluent Bit has input worker threads that independently process chunks and invoke filters (viaflb_filter_do), multiple workers can simultaneously callcb_log_to_metrics_filterwith the samectx. The vulnerable window is between writing tolabel_values(lines 982–999) and passing it tocmt_counter_inc,cmt_gauge_set, orcmt_histogram_observe(lines 1009, 1030, 1051). A concurrent writer can corrupt label values mid-operation.Locking exists on chunks and tasks but not on filter instances or their context. To fix: either allocate
label_valuesper-invocation (stack or local scope), use thread-local storage, add a mutex around the vulnerable window, or buffer label values before the cmt call completes.
Currently, filter_log_to_metrics frequently allocates heap memory.
This causes memory fragmentation and take a longer time to allocate memory which corresponds to running period.
Instead, we need to optimize this kind of heap memory allocations and suppress CPU stale for waiting I/O operations for memory.
Enter
[N/A]in the box, if an item is not applicable to your change.Testing
Before we can approve your change; please submit the following in a comment:
If this is a change to packaging of containers or native binaries then please confirm it works for all targets.
ok-package-testlabel to test for all targets (requires maintainer to do).Documentation
Backporting
Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.
Summary by CodeRabbit
New Features
Bug Fixes
✏️ Tip: You can customize this high-level summary in your review settings.