-
-
Notifications
You must be signed in to change notification settings - Fork 935
Add iterator slice helpers and channel distinct/tee utilities #791
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Add iterator slice helpers and channel distinct/tee utilities #791
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #791 +/- ##
==========================================
+ Coverage 94.26% 94.33% +0.06%
==========================================
Files 18 18
Lines 2860 2894 +34
==========================================
+ Hits 2696 2730 +34
Misses 149 149
Partials 15 15
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
it/seq.go
Outdated
| if !yield(item) { | ||
| return | ||
| } | ||
| count++ | ||
| if count >= n { | ||
| return | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if we reduce it to one condition?
Just like it is already in all other places, like:
Lines 575 to 579 in b6d1bf3
| count++ | |
| if count > n && !yield(buf[idx]) { | |
| return | |
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done - condensed to a single condition as suggested. Thanks
| // Distinct returns a channel with duplicate values removed. | ||
| // The first occurrence is preserved, and ordering is maintained. | ||
| func Distinct[T comparable](upstream <-chan T) <-chan T { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should at least be mentioned that memory can run out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a note in the doc comments
| // Tee duplicates the stream into multiple output channels without blocking. | ||
| // If an output channel is full or not ready, the value is dropped for that channel. | ||
| func Tee[T any](count, channelsBufferCap int, upstream <-chan T) []<-chan T { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my opinion, this function is too specific.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean we should remove Tee because FanOut/FanIn already cover this, or keep it as a separate helper?
it/seq.go
Outdated
| } | ||
| count++ | ||
| if count >= n { | ||
| if count > n || !yield(item) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In your case you need to avoid additional iteration by collection
if !yield(item) || count >= n {Sorry for the misleading comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant all the relevant places
TakeWhile, TakeFilter too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
it/seq.go
Outdated
| if step >= size { | ||
| buffer = buffer[:0] | ||
| skip = step - size | ||
| } else { | ||
| buffer = buffer[step:] | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
step, size is constant
So this may be precalculated before loop
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may be simplified to something like
skip = skipDelta
buffer = buffer[I:J]But now I've actually noticed that this forward buffer shift is incorrect.
This leads to expansion and reallocation on next appends
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, thanks. Updated the forward shift to keep capacity by copying the overlap to the front and reslicing so we avoid extra reallocations on subsequent appends.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Managed to improve the implementation, benchmarks show about a 10% speedup
BenchmarkItSliding/old_window_step1_n10
BenchmarkItSliding/old_window_step1_n10-11 2913607 379.6 ns/op
BenchmarkItSliding/new_window_step1_n10
BenchmarkItSliding/new_window_step1_n10-11 3341169 341.3 ns/op
BenchmarkItSliding/old_overlap_step3_n10
BenchmarkItSliding/old_overlap_step3_n10-11 4146115 285.8 ns/op
BenchmarkItSliding/new_overlap_step3_n10
BenchmarkItSliding/new_overlap_step3_n10-11 4664964 250.7 ns/op
BenchmarkItSliding/old_no_overlap_n10
BenchmarkItSliding/old_no_overlap_n10-11 4272240 280.9 ns/op
BenchmarkItSliding/new_no_overlap_n10
BenchmarkItSliding/new_no_overlap_n10-11 4828736 249.4 ns/op
BenchmarkItSliding/old_gap_step8_n10
BenchmarkItSliding/old_gap_step8_n10-11 4425630 263.0 ns/op
BenchmarkItSliding/new_gap_step8_n10
BenchmarkItSliding/new_gap_step8_n10-11 5213691 235.1 ns/op
BenchmarkItSliding/old_window_step1_n100
BenchmarkItSliding/old_window_step1_n100-11 382444 3064 ns/op
BenchmarkItSliding/new_window_step1_n100
BenchmarkItSliding/new_window_step1_n100-11 406326 2968 ns/op
BenchmarkItSliding/old_overlap_step3_n100
BenchmarkItSliding/old_overlap_step3_n100-11 652431 1794 ns/op
BenchmarkItSliding/new_overlap_step3_n100
BenchmarkItSliding/new_overlap_step3_n100-11 703446 1694 ns/op
BenchmarkItSliding/old_no_overlap_n100
BenchmarkItSliding/old_no_overlap_n100-11 778430 1523 ns/op
BenchmarkItSliding/new_no_overlap_n100
BenchmarkItSliding/new_no_overlap_n100-11 812672 1454 ns/op
BenchmarkItSliding/old_gap_step8_n100
BenchmarkItSliding/old_gap_step8_n100-11 855152 1381 ns/op
BenchmarkItSliding/new_gap_step8_n100
BenchmarkItSliding/new_gap_step8_n100-11 914552 1303 ns/op
BenchmarkItSliding/old_window_step1_n1000
BenchmarkItSliding/old_window_step1_n1000-11 40515 29331 ns/op
BenchmarkItSliding/new_window_step1_n1000
BenchmarkItSliding/new_window_step1_n1000-11 42220 28129 ns/op
BenchmarkItSliding/old_overlap_step3_n1000
BenchmarkItSliding/old_overlap_step3_n1000-11 71319 16808 ns/op
BenchmarkItSliding/new_overlap_step3_n1000
BenchmarkItSliding/new_overlap_step3_n1000-11 74110 16148 ns/op
BenchmarkItSliding/old_no_overlap_n1000
BenchmarkItSliding/old_no_overlap_n1000-11 86786 13783 ns/op
BenchmarkItSliding/new_no_overlap_n1000
BenchmarkItSliding/new_no_overlap_n1000-11 89229 13398 ns/op
BenchmarkItSliding/old_gap_step8_n1000
BenchmarkItSliding/old_gap_step8_n1000-11 94996 12489 ns/op
BenchmarkItSliding/new_gap_step8_n1000
BenchmarkItSliding/new_gap_step8_n1000-11 99624 11980 ns/op
This PR is a follow‑up to the previous slice‑helpers PR. It adds iterator versions of Take, TakeWhile, and TakeFilter along with tests, examples, and docs.
New Channel Functions
Distinct / DistinctBy
Removes duplicate values (by value or key) while preserving order.
Production use cases:
Tee
Best‑effort fan‑out: duplicates the stream to multiple outputs, dropping values for slow outputs instead of blocking.
Production use cases: