improve performance and memory usage in HashUtils.djb2#28
Closed
skimbrel-figma wants to merge 3 commits intostatsig-io:mainfrom
Closed
improve performance and memory usage in HashUtils.djb2#28skimbrel-figma wants to merge 3 commits intostatsig-io:mainfrom
skimbrel-figma wants to merge 3 commits intostatsig-io:mainfrom
Conversation
Runtime profiling indicates this method generates several many memory allocations. Comparing to the JS implementation, we saw the intent of the `hash &= hash` line was to force the JS runtime to keep the number as a 32-bit integer. This is indeed the correct way to do it in JS, but not in Ruby; as a result, the `hash` local will grow ever larger, requiring more and more memory since Ruby supports unbounded integers. Fix: truncate the hash value on each iteration with the same 32-bit `0xFFFFFFF` constant used at the end instead.
Contributor
Author
|
Benchmarking before/after with a 256-char string on my laptop: A quick, imprecise comparison by inspecting I spot-checked a handful of input values to ensure the output hash value did not change. |
Contributor
|
Nice find! Pulling this in to run tests on it now |
Contributor
Author
|
@tore-statsig actually i just realized we can do even better — the only thing inside the each loop is |
statsig-kong bot
pushed a commit
that referenced
this pull request
May 14, 2025
#28 """ Runtime profiling indicates this method generates several many memory allocations. Comparing to the JS implementation, we saw the intent of the `hash &= hash` line was to force the JS runtime to keep the number as a 32-bit integer. This is indeed the correct way to do it in JS, but not in Ruby; as a result, the `hash` local will grow ever larger, requiring more and more memory since Ruby supports unbounded integers. Fix: truncate the hash value on each iteration with the same 32-bit `0xFFFFFFF` constant used at the end instead. """ Co-authored-by: Sam Kimbrel <98781278+skimbrel-figma@users.noreply.github.com>
Contributor
|
The original version of this is released |
Contributor
Author
|
great! we've been running the second commit (switching to |
statsig-kong bot
pushed a commit
that referenced
this pull request
May 22, 2025
the only thing inside the each loop is .ord, which is available as each_codepoint #28 --------- Co-authored-by: Sam Kimbrel <98781278+skimbrel-figma@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Runtime profiling indicates this method generates several many memory allocations.
Comparing to the JS implementation, we saw the intent of the
hash &= hashline was to force the JS runtime to keep the number as a 32-bit integer. This is indeed the correct way to do it in JS, but not in Ruby; as a result, thehashlocal will grow ever larger, requiring more and more memory since Ruby supports unbounded integers.Fix: truncate the hash value on each iteration with the same 32-bit
0xFFFFFFFconstant used at the end instead.