-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Improve base-32 hash decoding performance with reverse map #13680
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this! This is indeed much better: faster and clearer.
Just one small thing, then let's land it!
The changes include: * Defining nix32Chars as a constexpr char[]. * Adding a constexpr std::array<unsigned char, 256> (reverseNix32Map) to map characters to their base-32 digit values at compile time. * Replacing the slow character search loop with a direct lookup using reverseNix32Map. * Removing std::once_flag/isBase32 logic in references.cc in favor of reverseNix32Map Signed-off-by: Alexander V. Nikolaev <avn@avnik.info>
cad1d68
to
4bfc007
Compare
@Ericson2314 your suggestion applied |
I wouldn't mind if we could add a benchmark for this, but this could be done in a follow-up. |
@xokdvium also wouldn't be bad to add dedicated tests for both nix32 encoding/decoding |
std::array<unsigned char, 256> map{}; | ||
|
||
for (size_t i = 0; i < map.size(); ++i) | ||
map[i] = 0xFF; // invalid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this sentinel value could be given a name? Not a huge fan of hardcoded constants, since those don't tend to be self documenting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alternatively this could be an array mapping to std::optional
and nullopt
would be the sentinel, which should be also okay.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@xokdvium any suggestiions for name of constant? my idea is invalid_hash_value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the sentinel value just be zero?
Absolutely. If you'd like you could handle it in a separate PR. Any help with the test coverage would be greatly appreciated. |
Note re #13682 that this PR is not supposed to affect how reference scanning works. It just affects how the parsing of hashes themselves (a |
Motivation
This PR replaces the linear search over nix32Chars with a constexpr-initialized reverse lookup table for decoding base-32 hashes. The changes include:
This significantly improves performance and clarity of base-32 decoding logic, while preserving correctness.