Skip to content

Do not copy string content if possible while base64 encoding/decoding #7997

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

dreamer-dead
Copy link
Contributor

Main intent of this PR is to avoid string copying while encoding/decoding.
The encode change is quite trivial since it didn't copy the input.
But the decode case is opposite: it needs to remove few symbols from the input string.

So I decided to split code of decode function into 2 overloads: first that takes a string_view and a special function that takes a rvalue reference and can change the string content.

I believe that the added linear lookup with find_first_of("\r\n=") will not make things worse in comparison with previous code but now the encoded string is not copied every time.

PTAL

@dreamer-dead dreamer-dead requested review from a team as code owners April 19, 2023 09:00
@mike-myers-tob mike-myers-tob added refactor Related to osquery code refactoring ready for review Pull requests that are ready to be reviewed by a maintainer labels May 9, 2023
@directionless
Copy link
Member

Looks pretty similar to other string_view changes. Any of the c folks want to weigh in? (@alessandrogario @Smjert @marcosd4h )

@directionless directionless added this to the 5.10.0 milestone Sep 26, 2023
Copy link
Member

@Smjert Smjert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This first needs to be rebased on master; the base64::encode function has already been changed to support std::string_view on master, so I think this leaves us with decode only.

std::string decode(std::string&& encoded) {
boost::erase_all(encoded, "\r\n");
boost::erase_all(encoded, "\n");
boost::trim_right_if(encoded, boost::is_any_of("="));
Copy link
Member

@Smjert Smjert Oct 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something I don't get from the original function is why we need to do this at all.
This seems an unnecessary performance hit to incur into, which might not be necessary to do (and we also have to scan the string again to understand if it's necessary to), when I think the caller should provide a valid base64 decodable string.

@Smjert Smjert modified the milestones: 5.10.0, 5.11.0 Oct 6, 2023
@directionless directionless modified the milestones: 5.11.0, 5.12.0 Dec 27, 2023
@directionless directionless modified the milestones: 5.12.0, 5.13 Feb 29, 2024
@directionless directionless removed this from the 5.13 milestone Jul 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready for review Pull requests that are ready to be reviewed by a maintainer refactor Related to osquery code refactoring virtual tables
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants