Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(PUP-11122) Ensure reg values are null terminated #9205

Merged
merged 2 commits into from
Jan 23, 2024

Conversation

joshcooper
Copy link
Contributor

@joshcooper joshcooper commented Jan 10, 2024

RegQueryValueExW doesn't guarantee the returned buffer for the lpData
parameter is null terminated, so ensure that it is.

When retrieving a registry value of type REG_SZ or REG_EXPAND_SZ extend the
buffer by 1 wchar (2 bytes) so we can always write a wchar null terminator that
is guaranteed not to overwrite user data.

The resulting wchar array is then guaranteed to be properly wchar null
terminated:

\u0041
\u0042
\u0000

which when UTF-16LE encoded will result in the byte array:

\x41 \x00
\x42 \x00
\x00 \x00

If Windows does null terminate, then we will add an additional wchar null
terminator, which won't hurt anything.

The same applies to REG_MULTI_SZ, except it is supposed to have two wchar null
terminators, for a total of 4 bytes. One terminator is for the last string in
the array, and one terminator is for the entire array. For example, if the array
contains the strings ['A', 'B'] and Windows does not null terminate the
lpData buffer, then the resulting UTF-16LE encoded byte array will contain:

\x41 \x00
\x00 \x00
\x42 \x00
\x00 \x00
\x00 \x00

@joshcooper joshcooper force-pushed the win_reg branch 2 times, most recently from e681e19 to a8fc33d Compare January 18, 2024 01:12
@joshcooper joshcooper changed the title Win reg (PUP-11122) Ensure reg values are null terminated Jan 18, 2024
RegQueryValueExW doesn't guarantee the returned buffer for the `lpData`
parameter is null terminated, so ensure that it is.

When retrieving a registry value of type REG_SZ or REG_EXPAND_SZ extend the
buffer by 1 wchar (2 bytes) so we can always write a wchar null terminator that
is guaranteed not to overwrite user data.

The resulting wchar array is then guaranteed to be properly wchar null
terminated:

    \u0041
    \u0042
    \u0000

which when UTF-16LE encoded will result in the byte array:

    \x41 \x00
    \x42 \x00
    \x00 \x00

If Windows does null terminate, then we will add an additional wchar null
terminator, which won't hurt anything.

The same applies to REG_MULTI_SZ, except it is supposed to have two wchar null
terminators, for a total of 4 bytes. One terminator is for the last string in
the array, and one terminator is for the entire array. For example, if the array
contains the strings ['A', 'B'] and Windows does not null terminate the
`lpData` buffer, then the resulting UTF-16LE encoded byte array will contain:

    \x41 \x00
    \x00 \x00
    \x42 \x00
    \x00 \x00
    \x00 \x00
@joshcooper joshcooper marked this pull request as ready for review January 18, 2024 23:51
@joshcooper joshcooper requested a review from a team as a code owner January 18, 2024 23:51
@joshcooper
Copy link
Contributor Author

This should be backported in 7.x

@joshcooper
Copy link
Contributor Author

If Windows does correctly terminate the string and we add extra padding, then I wanted to verify it wouldn't cause issues, for example, when converting the REG_MULTI_SZ value to an array. First we convert the UTF-16LE string to UTF-8 in FFI::Pointer#read_wide_string, and then later, we split the UTF-8 string to an array using split(/\0/). This shows how extra padding is ignored and the correct two element array is generated:

irb(main):005:0> WNUL = "\0".encode('UTF-16LE')
=> "\u0000"
...
irb(main):014:0> (["a","b"].map {|s| s.encode('UTF-16LE')}.join(WNUL) + WNUL + WNUL + WNUL + WNUL)
=> "a\u0000b\u0000\u0000\u0000\u0000"
irb(main):015:0> (["a","b"].map {|s| s.encode('UTF-16LE')}.join(WNUL) + WNUL + WNUL + WNUL + WNUL).encode('UTF-8')
=> "a\u0000b\u0000\u0000\u0000\u0000"
irb(main):016:0> (["a","b"].map {|s| s.encode('UTF-16LE')}.join(WNUL) + WNUL + WNUL + WNUL + WNUL).encode('UTF-8').split(/\0/)
=> ["a", "b"]

@tvpartytonight tvpartytonight merged commit 3192f4a into puppetlabs:main Jan 23, 2024
9 checks passed
@joshcooper joshcooper deleted the win_reg branch January 23, 2024 21:00
@joshcooper joshcooper added the backport 7.x Generate a backport PR to 7.x label Jan 23, 2024
Copy link

Successfully created backport PR for 7.x:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 7.x Generate a backport PR to 7.x bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants