If you split a string when the remaining bytes after the last delimiter are fewer than the delimiter length, the loop condition within zstr_split_next causes an unsigned integer underflow.
Line 1085:
for (size_t i = 0; i <= remaining - it->delim.len; i++)
When remaining is less than it->delim.len, the subtraction wraps and the loop runs past the buffer.
You can repro this on any string that ends with the delimiter, so for example "Hello World ".
#include "include/base.h"
#include "include/zstr.h"
int main(void) {
zstr s = zstr_lit("hello world ");
zstr_view sv = zstr_as_view(&s);
zstr_split_iter it = zstr_split_init(sv, " ");
zstr_view chunk;
while (zstr_split_next(&it, &chunk)) {
LOG_INFO("chunk: '%.*s'\n", (int)chunk.len, chunk.data);
}
zstr_free(&s);
return 0;
}
Here's how I compiled the repro:
gcc -fsanitize=address -g -o repro repro.c
ASAN output:
=================================================================
==406993==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ba4041000e0 at pc 0x7fa406a91d4b bp 0x7ffee4f2a590 sp 0x7ffee4f29d38
READ of size 1 at 0x7ba4041000e0 thread T0
#0 0x7fa406a91d4a (/usr/lib/libasan.so.8+0x91d4a) (BuildId: 14b88b415de3de114ecf21fa09c781d8be3785a4)
#1 0x7fa406a92277 in memcmp (/usr/lib/libasan.so.8+0x92277) (BuildId: 14b88b415de3de114ecf21fa09c781d8be3785a4)
#2 0x55845f0f0fe4 in zstr_split_next include/zstr.h:1088
#3 0x55845f0f1355 in main /mnt/secondary/code_prod/C/beej-guide/repro.c:11
#4 0x7fa406627c4d in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:59
#5 0x7fa406627d8a in __libc_start_main_impl ../csu/libc-start.c:360
#6 0x55845f0f01a4 in _start (/mnt/secondary/code_prod/C/beej-guide/repro+0x11a4) (BuildId: b3ffa51dc0c0fd7ef8b92426a9f4dad0e4e66fea)
Address 0x7ba4041000e0 is located in stack of thread T0 at offset 224 in frame
#0 0x55845f0f11c1 in main /mnt/secondary/code_prod/C/beej-guide/repro.c:4
This frame has 4 object(s):
[48, 64) 'sv' (line 6)
[80, 96) 'chunk' (line 9)
[112, 160) 'it' (line 8)
[192, 224) 's' (line 5) <== Memory access at offset 224 overflows this variable
HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork
(longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-buffer-overflow include/zstr.h:1088 in zstr_split_next
Shadow bytes around the buggy address:
0x7ba4040ffe00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x7ba4040ffe80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x7ba4040fff00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x7ba4040fff80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x7ba404100000: f1 f1 f1 f1 f1 f1 00 00 f2 f2 00 00 f2 f2 00 00
=>0x7ba404100080: 00 00 00 00 f2 f2 f2 f2 00 00 00 00[f3]f3 f3 f3
0x7ba404100100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x7ba404100180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x7ba404100200: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x7ba404100280: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x7ba404100300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
==406993==ABORTING
Could add a guard around the for loop checking that it only runs when the delimiter can fit in the remaining data.
if (remaining >= it->delimiter.len)
{
for (....)
}
If you split a string when the remaining bytes after the last delimiter are fewer than the delimiter length, the loop condition within
zstr_split_nextcauses an unsigned integer underflow.Line 1085:
When
remainingis less thanit->delim.len, the subtraction wraps and the loop runs past the buffer.You can repro this on any string that ends with the delimiter, so for example "Hello World ".
Here's how I compiled the repro:
gcc -fsanitize=address -g -o repro repro.cASAN output:
Could add a guard around the for loop checking that it only runs when the delimiter can fit in the remaining data.