Skip to content

Segmentation fault (null pointer read and/or write) when reading CDX files #23

@the-blank-x

Description

@the-blank-x

Steps to reproduce:

  1. Save the following to crashpoc.cdx:
 CDX a b a m s k r M V g u
http://tillystranstuesdays.com/ 20240113012144 http://tillystranstuesdays.com/ text/html 200 AFQB6VVCWSKWEIAEJADJZAFMXOEGHO57 - - 1358 tillystranstuesdays.warc.gz <urn:uuid:4489ae0e-2e7d-482d-bff6-e86b02a3d719>
  1. Run wget-at --warc-dedup=crashpoc.cdx --warc-file=test https://example.com

Expected behavior: wget-at to download https://example.com into test.warc.gz

Actual behavior:

> ~/gits/wget-lua/src/wget --warc-dedup=crashpoc.cdx --warc-file=test https://example.com
zsh: segmentation fault (core dumped)  ~/gits/wget-lua/src/wget --warc-dedup=crashpoc.cdx --warc-file=test

Additional information:

> ~/gits/wget-lua/src/wget -V
GNU Wget 1.21.3-at.20231215.01 built on linux-gnu.

-cares +digest -gpgme +https +ipv6 +iri +large-file -metalink +nls 
+ntlm +opie +psl +ssl/gnutls 

Wgetrc: 
    /usr/local/etc/wgetrc (system)
Locale: 
    /usr/local/share/locale 
Compile: 
    gcc -DHAVE_CONFIG_H -DSYSTEM_WGETRC="/usr/local/etc/wgetrc" 
    -DLOCALEDIR="/usr/local/share/locale" -I. -I../lib -I../lib 
    -I/usr/include/luajit-2.1 -I/usr/include/p11-kit-1 -DHAVE_LIBGNUTLS 
    -DNDEBUG -ggdb -O0 
Link: 
    gcc -I/usr/include/p11-kit-1 -DHAVE_LIBGNUTLS -DNDEBUG -ggdb -O0 
    -lpcre2-8 -luuid -lidn2 -lnettle -lgnutls -lzstd -lz -lpsl -lm -ldl 
    -lluajit-5.1 ../lib/libgnu.a -lunistring 

Backtrace from GDB:

#0  __strlen_avx2 () at ../sysdeps/x86_64/multiarch/strlen-avx2.S:76
#1  0x00005555555c0c1a in xstrdup (string=0x0) at xmalloc.c:338
#2  0x00005555555a270b in store_warc_record (uri=0x5555556126b0 "http://tillystranstuesdays.com/", date=0x0, uuid=0x555555612710 "<urn:uuid:4489ae0e-2e7d-482d-bff6-e86b02a3d719>", 
    digest=0x7fffffffe4b0 "\001`\037V\242\264\225b \004H\006\234\200\254\273\210c\273\277\377\177") at warc.c:1415
#3  0x00005555555a2a7c in warc_process_cdx_line (lineptr=0x5555556125b0 "http://tillystranstuesdays.com/", field_num_original_url=0x2, field_num_checksum=0x5, field_num_record_id=0xa) at warc.c:1520
#4  0x00005555555a2c9e in warc_load_cdx_dedup_file () at warc.c:1591
#5  0x00005555555a2e70 in warc_init () at warc.c:1658
#6  0x0000555555591d4d in main (argc=0x4, argv=0x7fffffffe488) at main.c:2088

store_warc_record is called with a null pointer as its second parameter:

store_warc_record(original_url, NULL, record_id, digest);

store_warc_record doesn't check against null pointers, hence a segfault:

wget-lua/src/warc.c

Lines 1405 to 1422 in c1fe609

/* Store the WARC record in the warc_dedup_table using a warc_dedup_key and a
warc_dedup_record for the data. Copies all variables to the hash table. */
static void
store_warc_record (const char *uri, const char *date, const char *uuid,
const char *digest)
{
struct warc_dedup_record *rec = xmalloc (sizeof (struct warc_dedup_record));
struct warc_dedup_key *key = xmalloc (sizeof (struct warc_dedup_key));
rec->uri = xstrdup (uri);
rec->date = xstrdup (date);
rec->uuid = xstrdup (uuid);
key->uri = xstrdup (uri);
memcpy (rec->digest, digest, SHA1_DIGEST_SIZE);
memcpy (key->digest, digest, SHA1_DIGEST_SIZE);
hash_table_put (warc_dedup_table, key, rec);
}

When I was initially diagnosing this issue, I got a segfault from another area:

wget-lua/src/warc.c

Lines 1511 to 1519 in c1fe609

char *digest;
base32_decode_alloc (checksum, strlen (checksum), &checksum_v,
&checksum_l);
xfree (checksum);
if (checksum_v != NULL && checksum_l == SHA1_DIGEST_SIZE)
{
/* This is a valid line with a valid checksum. */
memcpy (digest, checksum_v, SHA1_DIGEST_SIZE);

digest is uninitialised when it is written to, causing a segfault and/or potential memory corruption. (In my case, digest was 0x0, but recompiling with -ggdb -O0 made it become some random writable pointer)

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions