Skip to content

Why is the built-in es code from initial.es so big? #179

@jpco

Description

@jpco

(Note: This is all using gcc on my 64-bit Linux laptop.)

If I compile es at HEAD (but with the treededup() call commented out, since it's a fine optimization but it messes with the math), using -Os, and then I strip it, the resulting binary is 245328 bytes in size. If I delete all of the definitions in initial.es and do the same thing, the es binary ends up at 130640 bytes. That comes out to a difference of 114688 bytes, or 112kiB, of binary image purely due to the contents of initial.es. Doesn't that seem way too large?

Looking deeper, there are two major parts to initial.c: the declaration of the "nodes" - all the lists and strings and trees and such that make up the value of the variables, and then the defs variable which actually defines the variables which use these trees. The other components in the file don't vary with the size of initial.es, so they shouldn't be accounted for in the size difference.

For the first part, I see

; cat initial.c | awk '/^static const/ {print $3}' | grep -v 'struct' | sort | uniq -c
    251 char
     76 Closure
    113 List
    112 Term
    355 Tree_p
    773 Tree_pp
    528 Tree_s

From some more awk-hackery, those 251 char (really char[]) declarations are collectively around 2070 bytes based on the length of the strings.

The other data types have sizes (based on sizeof() on my machine):

Closure	16
List	16
Term	16
Tree_p	16
Tree_pp	24
Tree_s	16

The counts and sizes of these types all add up to 37496 bytes. Adding in the strings estimate gives us 39566 bytes. That's all the "nodes".

The defs variable is an 88-element array where each element is two pointers (16 bytes) in size. That's 1408 bytes, so adding that to the total makes 40974 bytes.

Honestly, 40974 bytes still feels like a lot, but it's directly based on the representation of these structures in memory, so shrinking that would require rethinking the memory layout of Trees and Lists, and that's a pretty tall order. (Actually, I notice the NodeKinds in those Tree structures are using a total of 13248 bytes. That seems like relatively low-hanging fruit.)

The much weirder thing I notice is that after a strip, initial.o is 48704 bytes on my machine. That seems like a basically reasonable size for a file holding 40974 bytes of data. But then where the heck are the other ~70k bytes coming from in the final es binary? In a binary that's 240k total, that mysterious 70k alone makes up like 30% of the entire image!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions