Skip to content

WIP: Embed APE loader inside APE #267

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Sep 17, 2021
Merged

WIP: Embed APE loader inside APE #267

merged 2 commits into from
Sep 17, 2021

Conversation

lemaitre
Copy link
Contributor

@lemaitre lemaitre commented Sep 5, 2021

This is an implementation of the loader proposal from #263.

The idea is to embed inside the APE a tiny ELF that loads the APE in memory, and jump into it to start it. The loader is copied into TMP folder and started from here. The loader is from Justine's example: 969174e.

Before considering merging, I would like some feedbacks, especially about how to organize the new code, and where to put it.
Also, I won't be able to create a MacOs loader, so I might need some help here.

Copy link
Owner

@jart jart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Welcome as a new contributor. We need to go through one quick hurdle before we can proceed. Could you please email Justine Tunney [email protected] and say that you intend to assign her the copyright
to the changes you contribute to Cosmopolitan? See CONTRIBUTING.md for further details. We make the ask because it helps us ship the tiniest most legally permissive prim and proper binaries.
You also only need to do it one time, so any future changes you send us can be quickly merged.

@@ -645,6 +666,11 @@ apesh: .ascii "'\n#'\"\n" # sixth edition shebang
.ascii "fi\n"
.ascii "exit $R\n"
.endobj apesh
#ifdef APE_LOADER
.section .ape.loader,"a",@progbits
.incbin APE_LOADER
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice

.ascii "type ape-loader >/dev/null 2>&1 && "
.ascii "exec ape-loader \"$0\" \"$@\"\n"
#ifdef APE_LOADER
// There is no system-wide APE loader, but there is one
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I would recommend having this only happen for the ape-no-modify-self.o build.
  2. You might want to check $(uname -s) = Linux

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was in the impression that, if a system-wide ape-loader exist, it is always the right thing to do.
The copy of the embedded loader is done only if APE_LOADER is defined, which, I expect, happens only for ape-no-modify-self.o.

I need a separate macro, as APE_LOADER gives the path of the loader ELF.

I just copied the test for Linux/MacOS which was already here. The check for the existence of a directory is faster than calling uname in a subshell.

jart added a commit that referenced this pull request Sep 12, 2021
qjsc.com now has a -n do nothing flag so the makefile can create a
localized binary. See also #267 where we have an exciting new change
aiming to address this particular APE gotcha.
@lemaitre
Copy link
Contributor Author

Could you please email Justine Tunney [email protected] and say that you intend to assign her the copyright
to the changes you contribute to Cosmopolitan?

I've just sent it to you from my personal email.

@lemaitre
Copy link
Contributor Author

lemaitre commented Sep 17, 2021

I've ran some benchmarks about how fast the loader is:

native direct loader shell loader shell loader + copy
1.52 ms 1.57 ms 8.07 ms 15.7 ms

The binary tested is hello.com (with stdout redirected to /dev/null).
So the loader by itself is really fast and have a negligible impact.
However, the shell trick alone accounts for about 5-6 ms, and the copy of the loader takes another 7 ms.

All in all, I think it's reasonable, especially because this overhead does not increase with the size of the APE.

Copy link
Owner

@jart jart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM Thank You!

especially because this overhead does not increase with the size of the APE.

That's definitely the best part!

The 5ms constant factor can be using something like binfmt_misc or, more ideally, getting ape support merged into kernels.

@jart jart merged commit 18ccfeb into jart:master Sep 17, 2021
@lemaitre
Copy link
Contributor Author

@jart I know you were eager to merge this, but I think it was not ready yet (hence the "WIP" prefix in the name).
To me there were a few questions that needed answers:

  • Should we have a test for a system-wide ape-loader before trying anything else?
  • When copying the loader in the temp folder, should we randomize the destination a bit?
  • Should we keep the old code for copying the whole APE (in case the loader is not embedded)?

Maybe I could just create an issue to discuss this, and the another MR when we have answers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants