Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

guestagent binaries are too huge (k8s.io deps should be replaced with exec("kubectl")?) #3237

Open
AkihiroSuda opened this issue Feb 14, 2025 · 7 comments

Comments

@AkihiroSuda
Copy link
Member

The footprint of the guestagent binaries has grown from 6.2M + 6.6M to 46M + 45M + 46M + 48M through the four years

$ ls -lh lima-0.1.0-Darwin-x86_64.tar.gz 
-rw-r--r--  1 suda  staff   8.4M  2 14 14:23 lima-0.1.0-Darwin-x86_64.tar.gz

$ tar xf lima-0.1.0-Darwin-x86_64.tar.gz

$ ls -lh share/lima/lima-guestagent.Linux-*
-rw-r--r--  1 suda  staff   6.2M  5 14  2021 share/lima/lima-guestagent.Linux-aarch64
-rw-r--r--  1 suda  staff   6.6M  5 14  2021 share/lima/lima-guestagent.Linux-x86_64
$ ls -lh lima-1.0.6-Darwin-x86_64.tar.gz 
-rw-r--r--  1 suda  staff    61M  2 14 14:23 lima-1.0.6-Darwin-x86_64.tar.gz

$ tar xf lima-1.0.6-Darwin-x86_64.tar.gz

$ ls -lh share/lima/lima-guestagent.Linux-*
-rw-r--r--  1 suda  staff    46M  2 13 09:04 share/lima/lima-guestagent.Linux-aarch64
-rw-r--r--  1 suda  staff    45M  2 13 09:04 share/lima/lima-guestagent.Linux-armv7l
-rw-r--r--  1 suda  staff    46M  2 13 09:04 share/lima/lima-guestagent.Linux-riscv64
-rw-r--r--  1 suda  staff    48M  2 13 09:04 share/lima/lima-guestagent.Linux-x86_64
@jandubois
Copy link
Member

jandubois commented Feb 14, 2025

Most of this comes from adding the Kubernetes port forwarding (#1355) in pkg/guestagent/kubernetesservice which pulled in the k8s.io/* dependencies:

tar tvfz lima-0.14.2-Darwin-arm64.tar.gz| grep guest
-rw-r--r--  0 runner staff 7139328 23 Dec  2022 ./share/lima/lima-guestagent.Linux-x86_64
-rw-r--r--  0 runner staff 6946816 23 Dec  2022 ./share/lima/lima-guestagent.Linux-riscv64
-rw-r--r--  0 runner staff 6881280 23 Dec  2022 ./share/lima/lima-guestagent.Linux-aarch64tar tvfz lima-0.15.0-Darwin-arm64.tar.gz| grep guest
-rw-r--r--  0 runner staff 38367232 28 Feb  2023 ./share/lima/lima-guestagent.Linux-x86_64
-rw-r--r--  0 runner staff 37027840 28 Feb  2023 ./share/lima/lima-guestagent.Linux-riscv64
-rw-r--r--  0 runner staff 36765696 28 Feb  2023 ./share/lima/lima-guestagent.Linux-aarch64

Not sure what to do about this. Are you concerned about the size of the Lima distribution, or about the amount of data we add to cidata.iso? If it is the latter, then we could create a minimal guestagent without the API based scrapers, so we can only use the full hostagent when the "expensive" scrapers are needed?

I do expect the size to grow a little more when we add API based scrapers for dockerd and containerd as well, but nothing like the size of the k8s libraries.

@AkihiroSuda
Copy link
Member Author

AkihiroSuda commented Feb 14, 2025

Not sure what to do about this.

Probably we can eliminate a bunch of deps just by modifying pkg/guestagent/kubernetesservice to shell out kubectl ?

Are you concerned about the size of the Lima distribution, or about the amount of data we add to cidata.iso?

Both

@AkihiroSuda AkihiroSuda changed the title guestagent binaries are too huge guestagent binaries are too huge (k8s.io deps should be replaced with exec("kubectl")?) Feb 14, 2025
@jandubois
Copy link
Member

Probably we can eliminate a bunch of deps just by modifying pkg/guestagent/kubernetesservice to shell out kubectl ?

How would that work for creating a service watcher that gets notified "immediately" when the service definitions change?

One of the reasons to switch to API based scrapers is to get rid of the 3 seconds delay between ports being opened/closed and the corresponding change being made to the host socket.

So I could see us adding a slimmed-down guestagent for when you don't need/want the API based scrapers. That would reduce the size of cidata.iso, but increase the size of the Lima distro slightly more.

Not sure if any of that is worth the effort; most people don't seem to care.

@jandubois
Copy link
Member

An alternative could be to implement scrapers as independent processes.

That means the hostagent would have to deal with the de-duplication when multiple scrapers report the same port changes. But I think we already have a race condition for this in the current code.

@AkihiroSuda
Copy link
Member Author

Why can't we use kubectl get --watch?

@jandubois
Copy link
Member

Why can't we use kubectl get --watch?

Maybe we can; I will discuss with @Nino-K. I think he just finished merging all the scrapers from the Rancher Desktop Windows guestagent with the Lima guestagent, so he will know better than me.

@afbjorklund
Copy link
Member

afbjorklund commented Feb 14, 2025

You can use CONFIG_GUESTAGENT_COMPRESS, to reduce the size of the guestagent binaries datadir by 75%.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants