Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Windows Please] #909

Open
jhay06 opened this issue Jun 18, 2022 · 57 comments
Open

[Windows Please] #909

jhay06 opened this issue Jun 18, 2022 · 57 comments
Labels
enhancement New feature or request expert help wanted Extra attention is needed roadmap Roadmap

Comments

@jhay06
Copy link

jhay06 commented Jun 18, 2022

Description

Hi ,
This is such a great work , hoping for the Windows version

WSL is good but very slow in most cases , i prefer to have this in windows , :) hope that this will be available soon.

Thanks

@AkihiroSuda AkihiroSuda added enhancement New feature or request help wanted Extra attention is needed expert labels Jun 18, 2022
@jandubois
Copy link
Member

WSL is good but very slow in most cases

I don't see how we could improve on the performance of WSL and Hyper-V on Windows. So unless somebody else has ideas, and the ability to implement them, this is unlikely to happen.

@afbjorklund
Copy link
Member

I think you can use the QEMU binaries with the WHPX acceleration, along with any available ISO, to see the "baseline" performance. If that is not enough, I guess this is more a request for running Windows containers (like Docker) ?

https://docs.microsoft.com/en-us/virtualization/windowscontainers/about/

I would be OK with being able to run VMs the same way on Windows, that is available on Mac and Linux today (i.e. QEMU). We had this with docker-machine and podman-machine, so it should be "possible" also for containerd-machine (lima)

But I don't run Windows myself, and when I do it is with something like MSYS

@afbjorklund
Copy link
Member

afbjorklund commented Jun 19, 2022

There are some GOOS=windows compilation issues on master, but those should be easy to fix:

# github.com/lima-vm/lima/pkg/lockutil
pkg/lockutil/lockutil.go:34:27: undefined: unix.LOCK_EX
pkg/lockutil/lockutil.go:38:28: undefined: unix.LOCK_UN
pkg/lockutil/lockutil.go:48:10: undefined: unix.Flock
pkg/lockutil/lockutil.go:49:27: undefined: unix.EINTR
note: module requires Go 1.18
# github.com/lima-vm/lima/pkg/networks
pkg/networks/validate.go:76:25: undefined: syscall.Stat_t
note: module requires Go 1.18

The "lockutil" are just missing some nerdctl code available. The syscall needs wrapping...

https://github.com/containerd/nerdctl/tree/master/pkg/lockutil


EDIT: added:

make GOOS=windows
_output/bin/limactl: PE32+ executable (console) x86-64 (stripped to external PDB), for MS Windows

@afbjorklund
Copy link
Member

afbjorklund commented Jun 19, 2022

Something like: (see MSYS2, and https://www.alpinelinux.org/downloads/)

$ /c/Program\ Files/qemu/qemu-system-x86_64 -m 512 -smp 1 \
  -accel whpx,kernel-irqchip=off -cdrom alpine-virt-3.16.0-x86_64.iso
Windows Hypervisor Platform accelerator is operational

The display looks broken (no input), but -serial stdio almost works.

ISOLINUX 6.04 6.04-pre1  Copyright (C) 1994-2015 H. Peter Anvin et al
boot:


   OpenRC 0.44.10 is starting up Linux 5.15.41-0-virt (x86_64)

 * /proc is already mounted
 * Mounting /run ... * /run/openrc: creating directory
 * /run/lock: creating directory
 * /run/lock: correcting owner
 * Caching service dependencies ... [ ok ]
 * Remounting devtmpfs on /dev ... [ ok ]
 * Mounting /dev/mqueue ... [ ok ]
 * Mounting modloop  ... * Verifying modloop
 [ ok ]
 * Mounting security filesystem ... [ ok ]
 * Mounting debug filesystem ... [ ok ]
 * Mounting persistent storage (pstore) filesystem ... [ ok ]
 * Starting busybox mdev ... [ ok ]
 * Loading hardware drivers ... [ ok ]
 * Loading modules ... [ ok ]
 * Setting system clock using the hardware clock [UTC] ... [ ok ]
 * Checking local filesystems  ... [ ok ]
 * Remounting filesystems ... [ ok ]
 * Mounting local filesystems ... [ ok ]
 * Configuring kernel parameters ... [ ok ]
 * Migrating /var/lock to /run/lock ... [ ok ]
 * Creating user login records ... [ ok ]
 * Cleaning /tmp directory ... [ ok ]
 * Setting hostname ... [ ok ]
 * Starting busybox syslog ... [ ok ]
 * Starting firstboot ... [ ok ]

Welcome to Alpine Linux 3.16
Kernel 5.15.41-0-virt on an x86_64 (/dev/ttyS0)

localhost login: root
root
Welcome to Alpine!

The Alpine Wiki contains a large amount of how-to guides and general
information about administrating Alpine systems.
See <http://wiki.alpinelinux.org/>.

You can setup the system with the command: setup-alpine

You may change this message by editing /etc/motd.

localhost:~#

EDIT: The kernel-irqchip thing was a workaround for a startup error:

whpx: injection failed, MSI (0, 0) delivery: 0, dest_mode: 0, trigger mode: 0, vector: 0, lost (c0350005)

And with "almost works", I mean this console has some weird issues:

localhost:~# apk add containerd
pk add containerd
-ash: pk: not found

EDIT: -display sdl works (better than "gtk")

qemu-whpx-efi-sdl

Here the console interaction works better.

@afbjorklund
Copy link
Member

afbjorklund commented Jun 19, 2022

So lima "works", and qemu "works". Left to do is making them work together, and add some documentation. 😃

Accelerator: https://docs.microsoft.com/en-us/virtualization/api/hypervisor-platform/hypervisor-platform (WHPX)

WSL is good but very slow in most cases , i prefer to have this in windows , :)

The main difference between WSL2 and Lima, is that lima uses a new virtual machine for each instance...
With the Windows Subsystem for Linux, all the system containers share the same VM kernel (like in LXC)

It is possible to start one Linux distribution (like Alpine), and then start system containers for Ubuntu or whatever.
Then the experience should be similar, same goes with sharing files - if opting in to use 9p (same as WSL uses)

@afbjorklund
Copy link
Member

afbjorklund commented Jun 26, 2022

Almost got it to run, final hurdle is converting paths for qemu (dos, argh) and scripts for ssh (don't ask)

"[hostagent] qemu[stderr]: C:\\Program Files\\qemu\\qemu-system-x86_64.exe: cannot create PID file: Failed to create PID file"
"[hostagent] stdout=\"\", stderr=\"command-line line 0: invalid quotes\\r\\n\", err=failed to execute script \"ssh\": stdout=\"\", stderr=\"command-line line 0: invalid quotes\\r\\n\": exit status 255"

Fixes (PRs):

  • lima compiles for GOOS=windows, cross-compiled on linux
  • unittests runs for GOOS=windows, using wine64 on linux

Verified:

  • regular limactl.exe operations (download, etc) works ok on Windows 10
  • starting virtual machine with hardware acceleration works on Windows 10

Fallbacks:

  • fallback to user "lima" using existing code, due to DOMAIN\user
  • use id -u and id -g where available, otherwise fallback uid gid
  • add home directory to the LimaUser, instead of using it "raw"
  • use cygpath $HOME where available, otherwise just use "filepath"
  • use windows paths (filepath) for host home and unix paths (path) for guest home

Workarounds:

User needs to add qemu, and regular tools - either MSYS2 or Git for Windows (MinGW) would work...

It's all normal programs, so it would be possible to install qemu-system-x86_64.exe and ssh.exe etc.
It does not require a Unix environment (like Cygwin) or other emulator, besides the regular QEMU (and Lima).

Using the "whpx" accelerator requires Windows with Hyper-V (Pro?), falling back to "haxm" would be possible.


Will make a PR for the fallbacks, but the rest needs a design decision - or to wait for AF_UNIX support ?

At this point it is just a proof-of-concept or technical demo, users are still recommended to use WSL2.

Note: this does not improve the performance (with Hyper-V), but it should be on par with the Mac version ?

I assume that all developers will be using Unix, and will not set up anything for PowerShell or DOS etc.

@arixmkii
Copy link

AF_UNIX support ?

@afbjorklund are you aware if there is any activity enabling AF_UNIX for windows builds on qemu side? This could benefit other projects as well. Like podman providing podman machine with MacOS like behavior instead as an alternative to WSL2 option.

@afbjorklund
Copy link
Member

afbjorklund commented Jun 27, 2022

Sorry, I don't know anything about it. The information I stumbled upon so far looked more like "gross hacks" than anything else.

https://cygwin.com/pipermail/cygwin/2020-June/245088.html

https://stackoverflow.com/questions/23086038/what-mechanism-is-used-by-msys-cygwin-to-emulate-unix-domain-sockets

I will assume that Unix sockets are unavailable on Windows

@afbjorklund
Copy link
Member

afbjorklund commented Jun 27, 2022

Afaik, Podman only uses Unix sockets for legacy (pre 18.09) Docker clients ? The other clients use SSH directly

@arixmkii
Copy link

Podman machine uses unix socket for qmp at least

-qmp unix://var/folders/<redacted>/T/podman/qmp_podman-machine-default.sock,server=on,wait=off 

And looks like for virtio-serial device

-device virtio-serial -chardev socket,path=/var/folders/<redacted>/T/podman/podman-machine-default_ready.sock,server=on,wait=off,id=podman-machine-default_ready 

These are extracts from podman machine start command line running on MacOS.

@afbjorklund
Copy link
Member

afbjorklund commented Jun 27, 2022

Oh, I thought you meant for the podman connection...
(Formerly known as CONTAINER_HOST or PODMAN_USER/PODMAN_HOST/PODMAN_PORT)

@arixmkii
Copy link

arixmkii commented Jun 27, 2022

No. I was talking about podman machine command and framework specifically. Having AF_UNIX in QEMU windows build could reduce the amount of platform specific branches to implement QEMU backed podman machine command for modern Windows versions.

@afbjorklund
Copy link
Member

afbjorklund commented Jun 27, 2022

For the PoC, I just used -chardev pipe (and mkfifo) for the qemu control.

       -chardev pipe,id=id,path=path
           Create a two-way connection to the guest. The behaviour differs slightly between Windows hosts and other hosts:

           On Windows, a single duplex pipe will be created at \\.pipe\path.

           On other hosts, 2 pipes will be created called path.in and path.out. Data written to path.in will be received by the guest. Data written by the guest can be read
           from path.out. QEMU will not create these fifos, and requires them to be present.

           path forms part of the pipe path as described above. path is required.

Didn't bother creating a qmp Monitor for Unix though, "left as an exercise"

@afbjorklund
Copy link
Member

afbjorklund commented Jun 27, 2022

But otherwise, I would be happy enough if exec.Command actually worked (with filepath)

Note that the examples in this package assume a Unix system. They may not run on Windows, and they do not run in the Go Playground used by golang.org and godoc.org.

https://pkg.go.dev/os/exec#Command

On Windows, processes receive the whole command line as a single string and do their own parsing.

@afbjorklund
Copy link
Member

afbjorklund commented Jun 29, 2022

Turns out that qemu doesn't start up correctly with pipe chardev. Switch them to null, and it works.

The WHPX accelerator is not compatible with -cpu max, so that needs a special case (like -cpu host)

Using Wine is too unstable to do anything but run unit tests, even with -accel tcg there are random failures.

The path issues were related to that os.UserHomeDir value is not compatible with exec.Command...

In case your home directory is C:\Users\AndersBjörklund or something, it fails to encode it properly.

This affects the default $LIMA_HOME and ~/.ssh, so probably needs some $HOME workaround/fallback <sigh>.

@afbjorklund
Copy link
Member

afbjorklund commented Jun 29, 2022

More random whpx failures:

{"level":"debug","msg":"qemu[stdout]: Windows Hypervisor Platform accelerator is operational","time":"2022-06-29T17:02:37+02:00"}
{"level":"debug","msg":"qemu[stderr]: C:\\Program Files\\qemu\\qemu-system-x86_64.exe: WHPX: Failed to emulate MMIO access with EmulatorReturnStatus
: 2","time":"2022-06-29T17:02:37+02:00"}
{"level":"debug","msg":"qemu[stderr]: C:\\Program Files\\qemu\\qemu-system-x86_64.exe: WHPX: Failed to exec a virtual processor","time":"2022-06-29T
17:02:37+02:00"}
{"error":"exit status 3","level":"info","msg":"QEMU has exited","time":"2022-06-29T17:02:37+02:00"}

Come and go, mysteriously...

It seems to mostly affect the default (ubuntu) image, even though only the ISO / URL changes ?.

Not the alpine (alpine-lima) image, but unfortunately it does not include any nerdctl/containerd.

@afbjorklund
Copy link
Member

afbjorklund commented Jun 29, 2022

Unfortunately, the terminal detection and signal handling is all messed up.

time="2022-06-29T17:25:44+02:00" level=info msg="Terminal is not available, proceeding without opening an editor"

If you terminate the limactl shell, then the limactl start kills the qemu.

{"level":"info","msg":"Received SIGINT, shutting down the host agent","time":"2022-06-29T17:23:14+02:00"}

@afbjorklund
Copy link
Member

afbjorklund commented Jul 1, 2022

In theory, this would be the way to fix the home:

\\?\C:\Users\AndersBjörklund (UNC)

In practice, this is the only workaround that works:

C:\Users\ANDERS~1 (DOS)


Probably want to flip these internal paths back to regular again, before displaying them to the user ?

Ironically, this seems to be done using filepath.ResolveSymlinks (which currently breaks LimaDir)

EDIT: Added PR, instead of hardcoded string:

@afbjorklund
Copy link
Member

afbjorklund commented Jul 2, 2022

Basic operation on Windows 10, when using Alpine with a custom containerd + nerdctl installation.

limactl start template://alpine

$ limactl ls
NAME      STATUS     SSH                ARCH      CPUS    MEMORY    DISK      DIR
alpine    Running    127.0.0.1:51129    x86_64    4       4GiB      100GiB    C:\Users\ANDERS~1\.lima\alpine

$ uname
MINGW64_NT-10.0-19044

$ lima uname
Linux

$ lima sudo nerdctl version
Client:
 Version:       v0.21.0
 OS/Arch:       linux/amd64
 Git commit:    9ddf5226eabcbb7b4b43987f3b0f8d53d86d3bca

Server:
 containerd:
  Version:      v1.6.6
  GitCommit:    10c12954828e7c7c9b6e0ea9b0c02b01407d3ae1

Note: no mounts, until the host/guest path situation is sorted out

DEBU[0002] the host home does not seem mounted, so the guest shell will have a different cwd

Note: no virtfs on windows, which means no 9p only sshfs mounts

ERROR: Feature virtfs cannot be enabled: virtio-9p (virtfs) requires Linux or macOS


Ubuntu template is still broken, MSYS2 terminal is still broken ("invalid quotes")

hostagent/useragent uses insecure ports, and qmp/serial sockets are disabled...

@afbjorklund
Copy link
Member

afbjorklund commented Jul 2, 2022

This is the EFI bug, turns out alpine still uses BIOS:

https://gitlab.com/qemu-project/qemu/-/issues/513

Seems like a workaround is to use -bios instead ?

EDIT: Indeed, that was it (with a custom OVMF.fd)

lima-ubuntu-qemu-whpx

So now both images are working OK, with WHPX.


@AkihiroSuda AkihiroSuda added the roadmap Roadmap label Jul 19, 2022
@afbjorklund
Copy link
Member

afbjorklund commented Jul 22, 2022

Fixed the quoting issues for MSYS, so now all three consoles should work (with lima)

  1. MSYS2 (msys64 subsystem)
  2. MinGW64 (Git for Windows)
  3. Command Prompt (cmd.exe)

Will push "port" and "pipe" up as drafts, and rebase and clear up the home directory...

  • port: use tcp sockets instead of unix sockets, for hostagent/guestagent
  • pipe: use named pipes instead of unix sockets, for qemu communication
  • add better handling of the external OVMF_CODE.fd, unlike the internal BIOS

@afbjorklund
Copy link
Member

Typical output:

MSYS2

lima-windows-msys2

MinGW64

lima-windows-mingw64

cmd.exe

lima-windows-cmd

@AkihiroSuda
Copy link
Member

Thanks a lot @afbjorklund

port: use tcp sockets instead of unix sockets, for hostagent/guestagent

This is fine until ssh.exe supports UNIX sockets, but this TCP socket has to be protected with mTLS to avoid potential attacks from malicious web sites via WebSockets.

@afbjorklund
Copy link
Member

afbjorklund commented Jul 22, 2022

This is fine until ssh.exe supports UNIX sockets, but this TCP socket has to be protected with mTLS to avoid potential attacks from malicious web sites via WebSockets.

I know, that is why I left it in draft. It's the same status as Docker's port 2375 - ok for testing development, but needs port 2376 for deployment production. Same thing with the named pipes unfortunately, currently it is using "null" instead of "pipe" in qemu.

       -chardev pipe,id=%s,path=%s
       -chardev socket,id=%s,path=%s,server=on,wait=off

Anyway, I will put the code up there for reading - hopefully there is some reasonable implementation to add tls to it (?), and hopefully there is some easy fix / patch to qemu for windows to allow it to still boot even when given the pipe option.


Investing some weird panic with the dns server as well, commented it out - but need to find out why it won't start...

                logrus.Debugf("Start %v server listening on: %v", network, addr)
                if e := s.ListenAndServe(); e != nil {
                        panic(e)
                }

So it remains in the "proof of concept" status, reason for pushing it is so that any Windows developer can help out.

@arixmkii
Copy link

@afbjorklund Thank you for your PRs. 👍 I will explore what was the progress.

@arixmkii
Copy link

OpenSSH MUX is going to be tricky. Even in msys2 or cygwin environments it works only with proxy mode, because passing FDs is not supported on Windows. But native port of OpenSSH for Windows rn is missing AF_UNIX support (haven't checked yet if this is due to compiler settings or missing bits in code).

@arixmkii
Copy link

Another show stopper for now. QEMU needs this patch (I will update my builds) https://lists.gnu.org/archive/html/qemu-devel/2022-07/msg04837.html
And the original issue reported in QEMU gitlab https://gitlab.com/qemu-project/qemu/-/issues/513
They wanted to include the patch in 7.2, but it doesn't have "enough correctness", so, was demoted to a hack and excluded.

So, before for the moment my experiments are limited to deprecated/centos-7 as the only template using legacy bios.

@arixmkii
Copy link

arixmkii commented Jan 1, 2023

Custom built QEMUwith 2 patches. Default box with UEFI. OpenSSH from msys2. Running default nginx. Automatic forwarding didn't happen, so, had to trigger it over ssh manually. File mounts were disabled during this experiments - they are the next ones after I will solve port forwarding issues. Unix socker (ga.sock) forwarding through ssh is done via ssh to random local tcp and then random local tcp to unix socket with gocat. SSH multiplexing was switched to proxy mode as FD forwarding doesn't work on Windows (even in cygwin/msys2).

Screenshot 2023-01-01 205737

@arixmkii
Copy link

arixmkii commented Jan 3, 2023

C:\qcw-utils\shells>limactl start
? Creating an instance "default" Proceed with the current configuration
time="2023-01-04T00:41:03+02:00" level=info msg="Attempting to download the image from \"https://cloud-images.ubuntu.com/releases/22.10/release-20221201/ubuntu-22.10-server-cloudimg-amd64.img\"" digest="sha256:4228fae635160ee2eeebda7b3f466e99729121958c125c6fbefe79178355d09b"
time="2023-01-04T00:41:04+02:00" level=info msg="Using cache \"C:\\\\Users\\\\Arthu\\\\AppData\\\\Local\\\\lima\\\\download\\\\by-url-sha256\\\\ba6a54c549d4809547852ec360ce975a9c17f2b960299755bb2ce033412526e2\\\\data\""
time="2023-01-04T00:41:04+02:00" level=info msg="Attempting to download the nerdctl archive from \"https://github.com/containerd/nerdctl/releases/download/v1.1.0/nerdctl-full-1.1.0-linux-amd64.tar.gz\"" digest="sha256:5440c7b3af63df2ad2c98e185e06a27b4a21eea334b05408e84f8502251d9459"
time="2023-01-04T00:41:04+02:00" level=info msg="Using cache \"C:\\\\Users\\\\Arthu\\\\AppData\\\\Local\\\\lima\\\\download\\\\by-url-sha256\\\\dabf40bf0b785bdfa60d957219520047fd7f68de336c973b22796a9aa1dcf1f6\\\\data\""
time="2023-01-04T00:41:06+02:00" level=info msg="[hostagent] Starting QEMU (hint: to watch the boot progress, see \"C:\\\\Users\\\\Arthu\\\\.lima\\\\default\\\\serial.log\")"
time="2023-01-04T00:41:06+02:00" level=info msg="SSH Local Port: 60022"
time="2023-01-04T00:41:06+02:00" level=info msg="[hostagent] Waiting for the basic requirement 1 of 1: \"ssh\""
time="2023-01-04T00:41:26+02:00" level=info msg="[hostagent] The basic requirement 1 of 1 is satisfied"
time="2023-01-04T00:41:26+02:00" level=info msg="[hostagent] Waiting for the essential requirement 1 of 3: \"ssh\""
time="2023-01-04T00:41:36+02:00" level=info msg="[hostagent] Waiting for the essential requirement 1 of 3: \"ssh\""
time="2023-01-04T00:41:36+02:00" level=info msg="[hostagent] The essential requirement 1 of 3 is satisfied"
time="2023-01-04T00:41:36+02:00" level=info msg="[hostagent] Waiting for the essential requirement 2 of 3: \"user session is ready for ssh\""
time="2023-01-04T00:41:36+02:00" level=info msg="[hostagent] The essential requirement 2 of 3 is satisfied"
time="2023-01-04T00:41:36+02:00" level=info msg="[hostagent] Waiting for the essential requirement 3 of 3: \"the guest agent to be running\""
time="2023-01-04T00:41:36+02:00" level=info msg="[hostagent] The essential requirement 3 of 3 is satisfied"
time="2023-01-04T00:41:36+02:00" level=info msg="[hostagent] Waiting for the optional requirement 1 of 2: \"systemd must be available\""
time="2023-01-04T00:41:36+02:00" level=info msg="[hostagent] The optional requirement 1 of 2 is satisfied"
time="2023-01-04T00:41:36+02:00" level=info msg="[hostagent] Waiting for the optional requirement 2 of 2: \"containerd binaries to be installed\""
time="2023-01-04T00:41:46+02:00" level=info msg="[hostagent] Not forwarding TCP 127.0.0.54:53"
time="2023-01-04T00:41:46+02:00" level=info msg="[hostagent] Not forwarding TCP 127.0.0.53:53"
time="2023-01-04T00:41:46+02:00" level=info msg="[hostagent] Not forwarding TCP [::]:22"
time="2023-01-04T00:42:16+02:00" level=info msg="[hostagent] Waiting for the optional requirement 2 of 2: \"containerd binaries to be installed\""
time="2023-01-04T00:42:31+02:00" level=info msg="[hostagent] The optional requirement 2 of 2 is satisfied"
time="2023-01-04T00:42:31+02:00" level=info msg="[hostagent] Waiting for the final requirement 1 of 1: \"boot scripts must have finished\""
time="2023-01-04T00:42:43+02:00" level=info msg="[hostagent] The final requirement 1 of 1 is satisfied"
time="2023-01-04T00:42:43+02:00" level=info msg="READY. Run `lima` to open the shell."

C:\qcw-utils\shells>lima uname -a
Linux lima-default 5.19.0-26-generic #27-Ubuntu SMP PREEMPT_DYNAMIC Wed Nov 23 20:44:15 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

C:\qcw-utils\shells>lima nerdctl run -d --name nginx -p 127.0.0.1:8080:80 nginx:alpine
docker.io/library/nginx:alpine:                                                   resolved       |++++++++++++++++++++++++++++++++++++++|
index-sha256:dd8a054d7ef030e94a6449783605d6c306c1f69c10c2fa06b66a030e0d1db793:    done           |++++++++++++++++++++++++++++++++++++++|
manifest-sha256:0f4e03e4e0e854bafe7ce689a4c2476feb07a88a465bbc7d0f155dd89a6b00db: done           |++++++++++++++++++++++++++++++++++++++|
config-sha256:1e415454686a67ed83fb7aaa41acb2472e7457061bcdbbf0f5143d7a1a89b36c:   done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:ee68d3549ec890fe6714463c6bfc4b33e306afc2615a08630331ee41822144c3:    done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:c158987b05517b6f2c5913f3acef1f2182a32345a304fe357e3ace5fadcad715:    done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:1e35f6679faba967c30b48c7bfb6b7e25928dd6737016537ea364512dd38c6f6:    done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:cb9626c74200bdb3d984ee7b36a36ea70587a935037415619bf4293ed3bb17f2:    done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:b6334b6ace34beacddd0cb6040a9407a6878a685b5350059053044806a38780a:    done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:f1d1c9928c827be4879dbab8d9004fb9f53e6c004e3ed4a128f8dded83294d98:    done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:9b6f639ec6ead43b54f835d2df60c3f0f250d9e0443231488912deb846d111ae:    done           |++++++++++++++++++++++++++++++++++++++|
elapsed: 4.9 s                                                                    total:  1.7 Mi (362.7 KiB/s)

3a62a72d43c08fa9603465b25d62dde5a23f01b5a5fa0ba212079c3e81d5f24f

C:\qcw-utils\shells>curl http://localhost:8080
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
html { color-scheme: light dark; }
body { width: 35em; margin: 0 auto;
font-family: Tahoma, Verdana, Arial, sans-serif; }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>

C:\qcw-utils\shells>

Port forwarding is ok now. Next priorities

  1. try out sshfs mounts
  2. try out 9pfs mounts
  3. invent something smarter than using Dos (8.3) filenames, because one can get this name only for file already present and if the socket is not there the best one can get is a directory Dos name with added non Dos filename, which doesn't look safe.
  4. prepare test build from forked sources

Further down the road:

  1. prepare code for upstream
  2. bugfixes...
  3. repeat from 5.

@AkihiroSuda
Copy link
Member

Thanks @arixmkii

Unix socker (ga.sock) forwarding through ssh is done via ssh to random local tcp and then random local tcp to unix socket with gocat.

TCP sockets are dangerous even on localhost, so this has to be protected with mTLS or something similar.

@arixmkii
Copy link

arixmkii commented Jan 3, 2023

TCP sockets are dangerous even on localhost

Indeed they are. If the handshake will become a part of proxying app then at least it would isolate the main lima code base from supporting TCP based GA client. But this is somehow distant. For now I will better focus on patches for making lima more windows compatible. I will start opening issues covering identified isolated topics in the coming days.

Update: If ever MS fork of OpenSSH will start supporting AF_UNIX and ContolPath options, many things would be solved automagically (dreaming here a bit).

@arixmkii
Copy link

arixmkii commented Jan 4, 2023

SSHFS looks pretty dead to me. Will try 9pfs first and if this works, then will not even try sshfs, probably.

@AkihiroSuda
Copy link
Member

SSHFS looks pretty dead to me.

Yes, there are active forks though

@arixmkii
Copy link

arixmkii commented Jan 4, 2023

This looks better. Though, building it with msys2 support and also bringing cygfuse to msys2 looks like one more challenge. 😅

It seems that sshfs binary is needed only for guest OS. 😌

@arixmkii
Copy link

arixmkii commented Jan 7, 2023

Another take on "bad" paths issue

func toCygpath(p string) string {
	if runtime.GOOS == "windows" {
		cp, _ := call([]string{"cygpath", "-u", p}, nil)
		cd := path.Dir(cp)
		cf := path.Base(cp)
		h := sha256.New()
		h.Write([]byte(cd))
		sha256_hash := hex.EncodeToString(h.Sum(nil))
		td := path.Join("/tmp", sha256_hash)
		_, err := call([]string{"test", "-d", td}, nil)
		if err == nil {
			return path.Join(td, cf)
		}
		_, err = call([]string{"ln", "-s", cd, td}, []string{"MSYS=winsymlinks:nativestrict"})
		if err == nil {
			return path.Join(td, cf)
		}
		return cp
	}
	return p
}

This utilized symlinks. So, we will create a symlink of good path to "bad" one. Good is constructed as "/tmp/SHA256VALUEOFBADONE". The problem is that symlink operation on Windwos host requires either elevation or developer mode settings in the OS. And obviously running limactl and all children in elevated mode is not a great idea. So, we are here left with a developer mode, which is kinda okay-ish as I consider Lima as a development tool, but not everyone would agree. The solution would be to provide the script, which user could run elevated, which will create all symlinks for the specific user is advance and then the code if detects one will just use that link via path substitution. Would be much better if ssh ControlPath would just accept all valid paths (I think it is the only place, where it rejects paths with whitespaces, not so sure with full unicode support).

Update 1: the good part is that we only need this magic around SSH commands, all other tools should work with Windows paths.

@arixmkii
Copy link

arixmkii commented Jan 8, 2023

Status updates:

  1. SSHFS works in RW mode
  2. 9pfs works in RO mode
    2.1. contacted developer of the patchset on some hints about RW mode
    2.2 reported 2 bugs to the developer - current implementation can't handle directories with symlinks or Unix domain sockets.
  3. The sample above, I consider a reasonable implementation for fixing path issues. At least for the moment.

Regarding 9pfs. It did work only RO, when I tried it with Podman, so, it is consistent at least, but let's see what response I will get about that issue, because annotation to the patchset mentions Write as supported.

Will work on creating separate issues for the code parts, which needs improvements to support Windows.

@arixmkii
Copy link

Tested this workaround #909 (comment) with cygwin. This works with cygwin symlinks, which doesn't require any sort of elevation or development mode. So, in this aspect cygwin has its edge over msys2 option.

@arixmkii
Copy link

Published QEMU build with 9pfs and pflash/UEFI patches (functionally equal to what I used for my experiments, but now built with CI) https://github.com/arixmkii/qcw/releases/tag/v0.0.8

@subfuzion
Copy link

Hey folks! I'm working on a book, and as someone completely enamored with the Lima experience on the Mac for demonstrating Linux system programming, it would be wonderful to see parity of experience for Windows users. Respect for all the open source contributors here -- the effort I see in this thread alone to make this happen is nothing short of amazing, so how unrealistic am I to hope that this is something that might be achieved by early 2024?

@afbjorklund
Copy link
Member

Another possibility for Windows users, besides using the portable QEMU, would be a Hyper-V driver...
(it would be doable, the VirtualBox driver #1277 was not too much work to set up - for the minimum)

But I have not payed attention to what the current status is, the PoC was "working" but had security issues.

@Anutrix
Copy link

Anutrix commented Jan 21, 2024

Since #1721 got merged, what's the status now? Does the docs mention how to install limactl on Windows or WSL2?

@AkihiroSuda
Copy link
Member

Since #1721 got merged, what's the status now? Does the docs mention how to install limactl on Windows or WSL2?

No docs yet (contribution is wanted), but it is installed and tested in the CI like this

windows:
name: "Windows tests"
runs-on: windows-2022-8-cores
timeout-minutes: 30
steps:
- name: Enable WSL2
run: |
wsl --set-default-version 2
# Manually install the latest kernel from MSI
Invoke-WebRequest -Uri "https://wslstorestorage.blob.core.windows.net/wslblob/wsl_update_x64.msi" -OutFile "wsl_update_x64.msi"
$pwd = (pwd).Path
Start-Process msiexec.exe -Wait -ArgumentList "/I $pwd\wsl_update_x64.msi /quiet"
wsl --update
wsl --status
wsl --list --online
- name: Install WSL2 distro
timeout-minutes: 3
run: |
# FIXME: At least one distro has to be installed here,
# otherwise `wsl --list --verbose` (called from Lima) fails:
# https://github.com/lima-vm/lima/pull/1826#issuecomment-1729993334
# The distro image itself is not consumed by Lima.
# ------------------------------------------------------------------
# Ubuntu-22.04: gets stuck in some infinite loop during adduser
# OracleLinux_9_1: almostly silently fails, and just prints "Usage: adduser [options] LOGIN"
wsl --install -d openSUSE-Leap-15.5
wsl --list --verbose
- name: Set gitconfig
run: |
git config --global core.autocrlf false
git config --global core.eol lf
- uses: actions/checkout@v4
with:
fetch-depth: 1
- uses: actions/setup-go@v5
with:
go-version: 1.21.x
- name: Unit tests
run: go test -v ./...
- name: Make
run: make
- name: Smoke test
# Make sure the path is set properly and then run limactl
run: |
$env:Path = 'C:\Program Files\Git\usr\bin;' + $env:Path
Set-ItemProperty -Path 'Registry::HKEY_LOCAL_MACHINE\System\CurrentControlSet\Control\Session Manager\Environment' -Name PATH -Value $env:Path
.\_output\bin\limactl.exe start template://experimental/wsl2
# TODO: run the full integration tests
- name: Debug
if: always()
run: type C:\Users\runneradmin\.lima\wsl2\ha.stdout.log
- name: Debug
if: always()
run: type C:\Users\runneradmin\.lima\wsl2\ha.stderr.log

@arixmkii
Copy link

I returned to experimenting with Lima + QEMU on Windows. Previous big no-goes were lack of mTLS (because of intermediated TCP between cygwinish runtimes and native Windows) and mux behaving differently (breaking some commands and taking it ugly). There were hopes that AF_UNIX would come to OpenSSH Windows builds, but there was barely any progress and mux support was not even in the first batch to be added.

Since then I changed concept how this could be achieved. Instead of cygwin/msys2/git shell going full Linux with a minimalist Alpine based VM with WSL2 mirrored networking mode (to have full localhost magic). This small distro will cover all networking stuff and utilities (like id, wslpath, ssh, ssh-keygen), while QEMU will run actual workloads.

Why WSL2 VM instead of cygwin:

  • it is simpler to create a distribution off Alpine then to bundle everything needed from msys2 (somehow achievable via chroot like options for msys2 package manager, but definitely more involved and Alpine feels more controllable);
  • OpenSSH just works as on other platforms - no need to workaround discrepancies;
  • reverse-sshfs was an issue with some Windows paths (issues, when username had spaces or non ASCII characters;
  • smaller code changes comparing to my prior experiments.

It is sort of stupid to have a lightweight VM next to a full sized VM, but if that actually works may be it is not that stupid.

I got some successes - running web server with port forwarding and lima shell operational, now will move to checking if reverse-sshfs will be a troublemaker.

@afbjorklund
Copy link
Member

I think that containers work quite well in the "new" WSL of Windows 11, now that it has both cgroups v2 and systemd*.

But it would still be nice to have QEMU support on all platforms, and some kind of simple Hyper-V VM driver (without WSL)

* it even supports KVM and GUI, which was a bit surprising

Probably need the driver framework to be in place, though?

@arixmkii
Copy link

This arixmkii@16467b4 got me usable QEMU setup with port forwarding, 9p, reverse-sshfs. To hide the complexity of WSL hosted tools I wrote this tool https://github.com/arixmkii/go-wsllinks So, under extras directory in bin there are:

  • id.exe
  • realpath.exe
  • sftp-server.exe
  • ssh.exe
  • ssh-keygen.exe
  • sync_lima_file.exe
  • wslpath.exe

The Lima can use them almost the same way as native tools (path translation is added, where required).

I still need to finalized build scripts for WSL distro (for tests I manually imported Alpine and installed all tools).

What still requires work - AF_UNIX socket forwarding. I checked that it is still possible to implement through intermediate TCP transport, but this will not be good w/o mTLS, I hope to figure out another way.

I also need to test WSL driver support, I definitely broke this, so, this will need fixing.

I plan to finalize WSL distro stuff and then setup CI to create at least one test build for sharing. Then I will do another round evaluating options for AF_UNIX forwarding support.

@arixmkii
Copy link

I think that containers work quite well in the "new" WSL of Windows 11,

No doubts here. From my point of view the powerful VM provisioning provided by Lima is as great as containers experience it gives. Having this option available would be beneficial.

@arixmkii
Copy link

I built very first artifact version from my experimental code with CI and it is available here: https://github.com/arixmkii/qcw/releases/tag/v0.0.28 They are highly experimental and for evaluation purposes only, don't try it on your production/important systems. People interested should consult the README file for instructions. The list of code changes applied on top of Lima sources is arixmkii@f97d2c5

@arixmkii
Copy link

arixmkii commented Jan 31, 2025

Resolved blocking issues with WSL2 machine type in my rebuilds (containerd is experiencing issues after setup, this is yet to be investigated). Updated versions will be published here https://github.com/arixmkii/qcw/releases/tag/v0.0.29 It is important to delete previous version of lima-infra WSL instance and install the new one. The build is still for evaluation purposes only and not production ready.

Submitted some quick win fixes:

And created a backlog for other required changes:

Some other planned activities - test/evaluate with msys2 userland instead on WSL2, add at least basic testing to the current CI builds (CI part basics are done for WSL flavor, I will add some QEMU variants, when the thing runs against upstream QEMU - needs #3176).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request expert help wanted Extra attention is needed roadmap Roadmap
Projects
None yet
Development

No branches or pull requests

8 participants