-
Notifications
You must be signed in to change notification settings - Fork 151
Integrating native-proxy #501
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Hi @jwindgassen , thanks for this contribution! Is the idea to replace jhsingle-native-proxy (so that code can be shared), or will that project continue on? If the latter then could there still be duplication of code? (will this need to periodically get resynced with jhsingle-native-proxy?) Or should shared code be factored out into something separate for each project to use? If there is documentation on using this, could that be integrated as well? |
That is a good question. I do not know about the current state of the original project. The newest commit was already made a few months ago, but I do not think it is abandoned. I will contact the original developer soon and ask him about the current status and his opinion on this. But the proxies used by it are almost identical to the proxies here, and (I think) they originally were a copy of an old version of the ones in this package. The Documentation for it is currently in the ReadMe, but could be added to the docs here, given that we decide to add this feature. |
This sounds like a great idea. I would suggest to create some kind of checklist of the features that are ported from jhsingle-native-proxy and what's not during the course of this pull request (or a series of pull requests). We're using jhsingle-native-proxy heavily in jhub-apps would love to move to just using |
@ryanlovett I talked to the original developer. He welcomes the merging of his feature into Jupyter Server Proxy. There are currently no further plans for @aktech Sure. Here is a list with the features I have currently ported or plan to do so:
That's at least everything I can think of now. If you have other ideas or require something more, let me know and I will add it to the list :) |
@jwindgassen Thanks very much for asking! It sounds like this has the potential to consolidate development in the future. |
@ryanlovett @aktech I have been working on the feature over the last few weeks, and I am now happy to announce, that the code is working 🎉 I'm welcoming anyone to test the feature on their JupyterHub instance for testing. Please let me know about any problems or errors you encounter when doing so 🙂 How to useFor testing, I like to use voila. The command to execute might look like this: What else do you need from my side, besides the code, to be satisfied with this PR? I am currently writing a page for the docs, which I will commit soon. |
@jwindgassen Thanks for the update! I'll try to test this week locally. How might a hub administrator typically configure use of this feature? For example would they set every user to launch voila via standalone proxy from Is the intent of standalone to essentially re-use the hub's auth, spawner, user storage, etc. but limit what apps users can invoke because it specifies just the one? (since jupyter server + jupyter-server-proxy enables users to launch an arbitrary number of proxied apps) |
In essence, yes. But since you need to overwrite |
Fascinating, thanks! More of an aside, but how are you customizing the spawner to launch the different applications? Edit: oh, is it jhub-apps as mentioned earlier? |
No. We have created our own custom Spawner. But is similar to this. We have overwritten the start method, which will submit a new Job to our cluster. And inside the start script, we start jupyterhub singleuser at the end. But now you mention it, jhub-apps might synergyze quite well with the standalone feature. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The command line arguments are quite complicated, e.g. having to parse maps.
The alternative is more work, but if we were to refactor ServerProcess to be a Traitlets Configurable
jupyter-server-proxy/jupyter_server_proxy/config.py
Lines 31 to 49 in 41e588a
ServerProcess = namedtuple( | |
"ServerProcess", | |
[ | |
"name", | |
"command", | |
"environment", | |
"timeout", | |
"absolute_url", | |
"port", | |
"unix_socket", | |
"mappath", | |
"launcher_entry", | |
"new_browser_tab", | |
"request_headers_override", | |
"rewrite_response", | |
"update_last_activity", | |
"raw_socket_proxy", | |
], | |
) |
I think we could take advantage of Traitlets ability to automatically create all CLI args, and we'd have the benefit of being able to configure jupyter-standaloneproxy using an arbitrarily complex file.
What do you think? I'm happy to investigate converting ServerProxy to Traitlets.
Thanks for the suggestion, I really like that idea. This should also solve the issue with keeping CLI Arguments up with any new options added to the proxies. I will look into it :) |
I've made a start in #507 |
So I used your branch to create the CLI via traitlets. I'm no expert with traitlets, but I managed to get it working: jwindgassen@2d9eb5b I had to play around with aliases and the extra_args a bit, but I wanted to keep the CLI reasonably unchanged to before. If you have a better idea or otherwise comments on my changes over there, let me know! :) |
Sorry for the delay. Since this is a new addition to Jupyter server proxy I think we should prioritise long term maintenance over just replicating the previous CLI- if there's a better way to do things we can use it as an opportunity to refactor. We also don't need to do everything in one go, for example it's fine to initially focus on creating a functional standalone proxy along that only supports standard traitlets configuration, and add additional flags in a follow-up PR. |
I'm fine with how the CLI looks right now, so I'm happy to switch to this once your PR has been accepted. I am also almost done with Tests and Docs, they should be ready by the end of the week. |
Hey @jwindgassen Thanks a lot for working on this and for the ping. I'll play with it this week to provide some feedback. |
1f8855b
to
e4741d0
Compare
Ok, I have now also added Docs and Tests for the new feature. Writing proper tests for Login and Activity is a bit more difficult, since I would need to spawn a JupyterHub instance to gain full access to its API. For now, I limited it to only testing our code, since the classes I import from JupyterHub are tested over there. I also added a section to the docs, mostly targeted to developers, where I explain the different sections of the code and what features I needed to implement to make this work smoothly. If you think we need more tests for specific cases or want something to be explained in the docs in more detail, let me know. |
@jwindgassen So from your perspective this is ready to be merged. Great and Thank you! ( As soon as #507 and #508 is merged this native-proxy can be updated afterwards. But for now this PR here is implemented to be independent of them. True? ) |
Yes. This is currently independent of #507 and functions without it. But in the future, once that has been merged, I would update the standalone feature to use @aktech Have you been able to get it running and did it work in your setup? I would highly appreciate any feedback or comments on this 🙂 @ryanlovett How do we continue for this PR? Is there anyone specific responsible for reviewing it? Is there still more you need here? Sorry for being a bit impatient, but we would like to get this feature running soon on our JupyterHub instance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've left a few minor comments and suggested changes. What do you think?
In terms of next steps, I'm fine with merging if @manics thinks its okay to go ahead and then address the related PRs later. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jwindgassen Thanks a lot for working on this. This is really very useful. I tried replacing jhsingle-native-proxy with this branch and I was able to get it working partially (minus conda env and repo) for panel and voila apps.
To be able to completely use it, we would need ability to specify conda env and pull from repo, but I think this PR is a great start in that direction. These features can be contributed in later PRs and this PR looks complete enough to get merged IMO.
I tested for panel and voila. For panel I ran this:
jupyter standaloneproxy --debug --skip-authentication -- python -m bokeh_root_cmd.main ~/path/to/jhub-apps/jhub_apps/examples/panel_basic.py --port={port} --debug --allow-websocket-origin=127.0.0.1:8888 --server=panel
@ryanlovett If you're generally happy with this would you mind merging #507 first, then we can rebase this PR, and it'll make this PR smaller, and it'll be a lot clearer what we're adding. @aktech I don't think we should completely reproduce jhsingle-native-proxy since we have to maintain this long term- cloning git repos and setting up conda envs doesn't feel like it's in scope. However I'd hope the move to traitlets makes it much easier to extend this in a separate app, or perhaps it's as easy as wrapping it in another script? |
@aktech Very nice to hear that it worked for you out of the box. I needed to find and fix some bugs when I installed it on our system to make it working, so I am relieved it worked without much effort for you now. You currently have the Regarding the conda/env activation and git puller, I would suggest seeing how desired this feature becomes in the future. I think it's probably quite niche, but maybe I am wrong and many people would like to use it. But for now I would consider it out of scope for this PR. |
@ryanlovett @manics Now that #507 is merged, should I rebase and tidy up the commits here? Or should I merge main into here and then append the required changes to the CLI? |
The `oauth_callback` requests were handled by the ProxyHandler, effectively causing the request to ping-pong between JupyterHub Login and `/oauth_callback`
d7681f1
to
941356f
Compare
Ok. So I rebased on top of the new #507 Merge. I had to refactor the There are still 2 minor changes I am thinking about implementing:
Any comments, on the changes or these ideas, are very welcome :) @manics @ryanlovett |
This looks great to me. Can it be merged? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay in reviewing this, it generally looks good to me, though I haven't yet managed to run it locally.
pass | ||
|
||
def check_origin(self, origin: str = None): | ||
# Skip JupyterHandler.check_origin |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain why it's OK to go straight to th WebSocketHandler check- is it because
https://jupyter-server.readthedocs.io/en/stable/api/jupyter_server.base.html#jupyter_server.base.handlers.JupyterHandler.check_origin
implies we're skipping the more relaxed check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem with JupyterHandler.check_origin is that it requires an IdentityProvider on the server to work correctly. This is currently not the case.
I looked a bit into the whole IdentityProvider
structure, and we could integrate one into the StandaloneProxyServer
. Additionally, JupyterHub provides a JupyterHubIdentityProvider
, which can be used to add the Handler for the oauth_callback/
route automatically, while I am currently adding it manually.
But integrating it now would probably mean another restructuring and a lot of new testing.
tl;dr: I think it might be a good idea in the future to add an IdentityProvider
to the StandaloneProxyServer
and remove this skip, but it's not required right now.
P.S.: This is also skipped in jhsingle-native-proxy
, since they removed the JupyterHandler
inheritance from ProxyHandler
return WebSocketHandler.check_origin(self, origin) | ||
|
||
def check_xsrf_cookie(self): | ||
# Skip HubAuthenticated.check_xsrf_cookie |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain why this is safe (assuming it is!)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like I mentioned below, this is also done in the original implementation in jhsingle-native-proxy
here and in our implementation here.
I need to remove it here again because my inheritance order is different from the original one. There, the ProxyHandler
is inherited from HubOAuthenticated
directly and then overwritten, whereas I can only inherit from HubOAuthenticated
in StandaloneHubProxyHandler
and need to skip it again.
This does not mean that this is inherently safe. But I trust it enough for now. But we are currently planning to let the whole standalone server get checked by someone with more websec experience than me.
I think this is safe for now, but I will report back when we find it is not!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The shortened command line arguments have a mix of -
and _
, could we use -
for all?
Wrap an arbitrary web service so it can be used in place of 'jupyterhub-
singleuser' in a JupyterHub setting.
Usage: jupyter standaloneproxy [options] -- <command>
For more details, see the jupyter-server-proxy documentation.
Options
=======
The options below are convenience aliases to configurable class-options,
as listed in the "Equivalent to" description-line of the aliases.
To see all configurable class-options for some <cmd>, use:
<cmd> --help-all
--debug
set log level to logging.DEBUG (maximize logging output)
Equivalent to: [--Application.log_level=10]
--show-config
Show the application's configuration (human-readable format)
Equivalent to: [--Application.show_config=True]
--show-config-json
Show the application's configuration (json format)
Equivalent to: [--Application.show_config_json=True]
--generate-config
generate default config file
Equivalent to: [--JupyterApp.generate_config=True]
-y
Answer yes to any questions instead of prompting.
Equivalent to: [--JupyterApp.answer_yes=True]
--absolute-url
Proxy requests default to being rewritten to ``/``. If this is True,
the absolute URL will be sent to the backend instead.
Equivalent to: [--ServerProcess.absolute_url=True]
--raw-socket-proxy
Proxy websocket requests as a raw TCP (or unix socket) stream.
In this mode, only websockets are handled, and messages are sent to the backend,
similar to running a websockify layer (https://github.com/novnc/websockify).
All other HTTP requests return 405 (and thus this will also bypass rewrite_response).
Equivalent to: [--ServerProcess.raw_socket_proxy=True]
--skip-authentication
Do not authenticate access to the server via JupyterHub. When set,
incoming requests will not be authenticated and anyone can access the
application.
WARNING: Disabling Authentication can be a major security issue.
Equivalent to: [--StandaloneProxyServer.skip_authentication=True]
--log-level=<Enum>
Set the log level by value or name.
Choices: any of [0, 10, 20, 30, 40, 50, 'DEBUG', 'INFO', 'WARN', 'ERROR', 'CRITICAL']
Default: 30
Equivalent to: [--Application.log_level]
--config=<Unicode>
Full path of a config file.
Default: ''
Equivalent to: [--JupyterApp.config_file]
--absolute_url=<Bool>
Proxy requests default to being rewritten to ``/``. If this is True, the
absolute URL will be sent to the backend instead.
Default: False
Equivalent to: [--StandaloneProxyServer.absolute_url]
--environment=<Union>
A dictionary of environment variable mappings. As with the command traitlet,
``{port}``, ``{unix_socket}`` and ``{base_url}`` will be substituted.
Could also be a callable. It should return a dictionary.
Default: {}
Equivalent to: [--StandaloneProxyServer.environment]
--mappath=<Union>
Map request paths to proxied paths. Either a dictionary of request paths to
proxied paths, or a callable that takes parameter ``path`` and returns the
proxied path.
Default: {}
Equivalent to: [--StandaloneProxyServer.mappath]
--port=<Int>
The port where the proxy server can be accessed. The port is usually taken
from the `JUPYTERHUB_SERVICE_URL` environment variable or will default to
`8888`. Used to explicitely overwrite the port of the server.
Default: 0
Equivalent to: [--StandaloneProxyServer.port]
--raw_socket_proxy=<Bool>
Proxy websocket requests as a raw TCP (or unix socket) stream. In this mode,
only websockets are handled, and messages are sent to the backend, similar
to running a websockify layer (https://github.com/novnc/websockify). All
other HTTP requests return 405 (and thus this will also bypass
rewrite_response).
Default: False
Equivalent to: [--StandaloneProxyServer.raw_socket_proxy]
--request_headers_override=<Union>
A dictionary of additional HTTP headers for the proxy request. As with the
command traitlet, ``{port}``, ``{unix_socket}`` and ``{base_url}`` will be
substituted.
Default: {}
Equivalent to: [--StandaloneProxyServer.request_headers_override]
--timeout=<Int>
Timeout in seconds for the process to become ready, default 5s.
Default: 5
Equivalent to: [--StandaloneProxyServer.timeout]
--unix_socket=<Union>
If set, the service will listen on a Unix socket instead of a TCP port. Set
to True to use a socket in a new temporary folder, or a string path to a
socket. This overrides port.
Proxying websockets over a Unix socket requires Tornado >= 6.3.
Default: None
Equivalent to: [--StandaloneProxyServer.unix_socket]
--base_url=<Unicode>
Base URL where Requests will be received and proxied. Usually taken from the
"JUPYTERHUB_SERVICE_PREFIX" environment variable (or "/" when not set). Set
to overwrite.
When setting to "/foo/bar", only incoming requests starting with this prefix
will be answered by the server and proxied to the proxied app. Any other
requests will get a 404 response.
Default: ''
Equivalent to: [--StandaloneProxyServer.base_url]
--address=<Unicode>
The address where the proxy server can be accessed. The address is usually
taken from the `JUPYTERHUB_SERVICE_URL` environment variable or will default
to `127.0.0.1`. Used to explicitely overwrite the address of the server.
Default: ''
Equivalent to: [--StandaloneProxyServer.address]
--server_port=<Int>
Set the port that the service will listen on. The default is to
automatically select an unused port.
Default: 0
Equivalent to: [--StandaloneProxyServer.server_port]
--activity_interval=<Int>
Specify an interval to send regulat activity updated to the JupyterHub (in
Seconds). When enabled, the StandaloneProxy will try to send a POST request
to the JupyterHub API containing a timestamp and the name of the server. The
URL for the activity Endpoint needs to be specified in the
"JUPYTERHUB_ACTIVITY_URL" environment variable. This URL usually is
"/api/users/<user>/activity".
Set to 0 to disable activity notifications.
Default: 300
Equivalent to: [--StandaloneProxyServer.activity_interval]
--websocket_max_message_size=<Int>
Restrict the size of a message in a WebSocket connection (in bytes). Tornado
defaults to 10MiB.
Default: None
Equivalent to: [--StandaloneProxyServer.websocket_max_message_size]
command=<Union>
An optional list of strings that should be the full command to be executed.
The optional template arguments ``{port}``, ``{unix_socket}`` and
``{base_url}`` will be substituted with the port or Unix socket path the
process should listen on and the base-url of the notebook.
Could also be a callable. It should return a list.
If the command is not specified or is an empty list, the server process is
assumed to be started ahead of time and already available to be proxied to.
Default: traitlets.Undefined
Examples
--------
jupyter standaloneproxy -- voila --port={port} --no-browser /path/to/notebook.ipynb
To see all available configurables, use `--help-all`.
There are also three JupyterApp flags:
--generate-config
generate default config file
Equivalent to: [--JupyterApp.generate_config=True]
-y
Answer yes to any questions instead of prompting.
Equivalent to: [--JupyterApp.answer_yes=True]
--config=<Unicode>
Full path of a config file.
Default: ''
Equivalent to: [--JupyterApp.config_file]
-y
Is there any use for this? Otherwise add it to the exclusions.
--generate-config
doesn't work - it just starts the proxy. Easiest to add to the exclusions unless you think it's useful?
--config
silently continues if the config file is broken or non-existent. I think this is the default behaviour for Jupyter Apps but I don't think it's very helpful, especially for an app like this where the config is critical!
def _validate_prefix(self, proposal): | ||
return proposal["value"].removesuffix("/") | ||
|
||
skip_authentication = Bool( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we've used no-...
for other flags in JupyterHub projects, do you think renaming this no_authentication
is reasonable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's do that in #521. I will need to refactor the traitlets and add the tag(config=True)
s there anyway.
@manics Thank you for the review. Both Regarding the We are currently working on getting the standalone feature running on our proper JupyterHub instance. We have already found (and fixed) a few bugs in the last few days while doing that. So if it gives you more confidence, we can wait with merging this PR until the standalone feature is fully running on our hub to make sure it is working there :) |
025e540
to
9ad5c25
Compare
Great!
with this StandaloneProxyServer from our JupyterHub. |
For Xpra the start command is:
For OpenVSCode-Server the start command is:
|
@manics It would be great if this patch could be merged. |
@jwindgassen have you pushed all the fixes you're planning to make? |
At least for this PR, I am done, yes. Take a look at comments on the remaining open code reviews. IMHO, this is ready to be merged. There are still some minor things to finish (like #521), but not in here. |
@manics Any chance this can be merged? |
Hey everyone.
I was recently made aware, of the desire to be able to create standalone proxies, similar to how it is done in jhsingle-native-proxy (see #1).
This would be immensely advantageous for us, so I started porting the code here recently.
There is still a lot to do and I needed to remove/comment out a few of the original features, but it is already fundamentally working as is. I am opening this PR to let you know of this and get an opinion on a few bits here and there. I will continue to improve on it in the next weeks.
In the meantime, any comments and ideas are welcome :)