diff --git a/docs/source/index.md b/docs/source/index.md index 1ece9d14..fe6f77fa 100644 --- a/docs/source/index.md +++ b/docs/source/index.md @@ -30,6 +30,7 @@ install server-process launchers arbitrary-ports-hosts +standalone ``` ## Convenience packages for popular applications diff --git a/docs/source/standalone.md b/docs/source/standalone.md new file mode 100644 index 00000000..4ad1caaa --- /dev/null +++ b/docs/source/standalone.md @@ -0,0 +1,194 @@ +(standalone)= + +# Spawning and proxying a web service from JupyterHub + +The `standalone` feature of Jupyter Server Proxy enables JupyterHub Admins to launch and proxy arbitrary web services +directly, instead of JupyterLab or Notebook. You can use Jupyter Server Proxy to spawn a single proxy, +without it being attached to a Jupyter server. The proxy securely authenticates and restricts access to authorized +users through JupyterHub, providing a unified way to access arbitrary applications securely. + +This works similarly to {ref}`proxying Server Processes `, where a server process is started and proxied. +The Proxy is usually started from the command line, often by modifying the `Spawner.cmd` in your +[JupyterHub Configuration](https://jupyterhub.readthedocs.io/en/stable/tutorial/getting-started/spawners-basics.html). + +This feature builds upon the work of [Dan Lester](https://github.com/danlester), who originally developed it in the +[jhsingle-native-proxy](https://github.com/ideonate/jhsingle-native-proxy) package. + +## Installation + +This feature has a dependency on JupyterHub and must be explicitly installed via an optional dependency: + +```shell +pip install jupyter-server-proxy[standalone] +``` + +## Usage + +The standalone proxy is controlled with the `jupyter standaloneproxy` command. You always need to specify the +{ref}`command ` of the web service that will be launched and proxied. Let's use +[voilà](https://github.com/voila-dashboards/voila) as an example here: + +```shell +jupyter standaloneproxy -- voila --no-browser --port={port} /path/to/some/Notebook.ipynb +``` + +Executing this command will spawn a new HTTP Server, creating the voilà dashboard and rendering the notebook. +Any template strings (like the `--port={port}`) inside the command will be automatically replaced when the command is +executed. + +The CLI has multiple advanced options to customize the proxy behavior. Execute `jupyter standaloneproxy --help` +to get a complete list of all arguments. + +### Specify the address and port + +The proxy will try to extract the address and port from the `JUPYTERHUB_SERVICE_URL` environment variable. This variable +will be set by JupyterHub. Otherwise, the server will be launched on `127.0.0.1:8888`. +You can also explicitly overwrite these values: + +```shell +jupyter standaloneproxy --address=localhost --port=8000 ... +``` + +### Disable Authentication + +For testing, it can be useful to disable the authentication with JupyterHub. Passing `--skip-authentication` will +not trigger the login process when accessing the application. + +```{warning} Disabling authentication will leave the application open to anyone! Be careful with it, +especially on multi-user systems. +``` + +### Configuration via traitlets + +Instead of using the commandline, a standalone proxy can also be configured via a `traitlets` configuration file. +The configuration file can be loaded by running `jupyter standaloneproxy --config path/to/config.py`. + +The options mentioned above can also be configured in the config file: + +```python +# Specify the command to execute +c.StandaloneProxyServer.command = [ + "voila", "--no-browser", "--port={port}", "/path/to/some/Notebook.ipynb" +] + +# Specify address and port +c.StandaloneProxyServer.address = "localhost" +c.StandaloneProxyServer.port = 8000 + +# Disable authentication +c.StandaloneProxyServer.skip_authentication = True +``` + +A default config file can be emitted by running `jupyter standaloneproxy --generate-config` + +## Usage with JupyterHub + +To launch a standalone proxy with JupyterHub, you need to customize the `Spawner` inside the configuration +using `traitlets`: + +```python +c.Spawner.cmd = "jupyter-standaloneproxy" +c.Spawner.args = ["--", "voila", "--no-browser", "--port={port}", "/path/to/some/Notebook.ipynb"] +``` + +This will hard-code JupyterHub to launch voilà instead of `jupyterhub-singleuser`. In case you want to give the users +of JupyterHub the ability to select which application to launch (like selecting either JupyterLab or voilà), +you will want to make this configuration optional: + +```python +# Let users select which application start +c.Spawner.options_form = """ + + + """ + +def select_application(spawner): + application = spawner.user_options.get("application", ["lab"])[0] + if application == "voila": + spawner.cmd = "jupyter-standaloneproxy" + spawner.args = ["--", "voila", "--no-browser", "--port={port}", "/path/to/some/Notebook.ipynb"] + +c.Spawner.pre_spawn_hook = select_application +``` + +```{note} This is only a very basic implementation to show a possible approach. For a production setup, you can create +a more rigorous implementation by creating a custom `Spawner` and overwriting the appropriate functions and/or +creating a custom `spawner.html` page. +``` + +## Technical Overview + +The following section should serve as an explanation to developers of the standalone feature of jupyter-server-proxy. +It outlines the basic functionality and will explain the different components of the code in more depth. + +### JupyterHub and jupyterhub-singleuser + +By default, JupyterHub will use the `jupyterhub-singleuser` executable when launching a new instance for a user. +This executable is usually a wrapper around the `JupyterLab` or `Notebook` application, with some +additions regarding authentication and multi-user systems. +In the standalone feature, we try to mimic these additions, but instead of using `JupyterLab` or `Notebook`, we +will wrap them around an arbitrary web application. +This will ensure direct, authenticated access to the application, without needing a Jupyter server to be running +in the background. The different additions will be discussed in more detail below. + +### Structure + +The standalone feature is built on top of the `SuperviseAndProxyhandler`, which will spawn a process and proxy +requests to this server. While this process is called _Server_ in the documentation, the term _Application_ will be +used here, to avoid confusion with the other server where the `SuperviseAndProxyhandler` is attached to. +When using jupyter-server-proxy, the proxies are attached to the Jupyter server and will proxy requests +to the application. +Since we do not want to use the Jupyter server here, we instead require an alternative server, which will be used +to attach the `SuperviseAndProxyhandler` and all the required additions from `jupyterhub-singleuser`. +For that, we use tornado `HTTPServer`. + +### Login and Authentication + +One central component is the authentication with the JupyterHub Server. +Any client accessing the application will need to authenticate with the JupyterHub API, which will ensure only +users themselves (or otherwise allowed users, e.g., admins) can access the application. +The Login process is started by deriving our `StandaloneProxyHandler` from +[jupyterhub.services.auth.HubOAuthenticated](https://github.com/jupyterhub/jupyterhub/blob/5.0.0/jupyterhub/services/auth.py#L1541) +and decorating any methods we want to authenticate with `tornado.web.authenticated`. +For the proxy, we just decorate the `proxy` method with `web.authenticated`, which will authenticate all routes on all HTTP Methods. +`HubOAuthenticated` will automatically provide the login URL for the authentication process and any +client accessing any path of our server will be redirected to the JupyterHub API. + +After a client has been authenticated with the JupyterHub API, they will be redirected back to our server. +This redirect will be received on the `/oauth_callback` path, from where we need to redirect the client back to the +root of the application. +We use the [HubOAuthCallbackHandler](https://github.com/jupyterhub/jupyterhub/blob/5.0.0/jupyterhub/services/auth.py#L1547), +another handler from the JupyterHub package, for this. +It will also cache the received OAuth state from the login so that we can skip authentication for the next requests +and do not need to go through the whole login process for each request. + +### SSL certificates + +In some JupyterHub configurations, the launched application will be configured to use an SSL certificate for requests +between the JupyterLab / Notebook and the JupyterHub API. The path of the certificate is given in the +`JUPYTERHUB_SSL_*` environment variables. We use these variables to create a new SSL Context for both +the `AsyncHTTPClient` (used for Activity Notification, see below) and the `HTTPServer`. + +### Activity Notifications + +The `jupyterhub-singleuser` will periodically send an activity notification to the JupyterHub API and inform it that +the currently running application is still active. Whether this information is used or not depends on the specific +configuration of this JupyterHub. + +### Environment Variables + +JupyterHub uses a lot of environment variables to specify how the launched app should be run. +This list is a small overview of all used variables and what they contain and are used for. + +| Variable | Explanation | Typical Value | +| ------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------ | +| `JUPYTERHUB_SERVICE_URL` | URL where the server should be listening. Used to find the Address and Port to start the server on. | `http://127.0.0.1:5555` | +| `JUPYTERHUB_SERVICE_PREFIX` | An URL Prefix where the root of the launched application should be hosted. E.g., when set to `/user/name/`, then the root of the proxied application should be `/user/name/index.html` | `/services/service-name/` or `/user/name/` | +| `JUPYTERHUB_ACTIVITY_URL` | URL where to send activity notifications to. | `$JUPYTERHUB_API_URL/user/name/activity` | +| `JUPYTERHUB_API_TOKEN` | Authorization Token for requests to the JupyterHub API. | | +| `JUPYTERHUB_SERVER_NAME` | A name given to all apps launched by the JupyterHub. | | +| `JUPYTERHUB_SSL_KEYFILE`, `JUPYTERHUB_SSL_CERTFILE`, `JUPYTERHUB_SSL_CLIENT_CA` | Paths to keyfile, certfile and client CA for the SSL configuration | | +| `JUPYTERHUB_USER`, `JUPYTERHUB_GROUP` | Name and Group of the user for this application. Required for Authentication | diff --git a/jupyter_server_proxy/config.py b/jupyter_server_proxy/config.py index 20d69070..f9b0800a 100644 --- a/jupyter_server_proxy/config.py +++ b/jupyter_server_proxy/config.py @@ -2,6 +2,8 @@ Traitlets based configuration for jupyter_server_proxy """ +from __future__ import annotations + import sys from textwrap import dedent, indent from warnings import warn @@ -263,60 +265,83 @@ def cats_only(response, path): """, ).tag(config=True) + def get_proxy_base_class(self) -> tuple[type | None, dict]: + """ + Return the appropriate ProxyHandler Subclass and its kwargs + """ + if self.command: + return ( + SuperviseAndRawSocketHandler + if self.raw_socket_proxy + else SuperviseAndProxyHandler + ), dict(state={}) + + if not (self.port or isinstance(self.unix_socket, str)): + warn( + f"""Server proxy {self.name} does not have a command, port number or unix_socket path. + At least one of these is required.""" + ) + return None, dict() + + return ( + RawSocketHandler if self.raw_socket_proxy else NamedLocalProxyHandler + ), dict() -def _make_proxy_handler(sp: ServerProcess): - """ - Create an appropriate handler with given parameters - """ - if sp.command: - cls = ( - SuperviseAndRawSocketHandler - if sp.raw_socket_proxy - else SuperviseAndProxyHandler - ) - args = dict(state={}) - elif not (sp.port or isinstance(sp.unix_socket, str)): - warn( - f"Server proxy {sp.name} does not have a command, port " - f"number or unix_socket path. At least one of these is " - f"required." - ) - return - else: - cls = RawSocketHandler if sp.raw_socket_proxy else NamedLocalProxyHandler - args = {} - - # FIXME: Set 'name' properly - class _Proxy(cls): - kwargs = args - - def __init__(self, *args, **kwargs): - super().__init__(*args, **kwargs) - self.name = sp.name - self.command = sp.command - self.proxy_base = sp.name - self.absolute_url = sp.absolute_url - if sp.command: - self.requested_port = sp.port - self.requested_unix_socket = sp.unix_socket - else: - self.port = sp.port - self.unix_socket = sp.unix_socket - self.mappath = sp.mappath - self.rewrite_response = sp.rewrite_response - self.update_last_activity = sp.update_last_activity - - def get_request_headers_override(self): - return self._realize_rendered_template(sp.request_headers_override) - - # these two methods are only used in supervise classes, but do no harm otherwise - def get_env(self): - return self._realize_rendered_template(sp.environment) - - def get_timeout(self): - return sp.timeout - - return _Proxy + def get_proxy_attributes(self) -> dict: + """ + Return the required attributes, which will be set on the proxy handler + """ + attributes = { + "name": self.name, + "command": self.command, + "proxy_base": self.name, + "absolute_url": self.absolute_url, + "mappath": self.mappath, + "rewrite_response": self.rewrite_response, + "update_last_activity": self.update_last_activity, + "request_headers_override": self.request_headers_override, + } + + if self.command: + attributes["requested_port"] = self.port + attributes["requested_unix_socket"] = self.unix_socket + attributes["environment"] = self.environment + attributes["timeout"] = self.timeout + else: + attributes["port"] = self.port + attributes["unix_socket"] = self.unix_socket + + return attributes + + def make_proxy_handler(self) -> tuple[type | None, dict]: + """ + Create an appropriate handler for this ServerProxy Configuration + """ + cls, proxy_kwargs = self.get_proxy_base_class() + if cls is None: + return None, proxy_kwargs + + # FIXME: Set 'name' properly + attributes = self.get_proxy_attributes() + + class _Proxy(cls): + def __init__(self, *args, **kwargs): + super().__init__(*args, **kwargs) + + for name, value in attributes.items(): + setattr(self, name, value) + + def get_request_headers_override(self): + return self._realize_rendered_template(self.request_headers_override) + + # these two methods are only used in supervise classes, but do no harm otherwise + def get_env(self): + return self._realize_rendered_template(self.environment) + + def get_timeout(self): + return self.timeout + + return _Proxy, proxy_kwargs def get_entrypoint_server_processes(serverproxy_config): @@ -332,21 +357,21 @@ def get_entrypoint_server_processes(serverproxy_config): return sps -def make_handlers(base_url, server_processes): +def make_handlers(base_url: str, server_processes: list[ServerProcess]): """ Get tornado handlers for registered server_processes """ handlers = [] - for sp in server_processes: - handler = _make_proxy_handler(sp) + for server in server_processes: + handler, kwargs = server.make_proxy_handler() if not handler: continue - handlers.append((ujoin(base_url, sp.name, r"(.*)"), handler, handler.kwargs)) - handlers.append((ujoin(base_url, sp.name), AddSlashHandler)) + handlers.append((ujoin(base_url, server.name, r"(.*)"), handler, kwargs)) + handlers.append((ujoin(base_url, server.name), AddSlashHandler)) return handlers -def make_server_process(name, server_process_config, serverproxy_config): +def make_server_process(name: str, server_process_config: dict, serverproxy_config): return ServerProcess(name=name, **server_process_config) diff --git a/jupyter_server_proxy/standalone/__init__.py b/jupyter_server_proxy/standalone/__init__.py new file mode 100644 index 00000000..117d4ac6 --- /dev/null +++ b/jupyter_server_proxy/standalone/__init__.py @@ -0,0 +1,9 @@ +from .app import StandaloneProxyServer + + +def main(): + StandaloneProxyServer.launch_instance() + + +if __name__ == "__main__": + main() diff --git a/jupyter_server_proxy/standalone/activity.py b/jupyter_server_proxy/standalone/activity.py new file mode 100644 index 00000000..8028ca7d --- /dev/null +++ b/jupyter_server_proxy/standalone/activity.py @@ -0,0 +1,75 @@ +import json +import os +from datetime import datetime + +from jupyterhub.utils import exponential_backoff, isoformat +from tornado import httpclient, ioloop +from tornado.log import app_log as log + + +async def notify_activity(): + """ + Regularly notify JupyterHub of activity. + See https://github.com/jupyterhub/jupyterhub/blob/4.x/jupyterhub/singleuser/extension.py#L389 + """ + + client = httpclient.AsyncHTTPClient() + last_activity_timestamp = isoformat(datetime.utcnow()) + failure_count = 0 + + activity_url = os.environ.get("JUPYTERHUB_ACTIVITY_URL") + server_name = os.environ.get("JUPYTERHUB_SERVER_NAME") + api_token = os.environ.get("JUPYTERHUB_API_TOKEN") + + if not (activity_url and server_name and api_token): + log.error( + "Could not find environment variables to send notification to JupyterHub" + ) + return + + async def notify(): + """Send Notification, return if successful""" + nonlocal failure_count + log.debug(f"Notifying Hub of activity {last_activity_timestamp}") + + req = httpclient.HTTPRequest( + url=activity_url, + method="POST", + headers={ + "Authorization": f"token {api_token}", + "Content-Type": "application/json", + }, + body=json.dumps( + { + "servers": { + server_name: {"last_activity": last_activity_timestamp} + }, + "last_activity": last_activity_timestamp, + } + ), + ) + + try: + await client.fetch(req) + return True + except httpclient.HTTPError as e: + failure_count += 1 + log.error(f"Error notifying Hub of activity: {e}") + return False + + # Try sending notification for 1 minute + await exponential_backoff( + notify, + fail_message="Failed to notify Hub of activity", + start_wait=1, + max_wait=15, + timeout=60, + ) + + if failure_count > 0: + log.info(f"Sent hub activity after {failure_count} retries") + + +def start_activity_update(interval): + pc = ioloop.PeriodicCallback(notify_activity, 1e3 * interval, 0.1) + pc.start() diff --git a/jupyter_server_proxy/standalone/app.py b/jupyter_server_proxy/standalone/app.py new file mode 100644 index 00000000..0b0fbfd8 --- /dev/null +++ b/jupyter_server_proxy/standalone/app.py @@ -0,0 +1,301 @@ +from __future__ import annotations + +import logging +import os +import re +import ssl +from textwrap import dedent +from urllib.parse import urlparse + +from jupyter_core.application import JupyterApp +from jupyterhub.services.auth import HubOAuthCallbackHandler +from jupyterhub.utils import make_ssl_context +from tornado import httpclient, httpserver, ioloop, web +from tornado.web import RedirectHandler +from traitlets.traitlets import Bool, Int, Unicode, default, validate + +from ..config import ServerProcess +from .activity import start_activity_update +from .proxy import make_standalone_proxy + + +class StandaloneProxyServer(JupyterApp, ServerProcess): + name = "jupyter-standalone-proxy" + description = """ + Wrap an arbitrary web service so it can be used in place of 'jupyterhub-singleuser' + in a JupyterHub setting. + + Usage: jupyter standaloneproxy [options] -- + + For more details, see the jupyter-server-proxy documentation. + """ + examples = "jupyter standaloneproxy -- voila --port={port} --no-browser /path/to/notebook.ipynb" + + base_url = Unicode( + help=""" + Base URL where Requests will be received and proxied. Usually taken from the + "JUPYTERHUB_SERVICE_PREFIX" environment variable (or "/" when not set). + Set to override. + + When setting to "/foo/bar", only incoming requests starting with this prefix will + be answered by the server and proxied to the proxied app. Any other requests will + get a 404 response. + """, + ).tag(config=True) + + @default("base_url") + def _default_prefix(self): + # Python 3.8 does not support removesuffix + prefix = os.environ.get("JUPYTERHUB_SERVICE_PREFIX", "/") + if prefix[-1] == "/": + prefix = prefix[:-1] + return prefix + + @validate("base_url") + def _validate_prefix(self, proposal): + prefix = proposal["value"] + if prefix[-1] == "/": + prefix = prefix[:-1] + return prefix + + skip_authentication = Bool( + default=False, + help=""" + Do not authenticate access to the server via JupyterHub. When set, + incoming requests will not be authenticated and anyone can access the + application. + + WARNING: Disabling Authentication can be a major security issue. + """, + ).tag(config=True) + + address = Unicode( + help=""" + The address where the proxy server can be accessed. The address is usually taken from the `JUPYTERHUB_SERVICE_URL` + environment variable or will default to `127.0.0.1`. Used to explicitly override the address of the server. + """ + ).tag(config=True) + + @default("address") + def _default_address(self): + if os.environ.get("JUPYTERHUB_SERVICE_URL"): + url = urlparse(os.environ["JUPYTERHUB_SERVICE_URL"]) + if url.hostname: + return url.hostname + + return "127.0.0.1" + + port = Int( + help=""" + The port where the proxy server can be accessed. The port is usually taken from the `JUPYTERHUB_SERVICE_URL` + environment variable or will default to `8888`. Used to explicitly override the port of the server. + """ + ).tag(config=True) + + @default("port") + def _default_port(self): + if os.environ.get("JUPYTERHUB_SERVICE_URL"): + url = urlparse(os.environ["JUPYTERHUB_SERVICE_URL"]) + + if url.port: + return url.port + elif url.scheme == "http": + return 80 + elif url.scheme == "https": + return 443 + + return 8888 + + server_port = Int(default_value=0, help=ServerProcess.port.help).tag(config=True) + + activity_interval = Int( + default_value=300, + help=""" + Specify an interval to send regular activity updates to the JupyterHub (in seconds). + When enabled, the StandaloneProxy will try to send a POST request to the JupyterHub API + containing a timestamp and the name of the server. + The URL for the activity Endpoint needs to be specified in the "JUPYTERHUB_ACTIVITY_URL" + environment variable. This URL usually is "/api/users//activity". + + Set to 0 to disable activity notifications. + """, + ).tag(config=True) + + websocket_max_message_size = Int( + default_value=None, + allow_none=True, + help="Restrict the size of a message in a WebSocket connection (in bytes). Tornado defaults to 10MiB.", + ).tag(config=True) + + @default("command") + def _default_command(self): + # ToDo: Find a better way to do this + return self.extra_args + + def __init__(self, **kwargs): + super().__init__(**kwargs) + + # Flags for CLI + self.flags = { + **super().flags, + "absolute-url": ( + {"ServerProcess": {"absolute_url": True}}, + dedent(ServerProcess.absolute_url.help), + ), + "raw-socket-proxy": ( + {"ServerProcess": {"raw_socket_proxy": True}}, + dedent(ServerProcess.raw_socket_proxy.help), + ), + "skip-authentication": ( + {"StandaloneProxyServer": {"skip_authentication": True}}, + dedent(self.__class__.skip_authentication.help), + ), + } + + # Create an Alias to all Traits defined in ServerProcess, with some + # exceptions we do not need, for easier use of the CLI + # We don't need "command" here, as we will take it from the extra_args + ignore_traits = [ + "name", + "launcher_entry", + "new_browser_tab", + "rewrite_response", + "update_last_activity", + "command", + ] + server_process_aliases = { + trait: f"StandaloneProxyServer.{trait}" + for trait in ServerProcess.class_traits(config=True) + if trait not in ignore_traits and trait not in self.flags + } + + self.aliases = { + **super().aliases, + **server_process_aliases, + "base_url": "StandaloneProxyServer.base_url", + "address": "StandaloneProxyServer.address", + "port": "StandaloneProxyServer.port", + "server_port": "StandaloneProxyServer.server_port", + "activity_interval": "StandaloneProxyServer.activity_interval", + "websocket_max_message_size": "StandaloneProxyServer.websocket_max_message_size", + } + + def emit_alias_help(self): + yield from super().emit_alias_help() + yield "" + + # Manually yield the help for command, which we will get from extra_args + command_help = StandaloneProxyServer.class_get_trait_help( + ServerProcess.command + ).split("\n") + yield command_help[0].replace("--StandaloneProxyServer.command", "command") + yield from command_help[1:] + + def get_proxy_base_class(self) -> tuple[type | None, dict]: + cls, kwargs = super().get_proxy_base_class() + if cls is None: + return None, kwargs + + return make_standalone_proxy(cls, kwargs) + + def get_proxy_attributes(self) -> dict: + attributes = super().get_proxy_attributes() + + # The ProxyHandler will be listening on "{base_url}/" instead of "{base_url}/{name}". + # Needed for correct header generation of "X-Forwarded-Context", etc. + attributes["proxy_base"] = "/" + + attributes["requested_port"] = self.server_port + attributes["skip_authentication"] = self.skip_authentication + + return attributes + + def create_app(self) -> web.Application: + self.log.debug(f"Process will use port = {self.port}") + self.log.debug(f"Process will use unix_socket = {self.unix_socket}") + self.log.debug(f"Process environment: {self.environment}") + self.log.debug(f"Proxy mappath: {self.mappath}") + + settings = dict( + debug=self.log_level == logging.DEBUG, + base_url=self.base_url, + # Required for JupyterHub + hub_user=os.environ.get("JUPYTERHUB_USER", ""), + hub_group=os.environ.get("JUPYTERHUB_GROUP", ""), + cookie_secret=os.urandom(32), + ) + + if self.websocket_max_message_size: + self.log.debug( + f"Restricting WebSocket Messages to {self.websocket_max_message_size}" + ) + settings["websocket_max_message_size"] = self.websocket_max_message_size + + # Create the proxy class without arguments + proxy_handler, proxy_kwargs = self.make_proxy_handler() + + base_url = re.escape(self.base_url) + return web.Application( + [ + # Redirects from the JupyterHub might not contain a slash, so we add one here + (f"^{base_url}$", RedirectHandler, dict(url=f"{base_url}/")), + (f"^{base_url}/oauth_callback", HubOAuthCallbackHandler), + (f"^{base_url}/(.*)", proxy_handler, proxy_kwargs), + ], + **settings, + ) + + def _configure_ssl(self) -> dict | None: + # See https://github.com/jupyter-server/jupyter_server/blob/v2.0.0/jupyter_server/serverapp.py#L2053-L2073 + keyfile = os.environ.get("JUPYTERHUB_SSL_KEYFILE", "") + certfile = os.environ.get("JUPYTERHUB_SSL_CERTFILE", "") + client_ca = os.environ.get("JUPYTERHUB_SSL_CLIENT_CA", "") + + if not (keyfile or certfile or client_ca): + self.log.warn("Could not configure SSL") + return None + + ssl_options = {} + if keyfile: + ssl_options["keyfile"] = keyfile + if certfile: + ssl_options["certfile"] = certfile + if client_ca: + ssl_options["ca_certs"] = client_ca + + # PROTOCOL_TLS selects the highest ssl/tls protocol version that both the client and + # server support. When PROTOCOL_TLS is not available use PROTOCOL_SSLv23. + ssl_options["ssl_version"] = getattr(ssl, "PROTOCOL_TLS", ssl.PROTOCOL_SSLv23) + if ssl_options.get("ca_certs", False): + ssl_options["cert_reqs"] = ssl.CERT_REQUIRED + + # Configure HTTPClient to use SSL for Proxy Requests + ssl_context = make_ssl_context(keyfile, certfile, client_ca) + httpclient.AsyncHTTPClient.configure( + None, defaults={"ssl_options": ssl_context} + ) + + return ssl_options + + def start(self): + if self.skip_authentication: + self.log.warn("Disabling Authentication with JuypterHub Server!") + + app = self.create_app() + + ssl_options = self._configure_ssl() + http_server = httpserver.HTTPServer(app, ssl_options=ssl_options, xheaders=True) + http_server.listen(self.port, self.address) + + self.log.info(f"Starting standaloneproxy on '{self.address}:{self.port}'") + self.log.info(f"Base URL: {self.base_url!r}") + self.log.info(f"Command: {self.command}") + + # Periodically send JupyterHub Notifications, that we are still running + if self.activity_interval > 0: + self.log.info( + f"Sending Activity Notification to JupyterHub with interval={self.activity_interval}s" + ) + start_activity_update(self.activity_interval) + + ioloop.IOLoop.current().start() diff --git a/jupyter_server_proxy/standalone/proxy.py b/jupyter_server_proxy/standalone/proxy.py new file mode 100644 index 00000000..35c30991 --- /dev/null +++ b/jupyter_server_proxy/standalone/proxy.py @@ -0,0 +1,81 @@ +from __future__ import annotations + +from logging import Logger + +from jupyter_server.utils import ensure_async +from jupyterhub import __version__ as __jh_version__ +from jupyterhub.services.auth import HubOAuthenticated +from tornado import web +from tornado.log import app_log +from tornado.web import RequestHandler +from tornado.websocket import WebSocketHandler + +from ..handlers import SuperviseAndProxyHandler + + +def make_standalone_proxy( + base_proxy_class: type, proxy_kwargs: dict +) -> tuple[type | None, dict]: + if not issubclass(base_proxy_class, SuperviseAndProxyHandler): + app_log.error( + "Cannot create a 'StandaloneHubProxyHandler' from a class not inheriting from 'SuperviseAndProxyHandler'" + ) + return None, dict() + + class StandaloneHubProxyHandler(HubOAuthenticated, base_proxy_class): + """ + Base class for standalone proxies. + Will restrict access to the application by authentication with the JupyterHub API. + """ + + def __init__(self, *args, **kwargs): + super().__init__(*args, **kwargs) + self.environment = {} + self.timeout = 60 + self.skip_authentication = False + + @property + def log(self) -> Logger: + return app_log + + @property + def hub_users(self): + if "hub_user" in self.settings: + return {self.settings["hub_user"]} + return set() + + @property + def hub_groups(self): + if "hub_group" in self.settings: + return {self.settings["hub_group"]} + return set() + + def set_default_headers(self): + self.set_header("X-JupyterHub-Version", __jh_version__) + + def prepare(self, *args, **kwargs): + pass + + def check_origin(self, origin: str = None): + # Skip JupyterHandler.check_origin + return WebSocketHandler.check_origin(self, origin) + + def check_xsrf_cookie(self): + # Skip HubAuthenticated.check_xsrf_cookie + pass + + def write_error(self, status_code: int, **kwargs): + # ToDo: Return proper error page, like in jupyter-server/JupyterHub + return RequestHandler.write_error(self, status_code, **kwargs) + + async def proxy(self, port, path): + if self.skip_authentication: + return await super().proxy(port, path) + else: + return await ensure_async(self.oauth_proxy(port, path)) + + @web.authenticated + async def oauth_proxy(self, port, path): + return await super().proxy(port, path) + + return StandaloneHubProxyHandler, proxy_kwargs diff --git a/pyproject.toml b/pyproject.toml index 9334e5e6..d70bcba6 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -52,6 +52,7 @@ dependencies = [ [project.optional-dependencies] test = [ + "jupyter-server-proxy[standalone]", "pytest", "pytest-asyncio", "pytest-cov", @@ -62,6 +63,9 @@ acceptance = [ "jupyter-server-proxy[test]", "robotframework-jupyterlibrary >=0.4.2", ] +standalone = [ + "jupyterhub" +] classic = [ "jupyter-server <2", "jupyterlab >=3.0.0,<4.0.0a0", @@ -79,6 +83,9 @@ Documentation = "https://jupyter-server-proxy.readthedocs.io" Source = "https://github.com/jupyterhub/jupyter-server-proxy" Tracker = "https://github.com/jupyterhub/jupyter-server-proxy/issues" +[project.scripts] +jupyter-standaloneproxy = "jupyter_server_proxy.standalone:main" + # hatch ref: https://hatch.pypa.io/latest/ # diff --git a/tests/test_standalone.py b/tests/test_standalone.py new file mode 100644 index 00000000..cc55548d --- /dev/null +++ b/tests/test_standalone.py @@ -0,0 +1,124 @@ +import logging +import sys +from pathlib import Path + +import pytest +from tornado import testing + +from jupyter_server_proxy.standalone import StandaloneProxyServer + +""" +Test if address and port are identified correctly +""" + + +def test_address_and_port_with_http_address(monkeypatch): + monkeypatch.setenv("JUPYTERHUB_SERVICE_URL", "http://localhost/") + proxy_server = StandaloneProxyServer() + + assert proxy_server.address == "localhost" + assert proxy_server.port == 80 + + +def test_address_and_port_with_https_address(monkeypatch): + monkeypatch.setenv("JUPYTERHUB_SERVICE_URL", "https://localhost/") + proxy_server = StandaloneProxyServer() + + assert proxy_server.address == "localhost" + assert proxy_server.port == 443 + + +def test_address_and_port_with_address_and_port(monkeypatch): + monkeypatch.setenv("JUPYTERHUB_SERVICE_URL", "http://localhost:7777/") + proxy_server = StandaloneProxyServer() + + assert proxy_server.address == "localhost" + assert proxy_server.port == 7777 + + +class _TestStandaloneBase(testing.AsyncHTTPTestCase): + runTest = None # Required for Tornado 6.1 + + unix_socket: bool + skip_authentication: bool + + def get_app(self): + command = [ + sys.executable, + str(Path(__file__).parent / "resources" / "httpinfo.py"), + "--port={port}", + "--unix-socket={unix_socket}", + ] + + proxy_server = StandaloneProxyServer( + command=command, + base_url="/some/prefix", + unix_socket=self.unix_socket, + timeout=60, + skip_authentication=self.skip_authentication, + log_level=logging.DEBUG, + ) + + return proxy_server.create_app() + + +class TestStandaloneProxyRedirect(_TestStandaloneBase): + """ + Ensure requests are proxied to the application. We need to disable authentication here, + as we do not want to be redirected to the JupyterHub Login. + """ + + unix_socket = False + skip_authentication = True + + def test_add_slash(self): + response = self.fetch("/some/prefix", follow_redirects=False) + + assert response.code == 301 + assert response.headers.get("Location") == "/some/prefix/" + + def test_wrong_prefix(self): + response = self.fetch("/some/other/prefix") + + assert response.code == 404 + + def test_on_prefix(self): + response = self.fetch("/some/prefix/") + assert response.code == 200 + + body = response.body.decode() + assert body.startswith("GET /") + assert "X-Forwarded-Context: /some/prefix/" in body + assert "X-Proxycontextpath: /some/prefix/" in body + + +@pytest.mark.skipif( + sys.platform == "win32", reason="Unix socket not supported on Windows" +) +class TestStandaloneProxyWithUnixSocket(_TestStandaloneBase): + unix_socket = True + skip_authentication = True + + def test_with_unix_socket(self): + response = self.fetch("/some/prefix/") + assert response.code == 200 + + body = response.body.decode() + assert body.startswith("GET /") + assert "X-Forwarded-Context: /some/prefix/" in body + assert "X-Proxycontextpath: /some/prefix/" in body + + +class TestStandaloneProxyLogin(_TestStandaloneBase): + """ + Ensure we redirect to JupyterHub login when authentication is enabled + """ + + unix_socket = False + skip_authentication = False + + def test_redirect_to_login_url(self): + response = self.fetch("/some/prefix/", follow_redirects=False) + + assert response.code == 302 + assert "Location" in response.headers