You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/source/operators/config-availability.md
+21-21Lines changed: 21 additions & 21 deletions
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
# Availability modes
2
2
3
-
Enterprise Gateway can be optionally configured in one of two "availability modes": _single-instance_ or _multi-instance_. When configured, Enterprise Gateway can recover from failures and reconnect to any active remote kernels that were previously managed by the terminated EG instance. As such, both modes require that kernel session persistence also be enabled via `KernelSessionManager.enable_persistence=True`.
3
+
Enterprise Gateway can be optionally configured in one of two "availability modes": _standalone_ or _replication_. When configured, Enterprise Gateway can recover from failures and reconnect to any active remote kernels that were previously managed by the terminated EG instance. As such, both modes require that kernel session persistence also be enabled via `KernelSessionManager.enable_persistence=True`.
4
4
5
5
```{note}
6
6
Kernel session persistence will be automtically enabled whenever availability mode is configured.
@@ -16,13 +16,13 @@ Known issues include:
16
16
We hope to address these in future releaases (depending on demand).
17
17
```
18
18
19
-
## Single-instance availability
19
+
## Standalone availability
20
20
21
-
_Single-instance availability_ assumes that, upon failure of the original EG instance, another EG instance will be started. Upon startup of the second instance (following the termination of the first), EG will attempt to load and reconnect to all kernels that were deemed active when the previous instance terminated. This mode is somewhat analogous to the classic HA/DR mode of _active-passive_ and is typically used when node resources are at a premium or the number of replicas (in the Kubernetes sense) must remain at 1.
21
+
_Standalone availability_ assumes that, upon failure of the original EG instance, another EG instance will be started. Upon startup of the second instance (following the termination of the first), EG will attempt to load and reconnect to all kernels that were deemed active when the previous instance terminated. This mode is somewhat analogous to the classic HA/DR mode of _active-passive_ and is typically used when node resources are at a premium or the number of replicas (in the Kubernetes sense) must remain at 1.
22
22
23
-
To enable Enterprise Gateway for 'single-instance' availability, configure `EnterpiseGatewayApp.availability_mode=single-instance` or set env `EG_AVAILABILITY_MODE=single-instance`.
23
+
To enable Enterprise Gateway for 'standalone' availability, configure `EnterpiseGatewayApp.availability_mode=standalone` or set env `EG_AVAILABILITY_MODE=standalone`.
24
24
25
-
Here's an example for starting Enterprise Gateway with single-instance availability:
25
+
Here's an example for starting Enterprise Gateway with standalone availability:
With _multi-instance availability_, multiple EG instances are operating at the same time, and fronted with some kind of reverse proxy or load balancer. Because state still resides within each `KernelManager` instance executing within a given EG instance, we strongly suggest configuring some form of _client affinity_ (a.k.a, "sticky session") to avoid node switches wherever possible since each node switch requires manual reconnection of the front-end (today).
45
+
With _replication availability_, multiple EG instances (or replicas) are operating at the same time, and fronted with some kind of reverse proxy or load balancer. Because state still resides within each `KernelManager` instance executing within a given EG instance, we strongly suggest configuring some form of _client affinity_ (a.k.a, "sticky session") to avoid node switches wherever possible since each node switch requires manual reconnection of the front-end (today).
46
46
47
47
```{tip}
48
48
Configuring client affinity is **strongly recommended**, otherwise functionality that relies on state within the servicing node (e.g., culling) can be affected upon node switches, resulting in incorrect behavior.
49
49
```
50
50
51
51
In this mode, when one node goes down, the subsequent request will be routed to a different node that doesn't know about the kernel. Prior to returning a `404` (not found) status code, EG will check its persisted store to determine if the kernel was managed and, if so, attempt to "hydrate" a `KernelManager` instance associated with the remote kernel. (Of course, if the kernel was running local to the downed server, chances are it cannot be _revived_.) Upon successful "hydration" the request continues as if on the originating node. Because _client affinity_ is in place, subsequent requests should continue to be routed to the "servicing node".
52
52
53
-
To enable Enterprise Gateway for 'multi-instance' availability, configure `EnterpiseGatewayApp.availability_mode=multi-instance` or set env `EG_AVAILABILITY_MODE=multi-instance`.
53
+
To enable Enterprise Gateway for 'replication' availability, configure `EnterpiseGatewayApp.availability_mode=replication` or set env `EG_AVAILABILITY_MODE=replication`.
54
54
55
55
```{attention}
56
-
To preserve backwards compatibility, if only kernel session persistence is enabled via `KernelSessionManager.enable_persistence=True`, the availability mode will be automatically configured to 'multi-instance' if `EnterpiseGatewayApp.availability_mode` is not configured.
56
+
To preserve backwards compatibility, if only kernel session persistence is enabled via `KernelSessionManager.enable_persistence=True`, the availability mode will be automatically configured to 'replication' if `EnterpiseGatewayApp.availability_mode` is not configured.
57
57
```
58
58
59
-
Here's an example for starting Enterprise Gateway with multi-instance availability:
59
+
Here's an example for starting Enterprise Gateway with replication availability:
Enabling kernel session persistence allows Jupyter Notebooks to reconnect to kernels when Enterprise Gateway is restarted and forms the basis for the _availability modes_ described above. Enterprise Gateway provides two ways of persisting kernel sessions: _File Kernel Session Persistence_ and _Webhook Kernel Session Persistence_, although others can be provided by subclassing `KernelSessionManager` (see below).
80
80
81
81
```{attention}
82
-
Due to its experimental nature, kernel session persistence is disabled by default. To enable this functionality, you must configure `KernelSessionManger.enable_persistence=True` or configure `EnterpriseGatewayApp.availability_mode` to either `single-instance` or `multi-instance`.
82
+
Due to its experimental nature, kernel session persistence is disabled by default. To enable this functionality, you must configure `KernelSessionManger.enable_persistence=True` or configure `EnterpriseGatewayApp.availability_mode` to either `standalone` or `replication`.
83
83
```
84
84
85
85
As noted above, the availability modes rely on the persisted information relative to the kernel. This information consists of the arguments and options used to launch the kernel, along with its connection information. In essence, it consists of any information necessary to re-establish communication with the kernel.
86
86
87
-
###File Kernel Session Persistence
87
+
## File Kernel Session Persistence
88
88
89
89
File Kernel Session Persistence stores kernel sessions as files in a specified directory. To enable this form of persistence, set the environment variable `EG_KERNEL_SESSION_PERSISTENCE=True` or configure `FileKernelSessionManager.enable_persistence=True`. To change the directory in which the kernel session file is being saved, either set the environment variable `EG_PERSISTENCE_ROOT` or configure `FileKernelSessionManager.persistence_root` to the directory. By default, the directory used to store a given kernel's session information is the `JUPYTER_DATA_DIR`.
90
90
91
91
```{note}
92
92
Because `FileKernelSessionManager` is the default class for kernel session persistence, configuring `EnterpriseGatewayApp.kernel_session_manager_class` to `enterprise_gateway.services.sessions.kernelsessionmanager.FileKernelSessionManager` is not necessary.
93
93
```
94
94
95
-
###Webhook Kernel Session Persistence
95
+
## Webhook Kernel Session Persistence
96
96
97
97
Webhook Kernel Session Persistence stores all kernel sessions to any database. In order for this to work, an API must be created. The API must include four endpoints:
98
98
@@ -112,15 +112,15 @@ To enable the webhook kernel session persistence, set the environment variable `
112
112
113
113
Because `WebhookKernelSessionManager` is not the default kernel session persistence class, an additional configuration step must be taken to instruct EG to use this class: `EnterpriseGatewayApp.kernel_session_manager_class = enterprise_gateway.services.sessions.kernelsessionmanager.WebhookKernelSessionManager`.
114
114
115
-
####Enabling Authentication
115
+
### Enabling Authentication
116
116
117
117
Enabling authentication is an option if the API requires it for requests. Set the environment variable `EG_AUTH_TYPE` or configure `WebhookKernelSessionManager.auth_type` to be either `Basic` or `Digest`. If it is set to an empty string authentication won't be enabled.
118
118
119
119
Then set the environment variables `EG_WEBHOOK_USERNAME` and `EG_WEBHOOK_PASSWORD` or configure `WebhookKernelSessionManager.webhook_username` and `WebhookKernelSessionManager.webhook_password` to provide the username and password for authentication.
120
120
121
-
###Bring Your Own Kernel Session Persistence
121
+
## Bring Your Own Kernel Session Persistence
122
122
123
-
To introduce a different implementation, you must configure the kernel session manager class. Here's an example for starting Enterprise Gateway using a custom `KernelSessionManager` and 'single-instance' availability. Note that setting `--MyCustomKernelSessionManager.enable_persistence=True` is not necessary because an availability mode is specified, but displayed here for completeness:
123
+
To introduce a different implementation, you must configure the kernel session manager class. Here's an example for starting Enterprise Gateway using a custom `KernelSessionManager` and 'standalone' availability. Note that setting `--MyCustomKernelSessionManager.enable_persistence=True` is not necessary because an availability mode is specified, but displayed here for completeness:
Alternative persistence implementations using SQL and NoSQL databases would be ideal and, as always, contributions are welcome!
144
144
145
-
###Testing Kernel Session Persistence
145
+
## Testing Kernel Session Persistence
146
146
147
147
Once kernel session persistence has been enabled and configured, create a kernel by opening up a Jupyter Notebook. Save some variable in that notebook and shutdown Enterprise Gateway using `kill -9 PID`, where `PID` is the PID of gateway. Restart Enterprise Gateway and refresh you notebook tab. If all worked correctly, the variable should be loaded without the need to rerun the cell.
0 commit comments