-
Notifications
You must be signed in to change notification settings - Fork 11
Vizier Architecture
Oliver Kennedy edited this page Mar 13, 2023
·
3 revisions
The Vizier system consists of the components shown in the architecture diagram shown below.
Note: The following discussion pertains to Vizier 2.0
-
UI: Vizier relies on a HTML/ScalaJS-based frontend for most user interactions. Code for the UI is located in
vizier/ui
(and joint API/UI code is located invizier/shared
). -
API: Vizier uses an API layer to manages notebook state and mediates between the components. The API may be accessed directly (e.g., by scripts), or via Vizier's UI. Code for the API is located in
vizier/backend
(and joint API/UI code is located invizier/shared
).- The Vizier API object manages the API layer, including spinning up an Akka-Http server to host it
- The api package contains handlers for every API call
- The routes file specifies all API routes
- The api.websocket package contains code implementing support for the project websocket.
- The api.websocket package contains code implementing support for the spreadsheet websocket.
- The catalog package implements the API's state model.
-
Scheduler: A scheduler is responsible for evaluating dependencies between notebook cells and re-executing cells that are out-of-date (whether because the cell was updated or one of its inputs changed in a new notebook version). Code for the scheduler is part of the API, and located in
vizier/backend/src/info/vizierdb/scheduler
.- The Scheduler is the main entry point to the scheduler. Its main role is to create and tear down...
- A RunningWorkflow represents a workflow being actively executed. Its main role is to identify cells that need (re-)execution and create...
- A RunningCell represents a cell being actively executed.
-
Datastore: Structured data (dataframes) and simple unstructured data (blobs) are stored in the Datastore layer. In addition to keeping track of this state, the datastore layer is responsible for managing fine-grained provenance relationships between data elements, and profiling dataset state. Vizier uses Apache Spark as a datastore layer.
- An Artifact encodes virtually all types of data. Methods on the class provide structured representations in scala.
-
Filestore: Vizier uses a file storage layer to manage large unstructured data.
- The Filestore provides basic functionality for translating Artifact identifiers to disk-backed storage.