Developer docs

These are things which are useful to refer back to. At some point in the future they might make their way into a proper docs page on RTD. These notes can be rough and might not always be up to date. If it's a quick answer then put it inline here; if it's a longer read then just link to it.

Technical

How Vizro resources are served

Source: https://github.com/mckinsey/vizro/pull/775

When serve_locally=True (the default), Dash serves component library resources (generally CSS and JS) through the Flask server using the _dash-component-suites route.

For Vizro library components (currently just KPI cards), this should happen when Vizro is imported.
For Vizro framework components (everything else), this should happen only when Vizro() is instantiated.

This makes our footprint as small as possible and ensures there's reduced risk of CSS name clash when someone wants to use Vizro just as a library but doesn't instantiate Vizro() (not common at all now, but maybe will be in the future).

When serve_locally=False, Dash switches to serving resources through external_url, where it's specified. For Vizro components we use jsDeliver as a CDN for this.

A few complications:

files that aren't CSS/JS (e.g. fonts, maps) can still be served through the same routes but should not have a <script> or <link> in the HTML source. This is achieved with dynamic=True
the CDN minifies CSS/JS files automatically, but some we have minified manually ourselves (currently just vizro-bootstrap.min.css)
it's not possible to serve JS as modules this way, which means we can't easily do import/export in them

In future:

when we release vizro-bootstrap, nothing changes for Vizro framework. Pure Dash users would treat it like any other bootstrap theme i.e. set it through external_stylesheets that points to the stylesheet on our CDN or download the file to their own assets folder. We'd have a shortcut for this like vizro.theme or do it through bootswatch if that's possible.
ideally we would webpack our JS and ship the .min.js rather than just relying on the CDN to minify it. This would let us write "proper" rather than just in-browser JS and mean we benefit from tree-shaking etc. rather than just minification that the CDN does. In reality the optimisations would make very little difference to performance, but it's kind of the "right" way to do things. It's more effort than it's worth to set up at the moment, but if we end up maintaining a bigger JS codebase we might do it
vizro-boostrap.css map file + all the SASS should live in the repo so that it can be handled correctly through developer tools in a browser

The order Dash inserts stylesheets is always:

external_stylesheets
library stylesheets as we add on import vizro
stylesheets added through append_css as we do in Vizro()
user assets (also go through append_css but only when server gets its first request)

The problem was that figures.css was served in stage 2 and therefore could come before vizro-bootstrap.min.css. I hoped this wouldn't cause any issues but unfortunately it did...

So now what we do is remove the library stylesheets in Vizro() and then add them using the framework's append_css mechanism. This means that vizro-boostrap.min.css always comes first for a framework user because we sort the stylesheets added in stage 3 to put it first (the rest are in alphabetical order). For a Dash user, it will be specified using external_stylesheets so always come first anyway.

Templates for plotly figures

See https://github.com/mckinsey/vizro/pull/615. Then altered slightly by:

Page loading

See https://github.com/mckinsey/vizro/pull/598#pullrequestreview-2200302196.

Here's the simplified callback flow when the page loads or refreshes:

HTTP: page build
- All Vizro models return output from their build method.
- This is fast because build doesn't rely on the data frame loading.
- Dynamic components return a placeholder to be filled later by on page load.
- They must return default original/initial values so persistence can work properly.
Client-side persistence is applied.
If there are show_in_url controls:
- URL query parameter and control values update each other (URL params take the precedence in this process).
If control values (like Slider/RangeSlider) come from the URL:
- update_slider_values / update_range_slider_values sync the numeric text input fields.
Dropdown and checklist "select all" sync after the URL
- update_dropdown_select_all and update_checklist_select_all trigger to sync the select-all option in case the control value is changed through the URL.
HTTP: on page load
- Dynamic placeholder components turn into a loading state.
- All current control values are sent to the server.
- All dynamic components get updated and replace the earlier placeholders.
- Dynamic Vizro models return output from their __call__ method.
- This step may take time because it loads and processes the data frame.
- It updates dynamic content (e.g. charts, cards, or control options).
Dropdown and checklist "select all" sync again after the OPL
- update_dropdown_select_all and update_checklist_select_all trigger to sync the select-all option in case the control options is changed.

How to run a Vizro app

Source: https://github.com/mckinsey/vizro/pull/580

Development

Vizro().build(dashboard).run() and then python app.py, which is what we do across our docs. This only works while you're developing but I like recommending it as the first port of call for users because it's simple, quick and easy, like Vizro should be. There's no need to define app.

Deployment

app = Vizro().build(dashboard)

# If you also want it to run during development with python app.py you also need this:
if __name__ == "__main__":
    app.run()

and then e.g. gunicorn app:app. The key change of this PR is that in this context there's no longer any need to define server = app.dash.server (although that will still work).

The integration tests in this repo do something a bit different but that's just due to some technicalities of how they run and so don't show a generally recommended pattern.

Link to a page, use an asset

Source: https://github.com/mckinsey/vizro/pull/151

Here's the rules for how we should write code so that paths are always correctly formed:

always use dash.get_relative_path to link to pages with href (see _make_page_404_layout example link)
always use dash.get_relative_path(f"/{STATIC_URL_PREFIX}/..") to refer to built-in assets in the static folder (see _make_page_404_layout example html.Img)
always use dash.get_asset_url to refer to things in the user assets folder, e.g. the logo is done this way

`html.Div(hidden=True)` vs. `None`

Source: https://github.com/mckinsey/vizro/pull/188

prefer to use None over html.Div(hidden=True) in the case that we don't need to refer to the element in any way (basically whenever you don't set an id). e.g. html.P(self.title) if self.title else None
prefer to use html.Div(hidden=True) over None in the case that we do need to refer to the element (basically when you do set an id). e.g. html.Div(hidden=True, id="nav_panel_outer"). Generally these can be identified by the fact that build return values have a type like _NavBuildType
prefer to use "" as default value for optional fields which are str. These fields do not need to accept None values at all

`CapturedCallable` attributes

Source: https://github.com/mckinsey/vizro/pull/367#issuecomment-1994052080

when it comes to using CapturedCallable we should always prefer to use the highest-level interface possible to avoid delving into private/protected things. There's basically three categories of attributes here:

dunder attributes like __call__ and __getitem__: these are the main point of entry to any callers and should be used wherever possible
protected attributes like _function and _arguments: ok to use if needed but will be removed or made into proper public things in due course, so put some thought into exactly what you're trying to do and whether you really need to use them or if you can already achieve it just with dunder attributes
private attributes like __arguments: you should never need to use these

Static vs Dynamic Data

Source: https://vizro.readthedocs.io/en/stable/pages/user-guides/data/

This sum-up can help you to quickly decide what Vizro data type and configuration to use in your future examples.

Static data:

When to use: Use for data that does not need to be reloaded while the dashboard is running.
Production ready: 🟢
Performances: 🟢
Limitations: Data can only be reloaded/refreshed by restarting the dashboard app.
Use cases: Any time when your data does not need to be reloaded while the dashboard is running.

Dynamic data:

When to use: Use for data that does need to be reloaded while the dashboard is running.
Production ready: 🟠 The reason is performance that might degrade you app.
Performances: 🟠 Your dashboard performance may suffer if: (Use the cache to solve the problem.)
- loading your data is a slow operation or
- you have many figures that use dynamic data or
- many users use the app at the same time.
Limitations: Performances
Use cases: When loading your data is fast operation and you strictly have to get the really latest results from the data source: For example:
- Displaying the results from the just finished workflow triggered by the user. (e.g. some model interaction flow).
- Repetitive reading logs from the file.
- Chat apps.

Dynamic data with cache:

When to use: Use to improve app performances when the dynamic data is used. Use it only in a case you don't need the really latest data to be always displayed.
Production ready: SimpleCache: 🔴 FileSystemCache: 🟢 and RedisCache: 🟢
Performances: SimpleCache: 🟢 FileSystemCache: 🟠 and RedisCache: 🟢
Limitations: Loaded/displayed data is old for the timeout (user specified) number of seconds, which means that real data (from its source) and the displayed one can differ.
Use cases: When loading your data is slow operation or you don't have to get the really latest results from the data source: For example:
- The forecast app (because you don't need the latest results),
- The data that presents searching engine results.
- Use it to reduce the number of external API calls (especially if you pay per API call).

Parametrised dynamic data:

When to use: Use when the entire data source, or its version, or its chunk that will be loaded into the app should depends on the user's UI input.
Production ready: Same as Dynamic data
Performances: Same as Dynamic data
Limitations: Same as Dynamic data + filter/parameter options are not natively updated according to the newly loaded data.
Use cases: For example:
- Selecting the source from where the data is going to be retrieved (e.g. selecting between: linkedin_result, twitter_results, instagram_results options).
- Displaying a certain version of the model interaction results.
- Displaying a only a chunk of the big data (where the concrete chunk depends on the user's input).

P.S. Parametrised data loading is compatible with caching. The cache uses memoization, so that the dynamic data function's arguments are included in the cache key. Pros/Cons between using the Parametrised dynamic data with or without the cache are the same as pros/cons between the Dynamic data with and without the cache which is presented above.

`assert_component_equal`

See https://github.com/mckinsey/vizro/pull/195.

How should types of Vizro pydantic models and the Vizro schema work and/or be interpreted

This question was posed and discussed in a PR that introduces the vm.Title model as a way to have an icon next to a title. The example, and the question discussed (and thus reused below) is whether a model like vm.Dashboard should have a field title: Union[str, vm.Title] or title: vm.Title and how before and after validators are allowed to coerce types. And how this would affect the Vizro schema.

Summary (valid choices)

title = Annotated[vm.Title, BeforeValidator(convert_str, json_schema_input_type=Union[str, vm.Title])]" with downside of being less clear to developers
title = Annotated[Union[str, vm.Title]] and we need to deal with the true consequences, i.e. that title can be both str and vm.Title

Long answer

Ultimately I think we should follow pydantic, and they say:

In essence, Pydantic's primary goal is to assure that the resulting structure post-processing (termed "validation") precisely conforms to the applied type hints.

We definitely do not always obey this, but I think eventually we should. In the ideal case, our type hint in the above example would be vm.Title and this alone, and the user would need to configure this. This raises 2 important points:

If the user is allowed to configure additional types?

In our context this would be the str - so what if we want to allow Union[str, vm.Title] (but ultimately convert to vm.Title) as input. I think the answer is given here: https://docs.pydantic.dev/latest/concepts/validators/#json-schema-and-field-validators

While the type hint for value is str, the cast_ints validator also allows integers. To specify the correct input type, the json_schema_input_type argument can be provided

So in our example, what we really should be doing is keeping the type hint at vm.Title, but having json_schema_input_type=Union[str, vm.Title].

Beware this is for before validators, after validators are a different story again. While pydantic recommends the above, I am not sure this is so good in our case. We are (atm) not working so much with json and json schema, and our models are the main configuration way, so I am not sure what the vm.Title type hint would do with a different json schema input thing? Would IDE's know about this and the fact that str is also allowed? (Edit: see answer further below) I don't think so... Vizro-AI would would be happy though, and so would a schema based GUI.

For after validators, there are some strong opinions flying about from the maestro himself - it "breaks type hints": https://github.com/pydantic/pydantic/discussions/3997#discussioncomment-3099169. I think we are doing this precise thing a fair bit lol. I agree with him actually, although of course our models are not really consumed much by anyone but us (and maybe vizro-ai) so it's not so problematic - except for the actions chain (see below)

What if the final type is not a subset of the input type (think for above example of just json_schema_input_type=str) or even worse if the coerced type is not even public (think of the actions chain in the case of after validators).

The is problematic for the reasons described above, but it also concretely affects some of our code: to_python. We fixed it for the actions chain, but in essence the model you would serialize contains things that you can't re-instantiate the model with. Not fun.

In the str example, this could be hacked in the before validator passing through any received vm.Title, but it's very ugly. In the case of the actions chain this would fail unless we hacked serialization (which we did).

What we should definitely avoid:

AfterValidators that convert into things that are not public
BeforeValidators that convert into things that are not in JSON schema

Additional comments

It has always been our intention to, wherever practical, follow the pydantic guidance that the type hint guarantees the coerced type. Just to understand, the problem you point with doing title: Title is that IDEs/mypy don't like it if you then specified title as a str? This is indeed quite annoying and I don't think there's much we can do about it, but let's double check with a search because surely (hopefully) someone must have wondered about this before (?). e.g. would this help? https://docs.pydantic.dev/latest/integrations/mypy/#init_typed
The other problem I foresee with this is that if and when we generate API docs automatically, it's slightly confusing because the the API docs here would give vm.Title without mentioning str, even though str is actually the most common user input. That's not a huge problem though so long as we explain it, especially since most of our narrative docs examples will make it clear that title can be a string because that's what we'll do 99% of the time.
It looks like without additional config, PyLance in VS code at least complains about a str when it's not part of the field. Code for the screenshot below:

import json
from typing import Annotated, Any, Optional, Union

from pydantic import BaseModel, BeforeValidator


class Title(BaseModel):
    text: str
    icon: Optional[str] = None


def convert_to_title(v: Any) -> Title:
    if isinstance(v, str):
        return Title(text=v)
    return v


class User(BaseModel):
    title: Annotated[
        Title,
        BeforeValidator(convert_to_title, json_schema_input_type=Union[str, Title]),
    ]


foo = User(title="Foo")
print(foo.model_dump_json())

bar = User(title=Title(text="Bar", icon="👋"))
print(bar.model_dump_json())

print(json.dumps(foo.model_json_schema(), indent=2))

Note this does not cause a problem with mypy by default thanks to the init_typed option being off by default.

Process

Deprecations

See https://github.com/mckinsey/vizro/pull/1377 for examples.

Code

Any changes must include the word legacy and/or deprecat* (e.g. deprecation, deprecated, deprecate). This will make it easy to keep track of everything and ensure we remove everything we want to in the breaking version.

Warnings

Emit FutureWarning rather than DeprecationWarning to ensure users see the message. Warning message should be:

short and sweet
where possible, include one-line strategy for resolution in imperative tense ("replace x with y")
state version where breaking change will be released
link to relevant section of docs page on deprecations (unless this doesn't exist, which should only be the case for very insignificant deprecations)

How to emit these warnings:

for a class/function/model, typing_extensions.deprecated, which unlike typing.deprecated supports runtime checks - see PEP 702. If typing_extensions stops supporting this then we would instead use Deprecated. The message appears automatically in the API docs. We can use Markdown here so the docs version works nicely so long as the warning is still readable when raised in the console. Use absolute link to deprecation docs section.
for a field, use make_deprecated_field_warning validator (message only shown in console, link to deprecation docs section is absolute). The message doesn't show in API docs so you must write one manually in the description and/or docstring using "❗Deprecated: " (message shown in docs so use Markdown, link to deprecation docs section is relative). Don't mark field as deprecated since it only affects the JSON schema and raises unwanted warnings when looking through model attributes.
for warning message only shown at runtime and not in API docs, message should be optimised for console, link to deprecation docs section is absolute.

Make sure you get the stacklevel right for warnings so the message raised in console points to the right line of code that triggers the warning.

Tests

Where feasible and worth the effort to do so, make tests of legacy feature:
- Copy (or cut, depending on what the feature is) and paste tests that use old API and label as legacy (e.g. legacy_layout). Depending on how important/widespread the change is, this might require copy and pasting a single file or several tests spread throughout different files across the test codebase. Don't pytest.mark.parameterize tests to check old and new API, just copy and paste instead. This will make it easier to remove the legacy features in future without needing to rewrite tests.
- Write new tests to check FutureWarning is emitted where expected
- Filter out FutureWarning to make legacy tests pass as before
Update all other tests to use new API until no FutureWarnings are raised. Any ignores still required should be done in minimal way and not using global ignore.
There's never any need to test for absence of deprecation warning since tests will automatically fail if unexpected warnings are raised as all warnings are elevated to error by default.

Examples

All examples should change to use the new API.

Documentation

We have a single page listing deprecations/breaking changes in the API reference. Descriptions should mirror FutureWarning messages but there's more space to expand if needed.
Narrative docs can, but do not have to, mention deprecations/breaking changes and resolution strategy where relevant. They should always link to the deprecations/breaking changes API reference page.
Maybe in future we'll have a more "how to" migration guide also to ease the upgrade to 0.2.0, but I don't imagine it will be very difficult. Could maybe also have some "what's new in 0.2.0" post.
Deprecated API remains in the API reference but links to the new one.
Just like with Examples, all other references to the old API should be updated to use the new one.

Changelog

Entry goes in "Deprecate" category.

Contribution flow

We follow GitHub flow. In short:

main is the only long-lived branch and is always in a releasable state
main is a protected branch and cannot be committed to directly
contributions to main must go through a PR process
contributions must be up to date with main before merging
PRs are merged using Squash and merge

To keep your PR in sync with main you can use rebase or merge. There are pros and cons to each method. Rebasing rewrites commit history, which can make it cleaner but complicate thingss if there are multiple contributors to a branch. Ask yourself "do I understand what update by rebase does and think it's a good idea to use it here?". If yes, then do it. If no, then update by merge instead. So if in doubt, just use merge (the default).

You should try to avoid long-lived PRs. Tips to achieve this:

keep code changes small. As a very general rule, it's better to have two small PRs rather than one big one. Consider basing one feature off another to break your work down into more manageable chunks
make reviewers' lives easy, e.g. with a clear PR description, clean commit history (e.g. use rebase if you understand it), instructions on how to review
reviewers should try to review quickly (e.g. within a day). PR authors should remind reviewers if required
several long conversations on PRs and multiple rounds of reviewing can be slow and hard to follow. Consider just talking directly to the PR reviewers
for complex changes, raise a draft PR early for visibility of your work and to get initial comments and feedback. Talk to PR reviewers and other developers before and while you do the work rather than just waiting for a single "big bang" review when it's complete
consider merging a feature that's work in progress (e.g. code without tests) so long as you keep it undocumented and ideally private (use _). This allows an incomplete feature to be present in the codebase without being visible to users. Only do this sparingly or things get confusing though

Sometimes it's impossible to avoid long-lived PRs, e.g. for some big new features, large refactoring work, etc. This is ok. It just shouldn't be the norm.

Ideally, all the following happen on the same merge to main (as above, this doesn't prevent you opening multiple PRs that point to a feature branch):

source code
tests
changelog
docs

Sometimes it might not be feasible to achieve all of these in one merge to main. How then do we keep main always release-ready? The key is that a feature is publicly available only when it is visible in documentation or changelog. This is ultimately what defines our functionality, rather than the existence of source code or tests in our codebase. This means that it's ok to merge code to main that you are not yet happy for the general public to use, so long as it is not publicly documented and does not break existing functionality. If such code is released then this is fine because the feature isn't yet visible to users. The important thing is to not make documentation/changelog public until you are comfortable that the feature can be used.

Docs

Code examples

Only add an id when it's actually required
Leave out targets if the control/action is supposed to apply to all components on screen anyway
Don't add controls unless they’re necessary to showcase the feature
Keep component usage minimal—no need to include multiple components if they don’t contribute to the example and feel free to just use a Card if the components are not displayed

CSS Guidelines

To enable global theming in the future, we need to take a consistent approach - both in how we use components and how we apply CSS variables.

Component Usage

Whenever possible, use Dash Bootstrap Components (DBC) when adding new models to our library. These components are automatically styled via the vizro-bootstrap stylesheet. If no suitable DBC components exists, you may use alternatives such as dash-mantine-components. However, in that case, you must include the necessary CSS in our static folder so the components match the Vizro design. When writing custom CSS, always use Bootstrap variables only.

CSS Variables

In the coming weeks, we'll phase out the use of QB design variables and switch entirely to Bootstrap variables. This simplifies theming and ensures all variables are publicly documented and easier to work with. Going forward, only use Bootstrap variables. A mapping table from QB design variables to Bootstrap variables is provided below and will be updated regularly.

QB Design Variable	Bootstrap Variable
--border-disabled	--bs-tertiary-color
--border-hover	--bs-primary
--border-selected	--bs-primary-text-emphasis
--border-subtleAlpha01	--bs-border-color
--border-subtleAlpha02	--bs-border-color-translucent
--dropdown-label-bg	--bs-tertiary-bg
--elevation-0	--bs-box-shadow-sm
--elevation-1	--bs-box-shadow
--field-enabled	--bs-primary-bg-subtle
--fill-active	--bs-primary
--fill-hoverSelected	--bs-primary-text-emphasis
--fill-primary	--bs-primary
--fill-secondary	--bs-secondary
--fill-subtle	--bs-tertiary-color
--focus	--bs-focus-ring-color
--focus-color	--bs-focus-ring-color
--primary-100	--bs-gray-900
--primary-900	--bs-gray-100
--stateOverlays-selected	--bs-primary-bg-subtle
--stateOverlays-selectedHover	--bs-border-transluscent
--surfaces-bg01	--bs-secondary-bg
--surfaces-bg02	--bs-tertiary-bg
--surfaces-bg03	--bs-body-bg
--surfaces-bg-card	--bs-primary-bg-subtle

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Developer docs

Technical

How Vizro resources are served

Templates for plotly figures

Page loading

How to run a Vizro app

Link to a page, use an asset

`html.Div(hidden=True)` vs. `None`

`CapturedCallable` attributes

Static vs Dynamic Data

`assert_component_equal`

How should types of Vizro pydantic models and the Vizro schema work and/or be interpreted

Summary (valid choices)

Long answer

What we should definitely avoid:

Additional comments

Process

Deprecations

Contribution flow

Docs

Code examples

CSS Guidelines

Uh oh!

Clone this wiki locally

Developer docs

Technical

How Vizro resources are served

Templates for plotly figures

Page loading

How to run a Vizro app

Link to a page, use an asset

html.Div(hidden=True) vs. None

CapturedCallable attributes

Static vs Dynamic Data

assert_component_equal

How should types of Vizro pydantic models and the Vizro schema work and/or be interpreted

Summary (valid choices)

Long answer

What we should definitely avoid:

Additional comments

Process

Deprecations

Contribution flow

Docs

Code examples

CSS Guidelines

Uh oh!

Clone this wiki locally

`html.Div(hidden=True)` vs. `None`

`CapturedCallable` attributes

`assert_component_equal`