Skip to content

Latest commit

 

History

History
182 lines (143 loc) · 5.61 KB

ecdf-plots.md

File metadata and controls

182 lines (143 loc) · 5.61 KB
jupyter
jupytext kernelspec language_info plotly
notebook_metadata_filter text_representation
all
extension format_name format_version jupytext_version
.md
markdown
1.2
1.4.2
display_name language name
Python 3
python
python3
codemirror_mode file_extension mimetype name nbconvert_exporter pygments_lexer version
name version
ipython
3
.py
text/x-python
python
python
ipython3
3.7.7
description display_as language layout name order page_type permalink thumbnail
How to add empirical cumulative distribution function (ECDF) plots.
statistical
python
base
Empirical Cumulative Distribution Plots
16
u-guide
python/ecdf-plots/
thumbnail/figure-labels.png

Overview

Empirical cumulative distribution function plots are a way to visualize the distribution of a variable, and Plotly Express has a built-in function, px.ecdf() to generate such plots. Plotly Express is the easy-to-use, high-level interface to Plotly, which operates on a variety of types of data and produces easy-to-style figures.

Alternatives to ECDF plots for visualizing distributions include histograms, violin plots, box plots and strip charts.

Simple ECDF Plots

Providing a single column to the x variable yields a basic ECDF plot.

import plotly.express as px
df = px.data.tips()
fig = px.ecdf(df, x="total_bill")
fig.show()

Providing multiple columns leverage's Plotly Express' wide-form data support to show multiple variables on the same plot.

import plotly.express as px
df = px.data.tips()
fig = px.ecdf(df, x=["total_bill", "tip"])
fig.show()

It is also possible to map another variable to the color dimension of a plot.

import plotly.express as px
df = px.data.tips()
fig = px.ecdf(df, x="total_bill", color="sex")
fig.show()

Configuring the Y axis

By default, the Y axis shows probability, but it is also possible to show raw counts by setting the ecdfnorm argument to None or to show percentages by setting it to percent.

import plotly.express as px
df = px.data.tips()
fig = px.ecdf(df, x="total_bill", color="sex", ecdfnorm=None)
fig.show()

If a y value is provided, the Y axis is set to the sum of y rather than counts.

import plotly.express as px
df = px.data.tips()
fig = px.ecdf(df, x="total_bill", y="tip", color="sex", ecdfnorm=None)
fig.show()

Reversed and Complementary CDF plots

By default, the Y value represents the fraction of the data that is at or below the value on on the X axis. Setting ecdfmode to "reversed" reverses this, with the Y axis representing the fraction of the data at or above the X value. Setting ecdfmode to "complementary" plots 1-ECDF, meaning that the Y values represent the fraction of the data above the X value.

In standard mode (the default), the right-most point is at 1 (or the total count/sum, depending on ecdfnorm) and the right-most point is above 0.

import plotly.express as px
fig = px.ecdf(df, x=[1,2,3,4], markers=True, ecdfmode="standard",
              title="ecdfmode='standard' (Y=fraction at or below X value, this the default)")
fig.show()

In reversed mode, the right-most point is at 1 (or the total count/sum, depending on ecdfnorm) and the left-most point is above 0.

import plotly.express as px
fig = px.ecdf(df, x=[1,2,3,4], markers=True, ecdfmode="reversed",
              title="ecdfmode='reversed' (Y=fraction at or above X value)")
fig.show()

In complementary mode, the right-most point is at 0 and no points are at 1 (or the total count/sum) per the definition of the CCDF as 1-ECDF, which has no point at 0.

import plotly.express as px
fig = px.ecdf(df, x=[1,2,3,4], markers=True, ecdfmode="complementary",
              title="ecdfmode='complementary' (Y=fraction above X value)")
fig.show()

Orientation

By default, plots are oriented vertically (i.e. the variable is on the X axis and counted/summed upwards), but this can be overridden with the orientation argument.

import plotly.express as px
df = px.data.tips()
fig = px.ecdf(df, x="total_bill", y="tip", color="sex", ecdfnorm=None, orientation="h")
fig.show()

Markers and/or Lines

ECDF Plots can be configured to show lines and/or markers.

import plotly.express as px
df = px.data.tips()
fig = px.ecdf(df, x="total_bill", color="sex", markers=True)
fig.show()
import plotly.express as px
df = px.data.tips()
fig = px.ecdf(df, x="total_bill", color="sex", markers=True, lines=False)
fig.show()

Marginal Plots

ECDF plots also support marginal plots

import plotly.express as px
df = px.data.tips()
fig = px.ecdf(df, x="total_bill", color="sex", markers=True, lines=False, marginal="histogram")
fig.show()
import plotly.express as px
df = px.data.tips()
fig = px.ecdf(df, x="total_bill", color="sex", marginal="rug")
fig.show()

Facets

ECDF Plots also support faceting

import plotly.express as px
df = px.data.tips()
fig = px.ecdf(df, x="total_bill", color="sex", facet_row="time", facet_col="day")
fig.show()