Skip to content
Merged
Show file tree
Hide file tree
Changes from 8 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
0.0.10
0.0.11
69 changes: 69 additions & 0 deletions docs/data.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
# Data Generation

In order to create process curves with a realistic behaviour of process drifts, `driftbench`
synthesizes curves by solving nonlinear optimization problems. By defining a function $f(w(t), t)$
and support points $x$ and $y$, we can solve for the internal parameters $w(t)$ such that
all the conditions given by the support points are satisfied. The schema is explained in
the following section.

![Example curve](./figures/example_curve.png)

## Synthetization
In the first step, we need to define latent information which encodes the shape of the curves
to synthesize. This is done by formulating such a spec in a `yaml`-file.
For example, the following spec defines a polynomial of 7-th degree:
```yaml
example:
N: 10000
dimensions: 100
x_scale: 0.2
y_scale: 0.2
func: w[7]* x**7 + w[6]* x**6 + w[5]* x**5 +w[4]* x**4 + w[3] * x**3 + w[2] * x**2 + w[1] * x + w[0]
w_init: np.zeros(8)
latent_information:
!LatentInformation
x0: [0, 1, 3, 2, 4]
y0: [0, 4, 7, 5, 0]
x1: [1, 3]
y1: [0, 0]
x2: [1]
y2: [0]
drifts:
!DriftSequence
- !LinearDrift
start: 1000
end: 1100
feature: y0
dimension: 2
m: 0.002
```
The root key defines the name of the dataset, in this case `example`.
The other keys of this `yaml` structure are:

- `N`: The number of curves to synthesize.
- `dimensions`: The number of timesteps one curve consists of.
- `x_scale`: The scale of a random gaussian noise which is applied to the `x`-latent information.
If set to 0, no scale noise is applied.
- `y_scale`: The scale of a random gaussian noise which is applied to the `y`-latent information.
If set to 0, no scale noise is applied.
- `func`: The function which defines the shape of a curve. The internal parameters are denoted as
`w`, while the timesteps which are used to evaluate the curve are denoted as `x`.
- `w_init`: The initial guess for the internal parameters. Must match the number of internal
parameters defined in `func`.
- `latent_information`: Contains a `LatentInformation` structure, which holds the latent information
which defines the support points of the curves. The `x_i`denote the `x`-information for the `i`-th
derivative of `func`, while the `y_i`denote the `y`-information respectively.
- `drifts`: Contains a [`DriftSequence`][driftbench.data_generation.drifts.DriftSequence] structure, which in turn holds a list of drifts, for example
`LinearDrift`-structures. These drifts are applied in the specified manner on the latent
information for each timestep defined as `start` as `end` within the `N` curves. The drift structure
defines the `feature` and the `dimension` as well as internal parameters, like in this case
the slope `m`.

After setting up such an input, you can call the `sample_curves`-function, and retrieve the
coefficients, respective latent information and curve for each timestep.
```python
coefficients, latent_information, curves = sample_curves(dataset["example"], measurement_scale=0.1)
```
By specifying a value for `measurment_scale` some gaussian noise witht the specified scale is applied
on each value for every curve. By default, $5\%$ of the mean of the curves is used. If you want to
omit the scale, set it to `0.0` explictly.
Binary file added docs/figures/example_curve.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 4 additions & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,10 @@ markdown_extensions:
class: mermaid
format: !!python/name:pymdownx.superfences.fence_code_format

extra_javascript:
- javascripts/mathjax.js
- https://unpkg.com/mathjax@3/es5/tex-mml-chtml.js

nav:
- About: index.md
- Data generation: data.md
Expand Down