Documentation request: score interpretation for pm4py.analysis.simplicity_petri_net #517

afparsons · 2025-01-30T15:23:44Z

Hello,

I am seeking clarity on the behavior of pm4py.analysis.simplicity_petri_net() function. The following are not immediately clear to me:

Is the float score bounded between zero and one?
- If so, are the bounds inclusive or exclusive?
Score interpretation: do higher values mean simpler or more complex?
Score interpretation: is it the same for all three algorithms?

I've skimmed the two cited papers, but I think it would be very helpful to include this information in the function docstring.

Relevant blocks

The text was updated successfully, but these errors were encountered:

fit-alessandro-berti · 2025-01-31T06:45:19Z

Hi @afparsons, thanks for opening this issue!

Here’s some clarity on each of your questions:

Is the float score bounded between zero and one?
- Yes. In all three “simplicity” variants, the computed score lies within the interval ([0, 1]) or ((0, 1]). For example, in the arc_degree implementation, the formula is
  [
  \text{simplicity} = \frac{1}{1 + \max(\text{mean_degree} - k, 0)},
  ]
  which asymptotically approaches (0) but never reaches it (so it’s in ((0,1])). In other variants (e.g., Extended Cardoso, Extended Cyclomatic), the formula may allow the score to hit exactly (0) in edge cases. Either way, you will not see negative values or values greater than (1).
Are the bounds inclusive or exclusive?
- For the arc_degree variant, the upper bound (1) is inclusive (when (\text{mean_degree} \le k)), but the lower bound (0) is effectively exclusive because you only approach (0) asymptotically for very large mean arc degrees.
- For the other two algorithms (Extended Cardoso, Extended Cyclomatic), the score is also designed to stay within ([0,1]); in practice, they often do include (0) in corner cases.
Score interpretation: do higher values mean simpler or more complex?
- A higher value corresponds to a simpler Petri net (fewer arcs/edges relative to places and transitions). A lower value means more complex.

We agree it would be very helpful to include these details directly in the docstring. Thanks again for the suggestion!

Hope that clarifies! Let us know if you have any further questions.

Provide feedback