Skip to content

Commit

Permalink
Merge pull request MicrosoftDocs#1445 from MicrosoftDocs/main63849191…
Browse files Browse the repository at this point in the history
…5108899291sync_temp

For protected branch, push strategy should use PR and merge to target branch method to work around git push error
  • Loading branch information
learn-build-service-prod[bot] authored Apr 20, 2024
2 parents c12c129 + 5deea5a commit c8b4ebf
Show file tree
Hide file tree
Showing 14 changed files with 301 additions and 102 deletions.
8 changes: 6 additions & 2 deletions articles/TOC.yml
Original file line number Diff line number Diff line change
Expand Up @@ -802,8 +802,12 @@
href: process-mining-copilot-in-process-analytics.md
- name: Connect to SAP ERP from process mining (preview)
href: process-mining-sap-erp.md
- name: Use your own Azure Data Lake Storage Gen2
href: process-mining-byo-azure-data-lake.md
- name: Storage Gen2
items:
- name: Bring your own Azure Data Lake Storage Gen2
href: process-mining-byo-azure-data-lake.md
- name: Use your own network isolated Azure Data Lake Storage Gen2
href: process-mining-byo-azure-data-lake-private.md
- name: Transform and map data
href: process-mining-transform.md
- name: Visualize and gain insights from processes
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
18 changes: 9 additions & 9 deletions articles/minit/aggregations.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ contributors:
- v-aangie
ms.subservice: process-advisor
ms.topic: conceptual
ms.date: 04/18/2024
ms.date: 04/22/2024
ms.author: michalrosik
ms.reviewer: angieandrews
ms.custom: bap-template
Expand Down Expand Up @@ -381,13 +381,13 @@ Returns the first [value] that meets the [condition], grouped according to the [

- **[value]**: An attribute name, nested operation, or expression

Data type: INT, FLOAT, TIME

- **[default]**: Default value returned by operator when no element in defined [context] meets the [condition]
Data type: INT, FLOAT, TIME, STRING

Data type: INT, FLOAT, DATE, TIME
- **[default]**: Value to be returned, when condition is not met

**Output data type**: FLOAT, TIME
Data type: BOOL, INT, FLOAT, STRING, DATE, TIME

**Output data type**: BOOL, INT, FLOAT, STRING, DATE, TIME

## LAST([context],[value])

Expand Down Expand Up @@ -415,11 +415,11 @@ Returns the last value that meets the [condition], grouped according to the [con

Data type: INT, FLOAT, TIME

- **[default]**: Default value returned by operator when no element in defined [context] meets the [condition]
- **[default]**: Value to be returned, when condition is not met

Data type: INT, FLOAT, DATE, TIME
Data type: BOOL, INT, FLOAT, STRING, DATE, TIME

**Output data type**: FLOAT, TIME
**Output data type**: BOOL, INT, FLOAT, STRING, DATE, TIME

## SELFLOOP([context],[attributeName])

Expand Down
63 changes: 63 additions & 0 deletions articles/minit/process-mining-desktop-sizing.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
---
title: Power Automate Process Mining desktop application sizing guide
description: Learn about the sizing requirements to run Power Automate Process Mining desktop app.
author: rosikm
contributors:
- rosikm
- v-aangie
ms.subservice: process-advisor
ms.topic: overview
ms.date: 04/16/2024
ms.author: michalrosik
ms.reviewer: angieandrews
search.audienceType:
- enduser
---

# Power Automate Process Mining Desktop Client Hardware Requirements

**Power Automate Process Mining (PAPM)** desktop client application enables users to discover, analyze, and improve business processes from event log data. PAPM uses a stateful process mining engine that requires a significant amount of hardware resources, especially RAM memory, disk drive capacity and speed, and CPU cores. This document provides the recommended hardware specifications for running PAPM on different sizes of event log files. The document also explains the factors that affect the performance of PAPM and how to optimize the hardware configuration for the best user experience.

## Hardware specifications

The following table summarizes the recommended hardware specifications for running PAPM on different sizes of event log files. These requirements assume minimal impact of other applications on memory consumption and CPU utilization. Otherwise, it's necessary to increase the requirements based on the demands of these applications. The table assumes that the event log files are in CSV format and that they're transformed into Process Model files before loading them into PAPM. The size of the Process Model file is typically 20-30% of the original CSV file. The table covers the minimal requirements and the optimal requirements for each size of event log file. The minimal requirements are the minimum hardware specifications that are needed to run PAPM without encountering errors or timeouts. The optimal requirements are the hardware specifications that are needed to run PAPM with fast and smooth performance.

|Event log size (CSV) |Process Model file size |Minimal requirements |Optimal requirements |
|---------|---------|---------|---------|
|0 - 10 GB |0 - 3 GB |<li>RAM: 8 GB</li><li>Disk: HDD (50 GB free)</li><li>CPU: 2 cores</li>|<li>RAM: 16 GB</li><li>Disk: SSD (100 GB free)</li><li>CPU: 4 cores</li>|
|10 - 50 GB |3 - 15 GB |<li>RAM: 16 GB</li><li>Disk: SSD (100 GB free)</li><li>CPU: 4 cores</li>|<li>RAM: 32 GB</li><li>Disk: NVMe SSD (200 GB free)</li><li>CPU: 8 cores</li>|
|50 - 100 GB |15 - 30 GB |<li>RAM: 32 GB</li><li>Disk: SSD (200 GB free)</li><li>CPU: 8 cores</li>|<li>RAM: 48 GB</li><li>Disk: NVMe SSD (400 GB free)</li><li>CPU: 16 cores</li>|
|100 - 150 GB |30 - 45 GB |<li>RAM: 48 GB</li><li>Disk: NVMe SSD (400 GB free)</li><li>CPU: 8 cores</li>|<li>RAM: 64 GB</li><li>Disk: NVMe SSD (600 GB free)</li><li>CPU: 16 cores</li>|

> [!NOTE]
>
> The previous table shows the minimal and optimal hardware configuration for running PAPM desktop app. The minimal configuration is the lowest configuration that can run PAPM without crashing but user might experience delays in the upper volume boundaries. The optimal configuration is the configuration that can run PAPM Desktop App smoothly and efficiently. For the boundary values of the process model size, we suggest choosing a stronger hardware configuration.
## Performance factors

The performance of PAPM depends on several factors, such as the size and complexity of the event log data, the type and number of analyses performed by the user, and the hardware configuration of the machine running PAPM. The following sections explain how each of these factors affects the performance of PAPM and how to optimize them for the best user experience.

### Data size and complexity

The size and complexity of the event log data have a direct impact on the performance of PAPM. The larger and more complex the data, the more hardware resources are needed to process and analyze them. The size of the data is determined by the number of events, the number of attributes, and the cardinality of the attribute values. The complexity of the data is determined by the number of variants, the number of activities, and the degree of concurrency and loops in the process. The following are some general guidelines to reduce the size and complexity of the data:

- Filter out irrelevant or redundant events and attributes before data ingestion.
- Reduce the number of unique values of the attributes by grouping or aggregating them into meaningful categories.
- Use a suitable mining attribute that captures the main behavior of the process and avoids creating too many variants.
- Use a suitable time granularity that reflects the temporal dynamics of the process and avoids creating too many events.

### Amount of analysis and their types

The type and number of analyses performed by the user also have an impact on the performance of PAPM. The more analyses the user performs, the more hardware resources are needed to compute and display them. The type of analysis determines the amount of data that needs to be accessed and processed, and the level of detail that needs to be shown. The following are some general guidelines to optimize the type and number of analyses:

- Use filters to focus on the most relevant or interesting cases, activities, or attributes for the analysis.
- Avoid creating custom metrics that aren't relevant for the current analysis. Custom metrics that are already created can be disabled without the need for deletion.
- Avoid performing too many analyses at the same time.

### Hardware configuration

The hardware configuration of the machine running PAPM is the most important factor that affects the performance of PAPM. The hardware configuration determines the amount of data that can be loaded into memory, the speed of reading data from disk, and the speed of processing data in parallel. The following are some general guidelines to optimize the hardware configuration:

- To load the data into memory, use a machine with enough RAM. This amount significantly improves the performance of PAPM, as it avoids the need to stream data from disk, which is slower. The recommended RAM size for each data size is shown in the previous table.
- To store and read the data, use a machine with a fast disk drive. This usage improves the performance of PAPM, especially if the data can't be loaded into memory. The recommended disk type and speed for each data size are shown in the previous table.
- To process the data in parallel, use a machine with enough CPU cores. This usage improves the performance of PAPM, as it enables PAPM to split the computation into multiple threads and use the full potential of the CPU. The recommended CPU core number for each data size is shown in the previous table.
6 changes: 3 additions & 3 deletions articles/minit/requirements-for-application.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ contributors:
- v-aangie
ms.subservice: process-advisor
ms.topic: conceptual
ms.date: 07/18/2023
ms.date: 04/16/2024
ms.author: michalrosik
ms.reviewer: angieandrews
search.audienceType:
Expand Down Expand Up @@ -37,7 +37,7 @@ Following are the requirements for the statistics metric type:

- **Case Level Attribute:** Requires aggregation. Uses case context functions. It's not possible to access the values of event-level attributes. For example, `AVG(CaseEvents, PriceUSD)` returns the average value of the attribute **PriceUSD**.

- **Case Duration Influence:** Requires aggregation. Uses case context functions. It's not possible to access the values of event-level attributes. For example, `AVG(CasesPerAttribute,DURATION)1` returns the average duration of cases for selected case level attribute value.
- **Case Duration Influence:** Requires aggregation. Uses case context functions. It's not possible to access the values of event-level attributes. For example, `AVG(CasesPerAttribute,DURATION())` returns the average duration of cases for selected case level attribute value.

- **Case Overview:** Aggregation isn't needed since **Case Overview** displays results per individual cases. Uses functions valid for cases. If you want to calculate statistics of all cases and use them in a metric, you need to define the scope of aggregation. For example, `DURATION()/AVG(ViewCases,DURATION())` returns the ratio between the specific case duration to the average case duration.

Expand All @@ -59,5 +59,5 @@ Following are the requirements for the filter metric type:

Following are the requirements for the process root cause analysis metric type:

**RCA**: The requirements are the same as for Statistics - Case Overview in the [Statistics](#statistics) section in this topic.
**RCA**: The requirements are the same as for Statistics - Case Overview in the [Statistics](#statistics) section in this article.

Loading

0 comments on commit c8b4ebf

Please sign in to comment.