Merge pull request MicrosoftDocs#1445 from MicrosoftDocs/main63849191…

…5108899291sync_temp For protected branch, push strategy should use PR and merge to target branch method to work around git push error
carusyte · Apr 20, 2024 · c8b4ebf · c8b4ebf
2 parents c12c129 + 5deea5a
commit c8b4ebf
Show file tree

Hide file tree

Showing 14 changed files with 301 additions and 102 deletions.
diff --git a/articles/TOC.yml b/articles/TOC.yml
@@ -802,8 +802,12 @@
       href: process-mining-copilot-in-process-analytics.md   
     - name: Connect to SAP ERP from process mining (preview)
       href: process-mining-sap-erp.md    
-    - name: Use your own Azure Data Lake Storage Gen2
-      href: process-mining-byo-azure-data-lake.md
+    - name: Storage Gen2
+      items:
+      - name: Bring your own Azure Data Lake Storage Gen2
+        href: process-mining-byo-azure-data-lake.md
+      - name: Use your own network isolated Azure Data Lake Storage Gen2
+        href: process-mining-byo-azure-data-lake-private.md
     - name: Transform and map data
       href: process-mining-transform.md
     - name: Visualize and gain insights from processes

diff --git a/...cles/media/process-mining-byo-azure-data-lake-private/azure-portal-settings.png b/...cles/media/process-mining-byo-azure-data-lake-private/azure-portal-settings.png
diff --git a/...cles/media/process-mining-byo-azure-data-lake-private/azure-portal-settings.svg b/...cles/media/process-mining-byo-azure-data-lake-private/azure-portal-settings.svg
diff --git a/articles/media/process-mining-byo-azure-data-lake-private/error.png b/articles/media/process-mining-byo-azure-data-lake-private/error.png
diff --git a/articles/media/process-mining-byo-azure-data-lake-private/prompt-ps.svg b/articles/media/process-mining-byo-azure-data-lake-private/prompt-ps.svg
diff --git a/articles/minit/aggregations.md b/articles/minit/aggregations.md
@@ -8,7 +8,7 @@ contributors:
   - v-aangie
 ms.subservice: process-advisor
 ms.topic: conceptual
-ms.date: 04/18/2024
+ms.date: 04/22/2024
 ms.author: michalrosik
 ms.reviewer: angieandrews
 ms.custom: bap-template
@@ -381,13 +381,13 @@ Returns the first [value] that meets the [condition], grouped according to the [
 
 - **[value]**: An attribute name, nested operation, or expression
 
-   Data type: INT, FLOAT, TIME
-
-- **[default]**: Default value returned by operator when no element in defined [context] meets the [condition]
+   Data type: INT, FLOAT, TIME, STRING
 
-   Data type: INT, FLOAT, DATE, TIME
+- **[default]**: Value to be returned, when condition is not met
 
-**Output data type**: FLOAT, TIME
+   Data type: BOOL, INT, FLOAT, STRING, DATE, TIME
+
+**Output data type**: BOOL, INT, FLOAT, STRING, DATE, TIME
 
 ## LAST([context],[value])
 
@@ -415,11 +415,11 @@ Returns the last value that meets the [condition], grouped according to the [con
 
    Data type: INT, FLOAT, TIME
 
-- **[default]**: Default value returned by operator when no element in defined [context] meets the [condition]
+- **[default]**: Value to be returned, when condition is not met
 
-   Data type: INT, FLOAT, DATE, TIME
+   Data type: BOOL, INT, FLOAT, STRING, DATE, TIME
 
-**Output data type**: FLOAT, TIME
+**Output data type**: BOOL, INT, FLOAT, STRING, DATE, TIME
 
 ## SELFLOOP([context],[attributeName])
 

diff --git a/articles/minit/process-mining-desktop-sizing.md b/articles/minit/process-mining-desktop-sizing.md
@@ -0,0 +1,63 @@
+---
+title: Power Automate Process Mining desktop application sizing guide
+description: Learn about the sizing requirements to run Power Automate Process Mining desktop app.
+author: rosikm
+contributors:
+  - rosikm
+  - v-aangie
+ms.subservice: process-advisor
+ms.topic: overview
+ms.date: 04/16/2024
+ms.author: michalrosik
+ms.reviewer: angieandrews
+search.audienceType:
+- enduser
+---
+
+# Power Automate Process Mining Desktop Client Hardware Requirements
+
+**Power Automate Process Mining (PAPM)** desktop client application enables users to discover, analyze, and improve business processes from event log data. PAPM uses a stateful process mining engine that requires a significant amount of hardware resources, especially RAM memory, disk drive capacity and speed, and CPU cores. This document provides the recommended hardware specifications for running PAPM on different sizes of event log files. The document also explains the factors that affect the performance of PAPM and how to optimize the hardware configuration for the best user experience.
+
+## Hardware specifications
+
+The following table summarizes the recommended hardware specifications for running PAPM on different sizes of event log files. These requirements assume minimal impact of other applications on memory consumption and CPU utilization. Otherwise, it's necessary to increase the requirements based on the demands of these applications. The table assumes that the event log files are in CSV format and that they're transformed into Process Model files before loading them into PAPM. The size of the Process Model file is typically 20-30% of the original CSV file. The table covers the minimal requirements and the optimal requirements for each size of event log file. The minimal requirements are the minimum hardware specifications that are needed to run PAPM without encountering errors or timeouts. The optimal requirements are the hardware specifications that are needed to run PAPM with fast and smooth performance. 
+
+|Event log size (CSV) |Process Model file size |Minimal requirements |Optimal requirements | 
+|---------|---------|---------|---------|
+|0 - 10 GB |0 - 3 GB |<li>RAM: 8 GB</li><li>Disk: HDD (50 GB free)</li><li>CPU: 2 cores</li>|<li>RAM: 16 GB</li><li>Disk: SSD (100 GB free)</li><li>CPU: 4 cores</li>|
+|10 - 50 GB |3 - 15 GB |<li>RAM: 16 GB</li><li>Disk: SSD (100 GB free)</li><li>CPU: 4 cores</li>|<li>RAM: 32 GB</li><li>Disk: NVMe SSD (200 GB free)</li><li>CPU: 8 cores</li>|
+|50 - 100 GB |15 - 30 GB |<li>RAM: 32 GB</li><li>Disk: SSD (200 GB free)</li><li>CPU: 8 cores</li>|<li>RAM: 48 GB</li><li>Disk: NVMe SSD (400 GB free)</li><li>CPU: 16 cores</li>|
+|100 - 150 GB |30 - 45 GB |<li>RAM: 48 GB</li><li>Disk: NVMe SSD (400 GB free)</li><li>CPU: 8 cores</li>|<li>RAM: 64 GB</li><li>Disk: NVMe SSD (600 GB free)</li><li>CPU: 16 cores</li>|
+
+> [!NOTE]
+>
+> The previous table shows the minimal and optimal hardware configuration for running PAPM desktop app. The minimal configuration is the lowest configuration that can run PAPM without crashing but user might experience delays in the upper volume boundaries. The optimal configuration is the configuration that can run PAPM Desktop App smoothly and efficiently. For the boundary values of the process model size, we suggest choosing a stronger hardware configuration.  
+
+## Performance factors
+
+The performance of PAPM depends on several factors, such as the size and complexity of the event log data, the type and number of analyses performed by the user, and the hardware configuration of the machine running PAPM. The following sections explain how each of these factors affects the performance of PAPM and how to optimize them for the best user experience.
+
+### Data size and complexity
+
+The size and complexity of the event log data have a direct impact on the performance of PAPM. The larger and more complex the data, the more hardware resources are needed to process and analyze them. The size of the data is determined by the number of events, the number of attributes, and the cardinality of the attribute values. The complexity of the data is determined by the number of variants, the number of activities, and the degree of concurrency and loops in the process. The following are some general guidelines to reduce the size and complexity of the data:
+
+- Filter out irrelevant or redundant events and attributes before data ingestion. 
+- Reduce the number of unique values of the attributes by grouping or aggregating them into meaningful categories.
+- Use a suitable mining attribute that captures the main behavior of the process and avoids creating too many variants.
+- Use a suitable time granularity that reflects the temporal dynamics of the process and avoids creating too many events.
+
+### Amount of analysis and their types
+
+The type and number of analyses performed by the user also have an impact on the performance of PAPM. The more analyses the user performs, the more hardware resources are needed to compute and display them. The type of analysis determines the amount of data that needs to be accessed and processed, and the level of detail that needs to be shown. The following are some general guidelines to optimize the type and number of analyses:
+
+- Use filters to focus on the most relevant or interesting cases, activities, or attributes for the analysis.
+- Avoid creating custom metrics that aren't relevant for the current analysis. Custom metrics that are already created can be disabled without the need for deletion.
+- Avoid performing too many analyses at the same time.
+
+### Hardware configuration
+
+The hardware configuration of the machine running PAPM is the most important factor that affects the performance of PAPM. The hardware configuration determines the amount of data that can be loaded into memory, the speed of reading data from disk, and the speed of processing data in parallel. The following are some general guidelines to optimize the hardware configuration:
+
+- To load the data into memory, use a machine with enough RAM. This amount significantly improves the performance of PAPM, as it avoids the need to stream data from disk, which is slower. The recommended RAM size for each data size is shown in the previous table.
+- To store and read the data, use a machine with a fast disk drive. This usage improves the performance of PAPM, especially if the data can't be loaded into memory. The recommended disk type and speed for each data size are shown in the previous table.
+- To process the data in parallel, use a machine with enough CPU cores. This usage improves the performance of PAPM, as it enables PAPM to split the computation into multiple threads and use the full potential of the CPU. The recommended CPU core number for each data size is shown in the previous table.
diff --git a/articles/minit/requirements-for-application.md b/articles/minit/requirements-for-application.md
@@ -7,7 +7,7 @@ contributors:
   - v-aangie
 ms.subservice: process-advisor
 ms.topic: conceptual
-ms.date: 07/18/2023
+ms.date: 04/16/2024
 ms.author: michalrosik
 ms.reviewer: angieandrews
 search.audienceType:
@@ -37,7 +37,7 @@ Following are the requirements for the statistics metric type:
 
 - **Case Level Attribute:** Requires aggregation. Uses case context functions. It's not possible to access the values of event-level attributes. For example, `AVG(CaseEvents, PriceUSD)` returns the average value of the attribute **PriceUSD**.
 
-- **Case Duration Influence:** Requires aggregation. Uses case context functions. It's not possible to access the values of event-level attributes. For example, `AVG(CasesPerAttribute,DURATION)1` returns the average duration of cases for selected case level attribute value.
+- **Case Duration Influence:** Requires aggregation. Uses case context functions. It's not possible to access the values of event-level attributes. For example, `AVG(CasesPerAttribute,DURATION())` returns the average duration of cases for selected case level attribute value.
 
 - **Case Overview:** Aggregation isn't needed since  **Case Overview** displays results per individual cases. Uses functions valid for cases. If you want to calculate statistics of all cases and use them in a metric, you need to define the scope of aggregation. For example, `DURATION()/AVG(ViewCases,DURATION())` returns the ratio between the specific case duration to the average case duration.
 
@@ -59,5 +59,5 @@ Following are the requirements for the filter metric type:
 
 Following are the requirements for the process root cause analysis metric type:
 
-**RCA**: The requirements are the same as for Statistics - Case Overview in the [Statistics](#statistics) section in this topic.
+**RCA**: The requirements are the same as for Statistics - Case Overview in the [Statistics](#statistics) section in this article.