diff --git a/docs/user_guides/projects/jobs/pyspark_job.md b/docs/user_guides/projects/jobs/pyspark_job.md index c0cb7e80..e329312f 100644 --- a/docs/user_guides/projects/jobs/pyspark_job.md +++ b/docs/user_guides/projects/jobs/pyspark_job.md @@ -217,7 +217,7 @@ The following table describes the JSON payload returned by `jobs_api.get_configu | Field | Type | Description | Default | | ------------------------------------------ | -------------- |-----------------------------------------------------| -------------------------- | | `type` | string | Type of the job configuration | `"sparkJobConfiguration"` | -| `appPath` | string | Project path to script (e.g `Resources/foo.py`) | `null` | +| `appPath` | string | Project path to script (e.g `Resources/foo.py`) | `null` | | `environmentName` | string | Name of the project spark environment | `"spark-feature-pipeline"` | | `spark.driver.cores` | number (float) | Number of CPU cores allocated for the driver | `1.0` | | `spark.driver.memory` | number (int) | Memory allocated for the driver (in MB) | `2048` | @@ -229,6 +229,10 @@ The following table describes the JSON payload returned by `jobs_api.get_configu | `spark.dynamicAllocation.maxExecutors` | number (int) | Maximum number of executors with dynamic allocation | `2` | | `spark.dynamicAllocation.initialExecutors` | number (int) | Initial number of executors with dynamic allocation | `1` | | `spark.blacklist.enabled` | boolean | Whether executor/node blacklisting is enabled | `false` | +| `files` | string | HDFS path(s) to files to be provided to the Spark application. Multiple files can be included in a single string, separated by commas.
Example: `"hdfs:///Project//Resources/file1.py,hdfs:///Project//Resources/file2.txt"` | `null` | +| `pyFiles` | string | HDFS path(s) to Python files to be provided to the Spark application. These will be added to the `PYTHONPATH` so they can be imported as modules. Multiple files can be included in a single string, separated by commas.
Example: `"hdfs:///Project//Resources/module1.py,hdfs:///Project//Resources/module2.py"` | `null` | +| `jars` | string | HDFS path(s) to JAR files to be provided to the Spark application. These will be added to the classpath. Multiple files can be included in a single string, separated by commas.
Example: `"hdfs:///Project//Resources/lib1.jar,hdfs:///Project//Resources/lib2.jar"` | `null` | +| `archives` | string | HDFS path(s) to archive files to be provided to the Spark application. Multiple files can be included in a single string, separated by commas.
Example: `"hdfs:///Project//Resources/archive1.zip,hdfs:///Project//Resources/archive2.tar.gz"` | `null` | ## Accessing project data diff --git a/docs/user_guides/projects/jobs/spark_job.md b/docs/user_guides/projects/jobs/spark_job.md index 6d0f0510..6345d5a6 100644 --- a/docs/user_guides/projects/jobs/spark_job.md +++ b/docs/user_guides/projects/jobs/spark_job.md @@ -230,7 +230,12 @@ The following table describes the JSON payload returned by `jobs_api.get_configu | `spark.dynamicAllocation.minExecutors` | number (int) | Minimum number of executors with dynamic allocation | `1` | | `spark.dynamicAllocation.maxExecutors` | number (int) | Maximum number of executors with dynamic allocation | `2` | | `spark.dynamicAllocation.initialExecutors` | number (int) | Initial number of executors with dynamic allocation | `1` | -| `spark.blacklist.enabled` | boolean | Whether executor/node blacklisting is enabled | `false` | +| `spark.blacklist.enabled` | boolean | Whether executor/node blacklisting is enabled | `false` +| `files` | string | HDFS path(s) to files to be provided to the Spark application. Multiple files can be included in a single string, separated by commas.
Example: `"hdfs:///Project//Resources/file1.py,hdfs:///Project//Resources/file2.txt"` | `null` | +| `pyFiles` | string | HDFS path(s) to Python files to be provided to the Spark application. These will be added to the `PYTHONPATH` so they can be imported as modules. Multiple files can be included in a single string, separated by commas.
Example: `"hdfs:///Project//Resources/module1.py,hdfs:///Project//Resources/module2.py"` | `null` | +| `jars` | string | HDFS path(s) to JAR files to be provided to the Spark application. These will be added to the classpath. Multiple files can be included in a single string, separated by commas.
Example: `"hdfs:///Project//Resources/lib1.jar,hdfs:///Project//Resources/lib2.jar"` | `null` | +| `archives` | string | HDFS path(s) to archive files to be provided to the Spark application. Multiple files can be included in a single string, separated by commas.
Example: `"hdfs:///Project//Resources/archive1.zip,hdfs:///Project//Resources/archive2.tar.gz"` | `null` | + ## Accessing project data