Manage Spark jobs
Manage Spark jobs
$ snd spark request-status --id 54284cb9-8e58-4d92-93cb-6543
$ snd spark runtime-info --id 54284cb9-8e58-4d92-93cb-6543
$ snd spark logs --id 54284cb9-8e58-4d92-93cb-6543
$ snd spark submit --job-name configuration-name --overrides ./overrides.json
$ snd spark configuration --name configuration-name
-a, --auth-provider string Specify the OAuth provider name (default "azuread")
--custom-auth-url string Specify the auth service uri
--custom-service-url string Specify the service url (default "https://beast.%s.sneaksanddata.com")
-e, --env string Target environment (default "awsd")
-h, --help help for spark
-i, --id string Specify the Job ID
--gen-docs Generate Markdown documentation for all commands
- snd - SnD CLI
- snd spark configuration - Get a deployed SparkJob configuration.
The name of the SparkJob should be provided as an argument.
- snd spark encrypt - Encrypt a value from a file or stdin using encryption key from a corresponding Spark Runtime
- snd spark logs - Get logs from a Spark Job
- snd spark request-status - Get the status of a Spark Job
- snd spark runtime-info - Get the runtime info of a Spark Job
- snd spark submit - Runs the provided Beast V3 job with optional overrides
The overrides should be provided as a JSON file with the structure below.
If the 'clientTag' is not provided, a random tag will be generated.
If 'extraArguments', 'projectInputs', 'projectOutputs', or 'expectedParallelism' are not provided, the job will run with the default arguments.
{
"client_tag": "<string> - A tag for the client making the submission",
"extra_arguments": "<object> - Any additional arguments for the job",
"project_inputs": [{
"alias": "<string> - An alias for the input",
"data_path": "<string> - The path to the input data",
"data_format": "<string> - The format of the input data"
}
// More input objects can be added here
],
"project_outputs": [{
"alias": "<string> - An alias for the output",
"data_path": "<string> - The path where the output data should be stored",
"data_format": "<string> - The format of the output data"
}
// More output objects can be added here
],
"expected_parallelism": "<integer> - The expected level of parallelism for the job"
}