Skip to content

Is it Possible to use schema from ExecutionInput into container_arguments of ProcessingStep? #167

Open
@MrDataPsycho

Description

@MrDataPsycho

Hi,
Lets say I have a execution schema as follows:

execution_input = ExecutionInput(
    schema={
        "PATH_INPUT": str,
        "DESTINATION_OUTPUT": str,
        "study_name": str,
        "ProcessingJobName": str,
        "input_code": str,
        "job_pk": str,
        "job_sk": str,
    }
)

How can I use the execution_input values in the Container Argument part bellow:

processing_step = steps.ProcessingStep(
    "SageMakerProcessingJob1",
    processor=get_processing_container_config(),
    job_name=execution_input["ProcessingJobName"],
    inputs=input_meta,
    outputs=output_meta,
    container_arguments=[
        "--input_filename", "file.docx", 
        "--study_name", execution_input["study_name"]
    ],
    container_entrypoint=["python3", "/opt/ml/processing/code/main.py"]
)

There the study name should come from the the execution input schema. But when trying to create the workflow graph it throughs following errors. Though in the jobname part it except the value from ExecutionInput

workflow_graph = steps.Chain([<over complicated steps>])
workflow = Workflow(
    name="ProcessingJob3_v1",
    definition=workflow_graph,
    role=workflow_execution_role,
    execution_input=execution_input
)
workflow.render_graph()
workflow_arn = workflow.create()

Error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-26-17fe64d66aa4> in <module>()
----> 1 workflow.render_graph()
      2 workflow_arn = workflow.create()

/home/ec2-user/SageMaker/.persisted_conda/dosjobs/lib/python3.6/site-packages/stepfunctions/workflow/stepfunctions.py in render_graph(self, portrait)
    374             portrait (bool, optional): Boolean flag set to `True` if the workflow graph should be rendered in portrait orientation. Set to `False`, if the graph should be rendered in landscape orientation. (default: False)
    375         """
--> 376         widget = WorkflowGraphWidget(self.definition.to_json())
    377         return widget.show(portrait=portrait)
    378 

/home/ec2-user/SageMaker/.persisted_conda/dosjobs/lib/python3.6/site-packages/stepfunctions/steps/states.py in to_json(self, pretty)
     91             return json.dumps(self.to_dict(), indent=4)
     92 
---> 93         return json.dumps(self.to_dict())
     94 
     95     def __repr__(self):

/home/ec2-user/SageMaker/.persisted_conda/dosjobs/lib/python3.6/json/__init__.py in dumps(obj, skipkeys, ensure_ascii, check_circular, allow_nan, cls, indent, separators, default, sort_keys, **kw)
    229         cls is None and indent is None and separators is None and
    230         default is None and not sort_keys and not kw):
--> 231         return _default_encoder.encode(obj)
    232     if cls is None:
    233         cls = JSONEncoder

/home/ec2-user/SageMaker/.persisted_conda/dosjobs/lib/python3.6/json/encoder.py in encode(self, o)
    197         # exceptions aren't as detailed.  The list call should be roughly
    198         # equivalent to the PySequence_Fast that ''.join() would do.
--> 199         chunks = self.iterencode(o, _one_shot=True)
    200         if not isinstance(chunks, (list, tuple)):
    201             chunks = list(chunks)

/home/ec2-user/SageMaker/.persisted_conda/dosjobs/lib/python3.6/json/encoder.py in iterencode(self, o, _one_shot)
    255                 self.key_separator, self.item_separator, self.sort_keys,
    256                 self.skipkeys, _one_shot)
--> 257         return _iterencode(o, 0)
    258 
    259 def _make_iterencode(markers, _default, _encoder, _indent, _floatstr,

/home/ec2-user/SageMaker/.persisted_conda/dosjobs/lib/python3.6/json/encoder.py in default(self, o)
    178         """
    179         raise TypeError("Object of type '%s' is not JSON serializable" %
--> 180                         o.__class__.__name__)
    181 
    182     def encode(self, o):

TypeError: Object of type 'ExecutionInput' is not JSON serializable

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions