Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove empty default values for inputs/outputs arguments #3744

Draft
wants to merge 8 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/faq.rst
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ How can I make an App dependent on multiple inputs?
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

You can pass any number of futures in to a single App either as positional arguments
or as a list of futures via the special keyword ``inputs=()``.
or as a list of futures via the special keyword ``inputs``.
The App will wait for all inputs to be satisfied before execution.


Expand Down
16 changes: 8 additions & 8 deletions docs/historical/changelog.rst
Original file line number Diff line number Diff line change
Expand Up @@ -301,7 +301,7 @@ New Functionality
.. code-block:: python

@bash_app
def cat(inputs=(), outputs=()):
def cat(inputs, outputs):
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not confident, though, that Parsl treats args and kwargs the same with respect to these two special arguments. I'm going to dig a bit more and see if I can convince myself that these do not have to be keyword arguments in order to be recognized correctly, but I'm worried that the previous =() default argument habit might have been related to this.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I expect that's where that style came from too - but it's not particularly nice and it would be good to figure it out and fix rather than behave not-like-normal-Python

return 'cat {} > {}'.format(inputs[0], outputs[0])

concat = cat(inputs=['hello-0.txt'],
Expand All @@ -314,7 +314,7 @@ New Functionality
from parsl import File

@bash_app
def cat(inputs=(), outputs=()):
def cat(inputs, outputs):
return 'cat {} > {}'.format(inputs[0].filepath, outputs[0].filepath)

concat = cat(inputs=[File('hello-0.txt')],
Expand All @@ -328,7 +328,7 @@ New Functionality
from parsl import File

@bash_app
def cat(inputs=(), outputs=()):
def cat(inputs, outputs):
return 'cat {} > {}'.format(inputs[0].filepath, outputs[0].filepath)


Expand Down Expand Up @@ -409,16 +409,16 @@ New Functionality

# The following example worked until v0.8.0
@bash_app
def cat(inputs=(), outputs=()):
def cat(inputs, outputs):
return 'cat {inputs[0]} > {outputs[0]}' # <-- Relies on Parsl auto formatting the string

# Following are two mechanisms that will work going forward from v0.9.0
@bash_app
def cat(inputs=(), outputs=()):
def cat(inputs, outputs):
return 'cat {} > {}'.format(inputs[0], outputs[0]) # <-- Use str.format method

@bash_app
def cat(inputs=(), outputs=()):
def cat(inputs, outputs):
return f'cat {inputs[0]} > {outputs[0]}' # <-- OR use f-strings introduced in Python3.6


Expand Down Expand Up @@ -522,12 +522,12 @@ New Functionality

# Old style: " ".join(inputs) is legal since inputs will behave like a list of strings
@bash_app
def concat(inputs=(), outputs=(), stdout="stdout.txt", stderr='stderr.txt'):
def concat(inputs, outputs, stdout="stdout.txt", stderr='stderr.txt'):
return "cat {0} > {1}".format(" ".join(inputs), outputs[0])

# New style:
@bash_app
def concat(inputs=(), outputs=(), stdout="stdout.txt", stderr='stderr.txt'):
def concat(inputs, outputs, stdout="stdout.txt", stderr='stderr.txt'):
return "cat {0} > {1}".format(" ".join(list(map(str,inputs))), outputs[0])

* Cleaner user app file log management.
Expand Down
4 changes: 2 additions & 2 deletions docs/userguide/apps/python.rst
Original file line number Diff line number Diff line change
Expand Up @@ -196,7 +196,7 @@ Some keyword arguments to the Python function are treated differently by Parsl
return x * 2

@python_app()
def reduce_app(inputs = ()):
def reduce_app(inputs):
return sum(inputs)
Comment on lines +199 to 200
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's one of the cases where an empty iterable would actually work:

>>> inputs = tuple()
>>> sum(inputs)
0

But I'm not sure that's useful behavior for this app.


map_futures = [map_app(x) for x in range(3)]
Expand All @@ -212,7 +212,7 @@ Some keyword arguments to the Python function are treated differently by Parsl
.. code-block:: python

@python_app()
def write_app(message, outputs=()):
def write_app(message, outputs):
"""Write a single message to every file in outputs"""
for path in outputs:
with open(path, 'w') as fp:
Expand Down
4 changes: 2 additions & 2 deletions docs/userguide/configuration/data.rst
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ irrespective of where that app executes.
.. code-block:: python

@python_app
def print_file(inputs=()):
def print_file(inputs):
with open(inputs[0].filepath, 'r') as inp:
content = inp.read()
return(content)
Expand Down Expand Up @@ -189,7 +189,7 @@ The following example illustrates how the remote file is implicitly downloaded f
.. code-block:: python

@python_app
def convert(inputs=(), outputs=()):
def convert(inputs, outputs):
with open(inputs[0].filepath, 'r') as inp:
content = inp.read()
with open(outputs[0].filepath, 'w') as out:
Expand Down
4 changes: 2 additions & 2 deletions docs/userguide/configuration/execution.rst
Original file line number Diff line number Diff line change
Expand Up @@ -216,12 +216,12 @@ The following code snippet shows how apps can specify suitable executors in the

# A mock molecular dynamics simulation app
@bash_app(executors=["Theta.Phi"])
def MD_Sim(arg, outputs=()):
def MD_Sim(arg, outputs):
return "MD_simulate {} -o {}".format(arg, outputs[0])

# Visualize results from the mock MD simulation app
@bash_app(executors=["Cooley.GPU"])
def visualize(inputs=(), outputs=()):
def visualize(inputs, outputs):
bash_array = " ".join(inputs)
return "viz {} -o {}".format(bash_array, outputs[0])

2 changes: 1 addition & 1 deletion docs/userguide/workflows/futures.rst
Original file line number Diff line number Diff line change
Expand Up @@ -151,7 +151,7 @@ be created (``hello.outputs[0].result()``).
# This app echoes the input string to the first file specified in the
# outputs list
@bash_app
def echo(message, outputs=()):
def echo(message, outputs):
return 'echo {} &> {}'.format(message, outputs[0])

# Call echo specifying the output file
Expand Down
10 changes: 5 additions & 5 deletions docs/userguide/workflows/workflow.rst
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ Sequential workflows can be created by passing an AppFuture from one task to ano

# Write a message to a file
@bash_app
def save(message, outputs=()):
def save(message, outputs):
return 'echo {} &> {}'.format(message, outputs[0])

message = generate(10)
Expand Down Expand Up @@ -159,15 +159,15 @@ In other cases, it can be convenient to pass data in files, as in the following
parsl.load()

@bash_app
def generate(outputs=()):
def generate(outputs):
return 'echo $(( RANDOM % (10 - 5 + 1 ) + 5 )) &> {}'.format(outputs[0])

@bash_app
def concat(inputs=(), outputs=(), stdout='stdout.txt', stderr='stderr.txt'):
def concat(inputs, outputs, stdout='stdout.txt', stderr='stderr.txt'):
return 'cat {0} >> {1}'.format(' '.join(inputs), outputs[0])

@python_app
def total(inputs=()):
def total(inputs):
total = 0
with open(inputs[0].filepath, 'r') as f:
for l in f:
Expand Down Expand Up @@ -209,7 +209,7 @@ the sum of those results.

# Reduce function that returns the sum of a list
@python_app
def app_sum(inputs=()):
def app_sum(inputs):
return sum(inputs)

# Create a list of integers
Expand Down
2 changes: 1 addition & 1 deletion parsl/app/bash.py
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,7 @@ def open_std_fd(fdname):
# TODO : Add support for globs here

missing = []
for outputfile in kwargs.get('outputs', []):
for outputfile in kwargs.get('outputs') or []:
fpath = outputfile.filepath

if not os.path.exists(fpath):
Expand Down
2 changes: 1 addition & 1 deletion parsl/data_provider/ftp.py
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ def wrapper(*args, **kwargs):
return wrapper


def _ftp_stage_in(working_dir, parent_fut=None, outputs=[], _parsl_staging_inhibit=True):
def _ftp_stage_in(working_dir, *, parent_fut=None, outputs, _parsl_staging_inhibit=True):
file = outputs[0]
if working_dir:
os.makedirs(working_dir, exist_ok=True)
Expand Down
4 changes: 2 additions & 2 deletions parsl/data_provider/globus.py
Original file line number Diff line number Diff line change
Expand Up @@ -267,7 +267,7 @@ def _update_local_path(self, file, executor, dfk):
# this cannot be a class method, but must be a function, because I want
# to be able to use partial() on it - and partial() does not work on
# class methods
def _globus_stage_in(provider, executor, parent_fut=None, outputs=[], _parsl_staging_inhibit=True):
def _globus_stage_in(provider, executor, *, parent_fut=None, outputs, _parsl_staging_inhibit=True):
globus_ep = provider._get_globus_endpoint(executor)
file = outputs[0]
dst_path = os.path.join(
Expand All @@ -280,7 +280,7 @@ def _globus_stage_in(provider, executor, parent_fut=None, outputs=[], _parsl_sta
file.path, dst_path)


def _globus_stage_out(provider, executor, app_fu, inputs=[], _parsl_staging_inhibit=True):
def _globus_stage_out(provider, executor, *, app_fu, inputs, _parsl_staging_inhibit=True):
"""
Although app_fu isn't directly used in the stage out code,
it is needed as an input dependency to ensure this code
Expand Down
2 changes: 1 addition & 1 deletion parsl/data_provider/http.py
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@ def wrapper(*args, **kwargs):
return wrapper


def _http_stage_in(working_dir, parent_fut=None, outputs=[], _parsl_staging_inhibit=True):
def _http_stage_in(working_dir, *, parent_fut=None, outputs, _parsl_staging_inhibit=True):
file = outputs[0]
if working_dir:
os.makedirs(working_dir, exist_ok=True)
Expand Down
2 changes: 1 addition & 1 deletion parsl/data_provider/zip.py
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,7 @@ def stage_in(self, dm, executor, file, parent_fut):
return app_fut._outputs[0]


def _zip_stage_out(zip_file, inside_path, working_dir, parent_fut=None, inputs=[], _parsl_staging_inhibit=True):
def _zip_stage_out(zip_file, inside_path, working_dir, *, parent_fut=None, inputs, _parsl_staging_inhibit=True):
file = inputs[0]

os.makedirs(os.path.dirname(zip_file), exist_ok=True)
Expand Down
8 changes: 4 additions & 4 deletions parsl/dataflow/dflow.py
Original file line number Diff line number Diff line change
Expand Up @@ -782,7 +782,7 @@ def _add_input_deps(self, executor: str, args: Sequence[Any], kwargs: Dict[str,
logger.debug("Not performing input staging")
return args, kwargs, func

inputs = kwargs.get('inputs', [])
inputs = kwargs.get('inputs') or []
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here and in several places below, the old code would assign None to inputs if the app's formal arguments contained inputs=None, and then the enumerate() below would fail. The new code will assign [] to inputs in this case.

for idx, f in enumerate(inputs):
(inputs[idx], func) = self.data_manager.optionally_stage_in(f, func, executor)

Expand All @@ -801,7 +801,7 @@ def _add_input_deps(self, executor: str, args: Sequence[Any], kwargs: Dict[str,

def _add_output_deps(self, executor: str, args: Sequence[Any], kwargs: Dict[str, Any], app_fut: AppFuture, func: Callable) -> Callable:
logger.debug("Adding output dependencies")
outputs = kwargs.get('outputs', [])
outputs = kwargs.get('outputs') or []
app_fut._outputs = []

# Pass over all possible outputs: the outputs kwarg, stdout and stderr
Expand Down Expand Up @@ -884,7 +884,7 @@ def check_dep(d: Any) -> None:
check_dep(dep)

# Check for futures in inputs=[<fut>...]
for dep in kwargs.get('inputs', []):
for dep in kwargs.get('inputs') or []:
check_dep(dep)

return depends
Expand Down Expand Up @@ -932,7 +932,7 @@ def append_failure(e: Exception, dep: Future) -> None:
append_failure(e, dep)

# Check for futures in inputs=[<fut>...]
if 'inputs' in kwargs:
if kwargs.get('inputs'):
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to above: If inputs=None is supplied, 'inputs' in kwargs will be True, but None will not be iterable below. Now, inputs has to be specified and be truthy (which None is not). Other bad but truthy values could still make it past this check, but that was previously the case, too.

new_inputs = []
for dep in kwargs['inputs']:
try:
Expand Down
2 changes: 1 addition & 1 deletion parsl/dataflow/memoization.py
Original file line number Diff line number Diff line change
Expand Up @@ -191,7 +191,7 @@ def make_hash(self, task: TaskRecord) -> str:
logger.debug("Ignoring kwarg %s", k)
del filtered_kw[k]

if 'outputs' in task['kwargs']:
if task['kwargs'].get('outputs'):
outputs = task['kwargs']['outputs']
del filtered_kw['outputs']
t.append(id_for_memo(outputs, output_ref=True))
Expand Down
4 changes: 2 additions & 2 deletions parsl/executors/radical/executor.py
Original file line number Diff line number Diff line change
Expand Up @@ -422,9 +422,9 @@ def task_translate(self, tid, func, parsl_resource_specification, args, kwargs):
task.use_mpi = False
task.function = self._pack_and_apply_message(func, args, kwargs)

task.input_staging = self._stage_files(kwargs.get("inputs", []),
task.input_staging = self._stage_files(kwargs.get("inputs") or [],
mode='in')
task.output_staging = self._stage_files(kwargs.get("outputs", []),
task.output_staging = self._stage_files(kwargs.get("outputs") or [],
mode='out')

task.input_staging.extend(self._stage_files(list(args), mode='in'))
Expand Down
4 changes: 2 additions & 2 deletions parsl/executors/taskvine/executor.py
Original file line number Diff line number Diff line change
Expand Up @@ -360,8 +360,8 @@ def submit(self, func, resource_specification, *args, **kwargs):

# Determine whether to stage input files that will exist at the workers
# Input and output files are always cached
input_files += [self._register_file(f) for f in kwargs.get("inputs", []) if isinstance(f, File)]
output_files += [self._register_file(f) for f in kwargs.get("outputs", []) if isinstance(f, File)]
input_files += [self._register_file(f) for f in kwargs.get("inputs") or [] if isinstance(f, File)]
output_files += [self._register_file(f) for f in kwargs.get("outputs") or [] if isinstance(f, File)]

# Also consider any *arg that looks like a file as an input:
input_files += [self._register_file(f) for f in args if isinstance(f, File)]
Expand Down
2 changes: 1 addition & 1 deletion parsl/executors/taskvine/install-taskvine.sh
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ TARBALL="cctools-$CCTOOLS_VERSION-x86_64-ubuntu20.04.tar.gz"

# If stderr is *not* a TTY, then disable progress bar and show HTTP response headers
[[ ! -t 1 ]] && NO_VERBOSE="--no-verbose" SHOW_HEADERS="-S"
wget "$NO_VERBOSE" "$SHOW_HEADERS" -O /tmp/cctools.tar.gz "https://github.com/cooperative-computing-lab/cctools/releases/download/release/$CCTOOLS_VERSION/$TARBALL"
wget $NO_VERBOSE $SHOW_HEADERS -O /tmp/cctools.tar.gz "https://github.com/cooperative-computing-lab/cctools/releases/download/release/$CCTOOLS_VERSION/$TARBALL"
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wget, at least on my system, fails when NO_VERBOSE and SHOW_HEADERS are empty/unset, as the command like ends up looking like

wget '' '' -O /tmp/cctools.tar.gz "<the url>"

and e.g.

~ $ wget '' '' -O /dev/null https://google.com &>/dev/null; echo $?
1
~ $ wget -O /dev/null https://google.com &>/dev/null; echo $?
0

Due to the set -e above, the script then aborted. This change fixed it for me.


mkdir -p /tmp/cctools
tar -C /tmp/cctools -zxf /tmp/cctools.tar.gz --strip-components=1
Expand Down
4 changes: 2 additions & 2 deletions parsl/executors/workqueue/executor.py
Original file line number Diff line number Diff line change
Expand Up @@ -461,8 +461,8 @@ def submit(self, func, resource_specification, *args, **kwargs):
output_files = []

# Determine the input and output files that will exist at the workers:
input_files += [self._register_file(f) for f in kwargs.get("inputs", []) if isinstance(f, File)]
output_files += [self._register_file(f) for f in kwargs.get("outputs", []) if isinstance(f, File)]
input_files += [self._register_file(f) for f in kwargs.get("inputs") or [] if isinstance(f, File)]
output_files += [self._register_file(f) for f in kwargs.get("outputs") or [] if isinstance(f, File)]

# Also consider any *arg that looks like a file as an input:
input_files += [self._register_file(f) for f in args if isinstance(f, File)]
Expand Down
2 changes: 1 addition & 1 deletion parsl/tests/integration/test_parsl_load_default_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@


@python_app
def cpu_stress(inputs=[], outputs=[]):
def cpu_stress(inputs=None, outputs=None):
s = 0
for i in range(10**8):
s += i
Expand Down
2 changes: 1 addition & 1 deletion parsl/tests/test_aalst_patterns.py
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ def slow_increment(x, dur=1):


@python_app
def join(inputs=[]):
def join(inputs):
return sum(inputs)


Expand Down
2 changes: 1 addition & 1 deletion parsl/tests/test_bash_apps/test_apptimeout.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@


@bash_app
def echo_to_file(inputs=(), outputs=(), walltime=0.01):
def echo_to_file(inputs=None, outputs=None, walltime=0.01):
return """echo "sleeping"; sleep 0.05"""


Expand Down
2 changes: 1 addition & 1 deletion parsl/tests/test_bash_apps/test_basic.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@


@bash_app
def echo_to_file(inputs=(), outputs=(), stderr=None, stdout=None):
def echo_to_file(inputs, outputs, stderr=None, stdout=None):
res = ""
for o in outputs:
for i in inputs:
Expand Down
4 changes: 2 additions & 2 deletions parsl/tests/test_bash_apps/test_memoize.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@


@bash_app(cache=True)
def fail_on_presence(outputs=()):
def fail_on_presence(outputs):
return 'if [ -f {0} ] ; then exit 1 ; else touch {0}; fi'.format(outputs[0])


Expand All @@ -23,7 +23,7 @@ def test_bash_memoization(tmpd_cwd, n=2):


@bash_app(cache=True)
def fail_on_presence_kw(outputs=(), foo=None):
def fail_on_presence_kw(outputs, foo=None):
return 'if [ -f {0} ] ; then exit 1 ; else touch {0}; fi'.format(outputs[0])


Expand Down
2 changes: 1 addition & 1 deletion parsl/tests/test_bash_apps/test_multiline.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@


@bash_app
def multiline(inputs=(), outputs=(), stderr=None, stdout=None):
def multiline(inputs, outputs, stderr=None, stdout=None):
return """echo {inputs[0]} &> {outputs[0]}
echo {inputs[1]} &> {outputs[1]}
echo {inputs[2]} &> {outputs[2]}
Expand Down
4 changes: 2 additions & 2 deletions parsl/tests/test_bash_apps/test_pipeline.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@


@bash_app
def increment(inputs=(), outputs=(), stdout=None, stderr=None):
def increment(inputs, outputs, stdout=None, stderr=None):
cmd_line = """
if ! [ -f {inputs[0]} ] ; then exit 43 ; fi
x=$(cat {inputs[0]})
Expand All @@ -16,7 +16,7 @@ def increment(inputs=(), outputs=(), stdout=None, stderr=None):


@bash_app
def slow_increment(dur, inputs=(), outputs=(), stdout=None, stderr=None):
def slow_increment(dur, inputs, outputs, stdout=None, stderr=None):
cmd_line = """
x=$(cat {inputs[0]})
echo $(($x+1)) > {outputs[0]}
Expand Down
Loading
Loading