exec: Cmd argument in non-shell mode should not split the string #794

hades · 2025-02-20T21:47:25Z

As discussed in cnfgmngmntcmp.

Currently, if shell is unset (the default), the cmd parameter is interpreted as a list of arguments separated by a non-empty sequence of spaces. (E.g. "ls -l /tmp" becomes ["ls", "-l", "/tmp"])[1].

While useful on the surface, this is an unusual behaviour, that I personally would least expect from this parameter, and most likely will be hard to make useful for complex commands.

Curiously, Ansible's ansible.builtin.command works in a similar way. However the mgmt implementation is not compatible: Ansible uses the Python's shlex module[2], which implements a parody of a working shell parser.

Additionally, if this ever works on Windows, this will become even more weird. Windows uses a single string to pass parameters to the processes, so the POSIX implementation is to concatenate the array of arguments and pass it as one string. So this mgmt resource will split the string, which then will be concatenated in libc, (which then most likely will be split again in the process itself).

My suggestions (in reverse order of preference) are:

replace Args, Cmd and Shell with the following parameters:
- Command: to specify a command as a list of arguments, where the first argument is a path to the executable
- ShellCommand: to specify a command as a string to be passed to a shell (sh -c "...")
- Shell: to override the shell implementation
  Command and (Shell, ShellCommand) would be mutually exclusive. Shell becomes optional, and ShellCommand by itself invokes the default shell.
similarly to Ansible, add a new resource ("shell") to specifically use shell for running the commands. Change the Cmd parameter of the exec resource to accept a list of arguments
use a shell-like parser to imitate parsing of the Cmd parameter with more fidelity
document the current behaviour in detail

I'd be happy to provide a path for whatever suggestion of your choice, as long as you choose correctly.

[1] https://mgmtconfig.com/docs/resources/#exec
[2] https://github.com/python/cpython/blob/3.13/Lib/shlex.py

The text was updated successfully, but these errors were encountered:

purpleidea · 2025-02-22T17:33:52Z

as long as you choose correctly

👀

We'll do our best. A few thoughts to aid the design discussion:

Let's not worry about Windows being weird. I know it is weird, and so as a result, we should focus on what's optimal for Linux users. If that happens to be compatible with Windows, then great. If it's not, then we can always add a windows:exec resource.
We want the default to be to avoid invoking the shell. We don't want that overheard by default if we can exec a subprocess directly.
It would be most helpful if you added some pseudo-code exec examples of what you think is the "correct" approach. This will make it a bit less confusing and less ambiguous so I don't misunderstand your wishes!

Thank you =D

hades · 2025-02-22T18:16:30Z

Here are some examples. For the current implementation:

# BAD
exec "exec1" {
  cmd => "/usr/bin/ls -l    /tmp",
}
exec "exec2" {
  cmd => "/usr/bin/ls -l    /tmp > /tmp/file_list.txt",
  shell => "/bin/bash",
}
exec "exec3" {
  cmd => "/usr/bin/ls -l    /tmp > /tmp/file_list.txt",
  shell => "/bin/zsh",
}

For the suggested changes:

# GOOD (Option 1)
exec "exec1" {
  command => ["/usr/bin/ls",  "-l", "/tmp",],
}
exec "exec2" {
  shellcommand => "/usr/bin/ls -l    /tmp > /tmp/file_list.txt",
}
exec "exec3" {
  shellcommand => "/usr/bin/ls -l    /tmp > /tmp/file_list.txt",
  shell => "/bin/zsh",
}

Or with the Ansible option:

# GOOD (Option 2)
exec "exec1" {
  cmd => ["/usr/bin/ls",  "-l", "/tmp",],
}
shellexec "exec2" {
  cmd => "/usr/bin/ls -l    /tmp > /tmp/file_list.txt",
}
shellexec "exec3" {
  cmd => "/usr/bin/ls -l    /tmp > /tmp/file_list.txt",
  shell => "/bin/zsh",
}

purpleidea · 2025-02-22T19:28:03Z

If I understand correctly, your main goal is that we express the "command" as a list rather than as a string, is that correct?

Keep in mind we still need a scenario where the user uses the "name" variable and doesn't override with a cmd param, so that should still have some useful semantics. (When we use cmd or args or whatever relevant params, then the name is only used as a unique identifier, and not a source of data for the resource.)

Lastly, assuming we agree on this change, are you willing to implement a polished version?

hades · 2025-02-22T20:04:13Z

If I understand correctly, your main goal is that we express the "command" as a list rather than as a string, is that correct?

Well, yes. I want to emphasize that "the command" is inherently a list. Currently this list is represented as a space-delimited string, and I would like to represent it as a list.

Keep in mind we still need a scenario where the user uses the "name" variable and doesn't override with a cmd param, so that should still have some useful semantics.

Ah. I'm not sure I'm a big fan of this feature, but, sure, the name could be interpreted as a list of one item, i.e. these two are equivalent:

exec "/bin/eject" {
}

and

exec "/bin/eject" {
  command => ["/bin/eject",],
}

Is that what you have in mind?

Lastly, assuming we agree on this change, are you willing to implement a polished version?

Sure.

purpleidea · 2025-02-23T20:52:52Z

The behaviour of:

exec "/usr/bin/ls -l /tmp" {

}

must still work. And I expect it would be identical to doing:

exec "whatever" {
    cmd => ["/usr/bin/ls", "-l", "/tmp",],
}

Having said that, I'm okay with this change, but I'd love it if @frebib had a look to ACK it first.

Looking through the golang code, the "important" golang exec API is https://pkg.go.dev/os/exec#CommandContext (or the non ctx version) which both have argv[0] separate from the remaining args... So I suppose that's why I did it that way,

Is there a good reason why it separated those two instead of just having a single argv? The only argument I can think of is that it prevents someone who controls the argv array from controlling the binary path.

hades · 2025-02-23T21:37:52Z

The behaviour of [...] must still work.

Well, all the arguments against the status quo would still apply to this shorthand, and we would need to decide what exactly is the semantics of that string. Is it Windows-like or Ansible-like? In other words, is

exec "/usr/bin/ls -l 'Documents and Settings'" {}

equivalent to

exec "whatever" {
    command => ["/usr/bin/ls", "-l", "'Documents", "and", "Settings'",],
}

or

exec "whatever" {
    command => ["/usr/bin/ls", "-l", "Documents and Settings",],
}

Neither is particularly attractive to me, and in general I personally see no value in making that shorthand work, especially given its unclear semantics.

Is there a good reason why it separated those two instead of just having a single argv?

golang works in mysterious ways.

frebib · 2025-02-23T23:12:43Z

The behaviour of [...] must still work.

Well, all the arguments against the status quo would still apply to this shorthand, and we would need to decide what exactly is the semantics of that string. Is it Windows-like or Ansible-like? In other words, is
exec "/usr/bin/ls -l 'Documents and Settings'" {}
equivalent to
exec "whatever" {
    command => ["/usr/bin/ls", "-l", "'Documents", "and", "Settings'",],
}
or
exec "whatever" {
    command => ["/usr/bin/ls", "-l", "Documents and Settings",],
}
Neither is particularly attractive to me, and in general I personally see no value in making that shorthand work, especially given its unclear semantics.

Echoing what I said in chat. I 100% agree with hades' take here. It's a footgun and a good piece of software will be opinionated enough to protect it's users even at the cost of apparent convenience.

hades mentioned this issue Feb 22, 2025

exec: inconsistent Args semantics #793

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

exec: Cmd argument in non-shell mode should not split the string #794

exec: Cmd argument in non-shell mode should not split the string #794

hades commented Feb 20, 2025

purpleidea commented Feb 22, 2025

hades commented Feb 22, 2025

purpleidea commented Feb 22, 2025

hades commented Feb 22, 2025

purpleidea commented Feb 23, 2025

hades commented Feb 23, 2025

frebib commented Feb 23, 2025

exec: Cmd argument in non-shell mode should not split the string #794

exec: Cmd argument in non-shell mode should not split the string #794

Comments

hades commented Feb 20, 2025

purpleidea commented Feb 22, 2025

hades commented Feb 22, 2025

purpleidea commented Feb 22, 2025

hades commented Feb 22, 2025

purpleidea commented Feb 23, 2025

hades commented Feb 23, 2025

frebib commented Feb 23, 2025