Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

std::os::argparse module #1897

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

alexveden
Copy link
Contributor

std::os::argsparse module

@hwchen
Copy link
Contributor

hwchen commented Jan 27, 2025

I don't know if the API for std argparse has already been discussed (It's not obvious from a quick search of issues or looking at the test runner pr). If not, I've got opinions 😄 and code I'd be willing to donate. But if this has already been decided I don't want to derail.

@alexveden
Copy link
Contributor Author

There are a bunch of tests of argparse there in the test runner PR. So you may try to get a sense of it API. Anyway, I'm open for ideas.

@lerno
Copy link
Collaborator

lerno commented Jan 29, 2025

@hwchen did you have some feedback?

@hwchen
Copy link
Contributor

hwchen commented Jan 31, 2025

Just want to be clear that I'm not really commenting on the the current implementation. I'm more interested in whether there's a certain type of API we're looking for in an argparse module.

I come from Rust, and not C, so I'll explain in terms of those libraries.

  • Clap is very full featured. Help text generation, deriving parser using struct attributes (tags), explicit subcommands, validation, built-in API for parsing common types.
  • lexopt is very minimal, it only provides a stream of values/options.

The ripgrep crate moved away from clap to lexopt, in part to reduce dependencies, and also because lexopt would end up providing more control over arg parsing (at the cost of having to implement more boilerplate).

I feel that the current PR API sits between the two (more towards simplicity). I think for stdlib, I'd prefer either extreme; if it's simpler, more complex parsers can be built on it, and if it has more features it can be used easily as-is for more scenarios. Odin ended up with something more comprehensive (can defined opts using a struct with tags).

Also, I believe that wherever we want to sit on the spectrum, it's good to be explicit about it.


As for my own biases, I've written an arg parsing library for c3 which follows the general structure of lexopt's API. I might prefer something like it in the std library, but I can also see the appeal of other approaches. And seeing as everybody ends up writing their own argparse, there's probably a lot of other opinions out there too :)

@tomaskallup
Copy link
Contributor

tomaskallup commented Feb 1, 2025

I have a bit of feedback on this.

I feel like the API is fine, it's exactly what it says that it is, argument parser. If something more like a full blown CLI app API would be needed (to have 0 hassle subcommands and what not) I could be in another module, which would utilize argparse under the hood.

What I currently don't see is a way to provide an array option, since from the implementation it would seem that providing a single option multiple times would result in an error of "duplicated option". The value of the option could be handled by the callback function from the looks of it.

The only other thing that came to mind was a bit more "hackability", for example if I wanted to somehow implement validation of a parameter, I would have to do it myself after the parsing and I would also have to write the extra help info (if it was for example an enum). But again, this could be solved by the wrapping module, which would hold the users hand a bit more. Altough my view is similar to hwchens' above, I feel like the current implementation here is good enough and if one wants to opt-out of some of the features, they still can (for example the help option is opt-in).

Edit: I see now that the callback function can return optional, which makes the hackability possible for validation or exclusivity of options.

@alexveden
Copy link
Contributor Author

What I currently don't see is a way to provide an array option, since from the implementation it would seem that providing a single option multiple times would result in an error of "duplicated option". The value of the option could be handled by the callback function from the looks of it.

This is a kind a thing I was thinking about. I think it's common for CLI to have accumulated values, e.g. -vvvv for verbosity levels. I didn't implement arrays, because I wanted to have argparse non-allocating. But I think it may be a good idea to add multiple values, at least make it possible to do it with callbacks.

So by design, the callback mechanism is the way to extend the argparse to whatever is needed. I can refine callbacks and arrays of arguments after PR approval.

The only other thing that came to mind was a bit more "hackability", for example if I wanted to somehow implement validation of a parameter, I would have to do it myself after the parsing and I would also have to write the extra help info (if it was for example an enum).

All hackability is implemented via callbacks, or explicit param validation after parsing in the main (or other function). argparse module still does simple validation, so if you expect int type in the option value and given a string, it will raise validation error. More complex cases, should be handled by the program via callback of argparse, or after parse completes in regular code.

@tomaskallup
Copy link
Contributor

So for the arrays, just a simple flag multiple would be needed for the arg? Also requiring you to use the callback.

Since now it would call the callback once and then error. I'm fine with arrays not being available by default and requiring custom implementation.

@alexveden
Copy link
Contributor Author

FYI, I found array args impractical in most cases, I barely can remember anything I used with array args except maybe gcc :). For simple use, it's possible to use --flag + array of arguments

@tomaskallup
Copy link
Contributor

That's what most tools do, single flag with values separated by some character. But sometimes you might want those values to be arbitrary strings and there might not be a feasible separator, like when specifying ENV variables for docker etc.

@lerno
Copy link
Collaborator

lerno commented Feb 4, 2025

I am sorry this one isn't looked at yet. It's half past midnight and I don't have the time this lib deserves to check it. I'll need to push it to the weekend.

@lerno
Copy link
Collaborator

lerno commented Feb 5, 2025

Maybe I'm not the kind of audience who is using something like this, but for me it's more natural with a simpler design, as you might have guessed from the way build_options.c work.

It is quite simple: have a switch which looks at each arg, then if the arg starts with - it instead runs through the switch with - opts, and if it finds another - then that's a long opt and will be checked with the longopts.

This way checking is trivially stateful, which can be useful.

So the useful functionality is not parsing the arguments but rather:

  1. Skip an argument
  2. Check if a string (argument) is a vaild file or directory
  3. Check if a string (argument) is an int
  4. Check if a string (argument) is one in a list of values, and return that index.

What are your thoughts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants