Partial / Pre Parsing a CLI #1880

kbknapp · 2020-04-29T15:38:15Z

In the short term, we now have IgnoreErrors.

Longer term, I feel like #1404 is a more general solution and has existing precedence (Python's argparse at least) compared to a get_partial_matches(arg_name) which only works in a narrow set of cases.

Make sure you completed the following tasks

Searched the discussions
Searched the closed issues

Describe your use case

A way to declare a subset of arguments to be parsed, then return from parsing early. With the expectation that parsing will be called again, and run to full completion.

This most often comes up when you want some argument to handle some aspect of the CLI itself (such as --color flags to control error messages during parsing, or in the above --arg-file which needs to determine a file to load prior to parsing the rest of the CLI).

Describe the solution you'd like

From a user stand point, nothing changes. The run the CLI and things just work. From the a clap consumer stand point, it essentially allows you to pull out some values from the CLI (ignoring everything else), change something about the CLI and continue parsing.

// .. already have some App struct
let pre_matches = app.get_partial_matches("color"); // Only parse the `--color` option, ignore all else
app = if let Some("never") = pre_matches.value_of("color") {
    app.setting(AppSettings::ColorNever)
} else {
    app
};

let matches = app.get_matches();

Maybe it's biting off more than we want to chew for now, but this issue seems to come up time and time again where we want to parse some particular subset of arguments, and use those values to inform how the CLI itself behaves.

Alternatives, if applicable

You can already reparse your CLI twice.

let matches = app.get_matches();
app = if let Some("never") = matches.value_of("color") {
    app.setting(AppSettings::ColorNever)
} else {
    app
};

let matches = app.get_matches();

But this doesn't help if there are errors in the initial parse, or options included included in the initial parse that the initial app isn't aware of. For example, an iptables style CLI where -m <foo> enables many other arguments and options.

Additional context

This could potentially be used in #1693 #1089 #1232 #568

The text was updated successfully, but these errors were encountered:

alerque · 2020-05-05T07:53:32Z

This feature would also be immensely useful for implementing #380, localizing a CLI interface.

You can already reparse your CLI twice.

This is not quite true. You already mentioned some exceptions such as in the case of parse errors, but there is one more road block that makes that technique unworkable. A number of things happen in get_matches() that have the potential to short circuit the rest of the application life-cycle. For example having a help subcommand. If this is detected it immediately outputs stuff to the terminal and quits. This means you don't get a chance to reliably parse the CLI a second time.

This makes the technique useless for localization because by the time we get information about a potential language flag the App has already spit out strings that may have needed localization.

CreepySkeleton · 2020-06-02T20:11:32Z

As a minimal viable solution, we could simply allow ignoring unknown or missing positionals and options.

// first "probe" app
let preprocess = App::new("app")
    .settings(IgnoreErrors) // also disables --help, version, and others
    .arg("--color <color> 'coloring policy: never/always'")
    .get_matches(); // will succeed no matter what

let mut app = App::new("main_app");

match preprocess.value_of("color") {
    Some("always") => app = app.setting(ColorAlways),
    Some("never") => app = app.setting(ColorNever),
    _ => {}
}

// real parsing starts here
app.get_matches()

This kind or partial parsing would allow for most, if not all, cases and would be fairly easy to implement.

alerque · 2020-06-02T21:17:49Z

Yes that does indeed sound like a MVS, and something along the lines of what I expected to be able to hack together already and failed. It's not the most elegant solution maybe, but it would get me by for my use case of localization.

wabain · 2020-06-07T14:10:09Z

@alerque It sounds like get_matches_safe could be what you're looking for? That would address the immediate problem of clap prematurely exiting with help or an error.

CreepySkeleton · 2020-06-07T15:10:59Z

@wabain But it wouldn't give you the partial mach, see the color example above

acheronfail · 2020-07-01T23:35:08Z

I am building an app which spawns another process, and needs to "sniff" if certain flags were passed to the subprocess. Having an AppSettings::IgnoreErrors would be invaluable, since I only want to match some specific args, and leave the rest alone.

Right now I believe this is impossible with clap, and I have to parse the args manually.

For some more detail, my app is used like this:

app <args sent to subprocess>

The subprocess supports hundreds of arguments, and I don't want to have to write code to match all of them if I'm just interested in 2 or 3 flags.

The get_matches_safe doesn't work here for me though, since that bails out and completely fails.

kenoss · 2020-07-15T20:06:11Z

Let me have some examples because my English is poor and I might miss points.

The original problem is "lax position of arguments."

# Good.  `--project` option belongs to `Arg("gcloud")` and `--zone` to `Arg("ssh")`.
$ gcloud --project=hoge compute ssh --zone=asia-northeast1-a instance-name

# Good.  `--project` option can descend to subcommands.
$ gcloud compute --project=hoge ssh --zone=asia-northeast1-a instance-name

# Also good.
$ gcloud compute ssh --project=hoge --zone=asia-northeast1-a instance-name
 
# Bad.  `--zone` option belongs to `ssh` subcommand.
$ gcloud --project=hoge compute --zone=asia-northeast1-a ssh instance-name

I think this is solved by adding option to Arg. No tricks.
There's possibility the implementation of parsing become little complicated, but developer using clap can use intuitively.
This is also glad with the viewpoint of #1232 . Tricks like alternative App and get_matches_safe make the systematic completion harder.

This approach will be match with the @acheronfail 's case. I guess there're two types of app:

time othercommand ...
cargo run -- othercommand-or-options ...

The former is very limited. time do not have subcommands because subcommands conflict with othercommand. It can have only root App and parse will terminate with othercommand. Hence the above approach has no ambiguity.

The later is similar. Parsing will terminate with first "--".

CreepySkeleton · 2020-07-17T16:55:08Z

@kenoss Thanks for looking into this, we appreciate that. I didn't quite get your comment, so let's try to get on the same page first.

The original problem is "lax position of arguments."

Could you please elaborate? I think this is where the misunderstanding comes from. The original problem in my understanding is to "loosely parse flags, options, and subcommands but ignore positional arguments entirely". The -- thing is another topic; I'm not sure why you brought it up.

Good. --project option belongs to Arg("gcloud") and --zone to Arg("ssh").

You have introduced "belongs to" relationship between two arguments. I can't wrap my head around it - arguments can't belong to each other. An argument can "belong" to a subcommand or the top-level app.

Maybe you meant App?

The complexity of implementation... Guys, I think we've all been missing the elephant in the room. The current parsing algorithm can't just "ignore" arguments, especially positional ones. Well, It can ignore, but it can't recover. I think this issue belongs to "postponed until after I finally rewrite the parser algorithm" category.

kenoss · 2020-08-05T15:39:37Z

Sorry for late reply.

The original problem in my understanding is to "loosely parse flags, options, and subcommands but ignore positional arguments entirely".

I read the issue again and I've got the point. Thanks to catch me up the argument.

The -- thing is another topic; I'm not sure why you brought it up.

Sorry. I misunderstood by get_matches_safe and app <args sent to subprocess>.

Maybe you meant App?

Yes.

I've found that this is a critical blocker of #1232 and I should parsing implementation.

…rors clap-rs#1880

@CreepySkeleton

Implemented as AppSetting::Ignore errors as suggested by @CreepySkeleton in clap-rs#1880 (comment). This is not a complete implementation but it works already in surprisingly many situations. clap-rs#1880

@CreepySkeleton

Implemented as AppSetting::Ignore errors as suggested by @CreepySkeleton in clap-rs#1880 (comment). This is not a complete implementation but it works already in surprisingly many situations. clap-rs#1880

@CreepySkeleton

Implemented as AppSetting::Ignore errors as suggested by @CreepySkeleton in clap-rs#1880 (comment). This is not a complete implementation but it works already in surprisingly many situations. clap-rs#1880

@CreepySkeleton

Implemented as AppSetting::Ignore errors as suggested by @CreepySkeleton in clap-rs#1880 (comment). This is not a complete implementation but it works already in surprisingly many situations. clap-rs#1880

epage · 2021-12-09T19:40:38Z

In the short term, we now have IgnoreErrors.

Longer term, I feel like #1404 is a more general solution and has existing precedence (Python's argparse at least) compared to a get_partial_matches(arg_name) which only works in a narrow set of cases.

If there is any concern with this, feel free to comment and we can re-open this.

alerque · 2021-12-14T11:08:56Z

I don't see anything in #1404 covering the use cases I commented on above. An option would need to be added to stop the default short-circuit behavior where the app prints something and exits when encountering some flags, e.g. --help.

epage · 2021-12-14T14:24:30Z

I don't see anything in #1404 covering the use cases I commented on above. An option would need to be added to stop the default short-circuit behavior where the app prints something and exits when encountering some flags, e.g. --help.

So if I understand, you would define one App with the localization flag(s), parse that to know how to handle it, and then do a full parse.

For the partial parse, couldn't you either set NoAutoHelp / NoAutoVersion or DisableHelpFlag / DisableVersionFlag / DisableHelpSubcommand?

alerque · 2021-12-14T14:57:14Z

For the partial parse, couldn't you either set NoAutoHelp / NoAutoVersion or DisableHelpFlag / DisableVersionFlag / DisableHelpSubcommand?

Honestly, I hadn't considered that — but it seems like quite a song and dance for what I want. First I'd have to set both of those (since they aren't the default), do one parsing pass, then unset both of them again for the second pass. I would suggest that should be an option on the parsing function (or an alternate function) to disable all current & future Auto____ functionality during a parse.

Should I open a new issue for that?

epage · 2021-12-14T15:04:00Z

For the partial parse, couldn't you either set NoAutoHelp / NoAutoVersion or DisableHelpFlag / DisableVersionFlag / DisableHelpSubcommand?

Honestly, I hadn't considered that — but it seems like quite a song and dance for what I want. First I'd have to set both of those (since they aren't the default), do one parsing pass, then unset both of them again for the second pass. I would suggest that should be an option on the parsing function (or an alternate function) to disable all current & future Auto____ functionality during a parse.

Should I open a new issue for that?

You can, it would be a better place to continue to discuss it. With that said, I am concerned about the number of parsing functions we already have today and would be preferring we find ways to reduce the number, rather than increase it. As long as we provide the building blocks, that is my main concern. How nice we make things past that is dependent on how prevalent the workflow is.

Already, you either have to be setting IgnoreErrors or calling the partial parsing proposed in #1404 with subset of arguments you want verified. This seems like something that is already going to be a specialized parse pass, so the level of burden of setting a couple more flags seems low.

scottidler · 2022-03-14T01:16:24Z

fn extract(item: (ContextKind, &ContextValue)) -> Option<&ContextValue> {                                                                                         
    let (k, v) = item;                                                                                                                                            
    if k == ContextKind::InvalidArg {                                                                                                                             
        return Some(v);                                                                                                                                           
    }                                                                                                                                                             
    None                                                                                                                                                          
}                                                                                                                                                                 
                                                                                                                                                                  
fn parse_known_args(cmd: &Command) -> Result<(ArgMatches, Vec<String>), Error> {                                                                                  
    let mut rem: Vec<String> = vec![];                                                                                                                            
    let mut args: Vec<String> = env::args().collect();                                                                                                            
    loop {                                                                                                                                                        
        match cmd.clone().try_get_matches_from(&args) {                                                                                                           
            Ok(matches) => {                                                                                                                                      
                return Ok((matches, rem));                                                                                                                        
            },                                                                                                                                                    
            Err(error) => {                                                                                                                                       
                match error.kind() {                                                                                                                              
                    ErrorKind::UnknownArgument => {                                                                                                               
                        let items = error.context().find_map(extract);                                                                                            
                        match items {                                                                                                                             
                            Some(x) => {                                                                                                                          
                                match x {                                                                                                                         
                                    ContextValue::String(s) => {                                                                                                  
                                        rem.push(s.to_owned());                                                                                                   
                                        args.retain(|a| a != s);                                                                                                  
                                    },                                                                                                                            
                                    _ => {                                                                                                                        
                                        return Err(error);                                                                                                        
                                    },                                                                                                                            
                                }                                                                                                                                 
                            },                                                                                                                                    
                            None => {                                                                                                                             
                                return Err(error);                                                                                                                
                            },                                                                                                                                    
                        }                                                                                                                                         
                    },                                                                                                                                            
                    _ => {                                                                                                                                        
                        return Err(error);                                                                                                                        
                    },                                                                                                                                            
                }                                                                                                                                                 
            },                                                                                                                                                    
        }                                                                                                                                                         
    }                                                                                                                                                             
}

fn main() {                                                                                                                                                       
    let cmd = Command::new("cmd")                                                                                                                                 
        .allow_hyphen_values(true)                                                                                                                                
        .trailing_var_arg(true)                                                                                                                                   
        .disable_help_flag(true)                                                                                                                                  
        .disable_version_flag(true)                                                                                                                               
        .arg(                                                                                                                                                     
            Arg::new("config")                                                                                                                                    
                .takes_value(true)                                                                                                                                
                .short('c')                                                                                                                                       
                .long("config")                                                                                                                                   
                .help("path to config"),                                                                                                                          
        );                                                                                                                                                        
    let result = parse_known_args(&cmd);                                                                                                                          
    let (matches, rem) = result.unwrap();                                                                                                                         
    println!("matches={:#?}", matches);                                                                                                                           
    println!("rem={:#?}", rem);   
    let matches = cmd.get_matches_from(&rem);
}

For anyone who comes across this and is looking for an equivalent to Python Argparse's parse_known_args, I hacked this code together. I am sure it could be improved as I am new to Rust. It would be nice to have some ergonomics in the Clap code base for this use case, but until then maybe this provides some usefulness to someone.

For anyone who doesn't understand why you would want this, it is a VERY common pattern for me to supply the config file in the first pass, that will either effect the default values if not alter the program's options entirely.

Clap bails parsing when an "error" is encountered, e.g. a subcommand is missing, "--help" is passed, or the "help" subcommand is invoked. This means that the current approach of parsing args does not handle flags like `--no-pager` or `--color` when an error is encountered. Fix this by separating early args into their own struct and preprocessing them using `ignore_errors` (per clap-rs/clap#1880). The early args are in a new `EarlyArgs` struct because of a known bug where `ignore_errors` causes default values not to be respected (clap-rs/clap#4391 specifically calls out bool, but strings may also be ignored), so when `ignore_errors` is given, the default values will be missing and parsing will fail unless the right arg types are used (e.g`. Option`). By parsing only early args (using the new struct) we only need to adjust `no_pager`, instead of adjusting all args with a default value.

novafacing · 2023-09-09T00:02:52Z

Sorry for the long time bump, I liked @scottidler's answer here so I've made it a little more generic. Slap this at the top of your cli code and you're good to go. My use case is for wrapping clang for LibAFL use -- I want to extract some arguments and pass everything else down to the clang command I'm wrapping.

use anyhow::{anyhow, bail, Result};
use clap::{
    error::{ContextKind, ContextValue, ErrorKind},
    ArgMatches, Command, CommandFactory, FromArgMatches, Parser,
};
use std::env::args;

#[derive(Debug, Clone)]
struct Known<T>
where
    T: FromArgMatches,
{
    matches: T,
    rest: Vec<String>,
}

impl<T> Known<T>
where
    T: FromArgMatches,
{
    pub fn new<I, S>(matches: ArgMatches, rest: I) -> Result<Self>
    where
        I: IntoIterator<Item = S>,
        S: AsRef<str>,
    {
        Ok(Self {
            matches: T::from_arg_matches(&matches)?,
            rest: rest.into_iter().map(|a| a.as_ref().to_string()).collect(),
        })
    }
}

trait ParseKnown: FromArgMatches {
    fn parse_known() -> Result<Known<Self>>;
}

impl<T> ParseKnown for T
where
    T: CommandFactory + FromArgMatches,
{
    fn parse_known() -> Result<Known<T>> {
        let command = Self::command();
        let mut rest = Vec::new();
        let mut args = args().collect::<Vec<_>>();

        loop {
            match command.clone().try_get_matches_from(&args) {
                Ok(matches) => {
                    return Known::new(matches, rest);
                }
                Err(e) if matches!(e.kind(), ErrorKind::UnknownArgument) => {
                    if let Some(ContextValue::String(v)) = e
                        .context()
                        .find_map(|(k, v)| matches!(k, ContextKind::InvalidArg).then_some(v))
                    {
                        let unknown = args
                            .iter()
                            .find(|a| a.starts_with(v))
                            .cloned()
                            .ok_or_else(|| anyhow!("No argument starts with {}", v))?;

                        if args.contains(&unknown) {
                            args.retain(|a| a != &unknown);
                            rest.push(unknown);
                        } else {
                            bail!("Got unknown argument {} but it is not in args at all.", v);
                        }
                    } else {
                        bail!("No string value found for unknown argument: {}", e);
                    }
                }
                Err(e) => bail!("Error getting matches from args '{:?}': {}", args, e),
            }
        }
    }
}

#[derive(Parser, Debug, Clone)]
#[clap(
    allow_hyphen_values = true,
    trailing_var_arg = true,
    disable_help_flag = true,
    disable_version_flag = true
)]
struct Args {
    #[arg(long)]
    libafl_no_link: bool,
    #[arg(long)]
    libafl: bool,
    #[arg(long)]
    libafl_ignore_configurations: bool,
    #[arg(long, value_delimiter = ',')]
    libafl_configurations: Vec<String>,
}

fn main() -> Result<()> {
    let args = Args::parse_known()?;
    println!("{:#?}", args);
    Ok(())
}

Handles e.g. cargo run -- -x --libafl -y -zexecstack -fsanitize=fuzzer-no-link -fno-common --libafl-ignore-configurations (the above example will fail with single-hyphen commands)

kbknapp added T: new feature A-parsing Area: Parser's logic and needs it changed somehow. labels Apr 29, 2020

kbknapp mentioned this issue May 1, 2020

Support or document localization of messages and flags #380

Open

kbknapp mentioned this issue May 11, 2020

Compute argument possible values based on other arguments #1910

Closed

CreepySkeleton mentioned this issue Jun 2, 2020

Dynamic completion support #1232

Closed

tpiekarski mentioned this issue Jun 5, 2020

"-c never" and NO_COLOR are not respected for -h and other usage messages sharkdp/fd#600

Closed

pksunkara mentioned this issue Mar 30, 2021

Partially parse args, capturing unknown args into a vec, rather than error #1404

Open

Urhengulas mentioned this issue Apr 5, 2021

Improve cli argument parsing knurling-rs/flip-link#31

Open

kolloch added a commit to kolloch/clap that referenced this issue May 13, 2021

AppSettings:IgnoreErrors: Allow partial parsing in the presence of er…

f4e2688

…rors clap-rs#1880

kolloch mentioned this issue May 13, 2021

setting: IgnoreErrors - Allow parsing despite missing option values #2480

Merged

pksunkara mentioned this issue May 26, 2021

Provide a Arg::custom_parse style function #1185

Closed

QuarticCat mentioned this issue Jun 5, 2021

Beautify help messages bootandy/dust#147

Merged

epage mentioned this issue Jul 30, 2021

Allow end users to indicate that specific arguments may be ignored if not recognized #2354

Closed

2 tasks

pksunkara removed the W: 3.x label Aug 13, 2021

This was referenced Dec 6, 2021

Dynamic completion support epage/clapng#92

Open

Partial / Pre Parsing a CLI epage/clapng#156

Open

Allow end users to indicate that specific arguments may be ignored if not recognized epage/clapng#179

Open

epage added C-enhancement Category: Raise on the bar on expectations and removed T: RFC / question labels Dec 8, 2021

epage closed this as completed Dec 9, 2021

epage added the S-wont-fix Status: Closed as there is no plan to fix this label Jan 11, 2022

chooglen mentioned this issue Dec 1, 2022

Paginate help output jj-vcs/jj#824

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Partial / Pre Parsing a CLI #1880

Partial / Pre Parsing a CLI #1880

kbknapp commented Apr 29, 2020 •

edited by epage

Loading

alerque commented May 5, 2020 •

edited

Loading

CreepySkeleton commented Jun 2, 2020

alerque commented Jun 2, 2020

wabain commented Jun 7, 2020

CreepySkeleton commented Jun 7, 2020

acheronfail commented Jul 1, 2020

kenoss commented Jul 15, 2020 •

edited

Loading

CreepySkeleton commented Jul 17, 2020

kenoss commented Aug 5, 2020

epage commented Dec 9, 2021

alerque commented Dec 14, 2021

epage commented Dec 14, 2021

alerque commented Dec 14, 2021

epage commented Dec 14, 2021

scottidler commented Mar 14, 2022

novafacing commented Sep 9, 2023 •

edited

Loading

Partial / Pre Parsing a CLI #1880

Partial / Pre Parsing a CLI #1880

Comments

kbknapp commented Apr 29, 2020 • edited by epage Loading

Make sure you completed the following tasks

Describe your use case

Describe the solution you'd like

Alternatives, if applicable

Additional context

alerque commented May 5, 2020 • edited Loading

CreepySkeleton commented Jun 2, 2020

alerque commented Jun 2, 2020

wabain commented Jun 7, 2020

CreepySkeleton commented Jun 7, 2020

acheronfail commented Jul 1, 2020

kenoss commented Jul 15, 2020 • edited Loading

CreepySkeleton commented Jul 17, 2020

kenoss commented Aug 5, 2020

epage commented Dec 9, 2021

alerque commented Dec 14, 2021

epage commented Dec 14, 2021

alerque commented Dec 14, 2021

epage commented Dec 14, 2021

scottidler commented Mar 14, 2022

novafacing commented Sep 9, 2023 • edited Loading

kbknapp commented Apr 29, 2020 •

edited by epage

Loading

alerque commented May 5, 2020 •

edited

Loading

kenoss commented Jul 15, 2020 •

edited

Loading

novafacing commented Sep 9, 2023 •

edited

Loading