Skip to content

refactor!: everything #85

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Conversation

sirewix
Copy link

@sirewix sirewix commented May 5, 2025

Excuse the bold title, but here it is

New "Command" language

Overall the syntax is different, but is inspired and somewhat resembles the current one. The idea idea is to separate DSL from containers in parsing and in processing stages. The new mdsh Command consists of these parts:

[langname] <out_cmd> <in_cmd> [data_line]
[data]

in_cmd defines how and where to source data, it can be one of three:

  • < — read file as is. The filepath is sourced from data_line, if data is available, it is read per line for filenames and each file is concatenated to previos one.
  • $ — command execution. If the data_line is available, then it is executed as shell command. If the data is available it is passed to the command as via stdin (Closes add syntax for using existing snippet as input #57). If only data is available but not data_line, then the data is executed as shell script.
  • "empty command" aka "use data as is", concatenating data_line and data. In practice this is useful only for env variables setting

out_cmd defines what to do with the data from in_cmd, it can be one of three:

  • > lang — produce code block with lang (similarly to current as lang statements).
  • > — produce raw markdown output fenced by comment-tags
  • ! — expand data to shell variables

with these 3 * 3 commands you get 9 combinations, for example:

  • > < include.md — read file and produce raw markdown
  • > py < script.py — read script.py and produce code block with language py
  • > yml $ ./script.py foo $bar — execute script.py foo $bar in shell and produce yml code block
  • >$ ./gen-md.py — execute gen-md.py and produce raw markdown
  • ! foo=$bar — use foo=$bar as "raw data" and expand env variables that can be used in the next shell executions
  • !< .env — read .env and eval shell vars
  • !$ ./gen-vars.py — execute gen-vars and treat output as the list of shell variable assignments

So it can do quite a lot of things and the underlying model is pretty simple, and even allows to do some useless things, like > hello — would produce an empty code block with hello language.

Containers

Commands can be put into containers, here's all of them:

  • inline code blocks: langname is skipped, parsing starts right from out_cmd, data is absent. Must start from new line and end with newline.

  • code blocks:

    ```[langname] <out_cmd> <in_cmd> [data_line]
    [data]
    ```
    

    Source env vars:

    ```env !
    foo=$bar
    ```

    Execute script and produce yaml block (you can even put shebang at the top and use other than bash scripting languages.

    ```sh > yaml $
    echo 'foo: true'
    ```

    Run data_line as oneline command and pass code block to it via stdin, producing raw markdown.

    ```> $ sed 's/.*/Hi, \0/'
    Bobby
    ```
  • oneline comments, similar to inline code blocks but hidden: <!-- >< LICENSE.md --> — includes LICENSE.md

  • multiline comment blocks, behaves similarly to code blocks, but langname is not needed

    <!-- > yml $
    echo 'hi: true'
    -->
  • links. These are slightly deviate from the rest of containers:

    [<out_cmd> <in_cmd> whatever here is ignored](<data_line>)
    

Key syntax design changes

  • Get rid of as lang constructions and move lang on the left so source code in code block can be properly highlighted
  • Keep old > $ ! with the same meaning but differently arranged, so a Command is a little pipeline of two stages now
  • Fix > behavior in code blocks, it was quite useless it produced raw markdown
  • Complement the set of supported containers with multiline comment blocks (Closes Support multiline commands in comments #76)

Key code changes

  • Rewrite everything to parses with nom library, now parsing code is easier to read and maintain and it's possible to do recursive parsing, f.x. in this PR comment tag fences can be in other comment tag fences, i.e.
    <!-- BEGIN mdsh -->
    <!-- BEGIN mdsh -->
    hello
    <!-- END mdsh -->
    <!-- END mdsh -->
    this would allow to include files that are processed by markdown and it shouldn't break
  • Horizontal layering: parsers in one pile, executors in another
  • Consistent error handling (with anyhow). It is currently a mixture of io::Errors, unrwaps, panics, expect and custom die function.
  • Removed regexes, they are too hard to read and maintain compared to parser combinators
  • Extensive use of Write and Read that could potentially give some performance boost in some cases
  • Fencing with comment tags everything produced by mdsh. It makes parsing much simpler, also closes mdsh clobbers code block immediately following command substitution #29
  • Env vars substitution changed from bash expansion to shellexpand. It supports simple like ${foo-def} expressions, but not arbitrary code execution like $(date). I think for this real scripts ($ command) should be used if it is needed
  • Added generation of test cases that also acts as documentation (see spec.{clear,processed}.md)

@sirewix sirewix force-pushed the refactor-all branch 9 times, most recently from c8de0a6 to 4462d76 Compare May 8, 2025 01:00
@sirewix sirewix marked this pull request as ready for review May 8, 2025 01:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant