-
Notifications
You must be signed in to change notification settings - Fork 88
WIP: Rust rewrite of wordlistctl #32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Creating config file for rust rewrite of wordlistctl
Repo file replaced with more rust like toml file format, more human readable and easier to edit
Replace current manpage with manpage generated by `clap_mangen` -- https://crates.io/crates/clap_mangen Mangen automatically generates ROFF files according to attributes written in code
Unneeded in Rust projects
repo.json moved to config/repo.toml
Improper gitignore fixed
Unneeded `use` statement made to have access to enum `Units`
Going to test arg! vs Arg builder. Also fixed floating point number issues in units.rs
added repo structs auto-generated by `https://transform.tools/json-to-rust-serde` further edits should be done by hand by adding to the `repo.toml` and then added the needed structs in `repo.rs` follow same naming scheme as autogenerated
unneeded after further testing
Worked on reading config but gave up after too many lifetime issues, did a bunch of work on parsing data and getting stuff back from file, can now read file, made minor updates to units. Finished up most of the args for `fetch` removed mod which created redundant references such as `units::units::Units` it now just uses `units::Units`
Add code scanning GitHub action
no longer scans reference file `wordlistctl.py`
No longer uses builder format, instead uses derive from `clap` with multiple subcommands. Does compile, syntactically correct, however tests not yet written to test functions. Unit tests need to be implemented. Also need to finish implementing the rest of the arguments before incrementing minor version from 0.10.0-alpha.1.4.0 to 0.11.0-alpha.1.0.0 Haven't been consistent with updating semver each push, will be maintained from now on.
Merge dev into master to mark progress
|
NOTE: in case it wasn't clear in the original comment, don't merge this yet it is a breaking change and the tool will not work until everything is fully implemented, this is merely to track the progress in reference to the main repository, there is much to be done and there will be likely be breaking changes. We are still only on version |
There just so much crap that I did but still needs to be done. I fixed and simplified a lot of stuff in `repo.rs` but theres still more to be done. `repo.toml` file modified via regex find of `/^(size = )"(\d*.\d*)"/g` and replace regex of `/$1$2/` Get back to me in a few days after I've slept
|
A low of work has gone into this and we're almost there, we have to review the Like I stated in my commit I'm going to do a proper code review because it is late and I may just be a bit crazy right now! |
Started movement from Docker and local environments to nix so that development environments are reproducible as well as build environments. Cargo.lock was also added back so that crate versions can be locked.
|
As of the most recent commit (Sable-20@7520352) my rewrite has begun it's migration away from Docker towards Nix for reproducible dev environments as well as build environments |
Backtracking, moving away from derive feature so that mangen and complete can easily be generated, god knows Derive makes it a pain in the ass to work with but it just does, and unless I want a 600 line command file I have to switch to buider
I can't do anymore professional commit messages tonight
replaced completions with better ones and added a man page!
Did some refactoring and got stuff working, have to solve the issue of where to put files if you don't have root permissions, probably just warn user on where they will go and tell them to use the intended directory of `/usr/share/wordlists/` you have to be root
makes things look prettier and messed around with some functions again
|
It essentially works I'd like to ask for a review and for feedback on the idea from a maintainer, sorry for the highlight but @dualfade you're the only maintainer I happen to have noticed from the matrix |
|
Hey @Sable-20 I will try to have a look at it. This will take a bit of time, as it is quite a big one, but what I have seen so far looks very good. |
I am aware it's a lot of code, I tried to rewrite in a more modular way that allows for pulling and changing, while it is essentially working there will likely in the future (hopefully near) be minor refactoring to allow separation of modules in a more efficient and organized fashion than currently. After a few more changes I will update the version to A contributing file and clippy linters will also be implemented, again, fairly minor changes that should require little to no review. Thank you for taking the time to read my code, I apologize for how it's a tad bit messy. |
pidone
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi. First of all, thanks again for your PR and the rewrite in Rust and sorry that it took a while to review. It's pretty cool that the shell completions and manpage are generated automatically. Also that you directly considered docker and nixOS. I'm not really familiar with nixOS so I can't really comment on that part.
I have noted everything that I have noticed. But I would see the following things as essential:
- we should not rename the tool, it should rather function as a drop-in replacment
- The optional parameters are not really optional, because they are directly unwrapped.
- The file structure of the wordlists directory is different in the implementation than in the python version. This could lead to all possible wordlists having to be downloaded again.
- The config is currently being compiled in. This should be loaded at runtime.
- The cli interface should differ as little as possible from the python version.
| # sed command | ||
| # -i to change it in the file | ||
| # s for substitution | ||
| # \(.*\) detects the version and saves it to capture group 1 | ||
| # features = \[\(.*\)\] detects features and saves to capture group 2 | ||
| # replacement just saves the version, disables default features, and adds in features http2, and rustls-tls so that this can be performed in docker | ||
| RUN sed -i 's/reqwest = { version = "\(.*\)", features = \[\(.*\)\] }/reqwest = { version = "\1", default-features = false, features = \["http2", "rustls-tls", \2\] }/' Cargo.toml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice workaround. But you can simple add openssl and ca-certificates and it should work without the native rusttls
| [[wordlists]] | ||
| [wordlists.deepmagic] | ||
| name = "deepmagic" | ||
| url = "https://raw.githubusercontent.com/danielmiessler/SecLists/master/Discovery/DNS/deepmagic.com-prefixes-top50000.txt" | ||
| size = 605.75 | ||
| unit = "kb" | ||
| group = "discovery" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason why you choice a list of an large table here? (in rust words a vec of a hashmap).
In my opinion simply a list of tables would be better:
[[wordlists]]
name = "..."
...
[[wordlists]]
name = "..."and so on.
This is a little bit more inefficient if you only search for a single list by name, but it won't matter here and the data format would be a bit nicer in my opinion.
| } | ||
|
|
||
| pub fn load_config() -> Result<Config> { | ||
| let config = toml::from_str::<Config>(include_str!("../config/config.toml")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This includes the config at compile-time. This means that if you change the config after compiling, it would have no effect, since it has already been baked into the binary. In my opinion, this misses the point of a config. Here the config should be loaded at runtime. Preferably from XDG_CONFIG_HOME. There is also the dirs crate that selects the correct config for you depending on the platform.
| assert!( | ||
| !std::path::Path::new(&format!("/{}/{}", base_dir, list.get_name())) | ||
| .try_exists() | ||
| .expect("File already exists") | ||
| ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes it necessary to specify base-dir as an absolute path. Furthermore, the entire download aborts because the error is not handled in main.
| assert!( | |
| !std::path::Path::new(&format!("/{}/{}", base_dir, list.get_name())) | |
| .try_exists() | |
| .expect("File already exists") | |
| ); | |
| let file = Path::new(base_dir).join(list.get_name()); | |
| if file.exists() { | |
| info!("{} already exists -- skipping", list.get_name()); | |
| return Ok(()); | |
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And the python version created subdirectories for the groups. We should keep the file structure as it was.
| for range in PartialRangeIter::new(0, (length - 1).try_into().unwrap(), chunk_size as usize)? { | ||
| trace!("Downloading range: {:?}", range); | ||
| let res = client | ||
| .get(list.get_url()) | ||
| .header(RANGE, range) | ||
| .send() | ||
| .await?; | ||
|
|
||
| let status = res.status(); | ||
| if !(status == StatusCode::OK || status == StatusCode::PARTIAL_CONTENT) { | ||
| return Err(eyre!( | ||
| "Failed to download file-- server respnse: {}", | ||
| status | ||
| )); | ||
| } | ||
| let bytes = res.bytes().await?; | ||
| std::io::copy(&mut bytes.as_ref(), &mut output_file)?; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice that you even split the download of a file so that it could be distributed to the workers. However, you have to be careful with the order. That's why I would only let the workers download one whole file at a time. And distribute the individual wordlists to the workers.
| "/usr/share/wordlists", | ||
| "rwordlistctl/0.1.0", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should also use base_dir and user_agent here.
| "/usr/share/wordlists", | ||
| "rwordlistctl/0.1.0", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
| pub fn get_size(&self) -> f64 { | ||
| self.size | ||
| } | ||
|
|
||
| pub fn get_unit(&self) -> &str { | ||
| &self.unit | ||
| } | ||
|
|
||
| pub fn get_url(&self) -> &str { | ||
| &self.url | ||
| } | ||
|
|
||
| pub fn get_group(&self) -> &str { | ||
| &self.group | ||
| } | ||
|
|
||
| pub fn get_name(&self) -> &str { | ||
| &self.name | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a particular reason why you are defining getter here and not making the fields public?
| } | ||
|
|
||
| fn load_repo() -> Result<Repo> { | ||
| let repo = toml::from_str::<Repo>(include_str!("../config/repo.toml")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same applies here as for the config. The repos are read in once during compilation and are then fixed in the binary. I don't see it as critical with the repo and could also understand the argument that we can only extend the repos using PRs. But we should be aware of this.
| (size, unit) | ||
| } | ||
|
|
||
| pub async fn convert_size(size: u64, unit: Units) -> u64 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Must this function be async?
Complete rewrite of
wordlistctlThis is a completely rust based rewrite of
wordlistctl.Original authors left in
Cargo.tomlas "blackarch" with me (the main contributor) written in separately.The goal of this rewrite is to improve the speed of
wordlistctland implement any missing features or fix existing issues.This will also improve code readability through Rust's strict type system along with strict formatting rules enforced by
rustfmtand the linters being run. Code security is also paramount and has a workflow specifically for managing code security.Manpage generation will also because automatic so that it can be automated and consistent.
Roadmap:
clap_mangen) generation into artifact generation OR pre-commit hooksrayonand itspar_iterstyle, considering usingTokioinstead ofrayonImplement custom error typesREMOVED FROM GOALSclapsubproject (probablyclap_complete)