-
-
Notifications
You must be signed in to change notification settings - Fork 67
Introduce a new zonefile module
#491
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
partim
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks all great, no real notes.
You might want to try your hand on the DNSSEC record types, though, to see how smooth scanning more complicated tokens (like Base64) is.
|
|
||
| /// An error in scanning a zonefile record. | ||
| #[derive(Clone, Debug, PartialEq, Eq)] | ||
| pub enum RecordError { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to leave the error types as (open) enums or rather wrap them away into an opaque type? I don’t think a user would want to match on the variants here?
|
Migrated some DNSSEC related commits to |
|
While testing this PR I noticed that on parsing failure the reported error gives the line number of the problematic record but says things like "could not parse the owner name: irregular character in domain name" without saying which character was considered irregular, or "could not parse the record data: unrecognized record type" without saying which record type. I appreciate that the reason for this is to be able to use For the irregular character case one cannot know which character is the problem without a trial and error / divide and conquer approach to change the record and reload, change the record and reload, which could be very frustrating. For the unrecognized record type the line number should be enough to make this clear, but only if parsing a file as input. If the input were XFR and not written out prior to parsing the reader of the error would have no idea which record caused the problem. |
src/new_base/name/reversed.rs
Outdated
| break; | ||
| } | ||
|
|
||
| if !c.is_ascii_alphanumeric() && !b"\\-_\"".contains(&c) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test fails for a record containing a wildcard such as:
*.w.example. 3600 IN MX 1 ai.example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I forgot to account for wildcards. Will fix.
|
@ximon18: I agree that the user experience here isn't the best. But the parsing system has a lot of inherent limitations with regards to error reporting. It can only report one error at a time, so a zonefile with multiple errors would still require many invocations to get working. It also provides very limited context: most parsing functions can't tell which part of a record they're being used in (e.g. Instead of making the parsing system more complicated, I think users are better served if we can show them the specification the parser is following. I've documented the zonefile spec (that I'm using) in this PR; perhaps we can print it or link to it from user-facing programs that perform zonefile parsing. For example, With regards to the correctness of the specification (i.e. that the actual implementation follows the documented specification correctly): perhaps we can use fuzzing here. It seems that grammar-based fuzzers do exist, and we could set that up. I'd like to do that in a separate PR, though -- I suspect it'll get pretty complicated. |
I'm concerned that referring operators to a spec document somewhere when there's an error due to something in the received zonefile bytes asking them to somehow work it out is going to be usable. In the previous parser implementation I added the missing context by passing out a String instead of &'static str, as the parser knew which value it just failed to parse. Is that not possible here? |
|
I think I misunderstood the use case you're thinking about. I was considering this API from the perspective of a user manually executing a program that is loading a zonefile (which is the perspective of us as developers trying to test out the zonefile functionality). What information would we need to provide DNS operators (who are presumably working with a large number of zones, and have an automated system that is loading zonefiles) when an error occurs? What would their immediate response to such a failure be? To answer the concrete question: it is possible to return a |
The use case is that a running server attempts to read a zone file from disk or records from an incoming XFR and for some reason is unable to parse the zone. The operator then needs to be able to know which record could not be parsed, which may require more than a line number and a generic error message not specific to the record in question. Am I missing something here that makes this wish seem strange? |
|
I think it makes sense. A parser should give at least the string value that could not be parsed, but ideally also the context surrounding that string. |
|
My concern is, DNS operators are probably not writing zonefiles by hand; if they're generating it from an automated system, and the output is malformed (at least according to this zonefile parser implementation), they can't just modify the zonefile by hand in response to the error message (although perhaps that is helpful for fixing the immediate issue). They will need to update their automated zonefile generation to avoid making the same mistake in the future. From that perspective, I'd imagine that having a formal specification of the acceptable grammar is far more useful than a thorough description of a single parse error. |
Indeed I don't expect operators to write zone files by hand. They will need to report the error to someone to get it solved in our software. We will then ask them "which record caused the error" and they will not know and may not wish to give out their raw data if they even have it. |
The parse error still reports the line number, so it's possible to find the failing record. Maybe this is harder if the zonefile is ephemeral, but I'd still expect it to be found on disk.
Suppose an operator does encounter a zonefile parsing issue, and they are unable to share the original record (or the specific component of it that appeared to fail) because it is sensitive data. If we were to include that information in the error message, doesn't that just make the error message sensitive data too? Still, I understand that having a bit more context in parse errors would be helpful. I'll update For providing additional context: calculating the character offset of a parse failure is surprisingly difficult because of parentheses. Parentheses are handled outside the |
|
|
||
| impl<R: io::BufRead> ZonefileScanner<R> { | ||
| /// Scan the next entry and return a reference to it. | ||
| pub fn scan(&mut self) -> Result<Option<Entry<'_>>, ZonefileError> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't work for me unless I move the alloc: bumpalo::Bump field to be a alloc: &bumpalo::Bump argument passed in to this function. I think it's because the &mut self causes the passed &self.alloc to retain a reference to &mut self via the returned Entry causing error scanner was mutably borrowed here in the previous iteration of the loop. It's fine if I don't use Entry, but once I try to store the record referenced by the returned Entry in the loop over scanner.scan() I hit that lifetime error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an intentional part of the design -- you can only access one Entry at a time, so that each Entry can use the same block of memory, and thus the total memory overhead is O(1). If you need to retain those entries, you would write them into a zone tree or similar structure (which I am yet to define), or just clone them into an external bump allocator. I need to make the latter case more convenient right now, but it can be done manually using RecordData::clone_to_bump() etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a simple zone I am storing the references into, just a BTreeMap. I can't write them into it because of the problem above, and I thought the idea was zero copy so I didn't try cloning them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, how do I ensure that the bump arena lives long enough, I have to make sure the scanner is not dropped (as it owns the bump arena), or clone entries...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'll have to show me how to do this, after several different attempts also using an external bump allocator with clone_to_bump() I'm not getting anywhere, I still have the scanner was mutably borrowed here in the previous iteration of the loop error.
This is a manual squash of #491 for internal testing.
This is a manual squash of #491 for internal testing.
This is a manual squash of #491 for internal testing.
Record data usually needs 'Name' (e.g. for canonical ordering), while we continue using 'RevName' for record owner names.
This is a breaking change to 'domain::new', switching from 'RevName' to the more suitable 'Name' for representing domain names within record data.
- Also enhance 'FromStr' impls to parse more of the zonefile format, for parity with the existing 'Display' impls - Also impl conversions between 'NameBuf' and 'RevNameBuf'
Improves the 'FromStr' impls for the name types.
This is a work-in-progress PR for overhauling
domain::zonefilewith a brand-new, out-of-place zonefile parser. It integrates a decent number of unit tests, but is currently lacking many record types (becausenew-basedoesn't have them yet).new_zonefilefor parsing.new-base).The new zonefile parser has a few basic differences from the existing implementation:
The new zonefile parser has been designed to support parallelized zonefile parsing in the future. I am leaving this to a later PR.