-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update fixes #2
Update fixes #2
Conversation
DoccDoc comments will also be added to generated codes.
…type! and choice_type!.
WalkthroughThis update introduces a comprehensive enhancement across the board for a JSON parser using the Changes
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (invoked as PR comments)
Additionally, you can add CodeRabbit Configration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review Status
Actionable comments generated: 12
Configuration used: CodeRabbit UI
Files selected for processing (21)
- derive/tests/json.pest (1 hunks)
- derive/tests/json.rs (1 hunks)
- generator/src/common.rs (1 hunks)
- generator/src/typed/attr.rs (1 hunks)
- generator/src/typed/config.rs (1 hunks)
- generator/src/typed/generator.rs (8 hunks)
- generator/tests/expected.rs (28 hunks)
- generator/tests/syntax.pest (1 hunks)
- meta/src/doc.rs (1 hunks)
- meta/src/grammar.pest (2 hunks)
- meta/src/parser.rs (8 hunks)
- meta/tests/pest3sample.pest (2 hunks)
- pest/src/choice.rs (1 hunks)
- pest/src/formatter.rs (1 hunks)
- pest/src/lib.rs (2 hunks)
- pest/src/sequence.rs (3 hunks)
- pest/src/typed/mod.rs (1 hunks)
- pest/src/typed/template/mod.rs (3 hunks)
- pest/src/typed/template/repetition.rs (1 hunks)
- pest/src/typed/traits.rs (6 hunks)
- pest/src/typed/wrapper.rs (1 hunks)
Additional comments: 35
meta/src/doc.rs (1)
- 5-6: The change from
line_docs
togrammar_doc
in theDocComment
struct shifts the documentation focus to the entire grammar file. This approach can provide a more structured overview but consider the impact on detailed documentation for individual rules.pest/src/typed/mod.rs (1)
- 1-1: The added documentation comment provides a clear overview of the module's purpose, enhancing readability and maintainability.
generator/src/typed/config.rs (1)
- 6-6: The addition of the
box_all_rules
field introduces a new configuration option, enhancing flexibility in rule handling. Consider documenting its purpose and impact for clarity.pest/src/lib.rs (1)
- 20-20: The addition of the
std
module re-exporting items from the standard library could simplify usage but verify the necessity and impact of this approach to avoid unnecessary dependencies or confusion.pest/src/typed/wrapper.rs (2)
- 6-11: The addition of the
get_const
method to theString
trait enhances usability by providing a convenient way to retrieve string constants.- 16-18: The enhancements to the
Rule
trait, including associated types and constants, improve its expressiveness and utility. Consider adding tests or examples to demonstrate these features' usage and benefits.generator/tests/syntax.pest (1)
- 1-1: The addition of the compatibility comment likely aids in understanding the test's purpose and context, enhancing clarity.
generator/src/typed/attr.rs (1)
- 24-27: Handling the "box_all_rules" attribute enhances configurability. Consider a more user-friendly approach for unrecognized attributes, such as detailed error messages, instead of panicking.
generator/src/common.rs (1)
- 45-55: Simplifying the handling of documentation comments in the
generate_rule_enum
function likely improves readability and maintainability. Verify that the simplification maintains the documentation's quality and detail.derive/tests/json.rs (1)
- 1-57: The tests for the JSON parser cover basic scenarios and use assertions to validate outcomes, enhancing reliability. Consider adding more diverse test cases to cover edge cases and complex JSON structures.
meta/tests/pest3sample.pest (1)
- 83-89: The documentation updates for rules
~
and^
provide clear explanations of their usage differences, enhancing understanding for users.meta/src/grammar.pest (1)
- 8-20: > 📝 NOTE
This review was outside the diff hunks and was mapped to the diff hunk with the greatest overlap. Original lines [11-113]
Supporting
///
and//!
comments for documenting grammar rules and the grammar file enhances documentation practices. Ensure compatibility with existing grammar files and documentation practices.pest/src/typed/template/repetition.rs (1)
- 40-40: Renaming
parse_with
toparse_with_partial
might clarify the function's purpose. Verify its impact on the codebase and ensure consistency in naming conventions.pest/src/choice.rs (1)
- 1-25: > 📝 NOTE
This review was outside the diff hunks, and no overlapping diff hunk was found. Original lines [38-76]
The implementations of
TypedNode
andPairContainer
traits for the generated choice types are correctly done, ensuring that these types can be seamlessly integrated into the parsing system. TheTypedNode
implementation provides a method for partial parsing, which is essential for the flexibility of the parsing process. ThePairContainer
implementation ensures that each choice type can correctly handle its child pairs, which is crucial for the traversal and manipulation of the parsed data. These implementations are well-designed and adhere to the expected contracts of the traits they implement.generator/src/typed/generator.rs (5)
- 47-56: The introduction of the
boxed
field in theRuleInfo
struct and its determination logic within thenew
constructor method is a significant change. This logic decides whether a rule should be boxed based on thebox_all_rules
configuration flag or the rule's reachability. This is a clever way to manage memory and performance by conditionally boxing rules, potentially reducing stack usage for deeply nested or recursive rules. However, it's important to ensure that all downstream code that interacts withRuleInfo
instances properly handles theboxed
attribute to avoid logic errors or performance regressions.- 183-187: The conditional boxing of rule content in the
create_rule
function based on theboxed
flag fromRuleInfo
is a critical update. This change allows for more flexible memory management, potentially improving performance for large or complex grammars by boxing only certain rule contents. It's essential to verify that this change does not introduce any unintended side effects, such as increased heap allocations for grammars where boxing is not necessary or beneficial. Testing with various grammar sizes and complexities would be advisable to assess the impact of this change.- 403-409: > 📝 NOTE
This review was outside the diff hunks and was mapped to the diff hunk with the greatest overlap. Original lines [406-423]
The addition of a
reachability
parameter to theprocess_rule
function and its utilization within the function to create aRuleInfo
instance with reachability analysis is a noteworthy change. This enhancement allows for more sophisticated rule processing based on the reachability of rules within the grammar, potentially optimizing the generated parser code. It's crucial to ensure that the reachability analysis is accurate and that the resulting parser code maintains correctness while benefiting from any optimizations introduced by this change.
- 436-442: > 📝 NOTE
This review was outside the diff hunks and was mapped to the diff hunk with the greatest overlap. Original lines [439-448]
The
process_rules
function now performs reachability analysis by callingcollect_reachability
before processing individual rules. This pre-processing step is essential for enabling optimizations based on rule reachability within the grammar. It's important to verify that this reachability analysis is performed efficiently and does not introduce significant overhead to the code generation process, especially for large grammars with many rules.
- 520-534: The
collect_reachability
function introduces a sophisticated mechanism for analyzing rule reachability within the grammar, which is used to determine whether certain rules should be boxed. This function plays a crucial role in optimizing the generated parser code by avoiding infinite size structs and managing memory more effectively. It's important to ensure that this function accurately captures all reachability relationships between rules and that the resulting reachability graph is used correctly throughout the code generation process.pest/src/typed/template/mod.rs (2)
- 403-403: The addition of the
Default
derive to theEmpty
struct is a good practice for types that can be instantiated with default values, especially for types representing "empty" or "no-op" behavior. This makes it easier to use theEmpty
struct in generic contexts where a default instance might be required.- 415-422: The implementation of the
NeverFailedTypedNode
trait for theEmpty
struct with theparse_with_partial
method is logically sound. This method essentially returns the input position unchanged along with a new instance ofEmpty
, which aligns with the expected behavior of a type that represents an "always successful" parse operation. This addition enhances the expressiveness of the parsing framework by providing a built-in way to represent operations that cannot fail.generator/tests/expected.rs (6)
- 6-6: The documentation comment for the
Regular
enum variant is clear and concise, providing a straightforward description of its purpose.- 8-8: The documentation comment for the
Atomic
enum variant is similarly clear and concise. It's good practice to keep documentation comments straightforward, aiding in maintainability and readability.- 10-10: The documentation comment for the
NonAtomic
enum variant follows the established pattern of clarity and conciseness. Consistency in documentation style across the codebase is beneficial for overall project maintainability.- 119-122: The addition of documentation attributes for the
content
andspan
fields in ther#Regular
struct enhances the code's readability and maintainability by clearly describing the purpose and use of these fields. This is a good practice, especially in public or widely used structs.- 155-158: The documentation for the
content
andspan
fields in ther#Atomic
struct follows the same positive pattern as seen with ther#Regular
struct. Consistent and clear documentation like this is crucial for future code maintenance and understanding.- 191-194: The documentation for the
content
andspan
fields in ther#NonAtomic
struct is clear and maintains consistency with the documentation style used throughout the file. This consistency in documentation style is commendable and should be maintained.meta/src/parser.rs (8)
- 118-118: The addition of the
doc
field to theParseRule
struct is a positive change for supporting documentation comments. This aligns well with the objective of enhancing parsing capabilities and improving documentation practices.- 334-334: The introduction of
parse_grammar_doc
function to parse grammar documentation comments is a valuable addition. It enhances the parser's capability to handle documentation comments, which is crucial for generating more informative and readable code.- 373-384: The modification in the
parse_rule
function to handle parsing of documentation comments is well-implemented. It ensures that documentation comments are correctly associated with their respective rules, improving the maintainability and readability of the parsed rules.- 478-489: The implementation of
parse_grammar_doc
function is concise and correctly extracts the content of grammar documentation comments. However, it's important to ensure that the comment syntax (//!
) is consistently used and recognized throughout the project's documentation.- 491-502: Similarly, the
parse_rule_doc
function effectively extracts rule documentation comments. This addition is crucial for maintaining well-documented parsing rules, which can significantly aid in understanding and maintaining the parser's logic.- 850-856: The test
grammar_test
validates the parsing of grammar and rule documentation comments, which is essential for ensuring the new functionality works as expected. It's good practice to include such tests when introducing new parsing capabilities.- 1472-1498: The tests for parsing grammar documentation comments are thorough and cover various scenarios, including handling multiple lines of comments. These tests are crucial for ensuring the robustness of the documentation parsing functionality.
- 1506-1524: The tests for parsing rule documentation comments are also comprehensive and ensure that the parsing functionality works as intended. It's important to maintain such tests to catch any potential issues early in the development process.
@@ -112,8 +112,11 @@ impl<'i> Partition<'i> { | |||
|
|||
/// Formatter options for [Span](crate::Span). | |||
pub struct FormatOption<SpanFormatter, MarkerFormatter, NumberFormatter> { | |||
/// Formatter for characters in span. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment for span_formatter
is clear and concise, effectively explaining its purpose. However, it could be enhanced by specifying the expected behavior or format of the formatter for better clarity.
Consider expanding the comment to include examples or expected outcomes of the formatter's application.
pub span_formatter: SpanFormatter, | ||
/// Formatter for position marker. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment for marker_formatter
is straightforward but lacks detail on what constitutes a "position marker."
It would be beneficial to provide examples or further explanation of what markers are used for, enhancing understanding for future contributors.
pub marker_formatter: MarkerFormatter, | ||
/// Formatter for line numbers and column numbers of start and end. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment for number_formatter
is succinct but could be improved by detailing the types of numbers it formats (e.g., line and column numbers) and their context within the formatter's usage.
Adding context about where and how the numbers are used within the formatter could provide clearer guidance for developers.
#[doc = ::core::stringify!(A choice for $type.)] | ||
$variant($type) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation string generated for each variant uses ::core::stringify!
to include the type name in the documentation. This is a clever use of Rust's macro system to generate informative documentation dynamically. However, it might be more readable and informative to include both the variant name and the type in the documentation string to clarify the role of each variant in the generated choice type.
- #[doc = ::core::stringify!(A choice for $type.)]
+ #[doc = concat!("`", stringify!($variant), "`: A choice of type `", stringify!($type), "`.")]
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
#[doc = ::core::stringify!(A choice for $type.)] | |
$variant($type) | |
#[doc = concat!("`", stringify!($variant), "`: A choice of type `", stringify!($type), "`.")] | |
$variant($type) |
pest/src/choice.rs
Outdated
//! A [macro](crate::choice_type) to generate choice type | ||
//! and generated choice types that has 2 to 15 variants, | ||
//! which are named `Choice{i}`. | ||
|
||
// TODO: peek to decide matched choice when possible. | ||
/// Generate choice type. | ||
/// | ||
/// ## Example | ||
/// | ||
/// ```rust | ||
/// pest::choice_type!(Choice2, (Choice0, T0), (Choice1, T1)); | ||
/// ``` | ||
#[macro_export] | ||
macro_rules! choice_type { | ||
( $name:ident, $( ( $variant:ident, $type:ident ) ),* $(,)? ) => { | ||
/// Choice type generated by `pest::choice_type!`. | ||
#[derive(Clone, Debug, Eq, Hash, PartialEq)] | ||
pub enum $name<$( $type ),*> { | ||
$( $variant($type) ),* | ||
$( | ||
#[doc = ::core::stringify!(A choice for $type.)] | ||
$variant($type) | ||
),* | ||
} | ||
impl<'i, R, $($type, )*> | ||
$crate::typed::TypedNode<'i, R> for $name<$($type, )*> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
📝 NOTE
This review was outside the diff hunks, and no overlapping diff hunk was found. Original lines [38-63]
The parsing logic within try_parse_with_partial
is well-implemented, using snapshots to efficiently backtrack on failed parsing attempts. This approach allows for a clean and efficient way to try each choice in sequence until one succeeds or all fail. The use of ::core::option::Option::Some
and ::core::option::Option::None
is consistent and clear, ensuring that the function's return type is correctly handled. The logic here is solid, but adding comments to explain the parsing strategy and the role of snapshots could improve maintainability and readability for future contributors.
+ // Attempt to parse each choice in sequence, using stack snapshots for efficient backtracking.
$(
stack.snapshot();
match $type::try_parse_with_partial(input, stack, tracker) {
::core::option::Option::Some((input, choice)) => {
stack.clear_snapshot();
let res = Self::$variant(choice);
return ::core::option::Option::Some((input, res))
}
::core::option::Option::None => {
stack.restore();
}
}
)*
pest/src/typed/traits.rs
Outdated
/// A pair that can be converted to a pair tree. | ||
pub trait PairTree<R: RuleType>: PairContainer<R> + wrapper::Rule<R> { | ||
/// Get pair span. | ||
fn get_span(&self) -> (usize, usize); | ||
/// Convert `Self` to a pair tree. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PairTree
trait extends PairContainer
with additional functionality specific to pair trees, such as getting the span of a pair and converting it into a pair tree. This trait is well-designed, providing essential operations for working with pair trees in a typed context. The method as_pair_tree
is particularly noteworthy for its ability to convert a complex structure into a simple, navigable pair tree. Ensuring that the documentation for this trait and its methods is clear and comprehensive will be beneficial for maintainability and usability.
+ /// `PairTree` extends `PairContainer` with methods specific to pair trees, enabling operations like retrieving spans and converting to pair trees.
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
/// A pair that can be converted to a pair tree. | |
pub trait PairTree<R: RuleType>: PairContainer<R> + wrapper::Rule<R> { | |
/// Get pair span. | |
fn get_span(&self) -> (usize, usize); | |
/// Convert `Self` to a pair tree. | |
/// A pair that can be converted to a pair tree. | |
/// `PairTree` extends `PairContainer` with methods specific to pair trees, enabling operations like retrieving spans and converting to pair trees. | |
pub trait PairTree<R: RuleType>: PairContainer<R> + wrapper::Rule<R> { | |
/// Get pair span. | |
fn get_span(&self) -> (usize, usize); | |
/// Convert `Self` to a pair tree. |
( $name:ident, $( ( $field:ident, $type:ident, $trivia:ident ) ),* $(,)? ) => { | ||
/// Sequence type generated by `pest::sequence_type!`. | ||
/// | ||
/// - `0`: No trivia. | ||
/// - `1`: Optional trivia. | ||
/// - `2`: Mandatory trivia. | ||
#[derive(Clone, Debug, Eq, Hash, PartialEq)] | ||
pub struct $name<$( $type, const $trivia: ::core::primitive::u8, )*> { | ||
$( $variant: $type ),* | ||
$( | ||
#[doc = ::core::stringify!(Field for $type.)] | ||
pub $field: $type, | ||
)* | ||
} | ||
impl<'i, $($type, const $trivia: ::core::primitive::u8, )*> | ||
$name<$($type, $trivia, )*> { | ||
/// As a tuple of references. | ||
pub fn as_tuple(&self) -> ($(&$type, )*) { | ||
( $( &self.$field, )* ) | ||
} | ||
/// Into a tuple of values. | ||
pub fn into_tuple(self) -> ($($type, )*) { | ||
( $( self.$field, )* ) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The macro for generating sequence types is well-designed, providing a clear and flexible way to define sequence types with varying numbers of fields and trivia handling. The use of const $trivia: ::core::primitive::u8
as a way to specify trivia handling for each field is a clever approach that allows for compile-time configuration of parsing behavior. However, it would be beneficial to expand the documentation to explain the trivia handling mechanism in more detail, including how the 0
, 1
, and 2
values affect parsing.
+ /// The `trivia` parameter for each field controls how whitespace and comments are handled:
+ /// - `0`: No trivia is considered.
+ /// - `1`: Optional trivia is allowed.
+ /// - `2`: Mandatory trivia must be present.
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
( $name:ident, $( ( $field:ident, $type:ident, $trivia:ident ) ),* $(,)? ) => { | |
/// Sequence type generated by `pest::sequence_type!`. | |
/// | |
/// - `0`: No trivia. | |
/// - `1`: Optional trivia. | |
/// - `2`: Mandatory trivia. | |
#[derive(Clone, Debug, Eq, Hash, PartialEq)] | |
pub struct $name<$( $type, const $trivia: ::core::primitive::u8, )*> { | |
$( $variant: $type ),* | |
$( | |
#[doc = ::core::stringify!(Field for $type.)] | |
pub $field: $type, | |
)* | |
} | |
impl<'i, $($type, const $trivia: ::core::primitive::u8, )*> | |
$name<$($type, $trivia, )*> { | |
/// As a tuple of references. | |
pub fn as_tuple(&self) -> ($(&$type, )*) { | |
( $( &self.$field, )* ) | |
} | |
/// Into a tuple of values. | |
pub fn into_tuple(self) -> ($($type, )*) { | |
( $( self.$field, )* ) | |
} | |
( $name:ident, $( ( $field:ident, $type:ident, $trivia:ident ) ),* $(,)? ) => { | |
/// Sequence type generated by `pest::sequence_type!`. | |
/// | |
/// - `0`: No trivia. | |
/// - `1`: Optional trivia. | |
/// - `2`: Mandatory trivia. | |
/// The `trivia` parameter for each field controls how whitespace and comments are handled: | |
/// - `0`: No trivia is considered. | |
/// - `1`: Optional trivia is allowed. | |
/// - `2`: Mandatory trivia must be present. | |
#[derive(Clone, Debug, Eq, Hash, PartialEq)] | |
pub struct $name<$( $type, const $trivia: ::core::primitive::u8, )*> { | |
$( | |
#[doc = ::core::stringify!(Field for $type.)] | |
pub $field: $type, | |
)* | |
} | |
impl<'i, $($type, const $trivia: ::core::primitive::u8, )*> | |
$name<$($type, $trivia, )*> { | |
/// As a tuple of references. | |
pub fn as_tuple(&self) -> ($(&$type, )*) { | |
( $( &self.$field, )* ) | |
} | |
/// Into a tuple of values. | |
pub fn into_tuple(self) -> ($($type, )*) { | |
( $( self.$field, )* ) | |
} |
if i > 1 { | ||
input = $crate::typed::template::try_handle_trivia::<R, $trivia>(input, stack, tracker)?; | ||
} | ||
let (next, $variant) = $type::try_parse_with_partial(input, stack, tracker)?; | ||
let (next, $field) = $type::try_parse_with_partial(input, stack, tracker)?; | ||
input = next; | ||
)* | ||
let res = Self { | ||
$($variant, )* | ||
$($field, )* | ||
}; | ||
::core::option::Option::Some((input, res)) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
📝 NOTE
This review was outside the diff hunks and was mapped to the diff hunk with the greatest overlap. Original lines [57-76]
The implementation of try_parse_with_partial
within the macro uses a loop to parse each field in sequence, handling trivia according to the specified trivia
parameter for each field. This approach is efficient and flexible, allowing for precise control over the parsing process. However, adding comments to explain the role of the trivia
parameter in controlling the parsing flow and the importance of the order in which fields are parsed could improve the readability and maintainability of this code.
+ // Parse each field in sequence, handling trivia according to the specified `trivia` parameter.
+ // The order of fields is important for correctly parsing the sequence.
$(
i += 1;
if i > 1 {
input = $crate::typed::template::try_handle_trivia::<R, $trivia>(input, stack, tracker)?;
}
let (next, $field) = $type::try_parse_with_partial(input, stack, tracker)?;
input = next;
)*
Sorry, I'll switch to that branch later. |
@tomtau I've cherry-picked all commits in my hand. But module system hasn't been finished yet, so please merge this later. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review Status
Actionable comments generated: 17
Configuration used: CodeRabbit UI
Files ignored due to path filters (1)
generator/Cargo.toml
is excluded by:!**/*.toml
Files selected for processing (21)
- derive/tests/grammar.pest (1 hunks)
- derive/tests/sample.rs (1 hunks)
- generator/src/typed/attr.rs (1 hunks)
- generator/src/typed/generator.rs (13 hunks)
- generator/src/typed/module.rs (4 hunks)
- generator/src/typed/output.rs (1 hunks)
- generator/tests/.gitignore (1 hunks)
- generator/tests/expected.rs (30 hunks)
- generator/tests/generator.rs (2 hunks)
- generator/tests/syntax.pest (2 hunks)
- pest/src/choice.rs (2 hunks)
- pest/src/formatter.rs (2 hunks)
- pest/src/lib.rs (2 hunks)
- pest/src/sequence.rs (3 hunks)
- pest/src/typed/mod.rs (1 hunks)
- pest/src/typed/template/mod.rs (27 hunks)
- pest/src/typed/template/repetition.rs (2 hunks)
- pest/src/typed/tracker.rs (9 hunks)
- pest/src/typed/traits.rs (8 hunks)
- pest/src/typed/wrapper.rs (1 hunks)
- pest/tests/predefined_node.rs (4 hunks)
Files not summarized due to errors (1)
- generator/tests/expected.rs: Error: Message exceeds token limit
Files skipped from review due to trivial changes (1)
- generator/tests/.gitignore
Files skipped from review as they are similar to previous changes (7)
- generator/src/typed/attr.rs
- generator/tests/syntax.pest
- pest/src/choice.rs
- pest/src/formatter.rs
- pest/src/typed/mod.rs
- pest/src/typed/template/mod.rs
- pest/src/typed/wrapper.rs
Additional comments: 25
derive/tests/sample.rs (2)
- 10-15: The test function
comment
correctly tests partial and complete parsing of a string with a comment. It's good practice to include comments explaining the purpose of each test case for better maintainability.- 17-23: The test function
skip_on_two_end
effectively tests parsing behavior with trailing spaces. It's well-structured and follows best practices for testing edge cases.generator/tests/generator.rs (2)
- 22-57: > 📝 NOTE
This review was outside the diff hunks and was mapped to the diff hunk with the greatest overlap. Original lines [10-37]
The test function
generated
correctly generates code, writes it to a file, formats it usingrustfmt
, and compares it with an expected output. It's a good practice to handle the output of external commands carefully and provide clear error messages, as done here.
- 22-57: > 📝 NOTE
This review was outside the diff hunks and was mapped to the diff hunk with the greatest overlap. Original lines [39-58]
The test function
generated_sample
follows the same pattern asgenerated
for testing code generation with a different input. It's well-implemented and follows best practices for testing and error handling.pest/src/lib.rs (1)
- 30-68: The macro
unicode
for generating unicode property types is correctly defined and implemented. It follows Rust's conventions for macros, including documentation and trait implementations. This macro enhances the library's functionality by providing a structured way to handle unicode properties.derive/tests/grammar.pest (1)
- 39-49: The updated stack manipulation operations using
pest::stack::push
,pest::stack::pop
, and related functions are correctly implemented within the grammar definitions. These changes improve clarity and maintainability by using a more structured approach to stack manipulation.pest/tests/predefined_node.rs (2)
- 16-20: Making the
Rule
enum public enhances the flexibility and usability of the library by allowing external modules to reference these rules directly.- 56-69: > 📝 NOTE
This review was outside the diff hunks and was mapped to the diff hunk with the greatest overlap. Original lines [31-66]
The addition of the
unicode!(MATH)
declaration and the implementation of thecheck_with_partial
method for theStrFoo
struct are correctly done. These changes expand the functionality and improve the library's capability to handle unicode properties and parsing checks.pest/src/typed/template/repetition.rs (1)
- 99-132: > 📝 NOTE
This review was outside the diff hunks and was mapped to the diff hunk with the greatest overlap. Original lines [40-129]
The renaming of
parse_with
toparse_with_partial
and the addition of thecheck_with_partial
function are correctly implemented. These changes improve the clarity and consistency of the API, making it more explicit when partial parsing checks are performed.generator/src/typed/output.rs (1)
- 168-170: Introducing aliases
soi
andeoi
forSOI
andEOI
improves readability and usability by providing shorter and more intuitive names for these commonly used entities. This change is a good practice for enhancing the developer experience.pest/src/sequence.rs (2)
- 1-3: The file header provides a clear introduction to the purpose of the macro. It's concise and informative.
- 93-93: The method
for_each_child_pair
in thePairContainer
trait implementation is a good addition for handling child pairs. It's well-implemented and follows the principle of least surprise.pest/src/typed/traits.rs (1)
- 20-25: The method
try_parse_with_partial
in theTypedNode
trait is a significant addition, allowing for partial parsing of input. Ensure that implementations of this method handle edge cases and partial inputs correctly to avoid parsing errors.generator/src/typed/module.rs (1)
- 18-21: The addition of the
Rule { argc: usize }
variant to theRuleGenerics
enum introduces a way to specify rules with a specific argument count. This is a positive change for flexibility but ensure that the handling ofargc
is consistent and well-documented across the codebase.pest/src/typed/tracker.rs (2)
- 172-174: The refinement of the
record_option_during_with
function signature simplifies the definition and ensures consistency in parameter types. This change enhances readability and maintainability of the code.- 207-213: The addition of the
record_empty_during
function is a thoughtful enhancement that provides a clear and concise way to handle cases where no result is expected from the provided function. This addition improves the expressiveness of the API.generator/src/typed/generator.rs (1)
- 179-183: The conditional logic for determining the
content_type
based on theboxed
flag is clear and concise. However, it's essential to ensure that boxing is used judiciously to avoid unnecessary heap allocations, which can impact performance.This change effectively allows for flexibility in handling rule content, enabling optimizations based on the specific needs of the grammar being processed.
generator/tests/expected.rs (8)
- 1-7: The use of
#[allow]
attributes at the beginning of theRule
enum is appropriate for suppressing compiler warnings that are expected in this context. This approach helps maintain clarity and focus on the essential parts of the code.- 11-15: The documentation comments added to specific rules within the
Rule
enum (Regular
,Atomic
,NonAtomic
) enhance the readability and maintainability of the code by clearly describing the purpose and usage of each rule. This is a good practice, especially in a parsing context where understanding the role of each rule is crucial.- 48-51: The definition of
OptionalTrivia
andMandatoryTrivia
types in thetrivia
module usingRep
andRepOnce
generics is a concise and clear way to express the concept of optional and mandatory repetitions in the parsing context. This approach leverages Rust's type system to enforce parsing rules at the type level, enhancing code safety and readability.- 126-129: The
r#Regular
struct is well-documented, and its fields are clearly defined, aligning with the grammar specified in the documentation comment. However, it's important to ensure that the use of raw string literals (e.g.,r#Regular
) for naming is consistent and justified across the codebase to avoid confusion.Verification successful
The verification process has confirmed the consistent use of raw string literals for struct names within the
generator/tests/expected.rs
file, as demonstrated by the multiple instances of struct declarations using ther#
prefix. This consistency suggests that the naming convention is deliberate and aligns with the project's standards or requirements for naming structs in this context.* 169-179: The `r#Atomic` struct's implementation and documentation clearly articulate its purpose and structure, following the grammar specified. The use of `Sequence3` to compose the rule's structure demonstrates effective use of generics to model complex parsing patterns. This pattern of defining parsing rules should be consistently applied throughout the codebase for maintainability. * 233-241: The `r#NonAtomic` struct is another example of clear documentation and struct definition aligning with the specified grammar. The use of `Sequence2` here simplifies the representation of a sequence of parsing operations, showcasing the flexibility of the generics used in this parsing framework. * 291-294: The `r#ExactString` struct is correctly documented and defined. It represents a specific parsing rule for an exact string match. This kind of explicit documentation and struct definition for each parsing rule significantly aids in understanding the parser's behavior and debugging potential issues. * 1511-1515: The `generics` module re-exports from the `pest` crate, providing a centralized place for generic parsing constructs used throughout the file. This approach enhances modularity and reusability of parsing components. It's crucial to ensure that these re-exports are used consistently across the project to leverage their benefits fully.Scripts Executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Verify the consistent use of raw string literals for struct names across the codebase. rg 'pub struct r#'Length of output: 1610
pub fn call( | ||
&self, | ||
name: &str, | ||
rule_ref: &RuleRef<'_>, | ||
args: Option<&ProcessedPathArgs>, | ||
root: &TokenStream, | ||
) -> TokenStream { | ||
let generics = generics(); | ||
let name = format_ident!("r#{}", name); | ||
match self { | ||
Self::Rule => { | ||
assert!( | ||
rule_ref.args.is_none(), | ||
"Unexpected arguments in `{}`.", | ||
rule_ref, | ||
); | ||
let (path, name) = getter(); | ||
quote! { #root::rules::#( #path:: )* #name :: <'i> } | ||
Self::Rule { argc } => match (argc, args) { | ||
(0, None) => { | ||
quote! { #root::rules::#name::<'i> } | ||
} | ||
(0, Some(_args)) => { | ||
panic!("Unexpected arguments in `{}`.", rule_ref,); | ||
} | ||
(argc, None) => { | ||
panic!("Expect {argc} arguments in `{rule_ref}`."); | ||
} | ||
(_argc, Some(ProcessedPathArgs::Slice(slice))) => { | ||
panic!("Unexpected slice {slice} in `{rule_ref}`."); | ||
} | ||
(argc, Some(ProcessedPathArgs::Call(args))) => { | ||
assert_eq!( | ||
*argc, | ||
args.len(), | ||
"Argument count not matched in `{}`.", | ||
rule_ref | ||
); | ||
quote! { #root::rules::#name::<'i, #(#args, )*> } | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The call
method has been significantly updated to handle different scenarios based on the rule type. It's crucial to ensure that the panic messages are clear and helpful for debugging. Additionally, the use of assert_eq!
for argument count validation is appropriate, but consider adding more detailed error messages to assist in troubleshooting.
- panic!("Unexpected arguments in `{}`.", rule_ref,);
+ panic!("Unexpected arguments in `{rule_ref}`. No arguments expected for this rule type.");
- panic!("Expect {argc} arguments in `{rule_ref}`.");
+ panic!("Expected {argc} arguments in `{rule_ref}`, but none were provided.");
- panic!("Unexpected slice {slice} in `{rule_ref}`.");
+ panic!("Unexpected slice argument {slice} in `{rule_ref}`. Slice arguments are not supported for this rule type.");
- "Argument count not matched in `{}`.",
+ "Expected {argc} arguments, but received {} in `{rule_ref}`.", args.len()
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
pub fn call( | |
&self, | |
name: &str, | |
rule_ref: &RuleRef<'_>, | |
args: Option<&ProcessedPathArgs>, | |
root: &TokenStream, | |
) -> TokenStream { | |
let generics = generics(); | |
let name = format_ident!("r#{}", name); | |
match self { | |
Self::Rule => { | |
assert!( | |
rule_ref.args.is_none(), | |
"Unexpected arguments in `{}`.", | |
rule_ref, | |
); | |
let (path, name) = getter(); | |
quote! { #root::rules::#( #path:: )* #name :: <'i> } | |
Self::Rule { argc } => match (argc, args) { | |
(0, None) => { | |
quote! { #root::rules::#name::<'i> } | |
} | |
(0, Some(_args)) => { | |
panic!("Unexpected arguments in `{}`.", rule_ref,); | |
} | |
(argc, None) => { | |
panic!("Expect {argc} arguments in `{rule_ref}`."); | |
} | |
(_argc, Some(ProcessedPathArgs::Slice(slice))) => { | |
panic!("Unexpected slice {slice} in `{rule_ref}`."); | |
} | |
(argc, Some(ProcessedPathArgs::Call(args))) => { | |
assert_eq!( | |
*argc, | |
args.len(), | |
"Argument count not matched in `{}`.", | |
rule_ref | |
); | |
quote! { #root::rules::#name::<'i, #(#args, )*> } | |
} | |
(0, Some(_args)) => { | |
panic!("Unexpected arguments in `{rule_ref}`. No arguments expected for this rule type."); | |
} | |
(argc, None) => { | |
panic!("Expected {argc} arguments in `{rule_ref}`, but none were provided."); | |
} | |
(_argc, Some(ProcessedPathArgs::Slice(slice))) => { | |
panic!("Unexpected slice argument {slice} in `{rule_ref}`. Slice arguments are not supported for this rule type."); | |
} | |
(argc, Some(ProcessedPathArgs::Call(args))) => { | |
assert_eq!( | |
*argc, | |
args.len(), | |
"Expected {argc} arguments, but received {} in `{rule_ref}`.", args.len() | |
); | |
quote! { #root::rules::#name::<'i, #(#args, )*> } | |
} |
fn new( | ||
rule: &'g ParseRule, | ||
config: Config, | ||
reachability: &BTreeMap<&str, BTreeSet<&str>>, | ||
) -> Self { | ||
let rule_name = rule.name.as_str(); | ||
let boxed = true; | ||
let boxed = match config.box_all_rules { | ||
true => true, | ||
false => !reachability.contains_key(rule_name), | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The constructor for RuleInfo
now includes logic to determine the boxed
field based on the box_all_rules
configuration flag and the presence of the rule name in the reachability
map. This is a critical change as it affects how rules are processed and generated, potentially impacting the performance and memory usage of the generated parser.
Consider adding documentation comments to explain the significance of the boxed
field and the conditions under which a rule is boxed. This will improve maintainability and understandability of the code.
+ /// Determines whether the rule should be boxed based on configuration and reachability.
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
fn new( | |
rule: &'g ParseRule, | |
config: Config, | |
reachability: &BTreeMap<&str, BTreeSet<&str>>, | |
) -> Self { | |
let rule_name = rule.name.as_str(); | |
let boxed = true; | |
let boxed = match config.box_all_rules { | |
true => true, | |
false => !reachability.contains_key(rule_name), | |
}; | |
fn new( | |
rule: &'g ParseRule, | |
config: Config, | |
reachability: &BTreeMap<&str, BTreeSet<&str>>, | |
) -> Self { | |
/// Determines whether the rule should be boxed based on configuration and reachability. | |
let rule_name = rule.name.as_str(); | |
let boxed = match config.box_all_rules { | |
true => true, | |
false => !reachability.contains_key(rule_name), | |
}; |
#[derive(Clone, Debug)] | ||
enum ModuleNode<'g> { | ||
// MutableCollection(RefCell<HashMap<&'g str, Rc<ModuleNode<'g>>>>), | ||
Collection(HashMap<&'g str, Rc<ModuleNode<'g>>>), | ||
Map { | ||
bound: HashSet<String>, | ||
mapper: fn(&'g str) -> TokenStream, | ||
}, | ||
Generics(RuleGenerics), | ||
Imported(Rc<Self>), | ||
} | ||
|
||
impl<'g> ModuleNode<'g> { | ||
/// [Err] here is actually not an error. | ||
fn child(&self, node: &'g str) -> &Self { | ||
match self { | ||
Self::Collection(map) => map.get(node).unwrap_or_else(|| { | ||
panic!( | ||
"Node {} can't be resolved. Available nodes are {:?}.", | ||
node, | ||
map.keys() | ||
) | ||
}), | ||
Self::Map { bound, mapper } => { | ||
assert!(bound.contains(node), "{node} is out of bound."); | ||
panic!("No child found.") | ||
} | ||
Self::Generics(_generics) => panic!("No child found."), | ||
Self::Imported(rc) => rc.child(node), | ||
} | ||
} | ||
fn get( | ||
&self, | ||
root: &TokenStream, | ||
node: &'g str, | ||
rule_ref: &RuleRef<'g>, | ||
args: &Option<ProcessedPathArgs>, | ||
) -> TokenStream { | ||
match self { | ||
Self::Collection(map) => panic!("Collection can't be called."), | ||
Self::Map { bound, mapper } => { | ||
assert!(args.is_none(), "Unexpected arguments."); | ||
mapper(node) | ||
} | ||
Self::Generics(generics) => generics.call(node, rule_ref, args.as_ref(), root), | ||
Self::Imported(rc) => rc.get(root, node, rule_ref, args), | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The updates to the ModuleNode
enum introduce new functionalities like handling collections, maps, and generics. Ensure that the panic messages in the child
method are informative and guide the user towards resolving the issue. Additionally, the handling of generics in the get
method appears to be well-implemented, but thorough testing is recommended to ensure that all edge cases are covered.
- panic!("Node {} can't be resolved. Available nodes are {:?}.", node, map.keys());
+ panic!("Node '{node}' can't be resolved. Available nodes are: {:?}", map.keys().collect::<Vec<_>>());
- panic!("No child found.");
+ panic!("No child found for the specified node. Ensure that the node name is correct and that it has been properly defined.");
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
#[derive(Clone, Debug)] | |
enum ModuleNode<'g> { | |
// MutableCollection(RefCell<HashMap<&'g str, Rc<ModuleNode<'g>>>>), | |
Collection(HashMap<&'g str, Rc<ModuleNode<'g>>>), | |
Map { | |
bound: HashSet<String>, | |
mapper: fn(&'g str) -> TokenStream, | |
}, | |
Generics(RuleGenerics), | |
Imported(Rc<Self>), | |
} | |
impl<'g> ModuleNode<'g> { | |
/// [Err] here is actually not an error. | |
fn child(&self, node: &'g str) -> &Self { | |
match self { | |
Self::Collection(map) => map.get(node).unwrap_or_else(|| { | |
panic!( | |
"Node {} can't be resolved. Available nodes are {:?}.", | |
node, | |
map.keys() | |
) | |
}), | |
Self::Map { bound, mapper } => { | |
assert!(bound.contains(node), "{node} is out of bound."); | |
panic!("No child found.") | |
} | |
Self::Generics(_generics) => panic!("No child found."), | |
Self::Imported(rc) => rc.child(node), | |
} | |
} | |
fn get( | |
&self, | |
root: &TokenStream, | |
node: &'g str, | |
rule_ref: &RuleRef<'g>, | |
args: &Option<ProcessedPathArgs>, | |
) -> TokenStream { | |
match self { | |
Self::Collection(map) => panic!("Collection can't be called."), | |
Self::Map { bound, mapper } => { | |
assert!(args.is_none(), "Unexpected arguments."); | |
mapper(node) | |
} | |
Self::Generics(generics) => generics.call(node, rule_ref, args.as_ref(), root), | |
Self::Imported(rc) => rc.get(root, node, rule_ref, args), | |
} | |
} | |
} | |
#[derive(Clone, Debug)] | |
enum ModuleNode<'g> { | |
// MutableCollection(RefCell<HashMap<&'g str, Rc<ModuleNode<'g>>>>), | |
Collection(HashMap<&'g str, Rc<ModuleNode<'g>>>), | |
Map { | |
bound: HashSet<String>, | |
mapper: fn(&'g str) -> TokenStream, | |
}, | |
Generics(RuleGenerics), | |
Imported(Rc<Self>), | |
} | |
impl<'g> ModuleNode<'g> { | |
/// [Err] here is actually not an error. | |
fn child(&self, node: &'g str) -> &Self { | |
match self { | |
Self::Collection(map) => map.get(node).unwrap_or_else(|| { | |
panic!( | |
"Node '{node}' can't be resolved. Available nodes are: {:?}.", | |
map.keys().collect::<Vec<_>>() | |
) | |
}), | |
Self::Map { bound, mapper } => { | |
assert!(bound.contains(node), "{node} is out of bound."); | |
panic!("No child found for the specified node. Ensure that the node name is correct and that it has been properly defined.") | |
} | |
Self::Generics(_generics) => panic!("No child found for the specified node. Ensure that the node name is correct and that it has been properly defined."), | |
Self::Imported(rc) => rc.child(node), | |
} | |
} | |
fn get( | |
&self, | |
root: &TokenStream, | |
node: &'g str, | |
rule_ref: &RuleRef<'g>, | |
args: &Option<ProcessedPathArgs>, | |
) -> TokenStream { | |
match self { | |
Self::Collection(map) => panic!("Collection can't be called."), | |
Self::Map { bound, mapper } => { | |
assert!(args.is_none(), "Unexpected arguments."); | |
mapper(node) | |
} | |
Self::Generics(generics) => generics.call(node, rule_ref, args.as_ref(), root), | |
Self::Imported(rc) => rc.get(root, node, rule_ref, args), | |
} | |
} | |
} |
"~" => res.add_option_trivia(inter.typename), | ||
"^" => res.add_mandatory_trivia(inter.typename), | ||
_ => { | ||
let rule_info = RuleInfo::from(rule); | ||
let rule_info = RuleInfo::new(rule, config, reachability); | ||
res.insert_rule_struct(create_rule( | ||
config, | ||
&rule_config, | ||
&rule_info, | ||
inter.typename, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
📝 NOTE
This review was outside the diff hunks and was mapped to the diff hunk with the greatest overlap. Original lines [394-413]
The addition of the reachability
parameter to the process_rule
function is a significant change that introduces the concept of reachability analysis into rule processing. This analysis can help optimize the generated parser by identifying which rules need to be boxed to prevent infinite recursion or excessively large structs.
It would be beneficial to add more detailed comments explaining how reachability analysis is performed and its impact on the generated parser. This will aid future maintainability and provide valuable context for developers working with this code.
+ /// Processes a single rule, taking into account reachability analysis to optimize boxing of rules.
/// Wrap some nodes in [std::boxed::Box] to avoid infinite size struct, | ||
/// which can break the edges in the reference graph, | ||
/// and then collect reachability. | ||
/// | ||
/// Rules that are not in map keys are wrapped. | ||
/// | ||
/// We won't promise anything on which nodes are boxed. | ||
fn collect_reachability(rules: &[ParseRule]) -> BTreeMap<&str, BTreeSet<&str>> { | ||
let mut res = BTreeMap::new(); | ||
let rule_trivia = rules.iter().find(|rule| rule.name == "~"); | ||
for rule in rules { | ||
let entry = res.entry(rule.name.as_str()).or_default(); | ||
collect_used_rule(rule, rule_trivia, entry); | ||
} | ||
// Length of any path is no more than `rules.len()`. | ||
for _ in 0..rules.len() { | ||
// Before the `i`-th iteration, | ||
// `res[a]` contains all nodes that can be reached from `a` | ||
// in no more than `i+1` steps. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The collect_reachability
function is a new addition that performs reachability analysis to determine which rules should be boxed. This function is crucial for optimizing the generated parser and preventing issues with infinite recursion or excessively large structs.
While the function's implementation appears to be correct, adding more comprehensive comments explaining the algorithm and its purpose would greatly enhance the code's readability and maintainability.
+ /// Performs reachability analysis to identify rules that should be boxed to optimize the generated parser.
+ /// This analysis helps prevent infinite recursion and manage struct sizes by determining the reachability of rules.
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
/// Wrap some nodes in [std::boxed::Box] to avoid infinite size struct, | |
/// which can break the edges in the reference graph, | |
/// and then collect reachability. | |
/// | |
/// Rules that are not in map keys are wrapped. | |
/// | |
/// We won't promise anything on which nodes are boxed. | |
fn collect_reachability(rules: &[ParseRule]) -> BTreeMap<&str, BTreeSet<&str>> { | |
let mut res = BTreeMap::new(); | |
let rule_trivia = rules.iter().find(|rule| rule.name == "~"); | |
for rule in rules { | |
let entry = res.entry(rule.name.as_str()).or_default(); | |
collect_used_rule(rule, rule_trivia, entry); | |
} | |
// Length of any path is no more than `rules.len()`. | |
for _ in 0..rules.len() { | |
// Before the `i`-th iteration, | |
// `res[a]` contains all nodes that can be reached from `a` | |
// in no more than `i+1` steps. | |
/// Wrap some nodes in [std::boxed::Box] to avoid infinite size struct, | |
/// which can break the edges in the reference graph, | |
/// and then collect reachability. | |
/// | |
/// Rules that are not in map keys are wrapped. | |
/// | |
/// We won't promise anything on which nodes are boxed. | |
/// Performs reachability analysis to identify rules that should be boxed to optimize the generated parser. | |
/// This analysis helps prevent infinite recursion and manage struct sizes by determining the reachability of rules. | |
fn collect_reachability(rules: &[ParseRule]) -> BTreeMap<&str, BTreeSet<&str>> { | |
let mut res = BTreeMap::new(); | |
let rule_trivia = rules.iter().find(|rule| rule.name == "~"); | |
for rule in rules { | |
let entry = res.entry(rule.name.as_str()).or_default(); | |
collect_used_rule(rule, rule_trivia, entry); | |
} | |
// Length of any path is no more than `rules.len()`. | |
for _ in 0..rules.len() { | |
// Before the `i`-th iteration, | |
// `res[a]` contains all nodes that can be reached from `a` | |
// in no more than `i+1` steps. |
/// Check whether the some leading part of the result can be accepted. | ||
fn check_with_partial( | ||
input: Position<'i>, | ||
stack: &mut Stack<Span<'i>>, | ||
tracker: &mut Tracker<'i, R>, | ||
) -> Option<(Position<'i>, Self)>; | ||
/// Try parse given string into a node. | ||
) -> Option<Position<'i>>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The check_with_partial
method adds the ability to check if a leading part of the input can be parsed without actually performing the parsing. This could be useful for lookahead operations. Documenting specific use cases and examples would enhance understanding.
+ /// This method checks if a leading part of the input can be parsed according to the node's rules.
+ /// It's useful for lookahead operations and determining if parsing should proceed.
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
/// Check whether the some leading part of the result can be accepted. | |
fn check_with_partial( | |
input: Position<'i>, | |
stack: &mut Stack<Span<'i>>, | |
tracker: &mut Tracker<'i, R>, | |
) -> Option<(Position<'i>, Self)>; | |
/// Try parse given string into a node. | |
) -> Option<Position<'i>>; | |
/// Check whether the some leading part of the result can be accepted. | |
/// This method checks if a leading part of the input can be parsed according to the node's rules. | |
/// It's useful for lookahead operations and determining if parsing should proceed. | |
fn check_with_partial( | |
input: Position<'i>, | |
stack: &mut Stack<Span<'i>>, | |
tracker: &mut Tracker<'i, R>, | |
) -> Option<Position<'i>>; |
/// Struct for rules with full capacity. | ||
pub trait FullRuleStruct<'i>: wrapper::Rule { | ||
/// Wrapped inner type. | ||
type Inner: TypedNode<'i, Self::Rule>; | ||
/// Wrapped content type. | ||
type Content: From<Self::Inner>; | ||
/// Create from span and content. | ||
fn new(content: Self::Content, span: Span<'i>) -> Self; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The FullRuleStruct
trait and its method new
provide a structured way to create rule structs with full capacity. It's well-defined, but including examples of implementing and using this trait would be beneficial for developers.
+ /// Implement this trait for rule structs that encapsulate a complete rule with its content.
+ /// Example implementation:
+ /// ```
+ /// impl FullRuleStruct for MyRule {
+ /// type Inner = MyInnerNode;
+ /// type Content = MyContent;
+ /// fn new(content: Self::Content, span: Span) -> Self {
+ /// MyRule { content, span }
+ /// }
+ /// }
+ /// ```
/// A container of pairs. | ||
pub trait PairContainer<R> { | ||
/// for `Self` as a pair if it is, otherwise for each child pair. | ||
fn for_self_or_for_each_child_pair(&self, f: &mut impl FnMut(Pair<R>)) { | ||
self.for_each_child_pair(f) | ||
} | ||
/// For each child pair. | ||
fn for_each_child_pair(&self, f: &mut impl FnMut(Pair<R>)); | ||
/// Convert children pairs into a [Vec]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PairContainer
trait's methods for_self_or_for_each_child_pair
and vec_children_pairs
are well-designed for handling pairs and their children. Expanding the documentation to explain the purpose and usage of each method would help future developers.
+ /// This trait is used to abstract over entities that can contain pairs, allowing for operations on those pairs or their children.
+ /// It provides methods to iterate over child pairs and to collect them into a vector.
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
/// A container of pairs. | |
pub trait PairContainer<R> { | |
/// for `Self` as a pair if it is, otherwise for each child pair. | |
fn for_self_or_for_each_child_pair(&self, f: &mut impl FnMut(Pair<R>)) { | |
self.for_each_child_pair(f) | |
} | |
/// For each child pair. | |
fn for_each_child_pair(&self, f: &mut impl FnMut(Pair<R>)); | |
/// Convert children pairs into a [Vec]. | |
/// A container of pairs. | |
/// This trait is used to abstract over entities that can contain pairs, allowing for operations on those pairs or their children. | |
/// It provides methods to iterate over child pairs and to collect them into a vector. | |
pub trait PairContainer<R> { | |
/// for `Self` as a pair if it is, otherwise for each child pair. | |
fn for_self_or_for_each_child_pair(&self, f: &mut impl FnMut(Pair<R>)) { | |
self.for_each_child_pair(f) | |
} | |
/// For each child pair. | |
fn for_each_child_pair(&self, f: &mut impl FnMut(Pair<R>)); | |
/// Convert children pairs into a [Vec]. |
) -> Option<(Position<'i>, T)> { | ||
T::try_parse_with_partial(input, stack, tracker) | ||
} | ||
/// See [TypedNode::try_parse]. | ||
#[inline] | ||
fn try_parse<'i, T: TypedNode<'i, R>>(input: &'i str) -> Result<T, Error<R>> { | ||
T::try_parse(input) | ||
} | ||
/// See [TypedNode::try_parse_partial]. | ||
#[inline] | ||
fn try_parse_partial<'i, T: TypedNode<'i, R>>( | ||
input: &'i str, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
📝 NOTE
This review was outside the diff hunks and was mapped to the diff hunk with the greatest overlap. Original lines [232-257]
The TypedParser
trait introduces static methods for parsing that delegate to the corresponding methods in the TypedNode
trait. Documenting the relationship between TypedParser
and TypedNode
would clarify the design intention and usage patterns.
+ /// `TypedParser` provides static methods for parsing that are designed to work with any type that implements `TypedNode`.
+ /// This trait serves as a utility for simplifying the invocation of parsing operations.
/// Parse leading part of given input into a typed node. | ||
fn parse_with_partial(input: Position<'i>, stack: &mut Stack<Span<'i>>) | ||
-> (Position<'i>, Self); | ||
/// Parse leading part of given input into a typed node. | ||
fn parse_partial(input: &'i str) -> (Position<'i>, Self) { | ||
let input = Position::from_start(input); | ||
let mut stack = Stack::new(); | ||
Self::parse_with_partial(input, &mut stack) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The modifications to the NeverFailedTypedNode
trait, specifically the addition of methods for parsing with partial input, enhance the trait's utility by supporting more flexible parsing scenarios. Adding examples or more detailed documentation on how to implement and use these methods could improve usability.
+ /// These methods extend `NeverFailedTypedNode` with the ability to parse partial input, offering more flexibility in parsing operations.
+ /// Implementers should ensure that parsing can proceed without failure, adhering to the trait's guarantees.
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
/// Parse leading part of given input into a typed node. | |
fn parse_with_partial(input: Position<'i>, stack: &mut Stack<Span<'i>>) | |
-> (Position<'i>, Self); | |
/// Parse leading part of given input into a typed node. | |
fn parse_partial(input: &'i str) -> (Position<'i>, Self) { | |
let input = Position::from_start(input); | |
let mut stack = Stack::new(); | |
Self::parse_with_partial(input, &mut stack) | |
} | |
/// Parse leading part of given input into a typed node. | |
/// These methods extend `NeverFailedTypedNode` with the ability to parse partial input, offering more flexibility in parsing operations. | |
/// Implementers should ensure that parsing can proceed without failure, adhering to the trait's guarantees. | |
fn parse_with_partial(input: Position<'i>, stack: &mut Stack<Span<'i>>) | |
-> (Position<'i>, Self); | |
/// Parse leading part of given input into a typed node. | |
fn parse_partial(input: &'i str) -> (Position<'i>, Self) { | |
let input = Position::from_start(input); | |
let mut stack = Stack::new(); | |
Self::parse_with_partial(input, &mut stack) | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review Status
Actionable comments generated: 2
Configuration used: CodeRabbit UI
Files selected for processing (2)
- pest/src/typed/template/repetition.rs (3 hunks)
- pest/src/typed/traits.rs (7 hunks)
Files skipped from review as they are similar to previous changes (1)
- pest/src/typed/traits.rs
Additional comments: 1
pest/src/typed/template/repetition.rs (1)
- 40-43: The function
parse_with_partial
has been correctly renamed and its signature appears to be consistent with the expected functionality. However, ensure that all references to this function throughout the codebase have been updated to reflect the new name.
fn check_with_partial( | ||
mut input: Position<'i>, | ||
stack: &mut Stack<Span<'i>>, | ||
tracker: &mut Tracker<'i, R>, | ||
) -> Option<Position<'i>> { | ||
for i in 0..MAX { | ||
match restore_on_none(stack, |stack| { | ||
check_unit::<R, T, TRIVIA>(input, stack, tracker, i) | ||
}) { | ||
Some(next) => { | ||
input = next; | ||
} | ||
None => { | ||
if i < MIN { | ||
return None; | ||
} else { | ||
break; | ||
} | ||
} | ||
} | ||
} | ||
|
||
Some(input) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The addition of check_with_partial
function is a significant enhancement. It's important to ensure that this function is properly documented, explaining its purpose, parameters, and return value. This will help maintain the code's readability and understandability.
Consider adding comprehensive documentation comments for the check_with_partial
function to explain its usage and parameters.
|
||
Some((input, Self { content: vec })) | ||
} | ||
#[inline] | ||
fn check_with_partial( | ||
mut input: Position<'i>, | ||
stack: &mut Stack<Span<'i>>, | ||
tracker: &mut Tracker<'i, R>, | ||
) -> Option<Position<'i>> { | ||
for i in 0..MAX { | ||
match restore_on_none(stack, |stack| { | ||
check_unit::<R, T, TRIVIA>(input, stack, tracker, i) | ||
}) { | ||
Some(next) => { | ||
input = next; | ||
} | ||
None => { | ||
if i < MIN { | ||
return None; | ||
} else { | ||
break; | ||
} | ||
} | ||
} | ||
} | ||
|
||
Some(input) | ||
} | ||
} | ||
impl<T, const TRIVIA: u8, const MIN: usize, const MAX: usize> IntoIterator | ||
for RepMinMax<T, TRIVIA, MIN, MAX> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
📝 NOTE
This review was outside the diff hunks and was mapped to the diff hunk with the greatest overlap. Original lines [40-126]
The logic within both parse_with_partial
and check_with_partial
functions seems to correctly implement the intended parsing behavior, with appropriate handling of the MIN
and MAX
constraints. However, it's crucial to add unit tests covering various edge cases, such as reaching the MAX
limit, not meeting the MIN
requirement, and handling trivia between parsed units.
Would you like assistance in creating unit tests for these functions to ensure their robustness and correctness?
@tomtau and could you please explain how the module path ( |
@TheVeryDarkness the pest3 module system isn't yet properly defined, there are two aspects to it: syntax and semantics. syntaxI saw two proposals for it:
I think the current pest3 implementation was leaning towards the latter (even though the former was in the original proposal by @dragostis ) which seems fair: it's familiar to Rust users, it doesn't have issues with different system path notations on different operating systems, and can potentially be abstracted to "pest_vm".
If we go with that, I guess we can refer to the ancestor modules using the semanticsThere are multiple issues related to semantics, off the top of my head (there are likely more):
I think for this initial prototype, we can go with simple decisions, e.g. the grammar parser sees |
@TheVeryDarkness I rebase-merged this one, so feel free to start a new branch off the latest master! |
Sorry, I'm busy these weeks :( |
No worries! |
@TheVeryDarkness I previously merged that original PR by rebasing it, because I wanted to preserve a linear history without merge commits... but that made a merge conflict between those two master branches, so I created a new branch off the current master (
update-fixes
) and cherry-picked the new commits from your master.How do you prefer to continue working on it going forward? Continue on that diverged master and I'll cherry-pick the new commits, or work directly on a branch?
Summary by CodeRabbit
New Features
parse_with
toparse_with_partial
and addedcheck_with_partial
inrepetition.rs
.try_parse_with_partial
,try_parse_partial
,check_with_partial
,check_partial
,parse_partial
toTypedNode
.FullRuleStruct
trait for rules with full capacity.PairContainer
methods for handling pairs and children pairs.PairTree
trait extendingPairContainer
with pair tree operation methods.TypedParser
trait for typed syntax tree production with parsing methods.Enhancements
box_all_rules
configuration flag for controlling rule boxing behavior in the parser.Refactor
Documentation
Bug Fixes