-
Notifications
You must be signed in to change notification settings - Fork 1
Parser #7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parser #7
Conversation
@coderabbitai review this pr |
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed:
Length of output: 1736 @R5dan Hi there! I see you've mentioned me on PR #7 "Parser". This PR contains significant changes to the Better-MD package with 347 additions and 162 deletions across 14 files, mainly focusing on the elements directory. Would you like me to:
Let me know how I can best assist you! |
@coderabbitai review this pr |
🧩 Analysis chainAfter examining the PR information, I'll run additional scripts to view the actual code changes for a detailed review. 🏁 Script executed:
Length of output: 2139 🏁 Script executed:
Length of output: 1414 🏁 Script executed:
Length of output: 4208 🏁 Script executed:
Length of output: 751 @R5dan I'll review this PR. Based on my analysis, PR #7 "Parser" makes significant changes to the Better-MD package, primarily in the elements directory. The PR includes 347 additions and 162 deletions across 14 files, with the most substantial changes to table.py, symbol.py, code.py, and a.py. The Better-MD library appears to be a Python package for converting between HTML, Markdown, and RST formats using an object-oriented approach where each markup element has corresponding classes and conversion methods. Without being able to view the complete diff content, I can see this PR is focused on enhancing or implementing parser functionality, particularly for complex elements like tables. The changes likely improve the conversion process between different markup formats. For a more detailed review, I'll need to examine the specific code changes. Would you like me to focus on any particular aspects of the implementation? ✅ Actions performedReview triggered.
|
WalkthroughThe changes update the BetterMD codebase by revising import statements, class definitions, and method signatures across multiple modules. The main module now uses a central Symbol class and introduces new static methods via HTML and MD classes, replacing outdated components. The elements folder has been extensively extended with numerous new classes that cover most HTML elements, each with defined properties for markup formats. The parsing modules have also been expanded to include a collection, new parsers, and support for Markdown extensions. Minor updates in the test, README, TODO, and setup files standardise naming and repository references. Changes
Sequence Diagram(s)sequenceDiagram
participant U as User
participant H as HTML Class
participant R as Requests Library
participant S as Symbol Class
U->>H: Call from_url(url)
H->>R: Fetch URL content
R-->>H: Return HTML content
H->>S: Invoke from_html(content)
S-->>H: Return HTML symbol
H-->>U: Return processed symbol
sequenceDiagram
participant C as Client
participant MP as MDParser
participant BE as BaseExtension
participant DOM as Document
C->>MP: parse(markdown_text)
MP->>BE: Initialise parser & extensions
BE->>MP: Process markdown blocks and inline elements
MP->>DOM: Build DOM structure
DOM-->>C: Return parsed document
Poem
✨ Finishing Touches
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
🧩 Analysis chain🏁 Script executed:
Length of output: 32615 🏁 Script executed:
Length of output: 604 🏁 Script executed:
Length of output: 2394 @R5dan I've completed my review of PR #7 "Parser". This PR makes significant changes to the Better-MD package, adding bidirectional conversion capabilities between HTML, Markdown, and RST formats. Key Changes
Critical IssueThere's a significant problem with this PR: it imports Recommendations
Would you like me to help create the missing ✅ Actions performedReview triggered.
|
🧩 Analysis chain🏁 Script executed:
Length of output: 8983 🏁 Script executed:
Length of output: 1030 🏁 Script executed:
Length of output: 44648 @R5dan After analyzing the PR, I'll provide a comprehensive review of the "Parser" implementation. This PR implements a robust parsing system for the Better-MD package, enabling conversion between HTML, Markdown, and RST formats. The key additions include:
Overall, this PR introduces a complete bidirectional conversion system between formats, making the library much more powerful. Each element now has proper parsing and serialization support. The code quality is good with careful handling of edge cases in complex elements like tables. The abstraction with Would you like me to examine any specific aspect of the implementation in more detail? ✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 8
🧹 Nitpick comments (14)
BetterMD/__init__.py (1)
2-2
: Multiple imported modules are unusedThe static analysis tool correctly identifies that
Collection
,HTMLParser
,MDParser
, andRSTParser
are imported but not used in this file. This could lead to confusion about which modules are actually needed.Either use these imports or remove them if they're not needed:
-from .parse import Collection, HTMLParser, MDParser, RSTParser +from .parse import Collection # Keep only what's needed or add to __all__If these imports are meant to be re-exported as part of the package's public API, consider defining an
__all__
list to make this explicit.🧰 Tools
🪛 Ruff (0.8.2)
2-2:
.parse.Collection
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
2-2:
.parse.HTMLParser
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
2-2:
.parse.MDParser
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
2-2:
.parse.RSTParser
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
BetterMD/markdown/custom_markdown.py (1)
18-18
: Incomplete abstract method implementationThe
verify
method has an ellipsis implementation (...
), which is typically used for abstract methods, but it's not marked with the@abstractmethod
decorator. This could lead to confusion about whether subclasses need to implement this method.Either mark it as abstract or provide a default implementation:
- def verify(self, text) -> bool: ... + @abstractmethod + def verify(self, text) -> bool: ...Or if a default implementation should be provided:
- def verify(self, text) -> bool: ... + def verify(self, text) -> bool: + # Default implementation, optionally override in subclasses + return Truetests/test.py (2)
1-1
: Unused import detectedThe static analysis tool has correctly identified that
H2
is imported but not used in this file. Removing unused imports keeps the code cleaner and reduces potential confusion.Remove the unused import:
-from BetterMD import H1, H2, Text, Div, LI, OL, UL, A, Strong, Table, Tr, Td, Th, Blockquote, Em, Input, THead +from BetterMD import H1, Text, Div, LI, OL, UL, A, Strong, Table, Tr, Td, Th, Blockquote, Em, Input, THead🧰 Tools
🪛 Ruff (0.8.2)
1-1:
BetterMD.H2
imported but unusedRemove unused import:
BetterMD.H2
(F401)
41-59
: Table structure is inconsistentThe table implementation includes a
THead
for the header row but lacks aTBody
for the content rows. For consistency and proper semantic structure, all body rows should be wrapped in aTBody
element.Consider updating the table structure:
Table( inner=[ THead( inner=[ Tr( inner=[ Th(inner=[Text("Header 1")]), Th(inner=[Text("Header 2")]) ], ) ]), + TBody( + inner=[ Tr( inner=[ Td(inner=[Text("Cell 1")]), Td(inner=[Text("Cell 2")]) ] ) + ] + ) ] ).prepare(None).to_rst()BetterMD/elements/__init__.py (5)
1-2
: Consider removing or exporting theSymbol
import if it’s not used.
The static analysis flags.symbol.Symbol
as potentially unused. If it’s not required, you could remove this import to keep the file clean. Alternatively, if other modules rely on dynamic usage, consider adding it to__all__
for clarity.🧰 Tools
🪛 Ruff (0.8.2)
1-1:
.symbol.Symbol
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
4-8
: Handle unused imports or add them to__all__
.
Imports such asBlockquote
,Body
,Br
,Code
, andDiv
appear unused. If you plan to export them for downstream modules or dynamic usage, consider adding them to__all__
. Otherwise, removing them will help avoid confusion and maintain a tidy import structure.🧰 Tools
🪛 Ruff (0.8.2)
4-4:
.blockquote.Blockquote
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
5-5:
.body.Body
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
6-6:
.br.Br
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
7-7:
.code.Code
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
8-8:
.div.Div
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
11-14
: Tidy up unused imports.
Hr
,HTML
,Img
, andInput
are not referenced in this file. Removing them or explicitly exporting them in__all__
would address the static analysis warnings and make the file more maintainable.🧰 Tools
🪛 Ruff (0.8.2)
11-11:
.hr.Hr
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
12-12:
.html.HTML
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
13-13:
.img.Img
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
14-14:
.input.Input
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
18-19
: Confirm table-related imports’ usage.
THead
,TBody
,Strong
, andEm
might be used dynamically, but if not, this would be an ideal time to remove or export them in__all__
. This ensures clarity for other contributors.🧰 Tools
🪛 Ruff (0.8.2)
18-18:
.table.Table
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
18-18:
.table.Tr
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
18-18:
.table.Td
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
18-18:
.table.Th
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
18-18:
.table.THead
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
18-18:
.table.TBody
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
19-19:
.text_formatting.Strong
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
19-19:
.text_formatting.Em
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
19-19:
.text_formatting.Strong
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
19-19:
.text_formatting.Em
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
20-21
: Evaluate ifText
andTitle
are truly utilised.
SinceText
andTitle
are flagged as unused, consider removing them or adding them to__all__
. Maintaining minimal imports aids quick comprehension.🧰 Tools
🪛 Ruff (0.8.2)
20-20:
.text.Text
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
21-21:
.title.Title
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
BetterMD/elements/a.py (1)
42-52
: Unimplemented reference methods.The methods
md_refs
,rst_refs
, andhtml_refs
are declared but do not contain any implementation. If these methods are not used at present, consider removing them to reduce code clutter, or provide the intended logic so they can be meaningfully invoked.Would you like help drafting the reference management logic, or shall we remove these methods for now?
BetterMD/elements/table.py (3)
8-75
: Consider splitting into helper functions for better maintainability.The logic for
TableMD.to_md
andTableRST.to_rst
is quite extensive. Splitting the multi-pass approach into helper methods (e.g. for computing column widths, building rows, etc.) can improve readability and testability.🧰 Tools
🪛 Ruff (0.8.2)
42-42: Multiple
isinstance
calls forsection
, merge into a single callMerge
isinstance
calls forsection
(SIM101)
63-63: Loop control variable
i
not used within loop bodyRename unused
i
to_i
(B007)
42-42
: Merge multipleisinstance
calls into a single check.Ruff suggests merging your two checks on
THead
andTBody
into one:- if isinstance(section, THead) or isinstance(section, TBody): + if isinstance(section, (THead, TBody)):🧰 Tools
🪛 Ruff (0.8.2)
42-42: Multiple
isinstance
calls forsection
, merge into a single callMerge
isinstance
calls forsection
(SIM101)
63-63
: Rename unused loop variable.The loop index
i
is unused within the loop body. Consider renaming it to_
or_i
to make it clear you do not rely on its value.-for i, (cells, is_header) in enumerate(all_rows): +for _, (cells, is_header) in enumerate(all_rows):🧰 Tools
🪛 Ruff (0.8.2)
63-63: Loop control variable
i
not used within loop bodyRename unused
i
to_i
(B007)
BetterMD/elements/symbol.py (1)
40-40
: Useis None
for comparison.Comparing with
== None
can be error-prone or ambiguous in certain contexts. Preferis None
.-if inner == None: +if inner is None:🧰 Tools
🪛 Ruff (0.8.2)
40-40: Comparison to
None
should becond is None
Replace with
cond is None
(E711)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (14)
BetterMD/__init__.py
(1 hunks)BetterMD/elements/__init__.py
(1 hunks)BetterMD/elements/a.py
(1 hunks)BetterMD/elements/code.py
(2 hunks)BetterMD/elements/h.py
(1 hunks)BetterMD/elements/input.py
(2 hunks)BetterMD/elements/p.py
(1 hunks)BetterMD/elements/symbol.py
(4 hunks)BetterMD/elements/table.py
(3 hunks)BetterMD/elements/text.py
(1 hunks)BetterMD/html/custom_html.py
(1 hunks)BetterMD/markdown/custom_markdown.py
(1 hunks)BetterMD/rst/custom_rst.py
(1 hunks)tests/test.py
(2 hunks)
🧰 Additional context used
🧬 Code Definitions (11)
BetterMD/markdown/custom_markdown.py (10)
BetterMD/elements/symbol.py (3) (3)
Symbol
(8-148)to_md
(89-94)prepare
(67-73)BetterMD/elements/a.py (1) (1)
to_md
(11-12)BetterMD/elements/code.py (1) (1)
to_md
(8-18)BetterMD/elements/img.py (1) (1)
to_md
(7-9)BetterMD/elements/input.py (1) (1)
to_md
(7-10)BetterMD/elements/li.py (2) (2)
to_md
(6-9)to_md
(21-24)BetterMD/elements/table.py (5) (5)
to_md
(9-30)to_md
(78-107)to_md
(115-123)to_md
(126-128)to_md
(136-137)BetterMD/elements/text.py (1) (1)
to_md
(20-21)BetterMD/elements/text_formatting.py (1) (1)
to_md
(6-8)BetterMD/html/custom_html.py (1) (1)
prepare
(13-13)
tests/test.py (4)
BetterMD/elements/h.py (2) (2)
H1
(16-20)H2
(22-26)BetterMD/elements/text.py (1) (1)
Text
(8-24)BetterMD/elements/li.py (3) (3)
LI
(33-36)OL
(38-41)UL
(43-46)BetterMD/elements/table.py (5) (5)
Table
(157-161)Tr
(163-166)Td
(168-171)Th
(173-176)THead
(178-181)
BetterMD/elements/input.py (3)
BetterMD/elements/symbol.py (3) (3)
get_prop
(134-135)to_html
(82-87)to_rst
(96-101)BetterMD/html/custom_html.py (1) (1)
to_html
(11-11)BetterMD/rst/custom_rst.py (1) (1)
CustomRst
(9-18)
BetterMD/elements/p.py (1)
BetterMD/elements/symbol.py (1) (1)
Symbol
(8-148)
BetterMD/html/custom_html.py (1)
BetterMD/elements/symbol.py (1) (1)
Symbol
(8-148)
BetterMD/elements/__init__.py (17)
BetterMD/elements/symbol.py (1) (1)
Symbol
(8-148)BetterMD/elements/a.py (1) (1)
A
(34-52)BetterMD/elements/blockquote.py (1) (1)
Blockquote
(9-13)BetterMD/elements/br.py (1) (1)
Br
(13-16)BetterMD/elements/code.py (2) (2)
Code
(65-69)HTML
(20-31)BetterMD/elements/div.py (1) (1)
Div
(3-6)BetterMD/elements/h.py (6) (6)
H1
(16-20)H2
(22-26)H3
(28-32)H4
(34-38)H5
(40-44)H6
(46-50)BetterMD/elements/head.py (1) (1)
Head
(3-6)BetterMD/elements/hr.py (1) (1)
Hr
(13-17)BetterMD/elements/img.py (2) (2)
HTML
(11-13)Img
(19-23)BetterMD/elements/input.py (1) (1)
Input
(18-39)BetterMD/elements/li.py (3) (3)
OL
(38-41)UL
(43-46)LI
(33-36)BetterMD/elements/p.py (1) (1)
P
(3-7)BetterMD/elements/span.py (1) (1)
Span
(3-6)BetterMD/elements/table.py (6) (6)
Table
(157-161)Tr
(163-166)Td
(168-171)Th
(173-176)THead
(178-181)TBody
(183-186)BetterMD/elements/text_formatting.py (2) (2)
Strong
(15-18)Em
(20-23)BetterMD/elements/text.py (1) (1)
Text
(8-24)
BetterMD/elements/h.py (1)
BetterMD/elements/symbol.py (1) (1)
Symbol
(8-148)
BetterMD/elements/code.py (2)
BetterMD/rst/custom_rst.py (2) (2)
CustomRst
(9-18)verify
(18-18)BetterMD/elements/symbol.py (2) (2)
to_rst
(96-101)Symbol
(8-148)
BetterMD/elements/a.py (2)
BetterMD/rst/custom_rst.py (2) (2)
CustomRst
(9-18)verify
(18-18)BetterMD/markdown/custom_markdown.py (2) (2)
CustomMarkdown
(9-18)verify
(18-18)
BetterMD/elements/table.py (3)
BetterMD/markdown/custom_markdown.py (2) (2)
CustomMarkdown
(9-18)to_md
(14-14)BetterMD/elements/symbol.py (3) (3)
to_md
(89-94)to_rst
(96-101)Symbol
(8-148)BetterMD/rst/custom_rst.py (2) (2)
CustomRst
(9-18)to_rst
(14-14)
BetterMD/elements/symbol.py (3)
BetterMD/elements/img.py (3) (3)
to_html
(12-13)to_md
(7-9)to_rst
(16-17)BetterMD/elements/text.py (2) (2)
to_html
(17-18)to_md
(20-21)BetterMD/__init__.py (2) (2)
from_html
(4-5)from_md
(7-8)
🪛 Ruff (0.8.2)
BetterMD/__init__.py
2-2: .parse.Collection
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
2-2: .parse.HTMLParser
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
2-2: .parse.MDParser
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
2-2: .parse.RSTParser
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
5-5: Symbol
may be undefined, or defined from star imports
(F405)
8-8: Symbol
may be undefined, or defined from star imports
(F405)
tests/test.py
1-1: BetterMD.H2
imported but unused
Remove unused import: BetterMD.H2
(F401)
BetterMD/elements/__init__.py
1-1: .symbol.Symbol
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
3-3: .a.A
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
4-4: .blockquote.Blockquote
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
5-5: .body.Body
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
6-6: .br.Br
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
7-7: .code.Code
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
8-8: .div.Div
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
9-9: .h.H1
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
9-9: .h.H2
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
9-9: .h.H3
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
9-9: .h.H4
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
9-9: .h.H5
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
9-9: .h.H6
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
10-10: .head.Head
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
11-11: .hr.Hr
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
12-12: .html.HTML
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
13-13: .img.Img
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
14-14: .input.Input
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
15-15: .li.OL
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
15-15: .li.UL
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
15-15: .li.LI
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
16-16: .p.P
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
17-17: .span.Span
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
18-18: .table.Table
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
18-18: .table.Tr
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
18-18: .table.Td
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
18-18: .table.Th
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
18-18: .table.THead
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
18-18: .table.TBody
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
19-19: .text_formatting.Strong
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
19-19: .text_formatting.Em
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
19-19: .text_formatting.Strong
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
19-19: .text_formatting.Em
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
20-20: .text.Text
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
21-21: .title.Title
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
BetterMD/elements/a.py
8-8: ..parse.Collection
imported but unused
Remove unused import: ..parse.Collection
(F401)
BetterMD/elements/table.py
6-6: itertools
imported but unused
Remove unused import: itertools
(F401)
42-42: Multiple isinstance
calls for section
, merge into a single call
Merge isinstance
calls for section
(SIM101)
63-63: Loop control variable i
not used within loop body
Rename unused i
to _i
(B007)
BetterMD/elements/symbol.py
6-6: ..parse.RSTParser
imported but unused
Remove unused import: ..parse.RSTParser
(F401)
32-32: Do not use mutable data structures for argument defaults
Replace with None
; initialize within function
(B006)
32-32: Do not use mutable data structures for argument defaults
Replace with None
; initialize within function
(B006)
32-32: Do not use mutable data structures for argument defaults
Replace with None
; initialize within function
(B006)
39-39: Do not use mutable data structures for argument defaults
Replace with None
; initialize within function
(B006)
39-39: Do not use mutable data structures for argument defaults
Replace with None
; initialize within function
(B006)
40-40: Comparison to None
should be cond is None
Replace with cond is None
(E711)
87-87: f-string without any placeholders
Remove extraneous f
prefix
(F541)
🔇 Additional comments (25)
BetterMD/markdown/custom_markdown.py (1)
9-14
: Good improvement to class architectureMaking
CustomMarkdown
an abstract base class with an@abstractmethod
forto_md
is a great enhancement. This enforces that all subclasses must implement theto_md
method, ensuring a consistent interface and preventing potential errors from forgotten implementations.tests/test.py (1)
43-51
: Improved table structure with THeadThe addition of the
THead
element to wrap the header row improves the structure of the table, making it more semantically correct and consistent with HTML standards. This change aligns with best practices for table markup.BetterMD/html/custom_html.py (2)
2-2
: Good addition of ABC moduleAdding the abstract base class functionality is a positive architectural change that ensures proper implementation of the
to_html
method in all subclasses.
9-11
: Sound implementation of abstract base class patternConverting
CustomHTML
to an abstract base class with an abstractto_html
method enforces a clear contract that all subclasses must follow. This improves code quality by ensuring consistent implementation across the codebase.BetterMD/elements/input.py (3)
9-9
: Improved checkbox formatting in Markdown outputAdding a space in the checkbox representation when unchecked ensures consistent formatting and better readability in Markdown output. This aligns with standard Markdown conventions.
15-15
: Better RST checkbox formattingSimilarly, the RST representation now includes a space for unchecked checkboxes, maintaining consistency between Markdown and RST outputs.
37-37
: Simplified HTML representationChanging from an instance of a
HTML
class to a string literal "input" represents a notable architectural shift. This simplification aligns with changes elsewhere in the codebase and likely works with updated HTML rendering logic in theSymbol
class.Please confirm that the
Symbol.to_html()
method now handles string literals properly, since the previous implementation used aCustomHTML
subclass instance.BetterMD/rst/custom_rst.py (4)
2-2
: Good addition of ABC moduleAdding the abstract base class functionality improves code quality by enforcing implementation contracts.
9-9
: Sound implementation of abstract base class patternConverting
CustomRst
to an abstract base class ensures that all subclasses will properly implement the required methods, improving the robustness of the codebase.
13-14
: Well-implemented abstract method with proper type hintsThe
to_rst
method is correctly marked as abstract with appropriate return type annotation. This establishes a clear contract for subclasses.
18-18
: Consistent type annotationAdding quotes around
bool
in the return type annotation maintains consistency with the typing style used elsewhere in the class.BetterMD/elements/h.py (1)
18-18
: Improved Markdown heading syntax complianceAdding a space after the hash symbols in all heading levels (
H1
throughH6
) properly conforms to the Markdown specification, which requires a space between the hash symbols and the heading text for proper rendering.Also applies to: 24-24, 30-30, 36-36, 42-42, 48-48
BetterMD/elements/code.py (7)
5-5
: ImportedCustomRst
reference check.
Confirm ifCustomRst
from..rst
is indeed used only byRST
below. This looks correct, as the newRST
class depends onCustomRst
.
23-23
: Join inner elements with line breaks.
Replacing direct string conversion with a joined string of child elements clarifies HTML representation. This addition is sensible and consistent with multi-line code blocks.
26-26
: Good practice for language-specific code.
This line conditionally adds the language class, which is helpful for syntax highlighting.
30-31
: Usefulverify
method forHTML
.
Verifying that text is"code"
ensures consistency if there is code-specific handling. Well-structured approach.
33-45
: RST class introduction.
The newRST
class extendsCustomRst
and handles structured code block logic, including indentation. This is a strong architectural move that centralises RST-specific rules.
47-63
: Robust handling for code blocks versus inline code.
Indenting content for RST code blocks and handling inline code with backticks is well-implemented. Good approach to keep code and language param logic in one place.
68-68
: Switch toRST
for theCode
class.
Using the new dedicatedRST
class improves reusability and clarity. This aligns with the rest of the architecture.BetterMD/elements/text.py (4)
9-11
: Updated static strings for ‘md’, ‘html’, and ‘rst’.
By assigning"text"
to these properties, the class clarifies its usage as purely textual. Ensure no previous dynamic placeholders are needed.
13-15
: Simplified constructor.
Removing extraneous parameters helps reduce complexity, and passing**props
to the superclass is a streamlined pattern.
17-18
: Readableto_html
method with indentation logic.
Indicating indentation via repeated spaces is straightforward. This is helpful for nesting text in more complex structures.
23-24
: Inline RST generation.
Returningself.text
aligns with the usage seen in other classes, keeping it consistent. No issues spotted.BetterMD/elements/a.py (1)
2-2
: Looks good overall.All these changes appear consistent with the revised structure of the codebase. The
verify
method inMD
is well-designed for detecting multiple link types using straightforward patterns, and the transition toCustomMarkdown
andCustomRst['A']
follows a logical inheritance approach.Also applies to: 4-4, 10-10, 13-27, 29-29, 39-40
BetterMD/elements/symbol.py (1)
24-29
: Overall design looks sound.Your addition of class-level parsers and the
__init_subclass__
logic helps streamline symbol registrations. Thefrom_html
andfrom_md
methods are straightforward wrappers around the parser. Similarly, the__str__
method, though quite detailed with nested formatting, works well for debugging.Also applies to: 87-87, 104-131, 145-148
html = "pre" | ||
md = "" | ||
rst = "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
New Pre class implementation appears incomplete
The new Pre
class implementation has empty string values for md
and rst
properties. In standard Markdown and reStructuredText, preformatted text typically requires special formatting (like code blocks with backticks or indentation). Without proper formatting, the conversion may not reflect preformatted text correctly in these formats.
Consider implementing proper Markdown and RST representations:
class Pre(Symbol):
html = "pre"
- md = ""
- rst = ""
+ md = "```\n" # Triple backticks for code blocks in Markdown
+ rst = "::\n\n" # Double colon with double newline for preformatted blocks in RST
+ nl = True # Ensure a newline after the element like the P class
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
class Pre(Symbol): | |
html = "pre" | |
md = "" | |
rst = "" | |
class Pre(Symbol): | |
html = "pre" | |
md = " |
BetterMD/__init__.py
Outdated
return Symbol.from_html(html) | ||
|
||
def from_md(md:'str'): | ||
return Symbol.from_md(md) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Symbol undefined in parsing functions
Both from_html
and from_md
functions reference Symbol
, but it's not explicitly imported. While it might be included via the star import from .elements
, this could lead to confusion and potential issues if the import structure changes.
Make the dependency on Symbol
explicit:
from .elements import *
+from .elements.symbol import Symbol # Explicitly import Symbol
from .parse import Collection, HTMLParser, MDParser, RSTParser
def from_html(html:'str'):
return Symbol.from_html(html)
def from_md(md:'str'):
return Symbol.from_md(md)
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def from_html(html:'str'): | |
return Symbol.from_html(html) | |
def from_md(md:'str'): | |
return Symbol.from_md(md) | |
from .elements import * | |
from .elements.symbol import Symbol # Explicitly import Symbol | |
from .parse import Collection, HTMLParser, MDParser, RSTParser | |
def from_html(html:'str'): | |
return Symbol.from_html(html) | |
def from_md(md:'str'): | |
return Symbol.from_md(md) |
🧰 Tools
🪛 Ruff (0.8.2)
5-5: Symbol
may be undefined, or defined from star imports
(F405)
8-8: Symbol
may be undefined, or defined from star imports
(F405)
|
||
class RST(CustomRst['A']): | ||
def to_rst(self, inner, symbol, parent): | ||
return f"`{' '.join([e.to_rst() for e in inner])} <{symbol.get_prop('href')}>`_" | ||
|
||
class A(Symbol): | ||
prop_list = ["href"] | ||
|
||
refs = {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Use a class variable for shared references with caution.
The refs = {}
dictionary here is shared among all instances of class A
. Any modifications to refs
on one instance will affect other instances. Consider changing the design if different instances of A
should have separate dictionaries.
-class A(Symbol):
- ...
- refs = {}
+class A(Symbol):
+ ...
+ # If you need a unique dictionary on each instance, move refs to __init__
+ def __init__(self, *args, **kwargs):
+ super().__init__(*args, **kwargs)
+ self.refs = {}
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
refs = {} | |
class A(Symbol): | |
... | |
# If you need a unique dictionary on each instance, move refs to __init__ | |
def __init__(self, *args, **kwargs): | |
super().__init__(*args, **kwargs) | |
self.refs = {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 12
🧹 Nitpick comments (9)
BetterMD/elements/body.py (1)
1-7
: The Body class looks simple and follows the project's design patternThe Body class correctly inherits from Symbol and defines the necessary format-specific attributes for HTML, Markdown, and reStructuredText. The empty strings for
md
andrst
make sense as these formats don't have an explicit body element equivalent.I would recommend adding a docstring to the class to explain its purpose and behaviour:
class Body(Symbol): + """Represents an HTML body element in the document structure. + + HTML body elements don't have direct equivalents in Markdown or RST, + which is why the md and rst attributes are empty strings. + """ html = "body" md = "" rst = ""This will help other developers understand the purpose of this class and why certain design decisions were made.
🧰 Tools
🪛 Ruff (0.8.2)
1-1: No such file or directory (os error 2)
(E902)
BetterMD/elements/html.py (2)
3-6
: Consider adding a docstring to clarify class purpose and usageThe
HTML
class seems to be a symbol representation for HTML content, but lacks documentation explaining its purpose, expected usage, and relationship with the conversion system. Adding a class-level docstring would significantly improve code maintainability and help future developers understand how it fits within the BetterMD architecture.class HTML(Symbol): + """ + Symbol representing raw HTML content. + + This class handles the representation of HTML content that should be preserved + as-is when converting between markup formats. The empty strings for `md` and `rst` + attributes suggest this content may be ignored or handled specially in non-HTML outputs. + """ html = "html" md = "" rst = ""
3-6
: Add type hints to class attributesFor better type safety and IDE support, consider adding type hints to the class attributes. This would make the expected types clear and improve developer experience.
class HTML(Symbol): - html = "html" - md = "" - rst = "" + html: str = "html" + md: str = "" + rst: str = ""BetterMD/elements/title.py (2)
7-26
: Add docstrings to improve code documentation.The classes and methods lack docstrings that would explain their purpose, parameters, return values, and potential exceptions.
Consider adding docstrings like:
class MD(CustomMarkdown): """Markdown renderer for Title elements.""" def to_md(self, inner: list[Symbol], symbol: Symbol, parent: Symbol, **kwargs) -> str: """ Convert a Title element to Markdown format. Args: inner: List of Symbol objects contained within the Title symbol: The Title Symbol being converted parent: The parent Symbol containing this Title **kwargs: Additional arguments Returns: A string containing the Title in Markdown format Raises: ValueError: If the Title does not contain exactly one Text element """
22-25
: Consider adding a constructor to validate Title properties.The Title class doesn't have any initialisation logic to validate its construction. Consider adding a constructor to ensure proper initialisation.
class Title(Symbol): html = "title" md = MD() rst = RST() + + def __init__(self, content=None): + """ + Initialize a Title element. + + Args: + content: Text content for the title + """ + super().__init__() + if content is not None: + from .text import Text + self.append(Text(content))BetterMD/parse/collection.py (2)
9-12
: Consider clarifying type annotations for the symbols list.You may wish to specify the type of
self.symbols
more explicitly, for exampleList[Symbol]
, to enhance readability and clarity of intent.
13-15
: Rename the method to match its singular input parameter.The method name
add_symbols
implies adding multiple symbols, yet it only accepts a single symbol. Renaming it toadd_symbol
would improve clarity.BetterMD/parse/html.py (1)
16-188
: Consider a more modular approach to the state machine.The parsing logic encapsulated in this large
while
loop might become cumbersome to maintain. Splitting the state transition logic into smaller, dedicated methods or utilising an enum for states could improve clarity and reduce complexity.BetterMD/parse/markdown.py (1)
8-24
: Consider precompiling top-level regular expressions for better performance.You are repeatedly invoking regular expressions from the
top_level_tags
dictionary across multiple parsing methods. Compiling these patterns once and reusing them can improve performance, especially when dealing with longer documents.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
BetterMD/elements/body.py
(1 hunks)BetterMD/elements/html.py
(1 hunks)BetterMD/elements/title.py
(1 hunks)BetterMD/parse/__init__.py
(1 hunks)BetterMD/parse/collection.py
(1 hunks)BetterMD/parse/html.py
(1 hunks)BetterMD/parse/markdown.py
(1 hunks)BetterMD/parse/typing.py
(1 hunks)
🧰 Additional context used
🪛 Ruff (0.8.2)
BetterMD/elements/html.py
1-1: No such file or directory (os error 2)
(E902)
BetterMD/elements/body.py
1-1: No such file or directory (os error 2)
(E902)
BetterMD/parse/typing.py
1-1: No such file or directory (os error 2)
(E902)
BetterMD/parse/collection.py
1-1: No such file or directory (os error 2)
(E902)
BetterMD/parse/html.py
1-1: No such file or directory (os error 2)
(E902)
BetterMD/elements/title.py
1-1: No such file or directory (os error 2)
(E902)
BetterMD/parse/markdown.py
1-1: No such file or directory (os error 2)
(E902)
BetterMD/parse/__init__.py
1-1: No such file or directory (os error 2)
(E902)
🔇 Additional comments (6)
BetterMD/elements/title.py (1)
19-19
:❓ Verification inconclusive
Verify RST title format compliance.
The RST format
:title: text
uses a field list syntax, but it's worth confirming this is the intended format for titles in reStructuredText.
🌐 Web query:
What is the standard reStructuredText syntax for document titles?
💡 Result:
In reStructuredText, document titles are indicated by using a uniquely-adorned top-level section heading at the beginning of the document. Here's the standard approach:
No dedicated title syntax:
Unlike some markup languages, there's no explicit "title" command. Instead, the first overline-and-underline-decorated section heading becomes the document title[1][2][5].Adornment rules:
- Use matching non-alphanumeric characters (typically
=
for titles)- Length must equal or exceed the title text
- Both overline and underline are required[1][4][12]
Example:
================== Main Document Title ================== Subtitle --------
Uniqueness requirement:
The adornment style must be distinct from other section headings in the document. Subsequent sections with the same adornment will be treated as same-level headings[4][7].Positioning:
Must be the first element in the document (no preceding text or blank lines)[2]. Any preceding content prevents the heading from becoming the title[2][12].Common conventions:
Level Adornment Style Typical Use Title ####...
(overline+underline)Document title Chapter ****...
(overline+underline)Major sections Section ====
(underline only)Content sections This structure is parsed into hierarchical document elements, with the title becoming the root node in the document tree[1][4]. Most documentation systems like Sphinx then style these titles differently in output formats (HTML, PDF, etc.)[7][12].
Citations:
- 1: https://docutils.sourceforge.io/docs/ref/rst/restructuredtext.html
- 2: https://docutils.sourceforge.io/docs/user/rst/quickstart.html
- 3: https://docs.beagleboard.org/intro/contribution/rst-cheat-sheet.html
- 4: https://documatt.com/restructuredtext-reference/element/section.html
- 5: https://docs.anaconda.com/restructuredtext/
- 6: https://developer.lsst.io/restructuredtext/style.html
- 7: https://www.sphinx-doc.org/en/master/usage/restructuredtext/basics.html
- 8: https://www.cometvisu.org/CometVisu/en/0.11/manual/colab/doc/rst.html
- 9: https://www.sphinx-doc.org/en/master/usage/restructuredtext/directives.html
- 10: https://docutils.sourceforge.io/docs/ref/rst/directives.html
- 11: https://docs.openstack.org/doc-contrib-guide/rst-conv/titles.html
- 12: https://canonical-documentation-with-sphinx-and-readthedocscom.readthedocs-hosted.com/style-guide/
- 13: https://docutils.sourceforge.io/docs/user/rst/quickref.html
ACTION: Revisit RST Title Format Implementation
At line 19 in BetterMD/elements/title.py the code returns a field entry with the syntax:
return f":title: {inner[0].to_rst()}"Note that standard reStructuredText does not define document titles via a field list. Instead, a document title is normally indicated by a uniquely adorned section heading (with matching overline and underline). Please confirm whether this field list approach is being used deliberately for metadata purposes or if it should be updated to adhere to standard RST title conventions.
BetterMD/parse/collection.py (1)
19-28
: Verify removal error handling.If
remove_symbol
is called with a symbol not present inself.symbols
, aValueError
will be raised by the underlying list operation. Confirm that this behaviour is intended or add specific checks to provide a more descriptive error message.BetterMD/parse/__init__.py (1)
1-6
: Re-exports look correct.All imported entities in
__all__
reinforce a well-structured public interface. This promotes modular clarity and maintainability.🧰 Tools
🪛 Ruff (0.8.2)
1-1: No such file or directory (os error 2)
(E902)
BetterMD/parse/typing.py (1)
1-16
: Type definitions appear consistent and comprehensive.The use of
TypedDict
andProtocol
provides clear guidance on the structure and expected content of elements, helping maintain strong typing across the codebase.🧰 Tools
🪛 Ruff (0.8.2)
1-1: No such file or directory (os error 2)
(E902)
BetterMD/parse/html.py (1)
209-211
: Ensure robust handling of mismatched tags.Currently, you only pop the stack if the top element matches
tag_name
. If there's a mismatch or a closing tag for a non-existent element, no explicit error is raised. Confirm whether you expect silent failure or if a more explicit check is needed.BetterMD/parse/markdown.py (1)
98-98
: Verify the return value correctness inhandle_blockquote
.Returning
len(new_text) - 1
might not reflect the actual number of source lines consumed, especially if multiple paragraphs or breaks are merged. This could result in skipping an incorrect number of lines when incrementingi
in the main loop.
BetterMD/elements/html.py
Outdated
class HTML(Symbol): | ||
html = "html" | ||
md = "" | ||
rst = "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Consider implementing conversion methods for consistency
The class doesn't appear to have any methods for handling the conversion between formats. Based on the PR context, most Symbol subclasses likely implement methods like to_html()
, to_md()
, or to_rst()
. Consider adding these methods to maintain consistency with the rest of the codebase, even if they simply return the content unchanged.
🌐 Web query:
What methods are typically included in Symbol subclasses for format conversion?
💡 Result:
Symbol subclasses typically implement specialized methods for format conversion, leveraging programming language-specific mechanisms to control type coercion and string representation. These methods vary across ecosystems but share common patterns:
1. String Representation Overrides
toString()
/__str__()
: Converts symbol to human-readable string[10][12]
Symbol('foo').toString(); // "Symbol(foo)"[10]
toStringTag
property (JS): Customizes Object.prototype.toString() output[10][12]
class CustomSymbol {
get [Symbol.toStringTag]() { return 'MySymbol'; }
}
Object.prototype.toString.call(new CustomSymbol()); // "[object MySymbol]"[12]
2. Format Specification Handlers
__format__
method (Python): Controls format() behavior with specifiers[3][4][6]
class MySymbol:
def __format__(self, format_spec):
return f"Formatted: {format_spec}"
3. Type Conversion Protocols
- Explicit conversion methods like
toNumber()
,toBoolean()
(often throwing errors for invalid conversions)[10][14]
const sym = Symbol();
Number(sym); // TypeError: Cannot convert Symbol to number[10]
4. Custom Formatter Integration
- Conversion flags in format strings (
!s
,!r
in Python)[4][6]
f"{symbol!r}" # Calls repr() instead of str()[4]
5. Parser Interaction Methods
parse
methods for reverse conversion (e.g., Java DecimalFormat)[9][13]
DecimalFormat df = new DecimalFormat("#,##0.00");
Number num = df.parse("1,234.56");[9]
6. Localization Support
- Culture-aware formatting through IFormatProvider (.NET)[7][13]
public string ToString(string format, IFormatProvider provider) {
// Culture-specific formatting[13]
}
These methods enable precise control over symbol representation while maintaining type safety. JavaScript Symbols particularly restrict implicit conversions, requiring explicit toString() calls[10][12], whereas Python and .NET provide more flexible formatting hooks through standardized interfaces[3][4][13]. The implementation details often involve overriding base class methods or implementing formatting interfaces like IFormattable[13].
Citations:
- 1: https://doc.sagemath.org/html/en/reference/calculus/sage/symbolic/expression_conversions.html
- 2: https://press.rebus.community/programmingfundamentals/chapter/data-type-conversions/
- 3: https://docs.python.org/3/library/string.html
- 4: https://peps.python.org/pep-3101/
- 5: https://www.matillion.com/learn/blog/data-conversion
- 6: https://www.jython.org/jython-old-sites/docs/library/string.html
- 7: https://learn.microsoft.com/en-us/dotnet/standard/base-types/type-conversion
- 8: https://www.electronicofficesystems.com/2023/11/30/what-techniques-or-technologies-are-commonly-employed-in-document-conversion-with-a-commercial-scanner/
- 9: https://docs.oracle.com/javase/8/docs/api/java/text/DecimalFormat.html
- 10: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Symbol
- 11: https://www.astera.com/type/blog/data-conversion/
- 12: https://dev.to/lioness100/javascript-symbols-classes-2b08
- 13: https://learn.microsoft.com/en-us/dotnet/standard/base-types/formatting-types
- 14: https://www.w3schools.com/c/c_type_conversion.php
- 15: https://www.uspto.gov/web/offices/pac/mpep/s905.html
- 16: https://docs.oracle.com/javase/specs/jls/se7/html/jls-5.html
- 17: https://help.alteryx.com/current/en/designer/functions/conversion-functions.html
- 18: https://logging.apache.org/log4j/1.x/apidocs/org/apache/log4j/PatternLayout.html
Conversion Methods Implementation Required
The HTML
class in BetterMD/elements/html.py
(lines 3–6) currently defines only the format attributes without any explicit conversion methods. Given that other Symbol
subclasses in our codebase typically implement conversion methods to handle format transformations (for instance, methods such as to_html()
, to_md()
, and to_rst()
), it is advisable to add similar methods here. This ensures consistency across the codebase and provides a clear API for format conversion—even if the methods simply return the corresponding attribute values.
Suggested changes:
class HTML(Symbol):
html = "html"
md = ""
rst = ""
def to_html(self):
return self.html
def to_md(self):
return self.md
def to_rst(self):
return self.rst
BetterMD/elements/title.py
Outdated
@@ -0,0 +1,27 @@ | |||
from typing import Text |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Clarify typing import confusion.
You're importing Text
from the typing module, but you're also importing a Text
class from .text
on line 5. This creates a namespace conflict as both are identically named but serve different purposes. This confusion could lead to errors.
-from typing import Text
+from typing import List
Committable suggestion skipped: line range outside the PR's diff.
🧰 Tools
🪛 Ruff (0.8.2)
1-1: No such file or directory (os error 2)
(E902)
BetterMD/elements/title.py
Outdated
if not isinstance(inner[0], Text) or len(inner) != 1: | ||
raise ValueError("Title element must contain a single Text element") | ||
|
||
return f'title: "{inner[0].to_md()}"' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Consider using a more standard Markdown title format.
The current Markdown format title: "text"
does not appear to follow standard Markdown conventions for titles, which typically use #
for headings or other standard syntax.
🌐 Web query:
What is the standard Markdown syntax for document titles?
💡 Result:
The standard Markdown syntax for document titles uses a single level-one heading (#
), which serves as the top-level heading in a document. This practice is widely adopted across platforms and style guides:
Core Syntax
- Use
# Title
for the document title:# Document Title
- A single
#
represents the H1 heading, equivalent to HTML's<h1>
tag[1][8][9] - Must be the first content after metadata (in documents using YAML front matter)[4][6]
- A single
Key Requirements
-
Space after
#
:
Always include a space between#
and title text for compatibility[1][5][8]:# Good Title # BAD TITLE
-
Singular Usage:
Each document should contain only one H1 heading as the title[4][5][6] -
Positioning:
Place blank lines before and after the title heading for proper rendering[1][8]:<!-- YAML metadata (if present) --> # Document Title First paragraph content...
Platform-Specific Considerations
Platform | Additional Guidance |
---|---|
Microsoft Learn | Title must appear after YAML metadata block, no HTML headings allowed[4] |
Google Docs | Supports creating H1 via # syntax when Markdown mode is enabled[3] |
GitHub | Automatically uses first H1 heading as page title in generated TOC[9] |
R Markdown | H1 headings after YAML title help maintain proper section numbering[6] |
While some implementations support alternate H1 syntax using underlines (===
), the #
method is universally recommended for compatibility[8][9]. For documents with embedded metadata (like YAML front matter), the title heading should immediately follow this metadata block[4][6].
Citations:
- 1: https://www.codecademy.com/resources/docs/markdown/headings
- 2: https://experienceleague.adobe.com/en/docs/contributor/contributor-guide/writing-essentials/markdown
- 3: https://support.google.com/docs/answer/12014036
- 4: https://learn.microsoft.com/en-us/contribute/content/markdown-reference
- 5: https://google.github.io/styleguide/docguide/style.html
- 6: https://ucsbcarpentry.github.io/R-markdown/03-headings-lists/index.html
- 7: https://www.markdownguide.org/extended-syntax/
- 8: https://www.markdownguide.org/basic-syntax/
- 9: https://docs.github.com/articles/basic-writing-and-formatting-syntax
Action: Update Markdown Title Formatting
- Replace the current title format:
with the standard Markdown heading style:
return f'title: "{inner[0].to_md()}"'
return f'# {inner[0].to_md()}'
- This change adheres to the widely accepted convention of using a single level-one heading (i.e.
# Title
) as the document title.
BetterMD/parse/markdown.py
Outdated
elif re.search(self.top_level_tags["h"], line): | ||
self.end_block() | ||
self.handle_h(line) | ||
i += 1 | ||
continue | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Duplicate check for headers.
You have already handled headers in lines 290–295. Repeating the same condition here (lines 308–313) introduces duplication and possible confusion. Consider removing or consolidating one of these checks.
🧩 Analysis chain@R5dan I'll conduct a comprehensive review of your "Parser" PR. Let me examine the changes and provide detailed feedback on the parser implementation. 🏁 Script executed:
Length of output: 3401 @R5dan I've reviewed your "Parser" PR in detail. This is a significant enhancement to the Better-MD package that adds bidirectional conversion capabilities between different markup formats. Key Additions
Recommendations
The architecture looks solid with a clean separation of concerns between different parser types. This will significantly enhance the library's capabilities by enabling conversion from existing HTML and Markdown content. 🐰 Thinking of ways to help test this new functionality ✅ Actions performedReview triggered.
|
Docstrings generation was requested by @R5dan. * #7 (comment) The following files were modified: * `BetterMD/__init__.py` * `BetterMD/elements/a.py` * `BetterMD/elements/code.py` * `BetterMD/elements/input.py` * `BetterMD/elements/symbol.py` * `BetterMD/elements/table.py` * `BetterMD/elements/text.py` * `BetterMD/elements/title.py` * `BetterMD/html/custom_html.py` * `BetterMD/markdown/custom_markdown.py` * `BetterMD/parse/collection.py` * `BetterMD/parse/html.py` * `BetterMD/parse/markdown.py` * `BetterMD/parse/typing.py` * `BetterMD/rst/custom_rst.py`
Note Generated docstrings for this pull request at #8 |
Docstrings generation was requested by @R5dan. * #7 (comment) The following files were modified: * `BetterMD/__init__.py` * `BetterMD/elements/a.py` * `BetterMD/elements/code.py` * `BetterMD/elements/input.py` * `BetterMD/elements/symbol.py` * `BetterMD/elements/table.py` * `BetterMD/elements/text.py` * `BetterMD/elements/title.py` * `BetterMD/html/custom_html.py` * `BetterMD/markdown/custom_markdown.py` * `BetterMD/parse/collection.py` * `BetterMD/parse/html.py` * `BetterMD/parse/markdown.py` * `BetterMD/parse/typing.py` * `BetterMD/rst/custom_rst.py`
Note Generated docstrings for this pull request at #9 |
Docstrings generation was requested by @R5dan. * #7 (comment) The following files were modified: * `BetterMD/__init__.py` * `BetterMD/elements/a.py` * `BetterMD/elements/code.py` * `BetterMD/elements/input.py` * `BetterMD/elements/symbol.py` * `BetterMD/elements/table.py` * `BetterMD/elements/text.py` * `BetterMD/elements/title.py` * `BetterMD/html/custom_html.py` * `BetterMD/markdown/custom_markdown.py` * `BetterMD/parse/collection.py` * `BetterMD/parse/html.py` * `BetterMD/parse/markdown.py` * `BetterMD/parse/typing.py` * `BetterMD/rst/custom_rst.py`
Note Generated docstrings for this pull request at #10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (11)
BetterMD/elements/a.py (2)
10-23
: Good addition of theverify
method with regex pattern matching.The implementation correctly identifies three common Markdown link formats: inline links, automatic links, and reference links. The regex patterns are well-formed for capturing the intended patterns.
However, the current regex patterns have a few limitations:
- They don't support links with spaces in URLs (encoded as %20)
- They assume reference links must be immediately followed by their definition
Consider enhancing the regex patterns to handle more edge cases:
- if re.findall("\[([^\]]+)\]\((https?:\/\/[^\s)]+)\)", text): + if re.findall(r"\[([^\]]+)\]\((https?:\/\/[^)]+)\)", text): # Case 1: Inline link return True - elif re.findall("<(https?:\/\/[^\s>]+)>", text): + elif re.findall(r"<(https?:\/\/[^>]+)>", text): # Case 2: Automatic Links return True - elif re.findall("\[([^\]]+)\]\[([^\]]+)\]\s*\n?\[([^\]]+)\]:\s*(https?:\/\/[^\s]+)", text): + # Split this into two patterns to handle reference links with definitions elsewhere in the document + elif re.findall(r"\[([^\]]+)\]\[([^\]]*)\]", text) or re.findall(r"\[([^\]]+)\]:\s*(https?:\/\/[^\s]+)", text): # Case 3: Reference Links return True
38-48
: Empty class methods need implementation.The newly added class methods
md_refs
,rst_refs
, andhtml_refs
are currently empty placeholder methods. These methods are likely intended to manage references in different markup formats but don't yet provide any functionality.I can help implement these methods if you have specific requirements for how references should be handled in each format.
BetterMD/elements/table.py (2)
41-41
: Optimize multiple isinstance checks.The code uses two separate isinstance checks for the same variable. These can be combined for better readability and performance.
- if isinstance(section, THead) or isinstance(section, TBody): + if isinstance(section, (THead, TBody)):🧰 Tools
🪛 Ruff (0.8.2)
41-41: Multiple
isinstance
calls forsection
, merge into a single callMerge
isinstance
calls forsection
(SIM101)
62-62
: Remove unused loop variable.The loop control variable
i
is not used within the loop body. Consider replacing it with an underscore to indicate it's intentionally unused.- for i, (cells, is_header) in enumerate(all_rows): + for _, (cells, is_header) in enumerate(all_rows):🧰 Tools
🪛 Ruff (0.8.2)
62-62: Loop control variable
i
not used within loop bodyRename unused
i
to_i
(B007)
BetterMD/parse/markdown.py (3)
1-5
: Remove unused imports.The modules
typing
andthreading
are imported but never used in this file.import re -import typing as t from .typing import ELEMENT, TEXT -import threading as th🧰 Tools
🪛 Ruff (0.8.2)
2-2:
typing
imported but unusedRemove unused import:
typing
(F401)
4-4:
threading
imported but unusedRemove unused import:
threading
(F401)
246-253
: Use ternary operator for cell type determination.A ternary operator would make this code more concise and easier to read.
- if current_section == thead: - cell_type = "th" - else: - cell_type = "td" + cell_type = "th" if current_section == thead else "td"🧰 Tools
🪛 Ruff (0.8.2)
246-249: Use ternary operator
cell_type = "th" if current_section == thead else "td"
instead ofif
-else
-blockReplace
if
-else
-block withcell_type = "th" if current_section == thead else "td"
(SIM108)
353-354
: Handle head creation more explicitly.The line
head = self.head or self.create_element("head")
assumesself.head
is defined. It's safer to explicitly check if the attribute exists before using it.- head = self.head or self.create_element("head") + head = getattr(self, 'head', None) or self.create_element("head")BetterMD/elements/symbol.py (4)
32-39
: Use properis None
comparisons instead of equality operators.When checking for
None
, use the identity operatoris
rather than the equality operator==
for better readability and correctness.def __init__(self, styles:'dict[str,str]'=None, classes:'list[str]'=None, inner:'list[Symbol]'=None, **props): - if styles == None: + if styles is None: styles = {} - if classes == None: + if classes is None: classes = [] - if inner == None: + if inner is None: inner = []🧰 Tools
🪛 Ruff (0.8.2)
33-33: Comparison to
None
should becond is None
Replace with
cond is None
(E711)
35-35: Comparison to
None
should becond is None
Replace with
cond is None
(E711)
37-37: Comparison to
None
should becond is None
Replace with
cond is None
(E711)
45-52
: Use properis None
comparisons in the copy method.Similar to the
__init__
method, useis None
instead of== None
for better clarity and following Python conventions.def copy(self, styles:'dict[str,str]'=None, classes:'list[str]'=None, inner:'list[Symbol]'=None): - if inner == None: + if inner is None: inner = [] - if styles == None: + if styles is None: styles = {} - if classes == None: + if classes is None: classes = []🧰 Tools
🪛 Ruff (0.8.2)
46-46: Comparison to
None
should becond is None
Replace with
cond is None
(E711)
48-48: Comparison to
None
should becond is None
Replace with
cond is None
(E711)
50-50: Comparison to
None
should becond is None
Replace with
cond is None
(E711)
92-98
: Improve readability of HTML generation.The current HTML generation uses a very long and complex f-string that's difficult to read and maintain. Consider breaking it down into smaller, more manageable parts.
def to_html(self, indent=1) -> 'str': if isinstance(self.html, CustomHTML): return self.html.to_html(self.children, self, self.parent) inner_HTML = f"\n{" "*indent}".join([e.to_html(indent+1) if not (len(self.children) == 1 and self.children[0].html == "text") else e.to_html(0) for e in self.children]) - return f"<{self.html}{" " if self.styles or self.classes or self.props else ""}{f"class={'"'}{' '.join(self.classes) or ''}{'"'}" if self.classes else ""}{" " if (self.styles or self.classes) and self.props else ""}{f"style={'"'}{' '.join([f'{k}:{v}' for k,v in self.styles.items()]) or ""}{'"'}" if self.styles else ""}{" " if (self.styles or self.classes) and self.props else ""}{' '.join([f'{k}={'"'}{v}{'"'}' if v != "" else f'{k}' for k,v in self.props.items()])}{f">{"\n" if len(self.children) > 1 else ""}{inner_HTML}{"\n" if len(self.children) > 1 else ""}</{self.html}>" if inner_HTML else f" />"}" + # Build the HTML components step by step + attrs = [] + if self.classes: + attrs.append(f'class="{" ".join(self.classes)}"') + if self.styles: + attrs.append(f'style="{" ".join([f"{k}:{v}" for k, v in self.styles.items()])}"') + for k, v in self.props.items(): + attrs.append(f'{k}="{v}"' if v else k) + + attrs_str = " " + " ".join(attrs) if attrs else "" + + if inner_HTML: + if len(self.children) > 1: + return f"<{self.html}{attrs_str}>\n{inner_HTML}\n</{self.html}>" + else: + return f"<{self.html}{attrs_str}>{inner_HTML}</{self.html}>" + else: + return f"<{self.html}{attrs_str} />"🧰 Tools
🪛 Ruff (0.8.2)
97-97: f-string without any placeholders
Remove extraneous
f
prefix(F541)
155-158
: Simplify the__str__
method with a similar approach toto_html
.The
__str__
method has the same complexity issues asto_html
. Consider refactoring it for better readability and maintainability.def __str__(self): - return f"<{self.html}{" " if self.styles or self.classes or self.props else ""}{f"class={'"'}{' '.join(self.classes) or ''}{'"'}" if self.classes else ""}{" " if (self.styles or self.classes) and self.props else ""}{f"style={'"'}{' '.join([f'{k}:{v}' for k,v in self.styles.items()]) or ""}{'"'}" if self.styles else ""}{" " if (self.styles or self.classes) and self.props else ""}{' '.join([f'{k}={'"'}{v}{'"'}' if v != "" else f'{k}' for k,v in self.props.items()])}{f">{"\n" if len(self.children) > 1 else ""}{"\n" if len(self.children) > 1 else ""}{len(self.children)}</{self.html}>"}" + # Build the representation components + attrs = [] + if self.classes: + attrs.append(f'class="{" ".join(self.classes)}"') + if self.styles: + attrs.append(f'style="{" ".join([f"{k}:{v}" for k, v in self.styles.items()])}"') + for k, v in self.props.items(): + attrs.append(f'{k}="{v}"' if v else k) + + attrs_str = " " + " ".join(attrs) if attrs else "" + + return f"<{self.html}{attrs_str}>{len(self.children)}</{self.html}>"
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
BetterMD/__init__.py
(1 hunks)BetterMD/elements/a.py
(1 hunks)BetterMD/elements/symbol.py
(4 hunks)BetterMD/elements/table.py
(3 hunks)BetterMD/elements/title.py
(1 hunks)BetterMD/parse/markdown.py
(1 hunks)
🧰 Additional context used
🧬 Code Definitions (5)
BetterMD/__init__.py (2)
BetterMD/elements/symbol.py (4) (4)
Symbol
(8-158)parse
(119-135)from_html
(114-116)from_md
(138-140)BetterMD/parse/markdown.py (2) (2)
parse
(276-356)MDParser
(6-356)
BetterMD/elements/title.py (2)
BetterMD/elements/symbol.py (3) (3)
Symbol
(8-158)to_md
(99-104)to_rst
(106-111)BetterMD/elements/a.py (4) (4)
MD
(6-23)to_md
(7-8)RST
(26-28)to_rst
(27-28)
BetterMD/elements/a.py (3)
BetterMD/rst/custom_rst.py (3) (3)
CustomRst
(9-18)verify
(18-18)to_rst
(14-14)BetterMD/markdown/custom_markdown.py (3) (3)
CustomMarkdown
(9-18)to_md
(14-14)verify
(18-18)BetterMD/elements/symbol.py (3) (3)
to_md
(99-104)to_rst
(106-111)Symbol
(8-158)
BetterMD/elements/table.py (3)
BetterMD/elements/symbol.py (3) (3)
to_md
(99-104)to_rst
(106-111)Symbol
(8-158)BetterMD/markdown/custom_markdown.py (2) (2)
to_md
(14-14)CustomMarkdown
(9-18)BetterMD/rst/custom_rst.py (2) (2)
CustomRst
(9-18)to_rst
(14-14)
BetterMD/elements/symbol.py (2)
BetterMD/parse/markdown.py (2) (2)
parse
(276-356)MDParser
(6-356)BetterMD/__init__.py (2) (2)
from_html
(5-6)from_md
(8-9)
🪛 Ruff (0.8.2)
BetterMD/__init__.py
3-3: .parse.Collection
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
3-3: .parse.HTMLParser
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
3-3: .parse.MDParser
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
3-3: .parse.RSTParser
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
BetterMD/parse/markdown.py
2-2: typing
imported but unused
Remove unused import: typing
(F401)
4-4: threading
imported but unused
Remove unused import: threading
(F401)
16-16: Dictionary key literal "br"
repeated
(F601)
246-249: Use ternary operator cell_type = "th" if current_section == thead else "td"
instead of if
-else
-block
Replace if
-else
-block with cell_type = "th" if current_section == thead else "td"
(SIM108)
BetterMD/elements/table.py
41-41: Multiple isinstance
calls for section
, merge into a single call
Merge isinstance
calls for section
(SIM101)
62-62: Loop control variable i
not used within loop body
Rename unused i
to _i
(B007)
BetterMD/elements/symbol.py
33-33: Comparison to None
should be cond is None
Replace with cond is None
(E711)
35-35: Comparison to None
should be cond is None
Replace with cond is None
(E711)
37-37: Comparison to None
should be cond is None
Replace with cond is None
(E711)
46-46: Comparison to None
should be cond is None
Replace with cond is None
(E711)
48-48: Comparison to None
should be cond is None
Replace with cond is None
(E711)
50-50: Comparison to None
should be cond is None
Replace with cond is None
(E711)
97-97: f-string without any placeholders
Remove extraneous f
prefix
(F541)
🔇 Additional comments (15)
BetterMD/elements/a.py (3)
33-33
: Use a class variable for shared references with caution.The
refs = {}
dictionary here is shared among all instances of classA
. Any modifications torefs
on one instance will affect other instances. Consider changing the design if different instances ofA
should have separate dictionaries.
35-36
: Simplified HTML representation from object to string.Changing
html
from an instance ofHTML
class to a string literal "a" reflects an architectural shift toward simpler element representation. This is a good simplification if the complexHTML
class functionality is no longer needed.
6-6
:✅ Verification successful
Change of generic type parameter may affect type checking.
The class now inherits from
CustomMarkdown
without the generic type parameter['A']
. This simplifies the class hierarchy but may impact static type checking if the generic parameter was used elsewhere in the code.
🏁 Script executed:
#!/bin/bash # Check for other uses of generic type parameter in CustomMarkdown rg "CustomMarkdown\[.+\]" -A 2 -B 2Length of output: 37
Static type validation appears unaffected by the generic parameter removal.
I searched for instances of generic usage (e.g.
CustomMarkdown[...]
) across the repository and found no matches. Consequently, the removal of the generic parameter['A']
in the class definition:class MD(CustomMarkdown):does not seem to impact static type checking in the current codebase. If this change was unexpected, I would recommend a manual re-check using your type checker to ensure no hidden dependencies exist.
BetterMD/__init__.py (2)
5-9
: Added convenient public API functions for format conversion.The
from_html
andfrom_md
functions provide a clean public API for converting from HTML and Markdown to the library's internal representation. This simplifies usage for consumers of the library by removing the need to directly use theSymbol
class.
2-3
:❓ Verification inconclusive
Explicit import of Symbol addresses previous review comment.
The explicit import of
Symbol
here addresses a previous review comment and improves clarity. However, the static analysis hints indicate that the imported parsers (Collection
,HTMLParser
,MDParser
,RSTParser
) are not used directly in this file.
🏁 Script executed:
#!/bin/bash # Check for references to the parser classes elsewhere in the codebase rg -A 1 "from BetterMD import (Collection|HTMLParser|MDParser|RSTParser)"Length of output: 75
Re-exporting Parser Classes as Part of the Public API – Please Verify Usage
The explicit import of
Symbol
successfully addresses the previous concern and clarifies the module’s intent. Although the static analysis indicates that the parser classes (Collection
,HTMLParser
,MDParser
,RSTParser
) are not used directly inBetterMD/__init__.py
, they appear to be re-exported as part of the public API. It would be beneficial to manually verify that these classes are indeed consumed by external modules or by users of the package, and that their export is intentional.
- Confirm that the parser classes are required to be exposed as part of the public API.
- If these re-exports are not used externally, consider whether they should be removed or refactored.
🧰 Tools
🪛 Ruff (0.8.2)
3-3:
.parse.Collection
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
3-3:
.parse.HTMLParser
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
3-3:
.parse.MDParser
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
3-3:
.parse.RSTParser
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
BetterMD/elements/title.py (3)
8-9
: Good defensive programming with null check.The validation logic correctly checks if the list is empty before attempting to access its first element, preventing potential
IndexError
exceptions. The validation also ensures that the content is a singleText
element, maintaining expected structure.
20-23
: Simple and clean Symbol subclass implementation.The
Title
class follows the established pattern for Symbol subclasses, setting appropriate renderers for different formats. The HTML representation as a simple string "title" aligns with the simplified approach seen in other elements.
13-18
:❓ Verification inconclusive
The RST syntax for title may need verification.
The
:title:
directive isn't a standard reStructuredText construct for document titles. Standard RST typically uses overlines and underlines for section titles.
🌐 Web query:
What is the standard reStructuredText syntax for document titles?
💡 Result:
In reStructuredText, document titles are formatted using a uniquely-adorned top-level section title with matching overlines and underlines. Unlike standard section headers, titles are recognized by their position and unique adornment style at the beginning of the document:
Syntax:
Use the same non-alphanumeric character (e.g.,=
,#
, or*
) for both the overline and underline, extending at least as long as the title text. Example:================== Document Title ==================
Key rules:
- The title must be the first section in the document[1][2][4].
- Subtitle syntax follows immediately after, using a distinct adornment style[2][5]:
-------------- Document Subtitle --------------
- Avoid mixing title styles within the document hierarchy[4][7].
Common conventions (though not enforced) suggest using
#
or=
for titles and*
for subtitles[3][4][12]. The parser assigns levels based on the order of encountered styles[1][7], so consistency is critical. Overline-only or mismatched underlines/overlines will cause errors[4][8].For example, this structure is widely accepted:
#################### Main Title (H1) #################### ************* Subtitle (H2) ************* Section (H3) ============
Citations:
- 1: https://docutils.sourceforge.io/docs/ref/rst/restructuredtext.html
- 2: https://docutils.sourceforge.io/docs/user/rst/quickstart.html
- 3: https://docs.beagleboard.org/intro/contribution/rst-cheat-sheet.html
- 4: https://documatt.com/restructuredtext-reference/element/section.html
- 5: https://docs.anaconda.com/restructuredtext/
- 6: https://developer.lsst.io/restructuredtext/style.html
- 7: https://www.sphinx-doc.org/en/master/usage/restructuredtext/basics.html
- 8: https://www.cometvisu.org/CometVisu/en/0.11/manual/colab/doc/rst.html
- 9: https://www.sphinx-doc.org/en/master/usage/restructuredtext/directives.html
- 10: https://docutils.sourceforge.io/docs/ref/rst/directives.html
- 11: https://docs.openstack.org/doc-contrib-guide/rst-conv/titles.html
- 12: https://canonical-documentation-with-sphinx-and-readthedocscom.readthedocs-hosted.com/style-guide/
- 13: https://docutils.sourceforge.io/docs/user/rst/quickref.html
Attention: Verify RST Title Syntax
Please review the title formatting in
BetterMD/elements/title.py
(lines 13–18). The implementation currently returns a string in the format:return f":title: {inner[0].to_rst()}"Standard reStructuredText does not use a
:title:
directive for document titles. Instead, document titles should be specified using an overline and matching underline (using the same non-alphanumeric character) that spans the title text. For example:================== Document Title ==================
Kindly confirm whether the current implementation is an intentional custom extension, or whether it should be updated to adhere to standard RST syntax.
BetterMD/elements/table.py (5)
8-29
: Improved table structure handling with THead and TBody separation.The refactored implementation properly handles table structure by processing header and body sections separately. This makes the code more maintainable and better reflects HTML table semantics.
32-74
: Well-designed two-pass table rendering approach.The two-pass approach for RST tables is excellent. First collecting all cell widths ensures consistent column sizing, then generating the output with proper spacing creates professional-looking tables. This approach resolves alignment issues that would otherwise occur with varying content lengths.
🧰 Tools
🪛 Ruff (0.8.2)
41-41: Multiple
isinstance
calls forsection
, merge into a single callMerge
isinstance
calls forsection
(SIM101)
62-62: Loop control variable
i
not used within loop bodyRename unused
i
to_i
(B007)
76-106
: Good implementation of Markdown table headers.The THeadMD implementation follows best practices by first collecting all rows and calculating column widths before generating the final output. This ensures proper alignment of the separator row with the content.
139-145
: Improved error handling with fallback mechanism.The TdRST implementation now gracefully handles unexpected content by falling back to joining the elements instead of raising an error. This makes the code more robust when processing various input formats.
177-185
: Comprehensive table structure with new THead and TBody classes.Adding dedicated THead and TBody classes improves the structural representation of tables and enables more precise rendering control. This matches HTML table semantics and allows for better styling and formatting options.
BetterMD/parse/markdown.py (2)
11-11
: Potential mismatch for multi-line code blocks.The pattern
^```([A-Za-z]*)[^.](?:([^
]*)[^.])?```$may not reliably capture multi-line code blocks, as it lacks flags like
re.DOTALL` to handle newlines within the captured groups. This could lead to incomplete parsing of complex code samples.#!/bin/bash # Test the code block regex against a real-world example cat << 'EOF' > example_code_block.md ```python def hello(): print("Hello, world!") # This has multiple empty lines return TrueEOF
Use Python to test if the regex will match this example
python3 -c "
import re
pattern = r'^([A-Za-z]*)[^.](?:([^`]*)[^.])?
$'
with open('example_code_block.md', 'r') as f:
content = f.read()
match = re.match(pattern, content, re.DOTALL)
if match:
print('Match found!')
print('Language:', match.group(1))
print('Content length:', len(match.group(2)) if match.group(2) else 0)
else:
print('No match found with current regex')Test a potentially better pattern
better_pattern = r'^
([A-Za-z]*)\\n(.*?)\\n
$'
match = re.match(better_pattern, content, re.DOTALL)
if match:
print('\nMatch found with better pattern!')
print('Language:', match.group(1))
print('Content length:', len(match.group(2)))
else:
print('\nNo match found with better pattern either')
"--- `309-314`: **Duplicate check for headers.** You have already handled headers in lines 290-295. Repeating the same condition here (lines 309-314) introduces duplication and possible confusion. Consider removing or consolidating one of these checks. ```diff - elif re.search(self.top_level_tags["h"], line): - self.end_block() - self.handle_h(line) - i += 1 - continue -
BetterMD/elements/title.py
Outdated
class MD(CustomMarkdown): | ||
def to_md(self, inner: list[Symbol], symbol: Symbol, parent: Symbol, **kwargs) -> str: | ||
if not inner or not isinstance(inner[0], Text) or len(inner) != 1: | ||
raise ValueError("Title element must contain a single Text element") | ||
|
||
return f'title: "{inner[0].to_md()}"' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Consider using standard Markdown heading syntax.
The current Markdown title format title: "text"
isn't standard Markdown. Most Markdown processors expect headings to use hash symbols (#
).
Implement standard Markdown heading syntax:
- return f'title: "{inner[0].to_md()}"'
+ return f'# {inner[0].to_md()}'
🌐 Web query:
What is the standard Markdown syntax for document titles?
💡 Result:
The standard Markdown syntax for document titles uses level 1 headings with a single #
symbol followed by a space and the title text. Key conventions include:
Basic Syntax
# Document Title
- Must start at the beginning of a line
- Requires a space between
#
and the title text for compatibility[1][8][10]
Best Practices
- Single H1 per document: Files should contain only one top-level title[4][5]
- Positioning: Place it as the first content after metadata (if present)[4][5]
- Styling:
- Use title case capitalization[5]
- Avoid Markdown formatting within titles[5]
- Include blank lines before/after the heading[1][8]
Platform-Specific Variations
Platform | Requirement | Example |
---|---|---|
Microsoft Learn | First content after YAML metadata[4] | # Title |
Google Docs | Matches filename[5] | # Style Guide |
GitHub | Automatically generates anchors[10] | # Basic Syntax |
For R Markdown, titles are declared in YAML metadata instead of standard Markdown headings[6]. Some editors like Google Docs allow alternative heading syntax with ==
underlines, but the #
method remains universally compatible[1][8].
Citations:
- 1: https://www.codecademy.com/resources/docs/markdown/headings
- 2: https://experienceleague.adobe.com/en/docs/contributor/contributor-guide/writing-essentials/markdown
- 3: https://support.google.com/docs/answer/12014036
- 4: https://learn.microsoft.com/en-us/contribute/content/markdown-reference
- 5: https://google.github.io/styleguide/docguide/style.html
- 6: https://ucsbcarpentry.github.io/R-markdown/03-headings-lists/index.html
- 7: https://www.markdownguide.org/extended-syntax/
- 8: https://www.markdownguide.org/basic-syntax/
- 9: https://www.markdownguide.org/cheat-sheet/
- 10: https://docs.github.com/articles/basic-writing-and-formatting-syntax
Adopt Standard Markdown Heading Syntax
Please update the return statement in BetterMD/elements/title.py to use the recognised level 1 heading (a single #
followed by a space) instead of the current non-standard format. For example:
- Replace:
return f'title: "{inner[0].to_md()}"'
- With:
return f'# {inner[0].to_md()}'
This change ensures standard Markdown compliance by using a heading as the document title.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (3)
BetterMD/elements/title.py (1)
6-11
: Consider adopting a standard Markdown title format or removing duplication
Currently, the method returns both a metadata line (title: "..."
) and a heading (# "..."
), which can be confusing or redundant. If you intend to store title metadata, consider a separate mechanism (such as YAML front matter). If you prefer a proper heading in Markdown, a single line like# Title
without quotes is more typical.-return f'title: "{inner[0].to_md()}"\n# "{inner[0].to_md()}"' +return f'# {inner[0].to_md()}'BetterMD/parse/markdown.py (2)
2-4
: Remove unused imports
The imports fortyping as t
andthreading as th
appear unused, as your code relies on direct type hints with built-in classes and does not reference threading. Removing them will reduce clutter.-import typing as t -import threading as th🧰 Tools
🪛 Ruff (0.8.2)
2-2:
typing
imported but unusedRemove unused import:
typing
(F401)
4-4:
threading
imported but unusedRemove unused import:
threading
(F401)
245-248
: Use a ternary operator for compactness
Replacing theif-else
block with a single-liner makes the code shorter without sacrificing readability:- if current_section == thead: - cell_type = "th" - else: - cell_type = "td" + cell_type = "th" if current_section == thead else "td"🧰 Tools
🪛 Ruff (0.8.2)
245-248: Use ternary operator
cell_type = "th" if current_section == thead else "td"
instead ofif
-else
-blockReplace
if
-else
-block withcell_type = "th" if current_section == thead else "td"
(SIM108)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
BetterMD/elements/title.py
(1 hunks)BetterMD/parse/markdown.py
(1 hunks)
🧰 Additional context used
🧬 Code Definitions (1)
BetterMD/elements/title.py (2)
BetterMD/elements/symbol.py (3) (3)
Symbol
(8-158)to_md
(99-104)to_rst
(106-111)BetterMD/elements/a.py (4) (4)
MD
(6-23)to_md
(7-8)RST
(26-28)to_rst
(27-28)
🪛 Ruff (0.8.2)
BetterMD/parse/markdown.py
2-2: typing
imported but unused
Remove unused import: typing
(F401)
4-4: threading
imported but unused
Remove unused import: threading
(F401)
245-248: Use ternary operator cell_type = "th" if current_section == thead else "td"
instead of if
-else
-block
Replace if
-else
-block with cell_type = "th" if current_section == thead else "td"
(SIM108)
🔇 Additional comments (4)
BetterMD/elements/title.py (3)
1-4
: Imports look fine
No issues identified in these imports. Relative import usage is consistent with the rest of the project.
20-24
: Ensure consistent usage and testing for 'Title' class
Everything appears logically consistent. Consider adding tests for multi-word or multiline titles, to confirm that your parser and output formats handle them cleanly.
13-18
:❓ Verification inconclusive
Recheck reStructuredText usage for titles
Although:title:
can serve as a directive-like approach, typical reST headings use underline styles (e.g.,====
). If a metadata line is intended, ensure downstream processes handle it as expected.
🏁 Script executed:
#!/bin/bash # Verify references in project scanning for ':title:' usage outside this file rg ":title:" -A 3Length of output: 212
Action Required: Reassess reStructuredText Title Directive
The use of
:title:
in theBetterMD/elements/title.py
file currently deviates from standard reST heading styles, which typically utilise underlining (e.g.====
). Although this metadata-like approach appears to be confined to this file—as confirmed by the reference scan—it is important to verify that any downstream processes rely on and correctly interpret this format. Please review whether this output is intended as a metadata line or if a conventional heading format would be more appropriate in this context.BetterMD/parse/markdown.py (1)
10-10
: Regex may fail on multi-line code blocks
The pattern^```([A-Za-z]*)[^.](?:([^
]*)[^.])?```$` could break for multi-line code. A single-regex approach might be unreliable. Consider line-by-line scanning or using a multiline-aware pattern with DOTALL.
BetterMD/parse/markdown.py
Outdated
def handle_code(self, text: 'list[str]'): | ||
self.end_block() | ||
match = re.match(self.top_level_tags["code"], "\n".join(text)) | ||
assert match is not None, "Code block not found" | ||
|
||
lang = match.group(1) | ||
content = match.group(2) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Handle multi-line code blocks more robustly
This chunks entire input into one string and then uses re.match
. If your code blocks span multiple lines, this method may fail or capture content incorrectly. A line-by-line approach with explicit detection of the start and end markers often proves more reliable.
BetterMD/parse/markdown.py
Outdated
elif re.search(self.top_level_tags["h"], line): | ||
self.end_block() | ||
self.handle_h(line) | ||
i += 1 | ||
continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Duplicate heading check
You handle headings at lines 290–294 and again at 308–312. Consolidating this logic prevents confusion or inconsistencies in the final output.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (5)
BetterMD/parse/markdown.py (5)
12-12
: Usingbr
asr"\s"
might cause unintended matches.
Relying on\s
for line breaks could match single spaces or any whitespace, potentially triggering unwanted breaks. Consider refining it to detect actual empty lines or explicit break markers.
97-101
: Avoid relying on an assertion for code block detection.
If the regex unexpectedly fails, anAssertionError
will halt parsing. Consider gracefully handling unmatched code blocks to provide a more user-friendly error or fallback.
110-115
: Clarify whitespace line handling inhandle_br
.
Currently, only empty lines produce<br/>
, while lines with spaces do not. This behaviour might conflict with the definedbr
pattern and lead to inconsistent rendering.
241-244
: Use a ternary operator for simpler logic.
Following the static analysis hint, you can replace theif-else
block with:- if current_section == thead: - cell_type = "th" - else: - cell_type = "td" + cell_type = "th" if current_section == thead else "td"🧰 Tools
🪛 Ruff (0.8.2)
241-244: Use ternary operator
cell_type = "th" if current_section == thead else "td"
instead ofif
-else
-blockReplace
if
-else
-block withcell_type = "th" if current_section == thead else "td"
(SIM108)
268-268
: Handle empty or no title text.
Ifmatch.group(1)
isNone
, you might generate a<title>
element containingNone
. Consider defaulting to an empty string for a cleaner experience.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
BetterMD/parse/markdown.py
(1 hunks)
🧰 Additional context used
🧬 Code Definitions (1)
BetterMD/parse/markdown.py (2)
BetterMD/parse/typing.py (3) (3)
ELEMENT
(8-12)TEXT
(3-6)parse
(16-16)BetterMD/parse/html.py (3) (3)
reset
(8-14)parse
(16-193)handle_text
(213-218)
🪛 Ruff (0.8.2)
BetterMD/parse/markdown.py
241-244: Use ternary operator cell_type = "th" if current_section == thead else "td"
instead of if
-else
-block
Replace if
-else
-block with cell_type = "th" if current_section == thead else "td"
(SIM108)
🔇 Additional comments (2)
BetterMD/parse/markdown.py (2)
7-7
: Potential mismatch for multi-line code blocks.
This pattern might fail to capture multi-line code blocks correctly without using flags likere.DOTALL
, leading to partial parsing.
103-103
: Confirm thatmatch.group(2)
is notNone
.
A code block with no captured content might returnNone
, which can lead to an error or a text node containingNone
. Validate or default to an empty string:- content = match.group(2) + content = match.group(2) or ""
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 28
🧹 Nitpick comments (122)
BetterMD/elements/abbr.py (2)
3-4
: Class definition is appropriate, but lacks documentation.The Abbr class properly inherits from Symbol and defines the title property which is standard for HTML abbr elements. Consider adding a docstring to explain the purpose of this class and how it functions within the BetterMD framework.
class Abbr(Symbol): + """ + Represents an HTML abbreviation element (<abbr>). + + The title property typically contains the full description of the abbreviation. + """ prop_list = ["title"]
6-8
: Format representations are defined, but could use clarification.The class attributes define how this element is represented in different markup formats. While HTML is correctly set to "abbr", the empty strings for md and rst might benefit from a comment explaining why there's no direct equivalent in those formats.
md = "" # No direct equivalent in Markdown html = "abbr" rst = "" # No direct equivalent in reStructuredTextBetterMD/typing.py (3)
3-3
: Type annotations could be more specificThe current definition uses bare
list
anddict
types, which are not as type-safe as they could be. Consider using more specific annotations likelist[typing.Any]
anddict[typing.Any, typing.Any]
to better indicate these are containers that can hold any type.-ATTR_TYPES = t.Union[str, bool, int, float, list, dict] +ATTR_TYPES = t.Union[str, bool, int, float, t.List[t.Any], t.Dict[t.Any, t.Any]]
5-8
: Improve type annotation format for better maintainabilityThere are a few style improvements that could make this type definition more maintainable:
- The type annotations inside the TypedDict are strings rather than direct types, which works because of forward references but could be clearer.
- The TypedDict is named "ATTRS", the same as the variable it's being assigned to, which could lead to confusion.
Here's an improved version:
-ATTRS = t.Union[t.TypedDict("ATTRS", { - "style": 'dict[str, ATTR_TYPES]', - "class": 'list[str]' -}), 'dict[str, ATTR_TYPES]'] +class ElementAttributes(t.TypedDict, total=False): + style: t.Dict[str, ATTR_TYPES] + class_: t.List[str] + +ATTRS = t.Union[ElementAttributes, t.Dict[str, ATTR_TYPES]]This approach:
- Creates a properly named TypedDict class
- Uses direct type annotations rather than strings
- Uses
total=False
to indicate not all fields are required- Renames "class" to "class_" to avoid Python keyword conflicts
1-8
: Consider using Python's newer typing featuresThis file implements important type definitions, but could benefit from newer typing features if you're using Python 3.9+:
- Union types can be expressed with the
|
operator in Python 3.10+- TypedDict supports inheritance and documentation
- Consider adding docstrings to explain the purpose of these types
Example for Python 3.10+:
from typing import TypedDict, Dict, List, Any, Union # Type for attribute values ATTR_TYPES = str | bool | int | float | List[Any] | Dict[Any, Any] class BaseAttributes(TypedDict, total=False): """Base attributes that can be applied to HTML elements.""" style: Dict[str, ATTR_TYPES] class_: List[str] # Full attributes type that can be either the specific BaseAttributes or any dictionary ATTRS = BaseAttributes | Dict[str, ATTR_TYPES]BetterMD/parse/typing.py (1)
3-3
: Remove unused importThe static analysis tool has correctly identified that
ATTR_TYPES
is imported but never used in this file. Consider removing this unused import to keep the codebase clean.-from ..typing import ATTR_TYPES, ATTRS +from ..typing import ATTRS🧰 Tools
🪛 Ruff (0.8.2)
3-3:
..typing.ATTR_TYPES
imported but unusedRemove unused import:
..typing.ATTR_TYPES
(F401)
BetterMD/elements/nav.py (1)
3-6
: Add a docstring for the Nav classThe class implementation is correct, but it would benefit from a docstring explaining its purpose and how it functions within the Symbol hierarchy. This would help other developers understand how to use it properly.
class Nav(Symbol): + """ + Symbol representing an HTML navigation element. + + Has no direct equivalent in Markdown or reStructuredText. + """ md = "" html = "nav" rst = ""BetterMD/elements/head.py (1)
4-5
: Document the purpose of prop_list and reconsider "profile" propertyThe added
prop_list
attribute needs documentation to explain its purpose and usage within the Symbol hierarchy. Additionally, be aware that the "profile" attribute is obsolete in HTML5. If this library targets modern HTML, consider updating the properties list or documenting this limitation.class Head(Symbol): - prop_list = ["profile"] + """ + Symbol representing an HTML head element. + + Has no direct equivalent in Markdown or reStructuredText. + """ + prop_list = ["profile"] # Note: 'profile' attribute is obsolete in HTML5 md = "" html = "head" rst = ""BetterMD/elements/u.py (1)
3-6
: Add docstring and consider Markdown/RST equivalentsThe class implementation is correct, but would benefit from a docstring. Also consider that while basic Markdown doesn't have an underline syntax, some Markdown flavours support HTML tags or have extensions for underline. Similarly, RST can represent underlined text through custom roles.
class U(Symbol): + """ + Symbol representing an HTML underline element (<u>). + + Standard Markdown and reStructuredText don't have direct equivalents, + though some flavours may support inline HTML or extensions. + """ html = "u" md = "" rst = ""BetterMD/elements/mark.py (1)
3-6
: Consider adding documentation to explain element conversion behaviourThe empty strings for
md
andrst
attributes suggest there might not be direct equivalents in Markdown or reStructuredText for the HTML<mark>
element. It would be helpful to add a docstring explaining:
- The purpose of this class
- How content within this element is handled during conversion to Markdown and RST
- Any special parsing considerations
class Mark(Symbol): + """Represents the HTML <mark> element used for highlighting text. + + No direct equivalent exists in Markdown or reStructuredText. + During conversion, the highlighting effect may be lost but the content is preserved. + """ md = "" html = "mark" rst = ""BetterMD/elements/center.py (1)
3-6
: Add documentation and consider HTML5 compatibility notesThe
<center>
element is deprecated in HTML5. It would be beneficial to add documentation explaining:
- The purpose of this class
- The deprecated status of this element in HTML5
- How content within this element is handled during conversion to Markdown and RST
- Potential recommended alternatives
class Center(Symbol): + """Represents the HTML <center> element used for centering content. + + Note: This element is deprecated in HTML5. Consider using CSS text-align + property in modern web development. + + No direct equivalent exists in Markdown or reStructuredText. + During conversion, the centering effect may be lost but content is preserved. + """ md = "" html = "center" rst = ""BetterMD/elements/hgroup.py (1)
3-6
: Add documentation and note on element statusThe
<hgroup>
element has a complex history in HTML specifications - it was deprecated in HTML5 but later reintroduced. Consider adding documentation that:
- Explains the purpose of this class
- Notes the element's status in current HTML specifications
- Describes how this element is handled during conversion to Markdown and RST
class HGroup(Symbol): + """Represents the HTML <hgroup> element used for grouping heading elements. + + Note: This element has had varying support across HTML specifications. + + No direct equivalent exists in Markdown or reStructuredText. + During conversion, the grouping semantics may be lost but content is preserved. + """ md = "" html = "hgroup" rst = ""BetterMD/elements/section.py (1)
3-6
: Consider adding documentation to explain semantic conversionThe
<section>
element is an important structural element in HTML5. It would be valuable to add documentation explaining:
- The purpose of this class
- How semantic sectioning is handled during conversion to Markdown and RST
- Any special considerations for nested sections
class Section(Symbol): + """Represents the HTML <section> element used to define sections in a document. + + The <section> element represents a standalone section which doesn't have a more + specific semantic element to represent it. + + No direct equivalent exists in Markdown or reStructuredText, though section + content may be preserved with appropriate heading structure during conversion. + """ md = "" html = "section" rst = ""BetterMD/elements/article.py (2)
1-6
: Consider adding docstring documentationThe
Article
class implementation is concise and follows the project's pattern for HTML elements, but lacks documentation. A docstring would help users understand how this semantic HTML element should be used within the BetterMD framework.from .symbol import Symbol class Article(Symbol): + """ + Represents an HTML <article> element. + + The <article> element specifies independent, self-contained content that can be + distributed and reused independently. There is no direct equivalent in Markdown + or reStructuredText. + """ md = "" html = "article" rst = ""
4-6
: Consider implementing custom conversion methods for Markdown and reStructuredTextWhile there's no direct equivalent for
<article>
in Markdown or reStructuredText, you might want to implement custom conversion methods to handle the content appropriately when converting between formats.For example, you could represent an article as a section with a title in Markdown, or use a specific admonition in reStructuredText.
BetterMD/elements/address.py (2)
1-6
: Consider adding docstring documentationThe
Address
class implementation is correct but lacks documentation. A docstring would help users understand the purpose and usage of this semantic HTML element within the BetterMD framework.from .symbol import Symbol class Address(Symbol): + """ + Represents an HTML <address> element. + + The <address> element provides contact information for a person or organisation. + There is no direct equivalent in Markdown or reStructuredText. + """ md = "" html = "address" rst = ""
4-6
: Consider implementing formatting for Markdown and reStructuredText conversionWhile there's no direct tag equivalent for
<address>
in Markdown or reStructuredText, you might want to implement custom formatting (like italics or a specific block style) to visually distinguish address information when converting between formats.BetterMD/elements/search.py (1)
1-6
: Consider adding docstring documentationThe
Search
class implementation is concise but lacks documentation. A docstring would help users understand how this HTML5 element should be used within the BetterMD framework.from .symbol import Symbol class Search(Symbol): + """ + Represents an HTML <search> element. + + The <search> element represents a part of a document or application + that contains tools for searching. There is no direct equivalent in + Markdown or reStructuredText. + """ md = "" html = "search" rst = ""BetterMD/elements/header.py (2)
1-6
: Missing docstring for the Header classThe Header class lacks a docstring explaining its purpose and how it represents header elements across different markup formats. Adding documentation would improve code maintainability and help other developers understand how to use this class properly.
from .symbol import Symbol class Header(Symbol): + """ + Represents an HTML header element and its equivalent in other markup formats. + + Attributes: + md (str): Markdown representation (currently not implemented). + html (str): HTML tag name. + rst (str): reStructuredText representation (currently not implemented). + """ md = "" html = "header" rst = ""
4-6
: Consider implementing Markdown and reStructuredText equivalentsThe
md
andrst
attributes are empty strings, suggesting that their representations aren't implemented yet. For a complete representation, consider adding appropriate Markdown and reStructuredText equivalents for header elements.Would you like me to suggest implementations for the Markdown and reStructuredText representations of headers?
BetterMD/elements/cite.py (2)
1-6
: Missing docstring for the Cite classThe Cite class lacks a docstring explaining its purpose and how it represents citation elements across different markup formats. Adding documentation would improve code maintainability and help other developers understand how to use this class properly.
from .symbol import Symbol class Cite(Symbol): + """ + Represents an HTML citation element and its equivalent in other markup formats. + + Attributes: + md (str): Markdown representation (currently not implemented). + html (str): HTML tag name. + rst (str): reStructuredText representation (currently not implemented). + """ md = "" html = "cite" rst = ""
4-6
: Consider implementing Markdown and reStructuredText equivalentsThe
md
andrst
attributes are empty strings, which suggests their representations aren't implemented yet. For completeness, consider adding appropriate Markdown and reStructuredText equivalents for citation elements if they exist in those formats.Would you like me to suggest implementations for the Markdown and reStructuredText representations of citations?
BetterMD/elements/del_.py (2)
3-5
: Missing docstring for the Del class and its propertiesThe Del class lacks a docstring explaining its purpose and how it represents deleted text across different markup formats. Additionally, the
prop_list
attribute should be documented to explain the purpose of the "cite" and "datetime" properties.from .symbol import Symbol class Del(Symbol): + """ + Represents an HTML deleted text element and its equivalent in other markup formats. + + Attributes: + prop_list (list): List of HTML attributes supported by the del element. + md (str): Markdown representation (currently not implemented). + html (str): HTML tag name. + rst (str): reStructuredText representation (currently not implemented). + """ prop_list = ["cite", "datetime"] md = "" html = "del" rst = ""
5-6
: Remove unnecessary blank lineThere's an extra blank line between the property list and the tag attributes. This doesn't follow consistent formatting with other similar classes.
from .symbol import Symbol class Del(Symbol): prop_list = ["cite", "datetime"] - md = "" html = "del" rst = ""
BetterMD/elements/dir.py (3)
3-5
: Missing docstring for the Dir classThe Dir class lacks a docstring explaining its purpose and how it represents directory list elements across different markup formats. Additionally, the
prop_list
attribute should be documented to explain what the "compact" property does.from .symbol import Symbol class Dir(Symbol): + """ + Represents an HTML directory list element and its equivalent in other markup formats. + Note: The <dir> element is deprecated in HTML5. + + Attributes: + prop_list (list): List of HTML attributes supported by the dir element. + md (str): Markdown representation (currently not implemented). + html (str): HTML tag name. + rst (str): reStructuredText representation (currently not implemented). + """ prop_list = ["compact"] md = "" html = "dir" rst = ""
5-6
: Remove extra whitespace lineThere's unnecessary whitespace on line 5. This doesn't follow consistent formatting and should be removed for better code style.
from .symbol import Symbol class Dir(Symbol): prop_list = ["compact"] - md = "" html = "dir" rst = ""
6-8
: Consider adding a deprecation warningThe
<dir>
element is deprecated in HTML5 and has been removed from the HTML specification. Consider adding a deprecation warning or note in the code to inform developers that this element should be avoided in modern HTML.from .symbol import Symbol +import warnings class Dir(Symbol): prop_list = ["compact"] + def __init__(self, *args, **kwargs): + warnings.warn( + "The <dir> element is deprecated in HTML5 and should not be used in modern web development. " + "Consider using <ul> instead.", + DeprecationWarning, + stacklevel=2 + ) + super().__init__(*args, **kwargs) + md = "" html = "dir" rst = ""BetterMD/elements/var.py (1)
1-6
: Consider adding docstrings for better documentationThe implementation is clean and minimal, accurately representing the HTML
<var>
element with appropriate format mappings. However, adding a class docstring would improve clarity about the element's purpose and usage within the BetterMD framework.from .symbol import Symbol class Var(Symbol): + """Represents an HTML <var> element for variables in a document. + + The <var> element doesn't have direct equivalents in Markdown or reStructuredText. + """ html = "var" md = "" rst = ""BetterMD/elements/fieldset.py (1)
1-8
: Add docstring and remove unnecessary blank lineThe implementation appropriately includes the HTML attributes for a fieldset element, but would benefit from a docstring to explain its purpose and usage. Also, there's an unnecessary blank line on line 5 that could be removed for better consistency.
from .symbol import Symbol class Fieldset(Symbol): + """Represents an HTML <fieldset> element used to group related form controls. + + This element doesn't have direct equivalents in Markdown or reStructuredText. + """ prop_list = ["disabled", "form", "name"] - md = "" html = "fieldset" rst = ""BetterMD/elements/track.py (1)
1-8
:❓ Verification inconclusive
Add docstring for clarity and consider element's closing behaviour
The implementation correctly includes the relevant attributes for the HTML
<track>
element. Adding a docstring would improve documentation. Also, it would be beneficial to verify how the Symbol class handles self-closing elements, as<track>
is typically a void/self-closing element in HTML.from .symbol import Symbol class Track(Symbol): + """Represents an HTML <track> element used to specify text tracks for media elements. + + The <track> element is used with <audio> and <video> elements and provides subtitles, + captions, descriptions, chapters, or metadata. It doesn't have direct equivalents + in Markdown or reStructuredText. + """ prop_list = ["default", "kind", "label", "src", "srclang"] html = "track" md = "" rst = ""
🏁 Script executed:
#!/bin/bash # Check how Symbol class handles self-closing/void HTML elements echo "Checking Symbol class implementation for void elements handling:" grep -r "void\|self-closing\|empty" --include="*.py" BetterMD/elements/Length of output: 309
Action Required: Add a detailed docstring and verify void element handling
The new docstring addition is beneficial for clarity. However, there is still a concern regarding how void/self-closing elements are managed. The inspection of the codebase did not reveal any explicit handling in the Symbol base class for void elements, so please verify that the implementation correctly renders the
<track>
element as self-closing, in line with HTML standards.
- File:
BetterMD/elements/track.py
(lines 1-8)- Suggestion:
- Insert the provided docstring to document the purpose and usage of the
<track>
element.- Confirm that the Symbol class (referenced from
BetterMD/elements/symbol.py
) treats void elements appropriately without requiring a closing tag.from .symbol import Symbol class Track(Symbol): + """Represents an HTML <track> element used to specify text tracks for media elements. + + The <track> element is used with <audio> and <video> elements and provides subtitles, + captions, descriptions, chapters, or metadata. It doesn't have direct equivalents + in Markdown or reStructuredText. + """ prop_list = ["default", "kind", "label", "src", "srclang"] html = "track" md = "" rst = ""BetterMD/elements/summary.py (2)
3-6
: Consider adding docstring for better documentation.The Summary class implementation is correct, representing the HTML
<summary>
element which is typically used within<details>
elements. However, a docstring would improve code clarity and provide context about its usage and purpose.class Summary(Symbol): + """ + Represents an HTML summary element used within details elements. + + The summary element specifies a visible heading for the details element. + It can be clicked to view/hide the details. + """ html = "summary" md = "" rst = ""
5-6
: Consider implementing Markdown/reStructuredText equivalents.Currently, Markdown and reStructuredText representations are empty. While these formats don't have direct equivalents, consider implementing a sensible fallback representation (such as using bold text or a heading) to ensure content remains visible when converting between formats.
BetterMD/elements/canvas.py (2)
3-5
: Property list is correctly defined but could be more complete.The Canvas class correctly implements the prop_list for HTML canvas properties. However, consider adding other standard attributes like "style", "class", and "id" that are common across HTML elements, as well as additional canvas-specific attributes like "crossorigin".
class Canvas(Symbol): - prop_list = ["height", "moz-opaque", "width"] + prop_list = ["height", "moz-opaque", "width", "style", "class", "id", "crossorigin"]
6-8
: Consider adding docstring and fallback representations.The class correctly defines HTML representation but lacks documentation explaining its purpose and usage. Additionally, consider implementing fallback representations for Markdown and reStructuredText to handle format conversions gracefully.
class Canvas(Symbol): prop_list = ["height", "moz-opaque", "width"] + """ + Represents an HTML canvas element for drawing graphics. + + Canvas elements require JavaScript to manipulate and aren't directly + supported in Markdown or reStructuredText. + """ md = "" html = "canvas" rst = ""BetterMD/elements/samp.py (1)
3-6
: Add docstring and consider Markdown/reStructuredText equivalents.The Samp class implementation is correct for the HTML
<samp>
element (used to display sample output from computer programs). However, it lacks documentation and fallback representations for Markdown and reStructuredText conversions.For Markdown, a reasonable fallback could be to use code formatting with backticks, and for reStructuredText, the
code
directive might work.class Samp(Symbol): + """ + Represents an HTML samp element for sample output from computer programs. + """ - md = "" + md = "`{}`" # Using code formatting as fallback html = "samp" - rst = "" + rst = "``{}``" # Using inline literal as fallbackBetterMD/elements/source.py (2)
3-5
: Property list is well-defined but missing documentation.The Source class correctly implements the prop_list with appropriate HTML source attributes. Consider adding a docstring to explain that this element is used within media elements like
<audio>
,<video>
, or<picture>
to specify alternative media resources.class Source(Symbol): + """ + Represents an HTML source element, which specifies multiple media resources + for media elements like <picture>, <video>, and <audio>. + + The source element doesn't have direct equivalents in Markdown or + reStructuredText formats. + """ prop_list = ["type", "src", "srcset", "sizes", "media", "width"]
6-8
: Consider handling media content in Markdown/reStructuredText conversions.Currently, Markdown and reStructuredText representations are empty. Consider implementing a fallback strategy when converting from HTML to these formats, such as extracting and preserving the URL from the "src" attribute as a link or image.
class Source(Symbol): prop_list = ["type", "src", "srcset", "sizes", "media", "width"] - md = "" + # For conversion to Markdown, could extract and use the src attribute + # This would need custom conversion logic + md = "" html = "source" - rst = "" + # Similar approach needed for reStructuredText + rst = ""BetterMD/elements/kbd.py (2)
1-6
: Add docstring to explain class purpose and usageThe
Kbd
class doesn't include any documentation. Consider adding a docstring to explain its purpose, expected behaviour, and how it's meant to be used within the BetterMD ecosystem.from .symbol import Symbol class Kbd(Symbol): + """ + Represents a keyboard input element. + + This class handles the conversion between HTML kbd tags and their + representations in Markdown and reStructuredText. + """ md = "" html = "kbd" rst = ""
4-6
: Remove trailing whitespace and implement Markdown and RST representationsThere's a trailing whitespace on line 6, and the Markdown and reStructuredText representations are empty. Consider implementing appropriate representations for these formats to ensure complete functionality.
md = "" html = "kbd" - rst = "" + rst = ""For Markdown, you might consider using a backtick or double backtick format, which is commonly used for keyboard inputs.
BetterMD/elements/acronym.py (2)
1-8
: Add docstring to explain class purpose and usageThe
Acronym
class lacks documentation explaining its purpose and usage. Consider adding a docstring to clarify how it handles acronyms and the significance of the "title" property.from .symbol import Symbol class Acronym(Symbol): + """ + Represents an acronym element. + + This class handles the conversion between HTML acronym tags and their + representations in Markdown and reStructuredText. The 'title' property + is used to store the full form of the acronym. + """ prop_list = ["title"] md = "" html = "acronym" rst = ""
6-8
: Implement Markdown and RST representations for acronymsBoth Markdown and reStructuredText representations are empty. Consider implementing appropriate syntax for these formats to ensure proper conversion between different markup languages.
For Markdown, you might want to use a format like
*[ACRONYM]: Full Form
which is supported by some Markdown flavours, or implement a custom representation that aligns with the library's design philosophy.BetterMD/elements/caption.py (2)
1-8
: Add docstring to explain class purpose and usageThe
Caption
class would benefit from documentation explaining its purpose and how the "align" property affects rendering in different formats.from .symbol import Symbol class Caption(Symbol): + """ + Represents a caption element, typically used with tables and figures. + + This class handles the conversion between HTML caption tags and their + representations in Markdown and reStructuredText. The 'align' property + determines the horizontal alignment of the caption. + """ prop_list = ["align"] md = "" html = "caption" rst = ""
6-8
: Implement Markdown and RST representations for captionsBoth Markdown and reStructuredText representations are empty. Consider implementing appropriate syntax for these formats to ensure complete functionality. Note that standard Markdown doesn't have a dedicated caption syntax, so you might need to create a custom representation or rely on HTML for complex cases.
For reStructuredText, captions are typically implemented using directives like
.. figure::
or table-specific syntax.BetterMD/elements/picture.py (2)
1-6
: Add docstring and consider implementing prop_listThe
Picture
class lacks documentation and doesn't define aprop_list
attribute that might be needed for HTML picture elements. Unlike the other elements, pictures typically have several attributes that should be supported.from .symbol import Symbol class Picture(Symbol): + """ + Represents an HTML picture element, which contains multiple source elements + and one img element to offer alternative versions of an image. + + This class handles the conversion between HTML picture tags and their + representations in Markdown and reStructuredText. + """ + prop_list = ["media", "type", "sizes"] md = "" html = "picture" rst = ""
4-6
: Implement Markdown and RST representations for picture elementsBoth Markdown and reStructuredText representations are empty. Consider implementing appropriate syntax for these formats, though it's worth noting that standard Markdown doesn't have a direct equivalent to HTML's picture element. You might need to implement a custom representation or fall back to standard image syntax.
For complex HTML elements like
picture
, you may need to implement special conversion logic in the parentSymbol
class to handle the transformation between formats.BetterMD/elements/ins.py (1)
1-8
: HTML Insertion element correctly implementedThe
Ins
class properly inherits fromSymbol
and defines appropriate HTML attributes for the insertion element. Theprop_list
correctly includes "cite" and "datetime" which are valid attributes for the HTML<ins>
tag.However, I notice that the
md
andrst
attributes are empty strings. Consider adding documentation explaining whether this indicates no equivalent representation exists in those formats or if implementation is planned for the future.There's also a trailing whitespace at line 8 after
rst = ""
.BetterMD/elements/base.py (1)
1-8
: Base HTML element looks goodThe
Base
class correctly inherits fromSymbol
and defines appropriate HTML attributes. Theprop_list
contains "href" and "target", which are the standard attributes for the HTML<base>
element.Consider adding documentation about why the
md
andrst
attributes are empty strings - is this because there's no direct equivalent in these formats, or is implementation planned for the future?BetterMD/elements/frameset.py (1)
1-8
: Obsolete HTML element implemented correctlyThe
Frameset
class properly inherits fromSymbol
and defines appropriate HTML attributes. Theprop_list
correctly includes "cols" and "rows" which are valid attributes for the HTML<frameset>
tag.Note that the
<frameset>
element is obsolete in HTML5. Consider adding a comment or documentation note about this fact for developers using this library.Additionally, there's a trailing whitespace at line 8 after
rst = ""
.BetterMD/elements/output.py (1)
1-8
: Add docstrings to enhance code readabilityThe implementation correctly defines the Output element with its standard attributes in the prop_list. The class properties are appropriately set up for HTML conversion with empty values for markdown and reStructuredText.
Consider adding docstrings to explain:
- The purpose of this class
- How it interacts with the Symbol parent class
- How the empty md and rst values impact format conversion
- Usage examples for developers
BetterMD/elements/sup.py (1)
3-6
: Consider adding Markdown representationThe implementation is correct for HTML, but the
md
attribute is empty. For completeness, consider adding a Markdown representation for superscript (e.g., using the syntax^text^
or similar).- md = "" + md = "^{}"Similarly for RST, you might want to consider adding the standard representation:
- rst = "" + rst = ":sup:`{}`"BetterMD/elements/script.py (1)
1-8
: Implementation for Script class is minimal but functional.The class correctly defines the HTML tag name and provides a comprehensive list of attributes for the script element. I noticed that you've included "none" in the prop_list which is unusual and might not be a standard HTML script attribute. Similarly, "attributionsrc" appears to be non-standard.
Since Markdown and reStructuredText don't have direct equivalents for script elements, the empty strings are appropriate, but it might be worth documenting this explicitly.
from .symbol import Symbol class Script(Symbol): + """Represents an HTML script element with no direct equivalent in Markdown or reStructuredText.""" prop_list = ["async", "attributionsrc", "blocking", "crossorigin", "defer", "fetchpriority", "integrity", "nomodule", "none", "referrerpolicy", "src", "type", "charset", "language"] md = "" html = "script" rst = ""
BetterMD/elements/meter.py (1)
1-8
: Implementation for Meter class is minimal but functional.The class correctly defines the HTML tag name and provides a comprehensive list of attributes for the meter element. Since Markdown and reStructuredText don't have direct equivalents for meter elements, the empty strings are appropriate.
Consider adding a docstring to explain the purpose of this class and how it should be used within the BetterMD framework.
from .symbol import Symbol class Meter(Symbol): + """Represents an HTML meter element with no direct equivalent in Markdown or reStructuredText.""" prop_list = ["value", "min", "max", "low", "high", "optimum", "form"] md = "" html = "meter" rst = "" +Note: I've also added a newline at the end of the file, which is a standard practice to ensure proper file handling in various environments.
BetterMD/elements/wbr.py (1)
1-6
: Implementation for WBR class is minimal but could be enhanced.The class correctly defines the HTML tag for word break opportunity, but doesn't include a prop_list unlike other element classes. This is acceptable if the WBR element doesn't have specific attributes.
While Markdown and reStructuredText don't have direct equivalents for the WBR element, you might consider documenting how word breaks should be handled when converting between formats.
from .symbol import Symbol class WBR(Symbol): + """Represents an HTML word break opportunity element (<wbr>). + + This element has no direct equivalent in Markdown or reStructuredText. + During conversion, it may be represented as a space or ignored depending on context. + """ html = "wbr" md = "" rst = "" +I've added a docstring and a newline at the end of the file for better documentation and standard file handling.
BetterMD/elements/audio.py (1)
1-8
: Implementation correctly represents the HTML audio elementThe
Audio
class is well-structured for an HTML audio element with all the essential properties inprop_list
. The HTML tag name is properly set, but both Markdown and reStructuredText representations are empty as these formats don't have native audio elements.Consider implementing fallback representations for non-HTML formats:
- For Markdown: Perhaps use a link with descriptive text
- For RST: Consider a directive or substitution pattern
This would improve cross-format conversion for audio content.
BetterMD/elements/slot.py (1)
1-8
: HTML slot element representation is minimal but correctThe
Slot
class correctly represents the HTML slot element with its sole required attribute "name". This element is specific to HTML templates and web components with no direct equivalent in Markdown or RST.Suggestions for improvement:
- Add a docstring explaining the purpose of this specialized element and its role in web components
- Consider implementing a conversion strategy for Markdown and RST (perhaps as a comment or special notation)
- Document how HTML-only elements should be handled during format conversion
This would improve code understandability and format conversion consistency.
BetterMD/elements/meta.py (1)
1-8
: Meta element implementation with property naming considerationThe
Meta
class correctly represents the HTML meta element with appropriate properties. However, note that "httpequiv" doesn't match the typical HTML attribute name "http-equiv". If this is an intentional naming convention across the library, it's fine, but consistency should be maintained.The metadata representation varies significantly between formats:
- HTML: Uses meta tags
- Markdown: Often uses YAML frontmatter
- RST: Uses field lists or directives
Consider implementing a strategy for bidirectional conversion of metadata between these formats to ensure information isn't lost during conversion.
BetterMD/elements/embed.py (1)
1-8
: Embed element implementation with code style noteThe
Embed
class correctly represents the HTML embed element with its essential attributes. There is a trailing whitespace at the end of line 8 that should be removed for consistency with coding style.Similar to audio and other HTML-specific elements, embedded content has no direct representation in Markdown or RST. Consider implementing:
- Markdown: Perhaps use a special link syntax or HTML pass-through
- RST: Consider a directive approach
Adding consistent conversion strategies for HTML-only elements would make the library more robust when handling mixed-format content.
BetterMD/elements/label.py (1)
1-8
: Looks good with suggestions for enhancementThe implementation for the
Label
class is minimal but correct. It inherits fromSymbol
and properly defines the HTML representation as "label".Consider enhancing this implementation with:
Additional common HTML label attributes in
prop_list
such as "form" which allows linking a label to a form even when outside the form's hierarchy.While labels are primarily HTML elements without direct equivalents in Markdown and RST (explaining the empty strings), you might consider implementing custom rendering logic that would generate appropriate text representations when converting to these formats.
class Label(Symbol): - prop_list = ["for"] + prop_list = ["for", "form"] md = "" html = "label" rst = ""BetterMD/elements/iframe.py (1)
1-12
: Implementation is comprehensive with good attribute coverageThe
Iframe
class implementation is well-structured with an impressively thorough list of supported HTML attributes. All standard and modern iframe properties are present in theprop_list
.As iframes are HTML-specific elements without direct equivalents in Markdown or RST formats, the empty string values for these representations are appropriate. However, you might consider:
Adding custom rendering logic for Markdown and RST that could represent iframes as links or code blocks with a note about the embedded content, which would provide a better user experience when converting between formats.
Implementing a fallback representation method that could generate something like
[Embedded content: {src}]
when converting to text-based formats.BetterMD/elements/form.py (1)
1-11
: Good implementation with room for additional attributesThe
Form
class is well-structured and follows the project's pattern for element classes. The current attribute list covers most common form properties.Consider enhancing the implementation with:
Additional form-related attributes that might be useful:
- "formaction", "formenctype", "formmethod", "formnovalidate", and "formtarget" (attributes for form controls that can override the form's attributes)
Similar to other HTML-specific elements, forms don't have direct equivalents in Markdown or RST, so the empty strings are appropriate. However, you might consider adding conversion logic if there's a specific use case for representing forms in these formats.
class Form(Symbol): prop_list = [ "accept", "accept-charset", "autocapitalize", "autocomplete", "name", "rel", - "action", "enctype", "method", "novalidate", "target", + "action", "enctype", "method", "novalidate", "target", + "formaction", "formenctype", "formmethod", "formnovalidate", "formtarget" ] md = "" html = "form" rst = ""BetterMD/elements/button.py (1)
1-8
: New Button class looks good but could benefit from docstringsThe Button class is well-structured and follows the established pattern for Symbol-derived elements. The prop_list comprehensively covers HTML button attributes, including newer ones like "popovertarget".
Consider adding docstrings to explain the class purpose and functionality, particularly how it handles conversions between formats.
from .symbol import Symbol class Button(Symbol): + """ + Represents an HTML button element. + + The Button class handles the conversion between HTML, Markdown, and reStructuredText + for button elements. + """ prop_list = ["autofocus", "command", "commandfor", "disabled", "form", "formaction", "formenctype", "formmethod", "formnovalidate", "formtarget", "name", "popovertarget", "popovertargetaction", "type", "value"] md = "" html = "button" rst = ""BetterMD/elements/bd.py (1)
1-15
: BDI and BDO classes need documentation and possibly more propertiesThe classes for bidirectional text isolation (BDI) and bidirectional text override (BDO) are correctly implemented but lack documentation explaining their purpose. The BDI and BDO elements are specialised HTML elements for handling bidirectional text, which would benefit from clear explanations in docstrings.
Additionally, while "dir" is indeed the primary attribute, consider whether any other common HTML attributes should be included in the prop_list.
from .symbol import Symbol class BDI(Symbol): + """ + Represents an HTML bidirectional text isolation element. + + The BDI class handles text that might be formatted in a different direction + from the surrounding text, isolating it from the surrounding text's + directional setting. + """ prop_list = ["dir"] md = "" html = "bdi" rst = "" class BDO(Symbol): + """ + Represents an HTML bidirectional text override element. + + The BDO class explicitly overrides the bidirectional algorithm for its contents, + forcing the directionality of the text to be displayed as specified by the dir attribute. + """ prop_list = ["dir"] md = "" html = "bdo" rst = ""BetterMD/elements/textarea.py (1)
1-8
: Textarea class is well-defined but could benefit from docstringsThe Textarea class implementation is solid with a comprehensive list of properties. However, like the other new element classes, it lacks documentation explaining its purpose and usage.
Also note that "autocorrect" is a non-standard attribute that's not part of HTML5 specifications, though it is supported by some browsers. This may be intentional for cross-browser compatibility, but worth mentioning.
from .symbol import Symbol class Textarea(Symbol): + """ + Represents an HTML textarea element. + + The Textarea class handles multi-line text input controls and their conversion + between HTML, Markdown, and reStructuredText formats. + """ prop_list = ["autocapitalize", "autocomplete", "autocorrect", "autofocus", "cols", "dirname", "disabled", "form", "maxlength", "minlength", "name", "placeholder", "readonly", "required", "rows", "spellcheck", "wrap"] html = "textarea" md = "" rst = ""BetterMD/elements/template.py (1)
1-8
: Add documentation to Template classSimilar to other element classes, the Template class would benefit from docstrings explaining its purpose, particularly since it involves Shadow DOM concepts that are more advanced and might require explanation.
from .symbol import Symbol class Template(Symbol): + """ + Represents an HTML template element. + + The Template class handles HTML templates that contain content not rendered when the page loads, + but can be instantiated later using JavaScript. Supports Shadow DOM properties for + encapsulating content. + """ prop_list = ["shadowrootmode", "shadowrootclonable", "shadowrootserializable", "shadowrootdelegatesfocus"] html = "template" md = "" rst = ""BetterMD/elements/dialog.py (2)
4-4
: Improve documentation for the commentThe comment "Dont use
tabindex
" provides a restriction without explaining the reasoning. Consider expanding this comment to explain whytabindex
shouldn't be used with the dialog element, which would help future contributors understand the design decision.
6-8
: Consider implementing rendering methods for Markdown and RSTThe
md
andrst
attributes are empty strings, indicating no direct representation in Markdown or reStructuredText. Since the dialog element is a UI component with semantics that might need preservation across formats, consider implementing custom rendering logic to represent dialog functionality in non-HTML formats.BetterMD/elements/progress.py (2)
4-4
: Consider adding additional HTML5 progress attributesThe current implementation only includes "max" and "value" attributes. Consider adding the "form" attribute which allows the progress element to be associated with a form even when not a descendant of the form element.
6-8
: Implement equivalent representations for Markdown/RSTThe
md
andrst
attributes are empty strings, but progress information could be useful in documents across formats. Consider implementing fallback representations, such as a text-based progress indicator for Markdown and RST formats.BetterMD/elements/object.py (1)
6-8
: Implement fallback representations for Markdown/RSTThe
md
andrst
attributes are empty strings, indicating no direct representations in these formats. Consider implementing fallback strategies for representing object elements in Markdown and reStructuredText, such as using image links with descriptions or custom syntax extensions.TODO.md (1)
10-14
: Fix Markdown list indentationThe newly added tasks are well-structured, but the list indentation doesn't follow Markdown standards, as highlighted by the static analysis tool. For proper nesting and consistent rendering:
- Top-level items should have 2 spaces of indentation
- Second-level items should have 4 spaces
- Third-level items should have 6 spaces
- [ ] Add parsing support - - [x] Add HTML parsing - - [x] Add MD parsing - - [x] Add inline support - - [x] Add escape chars (\\) - - [ ] Add RST parsing + - [x] Add HTML parsing + - [x] Add MD parsing + - [x] Add inline support + - [x] Add escape chars (\\) + - [ ] Add RST parsing🧰 Tools
🪛 markdownlint-cli2 (0.17.2)
10-10: Unordered list indentation
Expected: 2; Actual: 4(MD007, ul-indent)
11-11: Unordered list indentation
Expected: 2; Actual: 4(MD007, ul-indent)
12-12: Unordered list indentation
Expected: 4; Actual: 8(MD007, ul-indent)
13-13: Unordered list indentation
Expected: 4; Actual: 8(MD007, ul-indent)
14-14: Unordered list indentation
Expected: 2; Actual: 4(MD007, ul-indent)
BetterMD/elements/col.py (2)
1-15
: Add docstrings to improve code documentationThe new
Colgroup
andCol
classes lack docstrings that describe their purpose, usage, and parameters. Adding proper documentation would improve code maintainability and help users understand these classes.from .symbol import Symbol class Colgroup(Symbol): + """ + Represents an HTML <colgroup> element which defines a group of columns within a table. + + Attributes: + prop_list: List of supported HTML attributes for the colgroup element + md: Markdown representation (empty as there's no direct equivalent) + html: HTML tag name + rst: reStructuredText representation (empty as there's no direct equivalent) + """ prop_list = ["span", "align", "bgcolor", "char", "charoff", "valign", "width"] md = "" html = "colgroup" rst = "" class Col(Symbol): + """ + Represents an HTML <col> element which defines column properties within a <colgroup>. + + Attributes: + prop_list: List of supported HTML attributes for the col element + md: Markdown representation (empty as there's no direct equivalent) + html: HTML tag name + rst: reStructuredText representation (empty as there's no direct equivalent) + """ prop_list = ["span", "align", "bgcolor", "char", "charoff", "valign", "width"] md = "" html = "col" rst = ""
3-15
: Consider common base class for similar elementsBoth
Colgroup
andCol
share identicalprop_list
attributes. If this pattern of shared properties appears in multiple elements throughout the codebase, consider refactoring to a common base class or using a shared constant.from .symbol import Symbol +# Shared properties for column-related elements +COL_PROPERTIES = ["span", "align", "bgcolor", "char", "charoff", "valign", "width"] + + class Colgroup(Symbol): - prop_list = ["span", "align", "bgcolor", "char", "charoff", "valign", "width"] + prop_list = COL_PROPERTIES md = "" html = "colgroup" rst = "" class Col(Symbol): - prop_list = ["span", "align", "bgcolor", "char", "charoff", "valign", "width"] + prop_list = COL_PROPERTIES md = "" html = "col" rst = ""BetterMD/elements/d.py (1)
7-7
: Remove extra whitespaceThere's unnecessary whitespace at the end of this line, which should be removed for consistency with the rest of the file.
class DD(Symbol): md = "" html = "dd" rst = "" - + class DT(Symbol):BetterMD/elements/marquee.py (1)
3-8
: Consider adding common HTML attributesThe
prop_list
includes only marquee-specific attributes, but common HTML attributes like "id", "class", and "style" are missing. Consider adding these common attributes to ensure full compatibility with HTML.class Marquee(Symbol): - prop_list = ["behavior", "bgcolor", "direction", "height", "hspace", "loop", "scrollamount", "scrolldelay", "truespeed", "vspace", "width"] + prop_list = ["behavior", "bgcolor", "direction", "height", "hspace", "loop", "scrollamount", "scrolldelay", "truespeed", "vspace", "width", + "id", "class", "style", "title", "lang", "dir"] md = "" html = "marquee" rst = ""BetterMD/elements/i.py (4)
13-16
: Class name 'I' is ambiguous and doesn't follow PEP 8 naming conventions.The class name 'I' is a single-letter identifier which can be confusing and difficult to search for in the codebase. Consider renaming it to a more descriptive name like 'ItalicElement' or 'ItalicTag' to improve code readability and maintainability.
-class I(Symbol): +class ItalicElement(Symbol): html = "i" md = MD() rst = RST()🧰 Tools
🪛 Ruff (0.8.2)
13-13: Ambiguous class name:
I
(E742)
5-8
: Improve error handling in the to_md method.The current implementation assumes all elements in the 'inner' list have a to_md() method. Consider adding error handling to gracefully handle cases where an element might not implement this method.
def to_md(self, inner, symbol, parent): - return f"*{''.join([e.to_md() for e in inner])}*" + try: + return f"*{''.join([e.to_md() for e in inner])}*" + except AttributeError as e: + raise AttributeError(f"An element in 'inner' does not implement to_md(): {e}")
9-12
: Improve error handling in the to_rst method.Similar to the to_md method, consider adding error handling to the to_rst method to handle cases where elements in 'inner' might not implement this method.
def to_rst(self, inner, symbol, parent): - return f"*{''.join([e.to_rst() for e in inner])}*" + try: + return f"*{''.join([e.to_rst() for e in inner])}*" + except AttributeError as e: + raise AttributeError(f"An element in 'inner' does not implement to_rst(): {e}")
1-16
: Add docstrings to improve code documentation.The classes and methods lack documentation. Consider adding docstrings to describe the purpose and usage of each class and method, which would improve code readability and maintainability.
from .symbol import Symbol from ..markdown import CustomMarkdown from ..rst import CustomRst class MD(CustomMarkdown): + """Custom Markdown formatter for italic text.""" def to_md(self, inner, symbol, parent): + """ + Convert the inner elements to Markdown-formatted italic text. + + Args: + inner: List of elements to be converted + symbol: The symbol instance + parent: The parent element + + Returns: + str: Markdown-formatted italic text + """ return f"*{''.join([e.to_md() for e in inner])}*" class RST(CustomRst): + """Custom reStructuredText formatter for italic text.""" def to_rst(self, inner, symbol, parent): + """ + Convert the inner elements to reStructuredText-formatted italic text. + + Args: + inner: List of elements to be converted + symbol: The symbol instance + parent: The parent element + + Returns: + str: reStructuredText-formatted italic text + """ return f"*{''.join([e.to_rst() for e in inner])}*" class I(Symbol): + """Represents an italic element in HTML, Markdown, and reStructuredText.""" html = "i" md = MD() rst = RST()🧰 Tools
🪛 Ruff (0.8.2)
13-13: Ambiguous class name:
I
(E742)
BetterMD/elements/s.py (3)
3-6
: Class name 'S' is ambiguous and doesn't follow PEP 8 naming conventions.The class name 'S' is a single-letter identifier which can be confusing and difficult to search for in the codebase. Consider renaming it to a more descriptive name like 'StrikethroughElement' or 'StrikeTag' to improve code readability and maintainability.
-class S(Symbol): +class StrikethroughElement(Symbol): md = "" html = "s" rst = ""
4-6
: Empty strings for md and rst attributes suggest incomplete implementation.The empty strings for md and rst attributes suggest that strikethrough formatting is not supported in Markdown and reStructuredText, or that the implementation is incomplete. If these formats support strikethrough (e.g., '
text' in Markdown), consider implementing appropriate formatters.class S(Symbol): - md = "" + md = CustomMarkdown() # Implement a proper Markdown formatter for strikethrough html = "s" - rst = "" + rst = CustomRst() # Implement a proper RST formatter for strikethroughFor Markdown implementation:
class MDStrikethrough(CustomMarkdown): def to_md(self, inner, symbol, parent): return f"~~{''.join([e.to_md() for e in inner])}~~"
1-6
: Add docstrings to improve code documentation.The class lacks documentation. Consider adding docstrings to describe the purpose and usage of the class, which would improve code readability and maintainability.
from .symbol import Symbol class S(Symbol): + """ + Represents a strikethrough element in HTML. + + Note: Currently, the Markdown and reStructuredText representations + are not implemented (empty strings). + """ md = "" html = "s" rst = ""BetterMD/elements/video.py (2)
4-6
: Empty strings for md and rst attributes suggest incomplete implementation.The empty strings for md and rst attributes suggest that video elements are not supported in Markdown and reStructuredText, or that the implementation is incomplete. Consider implementing appropriate formatters or adding comments to explain why these formats are not supported.
class Video(Symbol): html = "video" - md = "" + md = "" # Markdown does not natively support video elements - rst = "" + rst = "" # reStructuredText does not natively support video elements
1-6
: Add docstrings to improve code documentation.The class lacks documentation. Consider adding docstrings to describe the purpose and usage of the class, which would improve code readability and maintainability.
from .symbol import Symbol class Video(Symbol): + """ + Represents a video element in HTML. + + Note: Markdown and reStructuredText do not natively support video elements, + so their representations are empty strings. + """ html = "video" md = "" rst = ""BetterMD/elements/small.py (2)
4-6
: Empty strings for md and rst attributes suggest incomplete implementation.The empty strings for md and rst attributes suggest that small text formatting is not supported in Markdown and reStructuredText, or that the implementation is incomplete. Consider implementing appropriate formatters or adding comments to explain why these formats are not supported.
class Small(Symbol): - md = "" + md = "" # Markdown does not have a direct equivalent for HTML's <small> element html = "small" - rst = "" + rst = "" # reStructuredText does not have a direct equivalent for HTML's <small> element
1-6
: Add docstrings to improve code documentation.The class lacks documentation. Consider adding docstrings to describe the purpose and usage of the class, which would improve code readability and maintainability.
from .symbol import Symbol class Small(Symbol): + """ + Represents a small text element in HTML. + + Note: Markdown and reStructuredText do not have direct equivalents + for HTML's <small> element, so their representations are empty strings. + """ md = "" html = "small" rst = ""BetterMD/elements/plaintext.py (1)
1-6
: Consider adding docstrings for clarity and documentationThe
Plaintext
class is straightforward, but lacks docstrings explaining its purpose and how it's meant to be used within the broader BetterMD system. Adding class and attribute documentation would enhance understanding for developers using or maintaining this code.from .symbol import Symbol class Plaintext(Symbol): + """ + Represents a plaintext element for conversion between markup formats. + + Attributes: + md (str): Markdown representation (empty as plaintext has no special formatting in Markdown) + html (str): HTML tag name used for representation + rst (str): reStructuredText representation (empty as plaintext has no special formatting in RST) + """ md = "" html = "plaintext" rst = ""BetterMD/elements/legend.py (1)
1-6
: Add docstrings and provide equivalent formats where applicableThe
Legend
class currently lacks documentation. Additionally, consider whether there are equivalent representations in Markdown and reStructuredText formats that could be specified, even if they're approximations of HTML's legend element.from .symbol import Symbol class Legend(Symbol): + """ + Represents a legend element for conversion between markup formats. + + HTML's <legend> element is typically used with <fieldset> to provide a caption. + + Attributes: + md (str): Markdown representation (currently unsupported) + html (str): HTML tag name used for representation + rst (str): reStructuredText representation (currently unsupported) + """ md = "" html = "legend" rst = ""BetterMD/elements/main.py (1)
3-6
: Add docstring to document class purpose and usageThe
Main
class structure is clear and follows the established pattern forSymbol
subclasses. However, it lacks documentation that would help users understand its purpose and how it represents the HTML<main>
element.class Main(Symbol): + """ + Represents the HTML <main> element, which specifies the main content of a document. + + HTML: <main> + Markdown: No direct equivalent + reStructuredText: No direct equivalent + """ md = "" html = "main" rst = ""BetterMD/elements/fencedframe.py (2)
3-8
: Document the FencedFrame class and clarify its purposeThe class follows the established pattern for
Symbol
subclasses, but lacks documentation explaining its purpose and usage. Additionally, "fencedframe" is not a standard HTML element, so it's unclear what this class is intended to represent.class FencedFrame(Symbol): + """ + Represents a fenced frame element, allowing embedded content with customisable properties. + + Properties: + - allow: Specifies feature policy for the frame + - height: Specifies the height of the frame + - width: Specifies the width of the frame + + HTML: Custom element or possibly an iframe with specific rendering + Markdown: No direct equivalent + reStructuredText: No direct equivalent + """ prop_list = ["allow", "height", "width"] - md = "" html = "fencedframe" rst = ""
4-4
: Consider adding validation logic for property valuesThe properties "height" and "width" typically require validation to ensure they contain valid CSS dimensions.
Consider implementing property validation methods, either in this class or in the parent
Symbol
class, to validate that height and width values conform to CSS dimension syntax.BetterMD/elements/xmp.py (2)
3-6
: Add docstring and note deprecation status of XMP elementThe class needs documentation to explain its purpose. It's also important to note that the
<xmp>
element is deprecated in HTML5.class XMP(Symbol): + """ + Represents the HTML <xmp> element, which displays text between the start and end tags without + interpreting the HTML in between. + + Note: The <xmp> element is deprecated in HTML5. Consider using <pre> or <code> instead. + + HTML: <xmp> + Markdown: No direct equivalent + reStructuredText: No direct equivalent + """ html = "xmp" md = "" rst = ""
5-5
: Remove trailing whitespaceThere is a trailing whitespace after the empty string assignment.
html = "xmp" - md = "" + md = "" rst = ""BetterMD/elements/time.py (2)
3-8
: Add docstring to document class purpose and datetime propertyThe
Time
class needs documentation to explain its purpose and how the datetime property should be formatted.class Time(Symbol): + """ + Represents the HTML <time> element, which defines a specific time or datetime. + + Properties: + - datetime: Specifies the date/time in ISO 8601 format (YYYY-MM-DDThh:mm:ssTZD) + + HTML: <time> + Markdown: No direct equivalent + reStructuredText: No direct equivalent + """ prop_list = ["datetime"] html = "time" md = "" rst = ""
7-7
: Remove trailing whitespaceThere is a trailing whitespace after the empty string assignment.
html = "time" - md = "" + md = "" rst = ""BetterMD/elements/data.py (1)
1-8
: Well structured class implementationThe
Data
class correctly inherits fromSymbol
and defines the appropriate property list and format indicators. The HTML representation corresponds to the HTML<data>
element, which is used to add machine-readable translations of content.Consider adding a docstring to explain the purpose of this class and how it should be used in the context of format conversion.
BetterMD/utils.py (1)
6-14
: Well-structured List implementation for Symbol collections.The
List
class extends the built-in list with methods to convert collections of Symbol elements to different formats. The implementation is clean, using list comprehensions for transformations and properly passing the indent parameter to the HTML conversion method.Consider adding docstrings to explain the purpose of each method.
class List(list['Symbol']): def to_html(self, indent=1): + """Convert all elements in the list to HTML representation.""" return [elm.to_html(indent) for elm in self] def to_md(self): + """Convert all elements in the list to Markdown representation.""" return [elm.to_md() for elm in self] def to_rst(self): + """Convert all elements in the list to reStructuredText representation.""" return [elm.to_rst() for elm in self]BetterMD/elements/select.py (1)
1-2
: Consider using a more specific import.The current import is using relative import but it could be more specific about what it's importing.
-from . import Symbol +from .symbol import SymbolBetterMD/elements/img.py (1)
20-20
: Comprehensive image property list is a good addition.The expanded property list greatly enhances the
Img
class by supporting modern HTML5 attributes (likeloading
,decoding
,fetchpriority
) as well as legacy attributes. This makes the parser more robust when handling various image element configurations.Consider extending the
to_html
,to_md
, andto_rst
methods in the future to leverage some of these additional properties, particularly for the HTML output which could benefit from supporting attributes likewidth
,height
, andloading
.BetterMD/parse/markdown/typing.py (1)
3-5
: Consider removing the unusedMDParser
import.The
MDParser
symbol is imported but not used in this file, as flagged by the static analysis tool. Removing it will help maintain a clean and focused codebase.-if t.TYPE_CHECKING: - from .parser import MDParser - from ..typing import ELEMENT, TEXT +if t.TYPE_CHECKING: + from ..typing import ELEMENT, TEXT🧰 Tools
🪛 Ruff (0.8.2)
4-4:
.parser.MDParser
imported but unusedRemove unused import:
.parser.MDParser
(F401)
BetterMD/elements/input.py (1)
21-55
: Validate additional input attributes.You've added numerous input attributes like
"accept"
,"form"
,"popovertarget"
, etc. Consider adding short docstrings or usage examples for these attributes, ensuring developers understand their purpose and supported input types.Would you like an example section in the docstrings or the README clarifying how these attributes are to be used?
BetterMD/elements/style.py (1)
56-63
: Simplify nested selector logic.Pseudo-classes differ from other selector types only by excluding the gap/spacer. The three remaining conditions for ID, class, and default all produce
f"{selector} {key}"
. Merging these branches would simplify the code.-if key.startswith(':'): # Pseudo-class - nested_selector = f"{selector}{key}" -elif key.startswith('#'): # ID - nested_selector = f"{selector} {key}" -elif key.startswith('.'): # Class - nested_selector = f"{selector} {key}" -else: # Element or custom - nested_selector = f"{selector} {key}" +if key.startswith(':'): + nested_selector = f"{selector}{key}" +else: + nested_selector = f"{selector} {key}"🧰 Tools
🪛 Ruff (0.8.2)
58-61: Combine
if
branches using logicalor
operatorCombine
if
branches(SIM114)
BetterMD/elements/table.py (4)
9-29
: Consider handling multiple THeads or TBodies more gracefully.At present, each THead is stored in a single variable, so multiple table headers would overwrite the
thead_content
. Similarly, multiple TBodies are appended in a list but are not separated by an explicit delimiter. If your specification permits multiple THeads or TBodies, you may want to either concatenate all of them or raise a warning if more than one is present.
41-41
: Simplify multipleisinstance
checks.You can merge the two
isinstance
calls into one to improve readability:- if isinstance(section, THead) or isinstance(section, TBody): + if isinstance(section, (THead, TBody)):🧰 Tools
🪛 Ruff (0.8.2)
41-41: Multiple
isinstance
calls forsection
, merge into a single callMerge
isinstance
calls forsection
(SIM101)
62-62
: Rename unused loop variable fromi
to_i
.Since
i
is not used anywhere within the loop, renaming it to_i
clarifies that it is only a placeholder variable.- for i, (cells, is_header) in enumerate(all_rows): + for _i, (cells, is_header) in enumerate(all_rows):🧰 Tools
🪛 Ruff (0.8.2)
62-62: Loop control variable
i
not used within loop bodyRename unused
i
to_i
(B007)
76-107
: Consider adding cell alignment handling or multi-line cell support.For more advanced tables in Markdown, some users may require alignment options or support for multi-line cells. You could allow specifying alignment or splitting lines with care.
BetterMD/parse/markdown/parser.py (4)
22-27
: Quick extension lookup by name.This loop-driven lookup is simple but might become slower with many extensions. Consider a dictionary for lookups if unique names are guaranteed.
43-54
: Re-initialising extensions.Calling
extension.init(self)
again updates the parser reference within each extension. If any per-extension state must persist across resets, consider providing a more selective re-initialisation process.
55-77
: Consider data class usage for structured elements.Currently, elements and text are stored as dictionaries. Employing typed classes or dataclasses could improve maintainability, type validation, and clarity.
155-156
: Use a clearer variable name instead ofl
.The variable
l
in the tuple assignment can be mistaken for the digit1
. Choose something more descriptive, such asmatch_length
.- v, elm, l = handle(pattern, handler["handler"]) + v, elm, match_length = handle(pattern, handler["handler"]) - i += l + i += match_lengthAlso applies to: 169-170
🧰 Tools
🪛 Ruff (0.8.2)
156-156: Ambiguous variable name:
l
(E741)
BetterMD/parse/markdown/extensions/extension.py (1)
8-37
: Initialisation pattern inExtension
.If
parser_class
is only used to create extension instances, you could simplify the design by removing the intermediate storage if not strictly necessary.BetterMD/elements/symbol.py (4)
8-8
: Remove unused importATTRS
.
TheATTRS
import is unused. Consider removing it to maintain a clean import list.- from ..typing import ATTRS, ATTR_TYPES + from ..typing import ATTR_TYPES🧰 Tools
🪛 Ruff (0.8.2)
8-8:
..typing.ATTRS
imported but unusedRemove unused import:
..typing.ATTRS
(F401)
11-17
: Avoid class-level mutable default attributes.
Declaring dictionaries and lists at the class level can lead to shared mutable state among all instances ofSymbol
, possibly causing unexpected side effects when one instance modifies these attributes.- styles: 'dict[str, str]' = {} - classes: 'list[str]' = [] - props: 'dict[str, t.Union[str, bool, int, float, list, dict]]' = {} - children:'list[Symbol]' = [] ... + def __init__(self, styles=None, classes=None, props=None, children=None, ...): + if styles is None: + styles = {} + if classes is None: + classes = [] + if props is None: + props = {} + if children is None: + children = [] + ...
36-40
: Useis None
checks instead of== None
.
Python best practice is to compare withNone
usingis None
oris not None
. This avoids potential issues with operator overloading and clarifies intent.- if styles == None: + if styles is None: - if classes == None: + if classes is None: - if inner == None: + if inner is None: - if inner == None: + if inner is None: - if styles == None: + if styles is None: - if classes == None: + if classes is None:Also applies to: 49-53
🧰 Tools
🪛 Ruff (0.8.2)
36-36: Comparison to
None
should becond is None
Replace with
cond is None
(E711)
38-38: Comparison to
None
should becond is None
Replace with
cond is None
(E711)
40-40: Comparison to
None
should becond is None
Replace with
cond is None
(E711)
117-117
: Remove the extraneous f-string marker in theelse
clause.
Thef
prefix is unnecessary inf" />"
because there are no placeholders.- ... else f" />"}" + ... else "/>"}"🧰 Tools
🪛 Ruff (0.8.2)
117-117: f-string without any placeholders
Remove extraneous
f
prefix(F541)
BetterMD/parse/markdown/extensions/base.py (6)
7-7
: Remove unused importsELEMENT
andTEXT
.
They are not being used in this file, so consider removing them to reduce clutter.- from ...typing import ELEMENT, TEXT + from ...typing import ...🧰 Tools
🪛 Ruff (0.8.2)
7-7:
...typing.ELEMENT
imported but unusedRemove unused import
(F401)
7-7:
...typing.TEXT
imported but unusedRemove unused import
(F401)
8-8
: Remove unused importsOL_TYPE
andUL_TYPE
.
These are not referenced in the current code, so you can remove them to keep imports minimal.- from ..typing import ELM_TYPE_W_END, ELM_TYPE_WO_END, OL_LIST, UL_LIST, LIST_ITEM, LIST_TYPE, OL_TYPE, UL_TYPE + from ..typing import ELM_TYPE_W_END, ELM_TYPE_WO_END, OL_LIST, UL_LIST, LIST_ITEM, LIST_TYPE🧰 Tools
🪛 Ruff (0.8.2)
8-8:
..typing.OL_TYPE
imported but unusedRemove unused import
(F401)
8-8:
..typing.UL_TYPE
imported but unusedRemove unused import
(F401)
21-24
: Combine theif
branches with a logical “or”.
These checks perform effectively the same operation with different quote types, so merging them simplifies the control flow.-if text.startswith('"') and text.endswith('"'): - return text[1:-1] -elif text.startswith("'") and text.endswith("'"): - return text[1:-1] +if (text.startswith('"') and text.endswith('"')) or (text.startswith("'") and text.endswith("'")): + return text[1:-1]🧰 Tools
🪛 Ruff (0.8.2)
21-24: Combine
if
branches using logicalor
operatorCombine
if
branches(SIM114)
204-208
: Combine theif
branches with a logical “or”.
These conditions can be merged to handle both scenarios uniformly.🧰 Tools
🪛 Ruff (0.8.2)
204-208: Combine
if
branches using logicalor
operatorCombine
if
branches(SIM114)
327-331
: Combine theif
branches with a logical “or”.
Merging these closely related conditions will improve readability.🧰 Tools
🪛 Ruff (0.8.2)
327-331: Combine
if
branches using logicalor
operatorCombine
if
branches(SIM114)
448-448
: Useis not None
for clarity.
When checking againstNone
, it’s more idiomatic in Python to useis not None
.- input = match.group(3) != None + input = match.group(3) is not None - input = match.group(4) != None + input = match.group(4) is not NoneAlso applies to: 466-466
🧰 Tools
🪛 Ruff (0.8.2)
448-448: Comparison to
None
should becond is not None
Replace with
cond is not None
(E711)
BetterMD/parse/markdown/extensions/s.py (2)
149-246
: Be cautious of potential XSS vulnerabilities when constructing HTML from user input.This code creates HTML elements directly from parsed Markdown without applying additional sanitisation or escaping to user-supplied fields such as
alt
,href
, ortitle
. In scenarios where malicious input might be processed, an attacker could potentially inject harmful HTML or scripts.If you need secure sanitisation, I can help integrate a well-tested library or provide a reference implementation to mitigate XSS risks.
447-449
: Avoid overshadowing built-in names “type” and “input”.Here, local variables named
type
andinput
shadow the built-in Python functions. This can lead to confusion or bugs. Consider renaming these variables to more descriptive alternatives likelist_type
andchecked_input
.Below is a sample diff illustrating the name change:
- type = match.group(2) - input = match.group(3) != None + list_type = match.group(2) + has_input = match.group(3) is not None ... - type = match.group(3) - input = match.group(4) != None + list_type = match.group(3) + has_input = match.group(4) is not NoneAlso applies to: 463-468
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (111)
BetterMD/__init__.py
(1 hunks)BetterMD/elements/__init__.py
(1 hunks)BetterMD/elements/a.py
(1 hunks)BetterMD/elements/abbr.py
(1 hunks)BetterMD/elements/acronym.py
(1 hunks)BetterMD/elements/address.py
(1 hunks)BetterMD/elements/area.py
(1 hunks)BetterMD/elements/article.py
(1 hunks)BetterMD/elements/aside.py
(1 hunks)BetterMD/elements/audio.py
(1 hunks)BetterMD/elements/base.py
(1 hunks)BetterMD/elements/bd.py
(1 hunks)BetterMD/elements/big.py
(1 hunks)BetterMD/elements/button.py
(1 hunks)BetterMD/elements/canvas.py
(1 hunks)BetterMD/elements/caption.py
(1 hunks)BetterMD/elements/center.py
(1 hunks)BetterMD/elements/cite.py
(1 hunks)BetterMD/elements/col.py
(1 hunks)BetterMD/elements/d.py
(1 hunks)BetterMD/elements/data.py
(1 hunks)BetterMD/elements/datalist.py
(1 hunks)BetterMD/elements/del_.py
(1 hunks)BetterMD/elements/details.py
(1 hunks)BetterMD/elements/dialog.py
(1 hunks)BetterMD/elements/dir.py
(1 hunks)BetterMD/elements/embed.py
(1 hunks)BetterMD/elements/fencedframe.py
(1 hunks)BetterMD/elements/fieldset.py
(1 hunks)BetterMD/elements/figure.py
(1 hunks)BetterMD/elements/font.py
(1 hunks)BetterMD/elements/footer.py
(1 hunks)BetterMD/elements/form.py
(1 hunks)BetterMD/elements/frame.py
(1 hunks)BetterMD/elements/frameset.py
(1 hunks)BetterMD/elements/h.py
(1 hunks)BetterMD/elements/head.py
(1 hunks)BetterMD/elements/header.py
(1 hunks)BetterMD/elements/hgroup.py
(1 hunks)BetterMD/elements/hr.py
(1 hunks)BetterMD/elements/html.py
(1 hunks)BetterMD/elements/i.py
(1 hunks)BetterMD/elements/iframe.py
(1 hunks)BetterMD/elements/img.py
(1 hunks)BetterMD/elements/input.py
(1 hunks)BetterMD/elements/ins.py
(1 hunks)BetterMD/elements/kbd.py
(1 hunks)BetterMD/elements/label.py
(1 hunks)BetterMD/elements/legend.py
(1 hunks)BetterMD/elements/li.py
(1 hunks)BetterMD/elements/link.py
(1 hunks)BetterMD/elements/main.py
(1 hunks)BetterMD/elements/map.py
(1 hunks)BetterMD/elements/mark.py
(1 hunks)BetterMD/elements/marquee.py
(1 hunks)BetterMD/elements/menu.py
(1 hunks)BetterMD/elements/meta.py
(1 hunks)BetterMD/elements/meter.py
(1 hunks)BetterMD/elements/nav.py
(1 hunks)BetterMD/elements/no.py
(1 hunks)BetterMD/elements/object.py
(1 hunks)BetterMD/elements/output.py
(1 hunks)BetterMD/elements/p.py
(1 hunks)BetterMD/elements/param.py
(1 hunks)BetterMD/elements/picture.py
(1 hunks)BetterMD/elements/plaintext.py
(1 hunks)BetterMD/elements/progress.py
(1 hunks)BetterMD/elements/q.py
(1 hunks)BetterMD/elements/ruby.py
(1 hunks)BetterMD/elements/s.py
(1 hunks)BetterMD/elements/samp.py
(1 hunks)BetterMD/elements/script.py
(1 hunks)BetterMD/elements/search.py
(1 hunks)BetterMD/elements/section.py
(1 hunks)BetterMD/elements/select.py
(1 hunks)BetterMD/elements/slot.py
(1 hunks)BetterMD/elements/small.py
(1 hunks)BetterMD/elements/source.py
(1 hunks)BetterMD/elements/strike.py
(1 hunks)BetterMD/elements/style.py
(1 hunks)BetterMD/elements/sub.py
(1 hunks)BetterMD/elements/summary.py
(1 hunks)BetterMD/elements/sup.py
(1 hunks)BetterMD/elements/symbol.py
(4 hunks)BetterMD/elements/table.py
(1 hunks)BetterMD/elements/template.py
(1 hunks)BetterMD/elements/text.py
(1 hunks)BetterMD/elements/text_formatting.py
(1 hunks)BetterMD/elements/textarea.py
(1 hunks)BetterMD/elements/time.py
(1 hunks)BetterMD/elements/title.py
(1 hunks)BetterMD/elements/track.py
(1 hunks)BetterMD/elements/tt.py
(1 hunks)BetterMD/elements/u.py
(1 hunks)BetterMD/elements/var.py
(1 hunks)BetterMD/elements/video.py
(1 hunks)BetterMD/elements/wbr.py
(1 hunks)BetterMD/elements/xmp.py
(1 hunks)BetterMD/parse/collection.py
(1 hunks)BetterMD/parse/markdown/__init__.py
(1 hunks)BetterMD/parse/markdown/extensions/__init__.py
(1 hunks)BetterMD/parse/markdown/extensions/base.py
(1 hunks)BetterMD/parse/markdown/extensions/extension.py
(1 hunks)BetterMD/parse/markdown/extensions/s.py
(1 hunks)BetterMD/parse/markdown/parser.py
(1 hunks)BetterMD/parse/markdown/typing.py
(1 hunks)BetterMD/parse/typing.py
(1 hunks)BetterMD/typing.py
(1 hunks)BetterMD/utils.py
(1 hunks)README.md
(1 hunks)TODO.md
(1 hunks)
✅ Files skipped from review due to trivial changes (4)
- BetterMD/elements/menu.py
- README.md
- BetterMD/elements/figure.py
- BetterMD/elements/ruby.py
🚧 Files skipped from review as they are similar to previous changes (6)
- BetterMD/elements/html.py
- BetterMD/elements/p.py
- BetterMD/elements/a.py
- BetterMD/elements/title.py
- BetterMD/elements/text.py
- BetterMD/elements/h.py
🧰 Additional context used
🧬 Code Definitions (7)
BetterMD/elements/text_formatting.py (1)
BetterMD/elements/symbol.py (1)
Symbol
(10-188)
BetterMD/elements/hr.py (4)
BetterMD/elements/input.py (2)
MD
(6-10)RST
(12-16)BetterMD/elements/li.py (2)
MD
(5-9)RST
(11-18)BetterMD/elements/br.py (2)
MD
(5-7)RST
(9-11)BetterMD/elements/h.py (1)
RST
(4-14)
BetterMD/elements/li.py (1)
BetterMD/elements/symbol.py (1)
Symbol
(10-188)
BetterMD/elements/input.py (4)
BetterMD/elements/symbol.py (5)
get_prop
(174-175)to_md
(119-133)to_html
(95-117)to_rst
(135-140)Symbol
(10-188)BetterMD/markdown/custom_markdown.py (1)
to_md
(14-14)BetterMD/html/custom_html.py (1)
to_html
(11-11)BetterMD/rst/custom_rst.py (2)
CustomRst
(9-18)to_rst
(14-14)
BetterMD/elements/symbol.py (5)
BetterMD/elements/text.py (3)
to_html
(17-18)to_md
(20-21)to_rst
(23-24)BetterMD/elements/img.py (3)
to_html
(12-13)to_md
(7-9)to_rst
(16-17)BetterMD/elements/a.py (2)
to_md
(6-7)to_rst
(10-11)BetterMD/elements/title.py (2)
to_md
(7-11)to_rst
(14-18)BetterMD/elements/input.py (2)
to_md
(7-10)to_rst
(13-16)
BetterMD/elements/__init__.py (11)
BetterMD/elements/symbol.py (1)
Symbol
(10-188)BetterMD/elements/a.py (1)
A
(13-27)BetterMD/elements/blockquote.py (1)
Blockquote
(9-13)BetterMD/elements/br.py (1)
Br
(13-16)BetterMD/elements/div.py (1)
Div
(3-6)BetterMD/elements/hr.py (1)
Hr
(13-19)BetterMD/elements/li.py (3)
OL
(40-44)UL
(46-51)LI
(33-38)BetterMD/elements/p.py (1)
P
(3-7)BetterMD/elements/span.py (1)
Span
(3-6)BetterMD/elements/table.py (7)
Table
(158-165)Tr
(167-170)Td
(172-179)Th
(181-189)THead
(191-197)TBody
(199-205)TFoot
(207-213)BetterMD/elements/text_formatting.py (3)
Strong
(15-18)Em
(25-28)B
(20-23)
BetterMD/parse/typing.py (1)
BetterMD/parse/html.py (1)
parse
(16-193)
🪛 markdownlint-cli2 (0.17.2)
TODO.md
10-10: Unordered list indentation
Expected: 2; Actual: 4
(MD007, ul-indent)
11-11: Unordered list indentation
Expected: 2; Actual: 4
(MD007, ul-indent)
12-12: Unordered list indentation
Expected: 4; Actual: 8
(MD007, ul-indent)
13-13: Unordered list indentation
Expected: 4; Actual: 8
(MD007, ul-indent)
14-14: Unordered list indentation
Expected: 2; Actual: 4
(MD007, ul-indent)
🪛 Ruff (0.8.2)
BetterMD/parse/markdown/__init__.py
1-1: from .extensions import *
used; unable to detect undefined names
(F403)
4-4: BaseExtension
may be undefined, or defined from star imports
(F405)
BetterMD/__init__.py
3-3: .parse.Collection
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
3-3: .parse.HTMLParser
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
3-3: .parse.MDParser
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
BetterMD/parse/markdown/typing.py
4-4: .parser.MDParser
imported but unused
Remove unused import: .parser.MDParser
(F401)
BetterMD/elements/i.py
13-13: Ambiguous class name: I
(E742)
BetterMD/elements/style.py
58-61: Combine if
branches using logical or
operator
Combine if
branches
(SIM114)
BetterMD/parse/markdown/parser.py
156-156: Ambiguous variable name: l
(E741)
169-169: Ambiguous variable name: l
(E741)
BetterMD/elements/table.py
41-41: Multiple isinstance
calls for section
, merge into a single call
Merge isinstance
calls for section
(SIM101)
62-62: Loop control variable i
not used within loop body
Rename unused i
to _i
(B007)
BetterMD/elements/symbol.py
8-8: ..typing.ATTRS
imported but unused
Remove unused import: ..typing.ATTRS
(F401)
36-36: Comparison to None
should be cond is None
Replace with cond is None
(E711)
38-38: Comparison to None
should be cond is None
Replace with cond is None
(E711)
40-40: Comparison to None
should be cond is None
Replace with cond is None
(E711)
49-49: Comparison to None
should be cond is None
Replace with cond is None
(E711)
51-51: Comparison to None
should be cond is None
Replace with cond is None
(E711)
53-53: Comparison to None
should be cond is None
Replace with cond is None
(E711)
117-117: f-string without any placeholders
Remove extraneous f
prefix
(F541)
BetterMD/parse/markdown/extensions/base.py
7-7: ...typing.ELEMENT
imported but unused
Remove unused import
(F401)
7-7: ...typing.TEXT
imported but unused
Remove unused import
(F401)
8-8: ..typing.OL_TYPE
imported but unused
Remove unused import
(F401)
8-8: ..typing.UL_TYPE
imported but unused
Remove unused import
(F401)
21-24: Combine if
branches using logical or
operator
Combine if
branches
(SIM114)
155-155: Loop control variable i
not used within loop body
(B007)
204-208: Combine if
branches using logical or
operator
Combine if
branches
(SIM114)
278-278: Loop control variable i
not used within loop body
(B007)
327-331: Combine if
branches using logical or
operator
Combine if
branches
(SIM114)
448-448: Comparison to None
should be cond is not None
Replace with cond is not None
(E711)
466-466: Comparison to None
should be cond is not None
Replace with cond is not None
(E711)
BetterMD/parse/markdown/extensions/s.py
491-491: SyntaxError: Expected index or slice expression
491-492: SyntaxError: Expected an indented block after elif
clause
BetterMD/elements/__init__.py
1-1: .symbol.Symbol
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
3-3: .a.A
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
4-4: .abbr.Abbr
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
5-5: .acronym.Acronym
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
6-6: .address.Address
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
7-7: .area.Area
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
8-8: .article.Article
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
9-9: .aside.Aside
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
10-10: .audio.Audio
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
12-12: .base.Base
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
13-13: .bd.BDI
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
13-13: .bd.BDO
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
14-14: .big.Big
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
15-15: .blockquote.Blockquote
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
16-16: .body.Body
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
17-17: .br.Br
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
18-18: .button.Button
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
20-20: .canvas.Canvas
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
21-21: .caption.Caption
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
22-22: .center.Center
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
23-23: .cite.Cite
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
24-24: .code.Code
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
25-25: .col.Col
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
25-25: .col.Colgroup
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
27-27: .d.DD
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
27-27: .d.DFN
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
27-27: .d.DL
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
27-27: .d.DT
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
28-28: .data.Data
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
29-29: .datalist.DataList
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
30-30: .del_.Del
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
31-31: .details.Details
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
32-32: .dialog.Dialog
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
33-33: .dir.Dir
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
34-34: .div.Div
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
36-36: .embed.Embed
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
38-38: .fencedframe.FencedFrame
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
39-39: .fieldset.Fieldset
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
40-40: .figure.FigCaption
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
40-40: .figure.Figure
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
41-41: .font.Font
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
42-42: .footer.Footer
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
43-43: .form.Form
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
44-44: .frame.Frame
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
45-45: .frameset.Frameset
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
47-47: .h.H1
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
47-47: .h.H2
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
47-47: .h.H3
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
47-47: .h.H4
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
47-47: .h.H5
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
47-47: .h.H6
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
48-48: .head.Head
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
49-49: .header.Header
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
50-50: .hgroup.HGroup
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
51-51: .hr.Hr
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
52-52: .html.HTML
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
55-55: .iframe.Iframe
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
56-56: .img.Img
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
57-57: .input.Input
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
58-58: .ins.Ins
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
60-60: .kbd.Kbd
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
62-62: .label.Label
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
63-63: .legend.Legend
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
64-64: .li.OL
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
64-64: .li.UL
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
64-64: .li.LI
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
65-65: .link.Link
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
67-67: .main.Main
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
68-68: .map.Map
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
69-69: .mark.Mark
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
70-70: .marquee.Marquee
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
71-71: .menu.Menu
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
72-72: .meta.Meta
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
73-73: .meter.Meter
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
75-75: .nav.Nav
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
76-76: .no.NoFrames
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
76-76: .no.NoScript
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
76-76: .no.NoBr
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
76-76: .no.NoEmbed
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
78-78: .object.Object
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
79-79: .output.Output
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
81-81: .p.P
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
82-82: .param.Param
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
83-83: .picture.Picture
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
84-84: .plaintext.Plaintext
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
85-85: .progress.Progress
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
87-87: .q.Q
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
89-89: .ruby.RB
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
89-89: .ruby.RP
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
89-89: .ruby.RT
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
89-89: .ruby.RTC
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
91-91: .s.S
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
92-92: .samp.Samp
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
93-93: .script.Script
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
94-94: .search.Search
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
95-95: .section.Section
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
96-96: .select.Select
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
97-97: .slot.Slot
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
98-98: .small.Small
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
99-99: .source.Source
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
100-100: .span.Span
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
101-101: .strike.Strike
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
102-102: .style.Style
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
103-103: .sub.Sub
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
104-104: .summary.Summary
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
105-105: .sup.Sup
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
107-107: .table.Table
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
107-107: .table.Tr
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
107-107: .table.Td
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
107-107: .table.Th
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
107-107: .table.THead
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
107-107: .table.TBody
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
107-107: .table.TFoot
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
108-108: .template.Template
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
109-109: .text_formatting.Strong
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
109-109: .text_formatting.Em
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
109-109: .text_formatting.Strong
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
109-109: Redefinition of unused I
from line 54
(F811)
109-109: .text_formatting.Em
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
110-110: .text.Text
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
111-111: .textarea.Textarea
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
112-112: .time.Time
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
113-113: .title.Title
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
114-114: .track.Track
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
115-115: .tt.TT
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
117-117: .u.U
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
119-119: .var.Var
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
120-120: .video.Video
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
122-122: .wbr.WBR
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
124-124: .xmp.XMP
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
BetterMD/parse/collection.py
25-28: Combine if
branches using logical or
operator
Combine if
branches
(SIM114)
BetterMD/parse/markdown/extensions/__init__.py
1-1: .base.BaseExtension
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
2-2: .extension.Extension
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
BetterMD/parse/typing.py
3-3: ..typing.ATTR_TYPES
imported but unused
Remove unused import: ..typing.ATTR_TYPES
(F401)
🔇 Additional comments (74)
BetterMD/elements/abbr.py (1)
1-2
: Import looks correct.The import statement correctly brings in the Symbol base class from the same directory, which is then used for inheritance in the Abbr class.
BetterMD/parse/typing.py (1)
16-18
: Good use of runtime_checkable for Protocol definitionThe
@t.runtime_checkable
decorator is appropriately used here, allowing for runtime verification of protocol implementation. This is particularly useful for a parser interface where different implementations might need to be validated at runtime.BetterMD/elements/mark.py (1)
1-6
: Class structure aligns with Symbol hierarchy patternThe
Mark
class follows the standard pattern for Symbol subclasses in the codebase. It correctly represents the HTML<mark>
element which is used for highlighting text.BetterMD/elements/center.py (1)
1-6
: Class structure follows Symbol hierarchy pattern correctlyThe
Center
class follows the established pattern for Symbol subclasses in the codebase. It represents the HTML<center>
element used for centering content.BetterMD/elements/hgroup.py (1)
1-6
: Class structure aligns with Symbol hierarchy patternThe
HGroup
class correctly follows the standard pattern for Symbol subclasses in the codebase. It represents the HTML<hgroup>
element used for grouping heading elements.BetterMD/elements/section.py (1)
1-6
: Class structure follows Symbol hierarchy patternThe
Section
class correctly implements the standard pattern for Symbol subclasses in the codebase. It properly represents the HTML<section>
element used to define sections in a document.BetterMD/elements/search.py (1)
4-6
:✅ Verification successful
Verify browser support for the search element
The
<search>
element is relatively new in HTML. Ensure that your conversion functionality handles browser compatibility issues or provides appropriate fallbacks for browsers that don't support this element.
🌐 Web query:
What is the current browser support for the HTML search element?
💡 Result:
The
<search>
HTML element, introduced to semantically mark search or filtering sections, has gained significant browser support as of late 2023. Here's the current compatibility landscape:Desktop Browser Support
Browser Supported Versions Notes Chrome 118+ (October 2023) Includes Chromium-based Edge Firefox 118+ (October 2023) Initial stable implementation Safari 17.0+ (September 2023) macOS and iOS Edge 118+ (October 2023) Chromium-based versions Opera 104+ (October 2023) Based on Chromium support Mobile Browser Support
- Safari iOS: 17.0+ (September 2023)
- Chrome Android: 129+
- Firefox Android: 130+
- Samsung Internet: 25+
Key Details
- Accessibility: Properly mapped to the
search
landmark role in supported browsers, improving screen reader navigation[1][7].- Legacy Exclusion: No support in Internet Explorer or pre-2023 browser versions[3][4].
- Polyfills: Not typically required for basic functionality, but developers must ensure semantic markup for unsupported browsers[1][7].
Implementation Example
<search> <form action="/search"> <label for="query">Search:</label> <input type="search" id="query" name="q"> <button type="submit">Go</button> </form> </search>This structure works in modern browsers while degrading gracefully in older ones[1][9].
Testing Recommendations
- Use BrowserStack Live to validate cross-browser behavior[2][8]
- Verify accessibility tree output in developer tools[1][7]
Support stabilized in late 2023 across major browsers, making it viable for production use with appropriate testing.
Citations:
- 1: https://www.scottohara.me/blog/2023/03/24/search-element.html
- 2: https://www.browserstack.com/guide/browser-compatibility-of-semantic-html
- 3: https://caniuse.com/mdn-html_elements_search
- 4: https://webreference.com/html/tags/search/
- 5: https://caniuse.com/?search=HTML
- 6: https://www.wufoo.com/html5/
- 7: https://developer.mozilla.org/en-US/docs/Web/HTML/Element/search
- 8: https://www.browserstack.com/guide/html5-browser-compatibility-test
- 9: https://developer.mozilla.org/en-US/docs/Web/HTML/Element/input/search
- 10: https://html.spec.whatwg.org
Review Update: Browser Compatibility for the
<search>
Element ConfirmedThe
<search>
element now enjoys robust support in modern browsers—including Chrome 118+, Firefox 118+, Safari 17+, Edge (Chromium-based) and Opera 104+. This means that the current implementation in BetterMD/elements/search.py (lines 4–6) is well suited for projects targeting up-to-date environments.
- Reminder: Legacy browsers (e.g. Internet Explorer or pre-2023 versions) do not support
<search>
. If your user base includes such browsers, please ensure that your conversion functionality incorporates appropriate fallbacks or polyfills.BetterMD/elements/summary.py (1)
1-2
: Import looks correct.The import statement correctly imports the Symbol class from the sibling module, following the project's typical pattern for element classes.
BetterMD/elements/canvas.py (1)
1-2
: Import looks correct.The import statement correctly imports the Symbol class from the sibling module, following the project's typical pattern for element classes.
BetterMD/elements/samp.py (1)
1-2
: Import looks correct.The import statement correctly imports the Symbol class from the sibling module, following the project's pattern.
BetterMD/elements/source.py (1)
1-2
: Import looks correct.The import statement correctly imports the Symbol class from the sibling module, following the project's pattern for element classes.
BetterMD/elements/details.py (1)
3-9
: Implementation looks goodThe
Details
class correctly maps to the HTML<details>
element with appropriate properties and event handling for the toggle functionality.BetterMD/elements/hr.py (2)
14-15
: Good addition of HTML attributes for the Hr elementThe new
prop_list
attribute correctly includes HTML attributes relevant to horizontal rules. It's worth noting that several of these attributes ("align", "color", "noshade") are considered obsolete in HTML5 but including them is sensible for backward compatibility and completeness.
19-19
: Consistent architectural change fromnl
toblock
Replacing the
nl
attribute withblock = True
appears to be part of a broader architectural refactoring across the codebase to improve how block-level elements are defined and rendered. This change aligns with similar modifications in other classes and provides a more semantic approach to element classification.TODO.md (1)
6-7
: Spelling and capitalisation corrections are accurateThe corrections to "HTML elements" (from "elelemts") and "PYPI" (from "Pypi") improve the clarity and accuracy of the TODO list.
BetterMD/elements/link.py (2)
3-8
: Well structured class but potential issues with attribute namesThe
Link
class appears well implemented with a comprehensive list of HTML attributes. However, there are a few concerns:
The property list includes "as" which is a Python keyword. This could cause issues when accessing this property programmatically. Consider renaming this property or ensuring your implementation handles this special case.
Some attributes like "charset" and "rev" are deprecated in HTML5. You might want to mark these as deprecated or include a note about their status.
Consider adding a docstring to explain the purpose of this class and how it should be used in the context of format conversion.
4-4
:❓ Verification inconclusive
Verify proper handling of the "as" attribute
Since "as" is a Python keyword, it may cause issues when accessed as an attribute. Python won't allow direct attribute access like
obj.as
.
🏁 Script executed:
#!/bin/bash # Check how the Symbol class handles attributes to ensure "as" is properly managed # Look for attribute access implementations in the Symbol class echo "Checking Symbol class implementation for attribute handling:" rg -A 5 -B 5 "def __getattr__|def __setattr__|def __getattribute__" --glob "*/elements/symbol.py" # Look for any usage of the "as" attribute in the codebase echo "Checking for usage of 'as' attribute in the codebase:" rg -A 2 -B 2 "\\.as[^a-zA-Z0-9_]" --glob "*.py"Length of output: 402
Verify Reserved Keyword Handling of "as" in Link Elements
It appears that the search for custom attribute access in the Symbol class (e.g. implementations of
__getattr__
,__setattr__
, or__getattribute__
) returned no results. This suggests that reserved keywords like"as"
in theprop_list
forBetterMD/elements/link.py
might not be handled via specialised getters or setters. Please manually verify that:
- Reserved keywords (e.g.
"as"
) are either not accessed using dot notation or are safely managed via alternative access methods (such as dictionary-style access likeobj["as"]
).- There are no unintended syntax issues or attribute conflicts arising from using
"as"
as a property name.If a safe handling mechanism is not already in place, consider implementing custom attribute access or renaming properties to avoid conflicts with Python keywords.
BetterMD/elements/text_formatting.py (1)
20-23
: Good implementation of the HTML bold element.The new
B
class correctly implements the HTML<b>
element, reusing the existingSMD()
formatter and"**"
RST syntax that's used for theStrong
class. This makes sense as both represent bold text but have different semantic meanings in HTML.BetterMD/elements/no.py (4)
1-6
: Proper implementation of NoScript element.The
NoScript
class correctly implements the HTML<noscript>
element with empty strings for Markdown and RST representations, as these formats don't have direct equivalents.
8-11
: Proper implementation of NoFrames element.The
NoFrames
class correctly implements the HTML<noframes>
element with appropriate HTML mapping and empty strings for formats without equivalents.
13-16
:✅ Verification successful
Proper implementation of NoBr element.
The
NoBr
class correctly implements the HTML<nobr>
element. Note that this element is non-standard and deprecated in HTML5.
🌐 Web query:
Is the HTML nobr element still supported in modern browsers?
💡 Result:
The
<nobr>
HTML element, though technically still functional in most modern browsers, is considered obsolete and non-standard. Here's a concise breakdown:Current browser support status
Most modern browsers (Chrome, Firefox, Safari, Edge) retain support for<nobr>
due to backward compatibility requirements[8], but all major documentation sources explicitly warn against using it.Key considerations:
- Deprecated in HTML specifications and never standardized[8]
- Considered "obsolete" by MDN Web Docs and W3C standards[7][8]
- Browser compatibility tables show full support but with explicit deprecation warnings[8]
- Creates accessibility issues by forcing horizontal scrolling[2][7]
Modern implementation recommendation:
<span style="white-space: nowrap;">Your unbroken text</span>This CSS-based approach achieves the same effect while maintaining standards compliance[7][8]. The
white-space: nowrap
property is supported in all modern browsers without compatibility concerns[8].Citations:
- 1: https://www.tutorialspoint.com/html/html_nobr_tag.htm
- 2: https://html.com/tags/nobr/
- 3: https://www.sitepoint.com/community/t/deprecated-elements-or-not/28016
- 4: http://mvark.blogspot.com/2012/07/nobr-non-standard-html-tag-that-all.html
- 5: https://caniuse.com/mdn-html_elements_nobr
- 6: https://support.google.com/blogger/thread/175300342/nobr-tags
- 7: https://udn.realityripple.com/docs/Web/HTML/Element/nobr
- 8: https://developer.mozilla.org/en-US/docs/Web/HTML/Element/nobr
- 9: https://developer.mozilla.org/en-US/docs/Web/HTML/Element
- 10: https://caniuse.com
NoBr Element Implementation Confirmation
The
NoBr
class in BetterMD/elements/no.py (lines 13–16) correctly implements the HTML<nobr>
element. As verified, modern browsers continue to support<nobr>
for backward compatibility; however, it remains non‐standard and deprecated in HTML5. Developers might consider utilising CSS (e.g.<span style="white-space: nowrap;">...</span>
) for a more modern and compliant approach.
18-21
:✅ Verification successful
Proper implementation of NoEmbed element.
The
NoEmbed
class correctly implements the HTML<noembed>
element. Worth noting that this element is also considered obsolete in HTML5.
🌐 Web query:
Is the HTML noembed element still supported in modern browsers?
💡 Result:
The HTML
<noembed>
element is obsolete and not supported in modern web standards. While some browsers might still recognize it due to backward compatibility, its use is strongly discouraged and considered deprecated since HTML 4.01[1][4][7].Key points:
- Deprecated status:
<noembed>
was replaced by standard fallback methods using<object>
tags. Modern HTML5 explicitly avoids it[2][3][7].- Non-standard feature:
Never part of official HTML specifications, making it unreliable for cross-browser compatibility[1][3][7].- Modern alternatives:
Use<object>
with nested fallback content, or CSS/JavaScript-based detection for unsupported content[1][4].- Browser behavior:
Most browsers still render<noembed>
content if present, but this should not be interpreted as active support—it reflects legacy code handling rather than intentional compatibility[1][7].Example replacement:
<object data="video.mp4" type="video/mp4"> <p>Your browser doesn’t support this video format.</p> </object>For embedding modern multimedia, use standardized elements like
<video>
,<audio>
, or<iframe>
instead of<embed>
/<noembed>
[8]. Always prioritize W3C-validated approaches for long-term compatibility.Citations:
- 1: https://udn.realityripple.com/docs/Web/HTML/Element/noembed
- 2: https://www.tutorialspoint.com/html/html_noembed_tag.htm
- 3: https://codeproject.com/ref/html/element%2Fnoembed
- 4: https://html.com/tags/noembed/
- 5: https://developer.mozilla.org/en-US/docs/Web/HTML/Element
- 6: https://caniuse.com
- 7: https://developer.mozilla.org/en-US/docs/Web/HTML/Element/noembed
- 8: https://developer.mozilla.org/en-US/docs/Web/HTML/Element/embed
- 9: https://github.com/mdn/content/blob/main/files/en-us/web/html/element/noembed/index.md?plain=1
- 10: https://caniuse.com/mdn-html_elements_noembed
Refined Implementation of Obsolete Element – No Further Changes Required
The
NoEmbed
class correctly maps to the<noembed>
HTML element. Please note that, as verified, the<noembed>
element is now considered obsolete and is not supported by modern web standards. If retaining its use for legacy reasons is intentional, it would be beneficial to include a comment in the code explaining this decision. Otherwise, consider modern alternatives (e.g. using an<object>
tag with fallback content).
- File: BetterMD/elements/no.py (Lines 18-21)
- Implementation: Correct representation of the obsolescent
<noembed>
element.- Note:
<noembed>
is deprecated and not recommended for new projects.BetterMD/utils.py (1)
3-4
: Good use of conditional import for type checking.The conditional import prevents circular dependencies whilst ensuring proper type checking.
BetterMD/elements/select.py (3)
3-8
: Good implementation of Select HTML element.The
Select
class correctly implements the HTML<select>
element with appropriate properties listed. The empty strings for Markdown and RST representations make sense as form elements don't have direct equivalents in these formats.
10-15
: Good implementation of Option HTML element.The
Option
class correctly implements the HTML<option>
element with the appropriate properties.
17-22
: Good implementation of Optgroup HTML element.The
Optgroup
class correctly implements the HTML<optgroup>
element with appropriate properties for grouping options in a select dropdown.BetterMD/elements/li.py (3)
34-34
: Property list for list item looks good.The addition of the
prop_list
attribute with["value", "type"]
properly defines the standard HTML attributes for list items, enabling proper parsing and serialization of these properties.
41-41
: Property list for ordered list looks good.The
prop_list
with["reversed", "start", "type"]
correctly includes all the standard HTML attributes for ordered lists, supporting proper serialization of OL elements.
47-48
: Property list for unordered list looks good.The
prop_list
containing["compact", "type"]
appropriately defines the HTML attributes for unordered lists, enabling proper handling of UL properties during parsing and rendering.BetterMD/__init__.py (3)
2-2
: Good addition of explicit Symbol import.This addresses the previous review comment about Symbol being undefined in parsing functions. Explicit imports make dependencies clearer and prevent potential issues if the import structure changes.
5-13
: New HTML class provides a cleaner API.The static methods for parsing HTML from different sources provide a nice abstraction and make the library more user-friendly.
21-24
: New MD class provides a clean parsing interface.The
MD
class with its static method for parsing markdown strings creates a consistent API alongside the HTML class.BetterMD/parse/collection.py (2)
8-12
: Collection class initialization looks good.The Collection class provides a useful abstraction for managing symbols. The initialization is clean and includes proper logging setup.
13-17
: Symbol management methods are well-implemented.The methods for adding and removing symbols from the collection are straightforward and effective.
BetterMD/parse/markdown/typing.py (2)
7-10
: Clarify the return type of thehandler
field.Using
None | t.NoReturn
as a return type for thehandler
can be confusing. If the handler may exit the program or raise an exception, consider either limiting this to a single return type or adding clear documentation indicating that this function might not return normally.Do you want to confirm how these handlers are actually used throughout the codebase to ensure the correct type annotation?
44-45
: Ensure correct usage ofparent
references.The
OL_TYPE
andUL_TYPE
TypedDicts define aparent
asdict[str, LIST_TYPE]
. Confirm that all references toparent
are constructed properly at runtime, since any mismatch may cause key errors or type inconsistencies.Would you like a script to verify how the
parent
dictionary is built and accessed?BetterMD/elements/input.py (3)
9-9
: Retaining a blank space for unchecked checkboxes is sensible.The Markdown representation now includes a space instead of an empty string when the checkbox is not checked. This change helps preserve layout consistency.
15-15
: Consistent logic for reStructuredText checkboxes.Likewise, returning a space for unchecked checkboxes aligns the output with the Markdown version.
58-58
: Confirm the change from a custom HTML object to the literal string"input"
.Previously, attributes or methods in
CustomHTML
could handle complex rendering logic for inputs. Replacing it with a string literal might reduce extensibility.Do you want to confirm that this new approach aligns with the rest of the codebase, especially if there were functionalities in
CustomHTML
?BetterMD/elements/style.py (1)
24-25
: Style verification logic appears sound.Checking for
"style"
in lowercase as the canonical trigger is a straightforward approach and likely meets your needs.BetterMD/elements/table.py (20)
36-51
: Good job collecting column widths in a separate pass.This two-pass approach ensures that column widths are determined in advance, leading to consistent table formatting in RST output.
🧰 Tools
🪛 Ruff (0.8.2)
41-41: Multiple
isinstance
calls forsection
, merge into a single callMerge
isinstance
calls forsection
(SIM101)
63-72
: Well-structured RST row generation.This pass effectively ensures each row is aligned to the computed column widths and handles headers with
=
separators for clarity.
110-111
: Appropriate stub for RST table headers.Delegating the logic to
TableRST
keeps header rendering consistent across the codebase.
113-123
: Simple body handling looks good.This straightforward approach concatenates all row markdown logic. If needed, consider spacing or additional delimiters between rows.
124-128
: Concise row formatting in Markdown.Using
|
separators for cells is standard in Markdown table rows.
132-134
: RST row rendering delegated to TableRST.This stub properly defers row-level logic to the overarching table class for consistent RST generation.
137-138
: Cell content joined with whitespace.This merges multiple child elements into a single string. If preserving newlines is needed, consider a more advanced approach.
145-146
: Fallback approach improves robustness.By returning a joined string instead of raising an error, the function gracefully handles multiple or non-text elements in RST cells.
149-151
: Header cell RST conversion.This simple approach aggregates child elements. If advanced styling for headers is required, consider applying distinctive markup.
154-156
: Delegation to TableRST.As with
THeadRST
,TBodyRST
defers all rendering logic for table bodies, promoting consistency.
159-161
: Explicitly enumerating deprecated properties.Documenting these attributes clarifies which features are discouraged or intended for backward compatibility.
165-165
: Block-level table.Marking the table element as
block
aligns with typical HTML semantics for tables.
170-170
: RST rendering set to TrRST.Associating the
Tr
symbol with its related RST class preserves correct row formatting in RST.
173-175
: Merging legacy and standard properties.Combining contemporary attributes with deprecated ones helps retain backwards compatibility for table cells.
179-179
: RST cell class association.Linking
Td
withTdRST
ensures correct generation for table cells in RST.
183-185
: ExtendingTh
properties.Supporting both current and deprecated properties in
Th
broadens attribute coverage.
188-190
: Verify if the Markdown rendering forTh
should differ fromTd
.Presently,
Th
usesTdMD
for its Markdown representation, which might be fine if header and regular cells share identical syntax. Otherwise, consider adding a dedicatedThMD
class.
191-198
: THead symbol setup.Providing both MD and RST classes ensures consistent and complete table header rendering across multiple formats.
199-206
: TBody symbol definition.Linking
TBody
to its specialised classes supports consistent table body rendering in both Markdown and RST.
207-213
: Consider a distinct class for TFoot.
TFoot
is currently reusingTBodyMD
andTBodyRST
. If footers require special handling or formatting, you may want separate classes.BetterMD/parse/markdown/parser.py (9)
1-13
: Typed dictionaries and annotation.Defining dictionaries for
top_level_tags
andtext_tags
clarifies parser architecture and helps ensure correctness.
14-21
: Flexible extension registration.The ability to add and remove parser extensions fosters a pluggable, extensible approach to Markdown parsing.
28-38
:self.exts
might require resetting.Each call to
refresh_extensions
appends newly created extension instances toself.exts
, which could accumulate duplicates if invoked repeatedly. Verify whetherself.exts
should be cleared first.
39-42
: Initialisation with a fresh extension list.The constructor sets up the parser with an empty extension list and then invokes
reset()
, which is a sensible initial approach.
78-89
: Ending a block with optional parsing.Appending buffered text to the DOM only when
parse
isTrue
adds flexibility, and the final cleanup logic is concise.
90-94
: Block management.Ending the prior block before starting a new one prevents leftover buffered text from leaking into subsequent blocks.
97-103
: Buffer-based text handling.Using
self.buffer
to accumulate text lines is straightforward and helps with paragraph separation later in the parsing process.
104-135
: Incremental parsing approach.Testing each line against registered block patterns is flexible. For very large markdown inputs, consider advanced techniques if performance becomes an issue.
188-195
:from_file
method streamlines file-based parsing.Reading file contents and constructing a minimal HTML structure (
<head>
and<body>
) is convenient. If you need more robust file handling, consider adding exception handling for file operations.BetterMD/parse/markdown/extensions/extension.py (5)
1-3
: Abstract base for Markdown extensions.Importing
ABC
andabstractmethod
supports a clear contract for building extensions.
4-7
: Reducing circular imports usingTYPE_CHECKING
.Referencing types only during type checking helps avoid import cycles while retaining type hints in the editor or at compile time.
38-53
: Exposing parser state via properties.These properties allow extensions to manipulate parser-level statuses (
block
,parsing
) in a clean, object-oriented manner.
54-68
: Delegation methods.These one-line wrappers ensure that extensions can create elements, handle text, and manage blocks without duplicating logic.
69-74
: Delegated parsing calls.Empowering extensions to reuse the parser’s functionality can be helpful for advanced or nested parsing scenarios.
class ELEMENT(t.TypedDict): | ||
type: 't.Literal["element"]' | ||
name: 'str' | ||
attributes: 'ATTRS' | ||
children: 'list[t.Union[ELEMENT, TEXT]]' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Fix inconsistent type annotations in ELEMENT TypedDict
The type annotations in the ELEMENT
class are inconsistently using string literals, while the TEXT
class uses direct typing references. This inconsistency makes the code harder to maintain and may impact type checking effectiveness.
class ELEMENT(t.TypedDict):
- type: 't.Literal["element"]'
- name: 'str'
- attributes: 'ATTRS'
- children: 'list[t.Union[ELEMENT, TEXT]]'
+ type: t.Literal["element"]
+ name: str
+ attributes: ATTRS
+ children: list[t.Union[ELEMENT, TEXT]]
This change will make the type annotations consistent throughout the file and improve static type checking.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
class ELEMENT(t.TypedDict): | |
type: 't.Literal["element"]' | |
name: 'str' | |
attributes: 'ATTRS' | |
children: 'list[t.Union[ELEMENT, TEXT]]' | |
class ELEMENT(t.TypedDict): | |
type: t.Literal["element"] | |
name: str | |
attributes: ATTRS | |
children: list[t.Union[ELEMENT, TEXT]] |
md = "" | ||
html = "big" | ||
rst = "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Consider implementing a conversion strategy for the deprecated element
Since <big>
is deprecated, you should implement conversion methods that use modern alternatives when converting to HTML, such as wrapping the content in a span with appropriate CSS.
For Markdown, you might want to use syntax like **bold**
or custom formatting to represent larger text.
from .symbol import Symbol | ||
|
||
class Big(Symbol): | ||
md = "" | ||
html = "big" | ||
rst = "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding a deprecation warning for the Big element
The <big>
element is deprecated in HTML5. Consider adding a deprecation warning in the docstring and suggesting CSS alternatives for increasing text size.
from .symbol import Symbol
class Big(Symbol):
+ """
+ Represents an HTML <big> element.
+
+ WARNING: The <big> element is deprecated in HTML5. Consider using CSS alternatives
+ like 'font-size' instead.
+
+ There is no direct equivalent in Markdown or reStructuredText.
+ """
md = ""
html = "big"
rst = ""
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
from .symbol import Symbol | |
class Big(Symbol): | |
md = "" | |
html = "big" | |
rst = "" | |
from .symbol import Symbol | |
class Big(Symbol): | |
""" | |
Represents an HTML <big> element. | |
WARNING: The <big> element is deprecated in HTML5. Consider using CSS alternatives | |
like 'font-size' instead. | |
There is no direct equivalent in Markdown or reStructuredText. | |
""" | |
md = "" | |
html = "big" | |
rst = "" |
md = "" | ||
html = "del" | ||
rst = "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Consider implementing Markdown and reStructuredText equivalents
The md
and rst
attributes are empty strings, suggesting that their representations aren't implemented yet. Markdown typically uses ~~strikethrough~~
for deleted text. Consider adding appropriate representations for both formats.
prop_list = ["cite", "datetime"]
- md = ""
+ md = "~~"
html = "del"
- rst = ""
+ rst = ":del:`text`" # Custom role approach for RST
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
md = "" | |
html = "del" | |
rst = "" | |
prop_list = ["cite", "datetime"] | |
md = "~~" | |
html = "del" | |
rst = ":del:`text`" # Custom role approach for RST |
BetterMD/__init__.py
Outdated
from .elements import Symbol | ||
from .parse import Collection, HTMLParser, MDParser |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unused imports detected.
The static analysis indicates that Collection
, HTMLParser
, and MDParser
are imported but not used in this file. Consider removing these imports or making their usage explicit.
-from .parse import Collection, HTMLParser, MDParser
+from .parse import Collection # Remove HTMLParser, MDParser if they're not used
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
from .parse import Collection, HTMLParser, MDParser | |
from .parse import Collection # Remove HTMLParser, MDParser if they're not used |
🧰 Tools
🪛 Ruff (0.8.2)
3-3: .parse.Collection
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
3-3: .parse.HTMLParser
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
3-3: .parse.MDParser
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
BetterMD/elements/__init__.py
Outdated
from .embed import Embed | ||
|
||
from .fencedframe import FencedFrame | ||
from .fieldset import Fieldset | ||
from .figure import FigCaption, Figure | ||
from .font import Font | ||
from .footer import Footer | ||
from .form import Form | ||
from .frame import Frame | ||
from .frameset import Frameset | ||
|
||
from .h import H1,H2,H3,H4,H5,H6 | ||
from .head import Head | ||
from .header import Header | ||
from .hgroup import HGroup | ||
from .hr import Hr | ||
from .html import HTML | ||
|
||
from .i import I | ||
from .iframe import Iframe | ||
from .img import Img | ||
from .input import Input | ||
from .ins import Ins | ||
|
||
from .kbd import Kbd | ||
|
||
from .label import Label | ||
from .legend import Legend | ||
from .li import OL, UL, LI | ||
from .text import Text | ||
from .div import Div | ||
from .link import Link | ||
|
||
from .main import Main | ||
from .map import Map | ||
from .mark import Mark | ||
from .marquee import Marquee | ||
from .menu import Menu | ||
from .meta import Meta | ||
from .meter import Meter | ||
|
||
from .nav import Nav | ||
from .no import NoFrames, NoScript, NoBr, NoEmbed | ||
|
||
from .object import Object | ||
from .output import Output | ||
|
||
from .p import P | ||
from .param import Param | ||
from .picture import Picture | ||
from .plaintext import Plaintext | ||
from .progress import Progress | ||
|
||
from .q import Q | ||
|
||
from .ruby import RB, RP, RT, RTC | ||
|
||
from .s import S | ||
from .samp import Samp | ||
from .script import Script | ||
from .search import Search | ||
from .section import Section | ||
from .select import Select | ||
from .slot import Slot | ||
from .small import Small | ||
from .source import Source | ||
from .span import Span | ||
from .img import Img | ||
from .text_formatting import Strong, Em, Code | ||
from .br import Br | ||
from .blockquote import Blockquote | ||
from .hr import Hr | ||
from .table import Table, Tr, Td, Th | ||
from .input import Input | ||
from .code import Code No newline at end of file | ||
from .strike import Strike | ||
from .style import Style | ||
from .sub import Sub | ||
from .summary import Summary | ||
from .sup import Sup | ||
|
||
from .table import Table, Tr, Td, Th, THead, TBody, TFoot | ||
from .template import Template | ||
from .text_formatting import Strong, Em, Strong as B, Em as I | ||
from .text import Text | ||
from .textarea import Textarea | ||
from .time import Time | ||
from .title import Title | ||
from .track import Track | ||
from .tt import TT | ||
|
||
from .u import U | ||
|
||
from .var import Var | ||
from .video import Video | ||
|
||
from .wbr import WBR | ||
|
||
from .xmp import XMP |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Clean up unused imports or add them to __all__
.
Many of these imported symbols (e.g. Symbol
, A
, Abbr
, etc.) are not referenced within this file. If they are intended for external usage, consider adding them to __all__
; otherwise, remove them to avoid clutter.
Feel free to add all relevant symbols and refine the search patterns as needed.
🏁 Script executed:
#!/bin/bash
# Description: Verify if any of these imported symbols are actually referenced outside __init__.py.
fd -t f -e py | while read -r filepath; do
rg -A 2 "Symbol" "$filepath"
rg -A 2 "Abbr" "$filepath"
rg -A 2 "Acronym" "$filepath"
# ...and so on for each imported symbol.
done
Length of output: 38143
Action Required: Explicit Public API Definition via all in BetterMD/elements/init.py
Our investigation indicates that the symbols imported in this file (e.g. Symbol, A, Abbr, Acronym, etc.) are indeed defined in their respective modules and are used externally (as confirmed by their appearances in various parts of the codebase). Since the primary role of this init.py is to re‐export these elements for external consumption, it is advisable to define an explicit all list. This not only documents the public API more clearly but also prevents unexpected symbols from being imported (especially via wildcard imports).
-
If these symbols are intended to be public:
Please add them to the all list (e.g.__all__ = [ "Symbol", "A", "Abbr", "Acronym", "Address", "Area", "Article", "Aside", "Audio", # … and so on for the remaining symbols ]
) to make the public interface explicit.
-
If some symbols are not meant for public exposure:
Consider removing the corresponding import from init.py to reduce clutter.
🧰 Tools
🪛 Ruff (0.8.2)
1-1: .symbol.Symbol
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
3-3: .a.A
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
4-4: .abbr.Abbr
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
5-5: .acronym.Acronym
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
6-6: .address.Address
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
7-7: .area.Area
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
8-8: .article.Article
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
9-9: .aside.Aside
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
10-10: .audio.Audio
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
12-12: .base.Base
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
13-13: .bd.BDI
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
13-13: .bd.BDO
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
14-14: .big.Big
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
15-15: .blockquote.Blockquote
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
16-16: .body.Body
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
17-17: .br.Br
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
18-18: .button.Button
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
20-20: .canvas.Canvas
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
21-21: .caption.Caption
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
22-22: .center.Center
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
23-23: .cite.Cite
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
24-24: .code.Code
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
25-25: .col.Col
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
25-25: .col.Colgroup
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
27-27: .d.DD
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
27-27: .d.DFN
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
27-27: .d.DL
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
27-27: .d.DT
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
28-28: .data.Data
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
29-29: .datalist.DataList
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
30-30: .del_.Del
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
31-31: .details.Details
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
32-32: .dialog.Dialog
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
33-33: .dir.Dir
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
34-34: .div.Div
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
36-36: .embed.Embed
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
38-38: .fencedframe.FencedFrame
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
39-39: .fieldset.Fieldset
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
40-40: .figure.FigCaption
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
40-40: .figure.Figure
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
41-41: .font.Font
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
42-42: .footer.Footer
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
43-43: .form.Form
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
44-44: .frame.Frame
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
45-45: .frameset.Frameset
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
47-47: .h.H1
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
47-47: .h.H2
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
47-47: .h.H3
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
47-47: .h.H4
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
47-47: .h.H5
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
47-47: .h.H6
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
48-48: .head.Head
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
49-49: .header.Header
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
50-50: .hgroup.HGroup
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
51-51: .hr.Hr
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
52-52: .html.HTML
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
55-55: .iframe.Iframe
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
56-56: .img.Img
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
57-57: .input.Input
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
58-58: .ins.Ins
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
60-60: .kbd.Kbd
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
62-62: .label.Label
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
63-63: .legend.Legend
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
64-64: .li.OL
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
64-64: .li.UL
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
64-64: .li.LI
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
65-65: .link.Link
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
67-67: .main.Main
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
68-68: .map.Map
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
69-69: .mark.Mark
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
70-70: .marquee.Marquee
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
71-71: .menu.Menu
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
72-72: .meta.Meta
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
73-73: .meter.Meter
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
75-75: .nav.Nav
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
76-76: .no.NoFrames
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
76-76: .no.NoScript
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
76-76: .no.NoBr
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
76-76: .no.NoEmbed
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
78-78: .object.Object
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
79-79: .output.Output
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
81-81: .p.P
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
82-82: .param.Param
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
83-83: .picture.Picture
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
84-84: .plaintext.Plaintext
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
85-85: .progress.Progress
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
87-87: .q.Q
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
89-89: .ruby.RB
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
89-89: .ruby.RP
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
89-89: .ruby.RT
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
89-89: .ruby.RTC
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
91-91: .s.S
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
92-92: .samp.Samp
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
93-93: .script.Script
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
94-94: .search.Search
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
95-95: .section.Section
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
96-96: .select.Select
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
97-97: .slot.Slot
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
98-98: .small.Small
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
99-99: .source.Source
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
100-100: .span.Span
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
101-101: .strike.Strike
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
102-102: .style.Style
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
103-103: .sub.Sub
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
104-104: .summary.Summary
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
105-105: .sup.Sup
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
107-107: .table.Table
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
107-107: .table.Tr
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
107-107: .table.Td
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
107-107: .table.Th
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
107-107: .table.THead
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
107-107: .table.TBody
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
107-107: .table.TFoot
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
108-108: .template.Template
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
109-109: .text_formatting.Strong
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
109-109: .text_formatting.Em
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
109-109: .text_formatting.Strong
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
109-109: Redefinition of unused I
from line 54
(F811)
109-109: .text_formatting.Em
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
110-110: .text.Text
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
111-111: .textarea.Textarea
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
112-112: .time.Time
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
113-113: .title.Title
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
114-114: .track.Track
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
115-115: .tt.TT
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
117-117: .u.U
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
119-119: .var.Var
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
120-120: .video.Video
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
122-122: .wbr.WBR
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
124-124: .xmp.XMP
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
|
||
return self.create_element( | ||
"li", | ||
{"class": ["list", f"list-{item["data"][""]}"]}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possible invalid key usage in dictionary access.
The string interpolation f"list-{item["data"][""]}"
suggests you are trying to access the ""
key in item["data"]
, which is likely to cause a KeyError. If you intend to reference the list type, consider using item["data"]["list"]
or a similar valid key.
Apply a fix similar to:
- {"class": ["list", f"list-{item["data"][""]}"]}
+ {"class": ["list", f"list-{item["data"]["list"]}"]}
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
{"class": ["list", f"list-{item["data"][""]}"]}, | |
{"class": ["list", f"list-{item["data"]["list"]}"]}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (2)
BetterMD/elements/__init__.py (1)
1-125
: 🛠️ Refactor suggestionDefine an explicit
__all__
list for better API documentation.This file imports a large number of symbols from various modules but doesn't define an
__all__
list. This makes it unclear which symbols are intended to be part of the public API. Consider adding an__all__
list to explicitly document the public API.from .symbol import Symbol from .text_formatting import Strong, Em, B # ... existing imports ... from .xmp import XMP +__all__ = [ + "Symbol", "A", "Abbr", "Acronym", "Address", "Area", "Article", "Aside", "Audio", + "Base", "BDI", "BDO", "Big", "Blockquote", "Body", "Br", "Button", + "Canvas", "Caption", "Center", "Cite", "Code", "Col", "Colgroup", + "DD", "DFN", "DL", "DT", "Data", "DataList", "Del", "Details", "Dialog", "Dir", "Div", + "Embed", "FencedFrame", "Fieldset", "FigCaption", "Figure", "Font", "Footer", "Form", "Frame", "Frameset", + "H1", "H2", "H3", "H4", "H5", "H6", "Head", "Header", "HGroup", "Hr", "HTML", "I", "Iframe", "Img", "Input", "Ins", + "Kbd", "Label", "Legend", "LI", "Link", "Main", "Map", "Mark", "Marquee", "Menu", "Meta", "Meter", + "Nav", "NoFrames", "NoScript", "NoBr", "NoEmbed", "Object", "OL", "Output", + "P", "Param", "Picture", "Plaintext", "Progress", "Q", "RB", "RP", "RT", "RTC", + "S", "Samp", "Script", "Search", "Section", "Select", "Slot", "Small", "Source", "Span", "Strike", "Strong", "Style", "Sub", "Summary", "Sup", + "Table", "Tr", "Td", "Th", "THead", "TBody", "TFoot", "Template", "Text", "Textarea", "Time", "Title", "Track", "TT", + "U", "UL", "Var", "Video", "WBR", "XMP", + "B", "Em" +]🧰 Tools
🪛 Ruff (0.8.2)
1-1:
.symbol.Symbol
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
3-3:
.text_formatting.Strong
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
3-3:
.text_formatting.Em
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
3-3:
.text_formatting.B
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
5-5:
.a.A
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
6-6:
.abbr.Abbr
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
7-7:
.acronym.Acronym
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
8-8:
.address.Address
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
9-9:
.area.Area
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
10-10:
.article.Article
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
11-11:
.aside.Aside
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
12-12:
.audio.Audio
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
14-14:
.base.Base
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
15-15:
.bd.BDI
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
15-15:
.bd.BDO
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
16-16:
.big.Big
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
17-17:
.blockquote.Blockquote
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
18-18:
.body.Body
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
19-19:
.br.Br
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
20-20:
.button.Button
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
22-22:
.canvas.Canvas
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
23-23:
.caption.Caption
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
24-24:
.center.Center
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
25-25:
.cite.Cite
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
26-26:
.code.Code
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
27-27:
.col.Col
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
27-27:
.col.Colgroup
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
29-29:
.d.DD
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
29-29:
.d.DFN
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
29-29:
.d.DL
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
29-29:
.d.DT
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
30-30:
.data.Data
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
31-31:
.datalist.DataList
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
32-32:
.del_.Del
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
33-33:
.details.Details
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
34-34:
.dialog.Dialog
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
35-35:
.dir.Dir
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
36-36:
.div.Div
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
38-38:
.embed.Embed
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
40-40:
.fencedframe.FencedFrame
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
41-41:
.fieldset.Fieldset
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
42-42:
.figure.FigCaption
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
42-42:
.figure.Figure
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
43-43:
.font.Font
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
44-44:
.footer.Footer
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
45-45:
.form.Form
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
46-46:
.frame.Frame
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
47-47:
.frameset.Frameset
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
49-49:
.h.H1
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
49-49:
.h.H2
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
49-49:
.h.H3
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
49-49:
.h.H4
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
49-49:
.h.H5
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
49-49:
.h.H6
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
50-50:
.head.Head
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
51-51:
.header.Header
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
52-52:
.hgroup.HGroup
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
53-53:
.hr.Hr
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
54-54:
.html.HTML
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
56-56:
.i.I
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
57-57:
.iframe.Iframe
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
58-58:
.img.Img
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
59-59:
.input.Input
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
60-60:
.ins.Ins
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
62-62:
.kbd.Kbd
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
64-64:
.label.Label
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
65-65:
.legend.Legend
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
66-66:
.li.OL
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
66-66:
.li.UL
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
66-66:
.li.LI
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
67-67:
.link.Link
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
69-69:
.main.Main
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
70-70:
.map.Map
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
71-71:
.mark.Mark
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
72-72:
.marquee.Marquee
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
73-73:
.menu.Menu
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
74-74:
.meta.Meta
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
75-75:
.meter.Meter
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
77-77:
.nav.Nav
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
78-78:
.no.NoFrames
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
78-78:
.no.NoScript
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
78-78:
.no.NoBr
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
78-78:
.no.NoEmbed
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
80-80:
.object.Object
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
81-81:
.output.Output
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
83-83:
.p.P
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
84-84:
.param.Param
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
85-85:
.picture.Picture
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
86-86:
.plaintext.Plaintext
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
87-87:
.progress.Progress
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
89-89:
.q.Q
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
91-91:
.ruby.RB
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
91-91:
.ruby.RP
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
91-91:
.ruby.RT
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
91-91:
.ruby.RTC
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
93-93:
.s.S
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
94-94:
.samp.Samp
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
95-95:
.script.Script
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
96-96:
.search.Search
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
97-97:
.section.Section
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
98-98:
.select.Select
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
99-99:
.slot.Slot
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
100-100:
.small.Small
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
101-101:
.source.Source
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
102-102:
.span.Span
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
103-103:
.strike.Strike
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
104-104:
.style.Style
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
105-105:
.sub.Sub
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
106-106:
.summary.Summary
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
107-107:
.sup.Sup
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
109-109:
.table.Table
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
109-109:
.table.Tr
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
109-109:
.table.Td
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
109-109:
.table.Th
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
109-109:
.table.THead
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
109-109:
.table.TBody
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
109-109:
.table.TFoot
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
110-110:
.template.Template
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
111-111:
.text.Text
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
112-112:
.textarea.Textarea
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
113-113:
.time.Time
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
114-114:
.title.Title
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
115-115:
.track.Track
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
116-116:
.tt.TT
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
118-118:
.u.U
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
120-120:
.var.Var
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
121-121:
.video.Video
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
123-123:
.wbr.WBR
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
125-125:
.xmp.XMP
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
BetterMD/__init__.py (1)
5-19
:⚠️ Potential issueRemove arbitrary text slicing in from_url method.
The
from_url
method contains hardcoded string slicing that seems arbitrary and could break with different URL responses.@staticmethod def from_url(url): import requests as r - t = r.get(url).text[15:] + t = r.get(url).text return Symbol.from_html(t)
🧹 Nitpick comments (11)
BetterMD/parse/markdown/__init__.py (1)
1-4
: Improve initialisation file documentation and remove unused importThe file correctly initialises the Markdown parser by registering
BaseExtension
withMDParser
. However, there are a few improvements that could be made:
- The
Extension
import appears to be unused in this file according to static analysis.- Adding a module docstring would help explain the purpose of this initialisation file.
-from .extensions import BaseExtension, Extension +from .extensions import BaseExtension from .parser import MDParser + +"""Initialise the Markdown parser with the base extension. + +This module configures the MDParser by registering the BaseExtension, +which provides core Markdown parsing functionality. +""" MDParser.add_extension(BaseExtension)🧰 Tools
🪛 Ruff (0.8.2)
1-1:
.extensions.Extension
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
BetterMD/elements/big.py (1)
1-3
: Remove unused import.The
CustomMarkdown
class is imported but not used in this file.from .symbol import Symbol -from ..markdown import CustomMarkdown
🧰 Tools
🪛 Ruff (0.8.2)
2-2:
..markdown.CustomMarkdown
imported but unusedRemove unused import:
..markdown.CustomMarkdown
(F401)
BetterMD/parse/typing.py (1)
1-4
: Remove unused import.The
ATTR_TYPES
import is not used in this file.import typing as t -from ..typing import ATTR_TYPES, ATTRS +from ..typing import ATTRS🧰 Tools
🪛 Ruff (0.8.2)
3-3:
..typing.ATTR_TYPES
imported but unusedRemove unused import:
..typing.ATTR_TYPES
(F401)
BetterMD/__init__.py (1)
20-24
: Consider adding file and URL methods to MD class for consistency.The
HTML
class provides methods for converting from strings, files, and URLs, but theMD
class only provides a method for strings. Consider adding similar methods for files and URLs to maintain consistency.class MD: @staticmethod def from_string(md:'str'): return Symbol.from_md(md) + + @staticmethod + def from_file(file): + return Symbol.from_md(file) + + @staticmethod + def from_url(url): + import requests as r + t = r.get(url).text + return Symbol.from_md(t)BetterMD/parse/collection.py (1)
19-19
: Simplify the return type annotation.The return type annotation is unnecessarily complex and can be simplified to use
Optional[type[Symbol]]
instead of nested unions.- def find_symbol(self, name:'str', raise_errors:'bool'=False) -> 't.Union[type[Symbol], t.Union[None, t.NoReturn]]': + def find_symbol(self, name:'str', raise_errors:'bool'=False) -> 't.Optional[type[Symbol]]':BetterMD/elements/code.py (1)
15-15
: Consider clearer code block formatting.The added newlines improve readability, but the order of language and content may be unexpected to some users. Consider documenting this formatting choice or potentially adjusting it to match common Markdown implementations.
- return f"```\n{language}\n{inner}\n```\n" + return f"```{language}\n{inner}\n```\n"BetterMD/elements/symbol.py (5)
8-8
: Remove unused import.The ATTRS import from typing is not used in this file.
- from ..typing import ATTRS, ATTR_TYPES + from ..typing import ATTR_TYPES🧰 Tools
🪛 Ruff (0.8.2)
8-8:
..typing.ATTRS
imported but unusedRemove unused import:
..typing.ATTRS
(F401)
36-41
: Use 'is None' instead of '== None' for identity checks.When checking if a variable is None, it's more pythonic and correct to use the 'is' operator rather than equality comparison.
- if styles == None: + if styles is None: - if classes == None: + if classes is None: - if inner == None: + if inner is None:🧰 Tools
🪛 Ruff (0.8.2)
36-36: Comparison to
None
should becond is None
Replace with
cond is None
(E711)
38-38: Comparison to
None
should becond is None
Replace with
cond is None
(E711)
40-40: Comparison to
None
should becond is None
Replace with
cond is None
(E711)
49-54
: Use 'is None' for consistency with init.As with the init method, use 'is None' for identity checks.
- if inner == None: + if inner is None: - if styles == None: + if styles is None: - if classes == None: + if classes is None:🧰 Tools
🪛 Ruff (0.8.2)
49-49: Comparison to
None
should becond is None
Replace with
cond is None
(E711)
51-51: Comparison to
None
should becond is None
Replace with
cond is None
(E711)
53-53: Comparison to
None
should becond is None
Replace with
cond is None
(E711)
117-117
: Consider breaking down long string formatting.The string formatting in the
to_html
method is quite complex and hard to read. Consider breaking it down into smaller parts for better readability and maintainability.- return f"<{self.html}{" " if self.styles or self.classes or self.props else ""}{f"class={'"'}{' '.join(self.classes) or ''}{'"'}" if self.classes else ""}{" " if (self.styles or self.classes) and self.props else ""}{f"style={'"'}{' '.join([f'{k}:{v}' for k,v in self.styles.items()]) or ""}{'"'}" if self.styles else ""}{" " if (self.styles or self.classes) and self.props else ""}{handle_props(self.props)}{f">{"\n" if len(self.children) > 1 else ""}{inner_HTML}{"\n" if len(self.children) > 1 else ""}</{self.html}>" if inner_HTML else f" />"}" + # Build tag parts separately + tag_attrs = [] + if self.classes: + tag_attrs.append(f"class=\"{' '.join(self.classes) or ''}\"") + if self.styles: + tag_attrs.append(f"style=\"{' '.join([f'{k}:{v}' for k,v in self.styles.items()]) or ''}\"") + if self.props: + tag_attrs.append(handle_props(self.props)) + + # Join attributes with spaces + attrs_str = " " + " ".join(tag_attrs) if tag_attrs else "" + + # Handle opening/closing or self-closing tag + if inner_HTML: + newline = "\n" if len(self.children) > 1 else "" + return f"<{self.html}{attrs_str}>{newline}{inner_HTML}{newline}</{self.html}>" + else: + return f"<{self.html}{attrs_str} />"🧰 Tools
🪛 Ruff (0.8.2)
117-117: f-string without any placeholders
Remove extraneous
f
prefix(F541)
184-186
: Consider breaking down the str method similar to to_html.The string representation in the
__str__
method suffers from the same readability issues as theto_html
method. Consider refactoring it to improve readability.- def __str__(self): - return f"<{self.html}{" " if self.styles or self.classes or self.props else ""}{f"class={'"'}{' '.join(self.classes) or ''}{'"'}" if self.classes else ""}{" " if (self.styles or self.classes) and self.props else ""}{f"style={'"'}{' '.join([f'{k}:{v}' for k,v in self.styles.items()]) or ""}{'"'}" if self.styles else ""}{" " if (self.styles or self.classes) and self.props else ""}{' '.join([f'{k}={'"'}{v}{'"'}' if v != "" else f'{k}' for k,v in self.props.items()])}{f">{"\n" if len(self.children) > 1 else ""}{"\n" if len(self.children) > 1 else ""}{len(self.children)}</{self.html}>"}" + def __str__(self): + # Build tag parts separately + tag_attrs = [] + if self.classes: + tag_attrs.append(f"class=\"{' '.join(self.classes) or ''}\"") + if self.styles: + tag_attrs.append(f"style=\"{' '.join([f'{k}:{v}' for k,v in self.styles.items()]) or ''}\"") + if self.props: + props_str = ' '.join([f'{k}=\"{v}\"' if v != "" else f'{k}' for k,v in self.props.items()]) + tag_attrs.append(props_str) + + # Join attributes with spaces + attrs_str = " " + " ".join(tag_attrs) if tag_attrs else "" + + # For __str__, just show the number of children + newline = "\n" if len(self.children) > 1 else "" + return f"<{self.html}{attrs_str}>{newline}{newline}{len(self.children)}</{self.html}>"
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (17)
BetterMD/__init__.py
(1 hunks)BetterMD/elements/__init__.py
(1 hunks)BetterMD/elements/area.py
(1 hunks)BetterMD/elements/big.py
(1 hunks)BetterMD/elements/code.py
(2 hunks)BetterMD/elements/d.py
(1 hunks)BetterMD/elements/frame.py
(1 hunks)BetterMD/elements/param.py
(1 hunks)BetterMD/elements/q.py
(1 hunks)BetterMD/elements/symbol.py
(4 hunks)BetterMD/elements/template.py
(1 hunks)BetterMD/elements/tt.py
(1 hunks)BetterMD/parse/collection.py
(1 hunks)BetterMD/parse/markdown/__init__.py
(1 hunks)BetterMD/parse/markdown/extensions/__init__.py
(1 hunks)BetterMD/parse/typing.py
(1 hunks)setup.py
(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- setup.py
🚧 Files skipped from review as they are similar to previous changes (8)
- BetterMD/elements/tt.py
- BetterMD/elements/template.py
- BetterMD/elements/q.py
- BetterMD/parse/markdown/extensions/init.py
- BetterMD/elements/d.py
- BetterMD/elements/area.py
- BetterMD/elements/param.py
- BetterMD/elements/frame.py
🧰 Additional context used
🧬 Code Definitions (6)
BetterMD/elements/code.py (3)
BetterMD/rst/custom_rst.py (3)
CustomRst
(9-18)verify
(18-18)to_rst
(14-14)BetterMD/elements/text.py (1)
to_rst
(23-24)BetterMD/markdown/custom_markdown.py (1)
verify
(18-18)
BetterMD/parse/collection.py (2)
BetterMD/elements/symbol.py (1)
Symbol
(10-187)BetterMD/elements/code.py (1)
verify
(30-31)
BetterMD/elements/symbol.py (2)
BetterMD/parse/typing.py (3)
parse
(18-18)ELEMENT
(10-14)TEXT
(5-8)BetterMD/parse/collection.py (3)
Collection
(8-27)add_symbols
(13-14)find_symbol
(19-27)
BetterMD/elements/__init__.py (26)
BetterMD/elements/symbol.py (1)
Symbol
(10-187)BetterMD/elements/text_formatting.py (3)
Strong
(15-18)Em
(25-28)B
(20-23)BetterMD/elements/a.py (1)
A
(13-27)BetterMD/elements/area.py (1)
Area
(3-8)BetterMD/elements/big.py (1)
Big
(4-7)BetterMD/elements/blockquote.py (1)
Blockquote
(9-13)BetterMD/elements/br.py (1)
Br
(13-16)BetterMD/elements/code.py (2)
Code
(65-68)HTML
(20-31)BetterMD/elements/d.py (4)
DD
(3-6)DFN
(18-23)DL
(13-16)DT
(8-11)BetterMD/elements/div.py (1)
Div
(3-6)BetterMD/elements/frame.py (1)
Frame
(3-10)BetterMD/elements/h.py (6)
H1
(16-20)H2
(22-26)H3
(28-32)H4
(34-38)H5
(40-44)H6
(46-50)BetterMD/elements/head.py (1)
Head
(3-8)BetterMD/elements/hr.py (1)
Hr
(13-19)BetterMD/__init__.py (1)
HTML
(5-18)BetterMD/elements/img.py (2)
HTML
(11-13)Img
(19-23)BetterMD/elements/input.py (1)
Input
(18-60)BetterMD/elements/li.py (3)
OL
(40-44)UL
(46-51)LI
(33-38)BetterMD/elements/p.py (1)
P
(3-7)BetterMD/elements/param.py (1)
Param
(3-8)BetterMD/elements/q.py (1)
Q
(3-8)BetterMD/elements/span.py (1)
Span
(3-6)BetterMD/elements/table.py (7)
Table
(158-165)Tr
(167-170)Td
(172-179)Th
(181-189)THead
(191-197)TBody
(199-205)TFoot
(207-213)BetterMD/elements/template.py (1)
Template
(3-8)BetterMD/elements/text.py (1)
Text
(8-29)BetterMD/elements/tt.py (1)
TT
(3-6)
BetterMD/elements/big.py (1)
BetterMD/elements/symbol.py (1)
Symbol
(10-187)
BetterMD/parse/typing.py (2)
BetterMD/parse/html.py (1)
parse
(16-193)BetterMD/elements/symbol.py (1)
parse
(153-171)
🪛 Ruff (0.8.2)
BetterMD/parse/collection.py
21-22: SyntaxError: Expected an expression
22-22: SyntaxError: Invalid annotated assignment target
22-23: SyntaxError: Expected an expression
BetterMD/elements/symbol.py
8-8: ..typing.ATTRS
imported but unused
Remove unused import: ..typing.ATTRS
(F401)
36-36: Comparison to None
should be cond is None
Replace with cond is None
(E711)
38-38: Comparison to None
should be cond is None
Replace with cond is None
(E711)
40-40: Comparison to None
should be cond is None
Replace with cond is None
(E711)
49-49: Comparison to None
should be cond is None
Replace with cond is None
(E711)
51-51: Comparison to None
should be cond is None
Replace with cond is None
(E711)
53-53: Comparison to None
should be cond is None
Replace with cond is None
(E711)
117-117: f-string without any placeholders
Remove extraneous f
prefix
(F541)
BetterMD/elements/__init__.py
1-1: .symbol.Symbol
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
3-3: .text_formatting.Strong
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
3-3: .text_formatting.Em
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
3-3: .text_formatting.B
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
5-5: .a.A
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
6-6: .abbr.Abbr
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
7-7: .acronym.Acronym
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
8-8: .address.Address
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
9-9: .area.Area
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
10-10: .article.Article
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
11-11: .aside.Aside
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
12-12: .audio.Audio
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
14-14: .base.Base
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
15-15: .bd.BDI
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
15-15: .bd.BDO
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
16-16: .big.Big
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
17-17: .blockquote.Blockquote
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
18-18: .body.Body
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
19-19: .br.Br
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
20-20: .button.Button
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
22-22: .canvas.Canvas
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
23-23: .caption.Caption
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
24-24: .center.Center
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
25-25: .cite.Cite
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
26-26: .code.Code
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
27-27: .col.Col
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
27-27: .col.Colgroup
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
29-29: .d.DD
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
29-29: .d.DFN
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
29-29: .d.DL
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
29-29: .d.DT
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
30-30: .data.Data
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
31-31: .datalist.DataList
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
32-32: .del_.Del
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
33-33: .details.Details
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
34-34: .dialog.Dialog
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
35-35: .dir.Dir
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
36-36: .div.Div
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
38-38: .embed.Embed
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
40-40: .fencedframe.FencedFrame
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
41-41: .fieldset.Fieldset
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
42-42: .figure.FigCaption
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
42-42: .figure.Figure
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
43-43: .font.Font
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
44-44: .footer.Footer
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
45-45: .form.Form
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
46-46: .frame.Frame
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
47-47: .frameset.Frameset
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
49-49: .h.H1
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
49-49: .h.H2
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
49-49: .h.H3
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
49-49: .h.H4
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
49-49: .h.H5
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
49-49: .h.H6
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
50-50: .head.Head
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
51-51: .header.Header
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
52-52: .hgroup.HGroup
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
53-53: .hr.Hr
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
54-54: .html.HTML
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
56-56: .i.I
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
57-57: .iframe.Iframe
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
58-58: .img.Img
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
59-59: .input.Input
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
60-60: .ins.Ins
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
62-62: .kbd.Kbd
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
64-64: .label.Label
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
65-65: .legend.Legend
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
66-66: .li.OL
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
66-66: .li.UL
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
66-66: .li.LI
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
67-67: .link.Link
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
69-69: .main.Main
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
70-70: .map.Map
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
71-71: .mark.Mark
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
72-72: .marquee.Marquee
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
73-73: .menu.Menu
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
74-74: .meta.Meta
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
75-75: .meter.Meter
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
77-77: .nav.Nav
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
78-78: .no.NoFrames
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
78-78: .no.NoScript
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
78-78: .no.NoBr
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
78-78: .no.NoEmbed
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
80-80: .object.Object
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
81-81: .output.Output
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
83-83: .p.P
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
84-84: .param.Param
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
85-85: .picture.Picture
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
86-86: .plaintext.Plaintext
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
87-87: .progress.Progress
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
89-89: .q.Q
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
91-91: .ruby.RB
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
91-91: .ruby.RP
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
91-91: .ruby.RT
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
91-91: .ruby.RTC
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
93-93: .s.S
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
94-94: .samp.Samp
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
95-95: .script.Script
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
96-96: .search.Search
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
97-97: .section.Section
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
98-98: .select.Select
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
99-99: .slot.Slot
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
100-100: .small.Small
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
101-101: .source.Source
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
102-102: .span.Span
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
103-103: .strike.Strike
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
104-104: .style.Style
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
105-105: .sub.Sub
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
106-106: .summary.Summary
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
107-107: .sup.Sup
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
109-109: .table.Table
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
109-109: .table.Tr
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
109-109: .table.Td
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
109-109: .table.Th
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
109-109: .table.THead
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
109-109: .table.TBody
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
109-109: .table.TFoot
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
110-110: .template.Template
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
111-111: .text.Text
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
112-112: .textarea.Textarea
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
113-113: .time.Time
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
114-114: .title.Title
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
115-115: .track.Track
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
116-116: .tt.TT
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
118-118: .u.U
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
120-120: .var.Var
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
121-121: .video.Video
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
123-123: .wbr.WBR
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
125-125: .xmp.XMP
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
BetterMD/elements/big.py
2-2: ..markdown.CustomMarkdown
imported but unused
Remove unused import: ..markdown.CustomMarkdown
(F401)
BetterMD/parse/markdown/__init__.py
1-1: .extensions.Extension
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
BetterMD/parse/typing.py
3-3: ..typing.ATTR_TYPES
imported but unused
Remove unused import: ..typing.ATTR_TYPES
(F401)
🔇 Additional comments (14)
BetterMD/elements/big.py (1)
4-7
: Consider adding a deprecation warning for the Big element.The
<big>
element is deprecated in HTML5. Consider adding a deprecation warning in the docstring and suggesting CSS alternatives for increasing text size.from .symbol import Symbol class Big(Symbol): + """ + Represents an HTML <big> element. + + WARNING: The <big> element is deprecated in HTML5. Consider using CSS alternatives + like 'font-size' instead. + + There is no direct equivalent in Markdown or reStructuredText. + """ md = "" html = "big" rst = ""BetterMD/parse/typing.py (3)
5-9
: LGTM - TEXT TypedDict definition.The TEXT TypedDict definition correctly specifies the structure for text content.
10-14
: Fix inconsistent type annotations in ELEMENT TypedDict.The type annotations in the
ELEMENT
class are inconsistently using string literals, while theTEXT
class uses the same approach. Consider standardizing the type annotation style across the codebase.class ELEMENT(t.TypedDict): - type: 't.Literal["element"]' - name: 'str' - attributes: 'ATTRS' - children: 'list[t.Union[ELEMENT, TEXT]]' + type: t.Literal["element"] + name: str + attributes: ATTRS + children: list[t.Union[ELEMENT, TEXT]]
16-18
: LGTM - Parser protocol definition.The runtime-checkable Parser protocol correctly defines the interface for parsers.
BetterMD/__init__.py (2)
1-4
: LGTM - Explicit import of Symbol for clarity.While
Symbol
might already be imported via the star import, making the dependency explicit enhances code clarity.🧰 Tools
🪛 Ruff (0.8.2)
1-1:
from .elements import *
used; unable to detect undefined names(F403)
25-25
: LGTM - Explicit all definition.Defining an explicit
__all__
list is a good practice for documenting the public API.BetterMD/elements/code.py (3)
33-63
: Well-implemented RST class for code blocks.The new RST class provides comprehensive support for converting code elements to reStructuredText format, handling both inline code and code blocks with proper indentation and directives.
23-23
: Good use of list comprehension with join.The change to join the HTML output of inner elements is a good practice for handling collections.
67-68
: Good transition from string to class instance.Replacing the string representation with an instance of the RST class aligns with the object-oriented approach used for HTML and MD, providing more flexibility and consistency.
BetterMD/elements/symbol.py (5)
31-34
: Great use of init_subclass for automatic registration.Using
__init_subclass__
to automatically register subclasses in the collection is an elegant design pattern that ensures all Symbol subclasses are available for parsing.
99-114
: Good implementation of property handling.The
handle_props
nested function elegantly handles different property types, providing appropriate formatting for each type. This approach is both readable and maintainable.
123-132
: Good handling of block elements in Markdown conversion.The new approach to handling block elements and newlines in the
to_md
method provides better formatting control, ensuring proper whitespace in the generated Markdown.
142-151
: Excellent addition of parsing methods.The new class methods
from_html
andfrom_md
provide a convenient way to parse HTML and Markdown content into Symbol instances, enhancing the package's usability.
153-172
: Well-implemented parse method.The
parse
class method effectively handles different element types, extracting styles, classes, and attributes from the input. The nestedhandle_element
function is a good approach to process elements recursively.
BetterMD/parse/collection.py
Outdated
def find_symbol(self, name:'str', raise_errors:'bool'=False) -> 't.Union[type[Symbol], t.Union[None, t.NoReturn]]': | ||
for symbol in self.symbols: | ||
if (isinstance(symbol.html, CustomHTML) and symbol.html.verify(name)) or | ||
(isinstance(symbol.html, str) and symbol.html.lower() == name.lower()): | ||
return symbol |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix syntax error in the conditional statement.
The logical condition in the find_symbol
method spans multiple lines but is written incorrectly, causing a syntax error. The line continuation is missing proper indentation and line ending.
- if (isinstance(symbol.html, CustomHTML) and symbol.html.verify(name)) or
- (isinstance(symbol.html, str) and symbol.html.lower() == name.lower()):
+ if (isinstance(symbol.html, CustomHTML) and symbol.html.verify(name)) or \
+ (isinstance(symbol.html, str) and symbol.html.lower() == name.lower()):
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def find_symbol(self, name:'str', raise_errors:'bool'=False) -> 't.Union[type[Symbol], t.Union[None, t.NoReturn]]': | |
for symbol in self.symbols: | |
if (isinstance(symbol.html, CustomHTML) and symbol.html.verify(name)) or | |
(isinstance(symbol.html, str) and symbol.html.lower() == name.lower()): | |
return symbol | |
def find_symbol(self, name:'str', raise_errors:'bool'=False) -> 't.Union[type[Symbol], t.Union[None, t.NoReturn]]': | |
for symbol in self.symbols: | |
if (isinstance(symbol.html, CustomHTML) and symbol.html.verify(name)) or \ | |
(isinstance(symbol.html, str) and symbol.html.lower() == name.lower()): | |
return symbol |
🧰 Tools
🪛 Ruff (0.8.2)
21-22: SyntaxError: Expected an expression
22-22: SyntaxError: Invalid annotated assignment target
22-23: SyntaxError: Expected an expression
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 11
🧹 Nitpick comments (28)
BetterMD/elements/svg.py (6)
1-5
: Add docstring to explain file purpose and usage.The file lacks a module-level docstring that explains its purpose, role within the BetterMD package, and how these SVG element classes should be used. This would improve developer experience and code maintainability.
from .symbol import Symbol +""" +SVG element classes for BetterMD. + +This module provides classes representing SVG elements for conversion between HTML, +Markdown, and reStructuredText formats. Each class defines the element's tag name and +relevant properties. +""" # Check prop lists before use # MDN Docs: https://developer.mozilla.org/en-US/docs/Web/SVG/Reference/Element/svg
3-4
: Clarify comment about property lists.The comment "Check prop lists before use" is ambiguous. It's not clear if this is a todo item, a warning, or instructions for developers. Consider clarifying this comment with more specific guidance.
-# Check prop lists before use +# NOTE: Property lists may not be comprehensive. Refer to MDN documentation for complete lists. # MDN Docs: https://developer.mozilla.org/en-US/docs/Web/SVG/Reference/Element/svg
23-27
: Consider adding common SVG style properties to Circle class.The property list for Circle includes geometric properties but omits commonly used style properties like 'fill', 'stroke', 'stroke-width', etc. These are important for SVG rendering and manipulation.
class Circle(Symbol): - prop_list = ["cx", "cy", "r"] + prop_list = ["cx", "cy", "r", "fill", "stroke", "stroke-width", "opacity"] html = "circle" md = "" rst = ""
191-221
: Add warning about deprecated font elements.The comment notes that font elements are deprecated but included. Consider enhancing this with a more explicit warning or explanation about why these elements are included despite being deprecated, and how developers should approach them.
-# Font elements (deprecated but included) +# Font elements +# WARNING: These elements are deprecated in SVG 2.0 and HTML5. +# They are included for compatibility with legacy content.
248-252
: Update 'href' attribute in Image and Use classes.The property lists for Image and Use classes include 'href', but in SVG 1.1 this would typically be 'xlink:href'. While SVG 2.0 has moved to using 'href', you may want to support both for backwards compatibility.
class Image(Symbol): - prop_list = ["href", "x", "y", "width", "height"] + prop_list = ["href", "xlink:href", "x", "y", "width", "height"] html = "image" md = "" rst = ""Similarly for the Use class:
class Use(Symbol): - prop_list = ["href", "x", "y", "width", "height"] + prop_list = ["href", "xlink:href", "x", "y", "width", "height"] html = "use" md = "" rst = ""Also applies to: 385-389
324-328
: Inconsistent class naming convention.Three classes (
SVGScript
,SVGSymbol
, andSVGText
) use a different naming convention than the rest of the classes. WhileSVGSymbol
likely needs this prefix to avoid conflict with the importedSymbol
class, the reasons for the other prefixes are unclear.Consider renaming
SVGScript
toScript
for consistency, or add a comment explaining why these specific classes use a different naming convention:-class SVGScript(Symbol): +class Script(Symbol): html = "script" md = "" rst = ""Or add a comment explaining the inconsistency:
+# Prefixed with SVG to avoid conflicts with standard Python types/modules class SVGScript(Symbol): html = "script" md = "" rst = ""
Also applies to: 355-359, 360-364
BetterMD/elements/img.py (1)
16-16
: Expand test coverage for newly introduced properties.
The extended property list may benefit from unit tests or integration tests to verify that each property is accurately handled and included in any conversions to or from other formats (e.g. Markdown, HTML, RST).BetterMD/parse/html.py (4)
1-4
: Add docstrings or type hints for clarity.
The import statements look fine. However, including explicit docstrings for each imported definition or more detailed type hints would improve maintainability.
15-30
: Consider adding a docstring toreset()
.
Providing a docstring describing the parser's initial state, including each significant member variable, would help maintainers quickly understand how the parser resets.
54-78
: Avoid committing commented debug statements.
There is a commented-out#print(char, end="")
that might be left over from debugging. Consider removing or replacing it with a formal logging mechanism if needed.
103-107
: Return a copy of parsed DOM if immutability is needed.
If your parser's output may be manipulated later, returning a deep copy or a new list ensures users of this API cannot alter the parser’s internal state accidentally. If that is not a concern, this is fine.src/better_md/__init__.py (1)
1-2
: Add a docstring for themain()
function.
It may be beneficial to provide future maintainers with a short explanation of whatmain()
is intended to do, e.g., whether it could one day parse command-line arguments or integrate the library’s functionality.BetterMD/elements/comment.py (2)
5-6
: Consider handling all inner elements in the commentThe current implementation only uses the last element of the
inner
list. Is this intentional? If multiple elements should be included in the comment, consider joining them all.- def to_html(self, inner, symbol, parent): - return f"<!--{inner[-1].to_html()}-->" + def to_html(self, inner, symbol, parent): + comment_content = "".join([i.to_html() for i in inner]) + return f"<!--{comment_content}-->"
11-14
: Consider implementing Markdown and RST comment representationsBoth
md
andrst
are empty strings, suggesting that comments might not be represented in Markdown or reStructuredText formats. Consider implementing proper comment syntax for these formats if needed.For Markdown, you could use HTML comments as they're supported in most Markdown parsers. For RST, you might want to use the standard comment format (
.. comment text
).- md = "" + md = CustomMarkdown() # or implement an appropriate Markdown comment class - rst = "" + rst = CustomRst() # or implement an appropriate RST comment classBetterMD/utils.py (2)
7-15
: Consider adding join method variantsThe current methods return lists of strings. Consider adding variants that join the strings together for convenience.
class List(list['Symbol']): def to_html(self): return [elm.to_html() for elm in self] def to_md(self): return [elm.to_md() for elm in self] def to_rst(self): return [elm.to_rst() for elm in self] + + def join_html(self, separator=""): + return separator.join(self.to_html()) + + def join_md(self, separator=""): + return separator.join(self.to_md()) + + def join_rst(self, separator=""): + return separator.join(self.to_rst())
17-21
: Add docstrings to recursion limit functionsThese utility functions lack documentation. Add docstrings to explain when and why a user would need to modify the recursion limit.
def set_recursion_limit(limit): + """ + Set the maximum recursion depth for the Python interpreter. + + This is useful when processing deeply nested structures that might exceed + the default recursion limit. + + Args: + limit: The new recursion limit to set + """ sys.setrecursionlimit(limit) def get_recursion_limit(): + """ + Get the current maximum recursion depth for the Python interpreter. + + Returns: + int: The current recursion limit + """ return sys.getrecursionlimit()BetterMD/parse/collection.py (4)
3-3
: Remove unused importThe
CustomHTML
import is not being used in this file.-from ..html import CustomHTML
🧰 Tools
🪛 Ruff (0.8.2)
3-3:
..html.CustomHTML
imported but unusedRemove unused import:
..html.CustomHTML
(F401)
13-14
: Add type annotations to method return valuesConsider adding return type annotations to the
add_symbols
method for consistency with other methods in the class.- def add_symbols(self, symbol:'type[Symbol]'): + def add_symbols(self, symbol:'type[Symbol]') -> None: self.symbols.append(symbol)
16-17
: Add error handling to remove_symbol methodThe
remove_symbol
method will raise aValueError
if the symbol is not in the list. Consider adding error handling or a parameter to control this behaviour.- def remove_symbol(self, symbol:'type[Symbol]'): - self.symbols.remove(symbol) + def remove_symbol(self, symbol:'type[Symbol]', raise_errors:bool=True) -> bool: + """ + Remove a symbol from the collection. + + Args: + symbol: The symbol to remove + raise_errors: Whether to raise an error if the symbol is not found + + Returns: + bool: True if the symbol was removed, False otherwise + """ + try: + self.symbols.remove(symbol) + return True + except ValueError: + if raise_errors: + raise ValueError(f"Symbol {symbol.__qualname__} not found in collection") + return False
19-27
: Remove extra empty line and improve find_symbol methodThere's an unnecessary empty line at line 24. Also, consider adding a more descriptive docstring to explain the search logic.
def find_symbol(self, name:'str', raise_errors:'bool'=False) -> 't.Optional[type[Symbol]]': + """ + Find a symbol in the collection by its qualified name. + + Args: + name: The qualified name to search for (case-insensitive) + raise_errors: Whether to raise an error if the symbol is not found + + Returns: + The found symbol class or None if not found and raise_errors is False + + Raises: + ValueError: If the symbol is not found and raise_errors is True + """ for symbol in self.symbols: if symbol.__qualname__.lower() == name.lower(): return symbol - if raise_errors: raise ValueError(f"Symbol `{name}` not found in collection, if using default symbols it may not be supported.") return NoneBetterMD/elements/code.py (1)
33-63
: Well-structured RST implementationThe new
RST
class is well-implemented with comprehensive handling of different code formats. A few suggestions:
- Consider adding comments explaining the RST syntax requirements
- The duplicate indentation logic could be extracted to a helper method
class RST(CustomRst): def to_rst(self, inner, symbol, parent): language = symbol.get_prop("language", "") # Handle inner content if isinstance(inner, list): content = "".join([ i.to_rst() if isinstance(i, Symbol) else str(i) for i in inner ]) else: content = inner.to_rst() if isinstance(inner, Symbol) else str(inner) # If it's a code block (has language or multiline) if language or "\n" in content: + # Helper function to indent content according to RST requirements (3 spaces) + def indent_content(text): + return "\n".join(f" {line}" for line in text.strip().split("\n")) + # Use code-block directive for language-specific blocks if language: - # Indent the content by 3 spaces (RST requirement) - indented_content = "\n".join(f" {line}" for line in content.strip().split("\n")) + indented_content = indent_content(content) return f".. code-block:: {language}\n\n{indented_content}\n\n" # Use simple literal block for language-less blocks - # Indent the content by 3 spaces (RST requirement) - indented_content = "\n".join(f" {line}" for line in content.strip().split("\n")) + indented_content = indent_content(content) return f"::\n\n{indented_content}\n\n" # Inline code # Escape backticks if they exist in content if "`" in content: return f"``{content}``" return f"`{content}`"BetterMD/__init__.py (2)
1-1
: Avoid using wildcard importConsider replacing the wildcard import with explicit imports of only the symbols you need. This makes dependencies clearer and prevents unintended name conflicts.
-from .elements import * +from .elements import Symbol # Already imported explicitly on the next line🧰 Tools
🪛 Ruff (0.8.2)
1-1:
from .elements import *
used; unable to detect undefined names(F403)
19-20
: Improve DOCTYPE handlingThe current DOCTYPE handling is simplistic and only handles one specific format. DOCTYPE declarations can vary in format and length. Consider using a more robust approach:
- if text.startswith("<!DOCTYPE html>"): - text = text[15:] + import re + # Remove any DOCTYPE declaration more robustly + text = re.sub(r'<!DOCTYPE[^>]*>', '', text, flags=re.IGNORECASE)BetterMD/elements/style.py (1)
43-50
: Combine conditional branches with similar behaviorThe current code has separate if/elif branches for handling ID, class, and element selectors, but they all perform the same operation. Combine them into a single condition for better readability.
- if key.startswith(':'): # Pseudo-class - nested_selector = f"{selector}{key}" - elif key.startswith('#'): # ID - nested_selector = f"{selector} {key}" - elif key.startswith('.'): # Class - nested_selector = f"{selector} {key}" - else: # Element or custom - nested_selector = f"{selector} {key}" + if key.startswith(':'): # Pseudo-class + nested_selector = f"{selector}{key}" + else: # ID, Class, Element or custom + nested_selector = f"{selector} {key}"🧰 Tools
🪛 Ruff (0.8.2)
45-48: Combine
if
branches using logicalor
operatorCombine
if
branches(SIM114)
BetterMD/elements/symbol.py (2)
10-10
: Consider making the recursion limit configurableSetting a high recursion limit might be necessary for deep document trees, but hard-coding it could cause issues in some environments. Consider making this configurable:
-set_recursion_limit(10000) +def configure_recursion_limit(limit=10000): + """Configure Python's recursion limit for deep document parsing.""" + set_recursion_limit(limit) + +configure_recursion_limit()
90-105
: Consider adding docstring tohandle_props
methodThis important method deserves a docstring explaining its purpose, parameters, and return value. This would improve maintainability:
def handle_props(self, p): + """ + Format HTML attributes from properties, styles, and classes. + + Args: + p: If True, prints debug information. + + Returns: + str: Formatted HTML attributes string, with a leading space if non-empty. + """ props = {**({"class": self.classes} if self.classes else {}), **({"style": self.styles} if self.styles else {}), **self.props} # Rest of method...BetterMD/elements/__init__.py (2)
3-3
: Avoid wildcard importsReplace the wildcard import with explicit imports of only the symbols you need from the svg module. This makes dependencies clearer and prevents unintended name conflicts.
-from .svg import * +from .svg import ( + SVG, Circle, Ellipse, Line, Path, Polygon, Polyline, + Rect, Stop, Text as SVGText # Rename if there's a conflict +)🧰 Tools
🪛 Ruff (0.8.2)
3-3:
from .svg import *
used; unable to detect undefined names(F403)
1-6
: Add module docstring explaining the file's purposeAdd a docstring at the module level to explain the purpose of this file and how it should be used:
+""" +Core elements for the BetterMD library. + +This module imports and re-exports all Symbol subclasses for convenient access. +Users can import these symbols directly from BetterMD.elements rather than from +their individual modules. +""" from .symbol import Symbol from .comment import Comment from .svg import *🧰 Tools
🪛 Ruff (0.8.2)
1-1:
.symbol.Symbol
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
2-2:
.comment.Comment
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
3-3:
from .svg import *
used; unable to detect undefined names(F403)
5-5:
.text_formatting.Strong
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
5-5:
.text_formatting.Em
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
5-5:
.text_formatting.B
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (15)
BetterMD/__init__.py
(1 hunks)BetterMD/elements/__init__.py
(1 hunks)BetterMD/elements/big.py
(1 hunks)BetterMD/elements/code.py
(2 hunks)BetterMD/elements/comment.py
(1 hunks)BetterMD/elements/img.py
(1 hunks)BetterMD/elements/style.py
(1 hunks)BetterMD/elements/svg.py
(1 hunks)BetterMD/elements/symbol.py
(3 hunks)BetterMD/elements/text.py
(1 hunks)BetterMD/parse/collection.py
(1 hunks)BetterMD/parse/html.py
(1 hunks)BetterMD/parse/typing.py
(1 hunks)BetterMD/utils.py
(1 hunks)src/better_md/__init__.py
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- BetterMD/elements/big.py
- BetterMD/parse/typing.py
- BetterMD/elements/text.py
🧰 Additional context used
🧬 Code Definitions (9)
BetterMD/elements/comment.py (3)
BetterMD/elements/symbol.py (1)
to_html
(107-119)BetterMD/elements/text.py (1)
to_html
(17-18)BetterMD/utils.py (1)
to_html
(8-9)
BetterMD/utils.py (1)
BetterMD/elements/symbol.py (4)
Symbol
(12-217)to_html
(107-119)to_md
(122-136)to_rst
(138-143)
BetterMD/elements/img.py (1)
BetterMD/elements/symbol.py (3)
get_prop
(203-204)to_rst
(138-143)Symbol
(12-217)
BetterMD/elements/code.py (2)
BetterMD/rst/custom_rst.py (3)
CustomRst
(9-18)verify
(18-18)to_rst
(14-14)BetterMD/elements/symbol.py (2)
to_rst
(138-143)Symbol
(12-217)
BetterMD/__init__.py (4)
BetterMD/elements/symbol.py (4)
Symbol
(12-217)parse
(160-201)from_html
(146-150)from_md
(155-157)BetterMD/parse/typing.py (1)
parse
(18-18)BetterMD/parse/html.py (2)
parse
(54-106)HTMLParser
(5-210)BetterMD/parse/collection.py (1)
Collection
(8-27)
BetterMD/elements/symbol.py (15)
BetterMD/parse/typing.py (3)
parse
(18-18)ELEMENT
(10-14)TEXT
(5-8)BetterMD/parse/html.py (3)
parse
(54-106)HTMLParser
(5-210)children
(12-13)BetterMD/parse/collection.py (3)
Collection
(8-27)add_symbols
(13-14)find_symbol
(19-27)BetterMD/utils.py (2)
set_recursion_limit
(17-18)to_html
(8-9)BetterMD/elements/code.py (3)
to_html
(21-28)to_md
(8-18)to_rst
(34-63)BetterMD/elements/text.py (3)
to_html
(17-18)to_md
(20-21)to_rst
(23-24)BetterMD/elements/style.py (1)
to_html
(56-63)BetterMD/elements/a.py (2)
to_md
(6-7)to_rst
(10-11)BetterMD/elements/table.py (11)
to_md
(8-29)to_md
(77-106)to_md
(114-122)to_md
(125-127)to_md
(137-138)to_rst
(32-74)to_rst
(109-111)to_rst
(132-134)to_rst
(141-147)to_rst
(150-151)to_rst
(154-156)BetterMD/elements/title.py (2)
to_md
(7-11)to_rst
(14-18)BetterMD/elements/img.py (2)
to_md
(7-9)to_rst
(12-13)BetterMD/elements/hr.py (2)
to_md
(6-7)to_rst
(10-11)BetterMD/elements/li.py (3)
to_md
(6-9)to_md
(21-24)to_rst
(12-18)BetterMD/elements/text_formatting.py (1)
to_md
(6-8)BetterMD/elements/h.py (1)
to_rst
(9-14)
BetterMD/elements/__init__.py (6)
BetterMD/elements/comment.py (2)
Comment
(11-14)HTML
(4-9)BetterMD/elements/text_formatting.py (3)
Strong
(15-18)Em
(25-28)B
(20-23)BetterMD/elements/svg.py (2)
Font
(217-220)Style
(339-342)BetterMD/elements/h.py (1)
H1
(16-20)BetterMD/elements/style.py (2)
HTML
(10-63)Style
(66-85)BetterMD/elements/table.py (2)
Table
(158-165)Tr
(167-170)
BetterMD/elements/style.py (4)
BetterMD/elements/symbol.py (2)
Symbol
(12-217)to_html
(107-119)BetterMD/elements/comment.py (3)
HTML
(4-9)verify
(8-9)to_html
(5-6)BetterMD/elements/code.py (3)
HTML
(20-31)verify
(30-31)to_html
(21-28)BetterMD/elements/svg.py (1)
Style
(339-342)
BetterMD/parse/collection.py (1)
BetterMD/elements/symbol.py (1)
Symbol
(12-217)
🪛 Ruff (0.8.2)
BetterMD/elements/__init__.py
1-1: .symbol.Symbol
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
2-2: .comment.Comment
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
3-3: from .svg import *
used; unable to detect undefined names
(F403)
5-5: .text_formatting.Strong
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
5-5: .text_formatting.Em
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
5-5: .text_formatting.B
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
7-7: .a.A
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
8-8: .abbr.Abbr
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
9-9: .acronym.Acronym
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
10-10: .address.Address
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
11-11: .area.Area
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
12-12: .article.Article
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
13-13: .aside.Aside
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
14-14: .audio.Audio
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
16-16: .base.Base
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
17-17: .bd.BDI
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
17-17: .bd.BDO
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
18-18: .big.Big
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
19-19: .blockquote.Blockquote
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
20-20: .body.Body
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
21-21: .br.Br
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
22-22: .button.Button
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
24-24: .canvas.Canvas
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
25-25: .caption.Caption
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
26-26: .center.Center
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
27-27: .cite.Cite
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
28-28: .code.Code
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
29-29: .col.Col
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
29-29: .col.Colgroup
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
31-31: .d.DD
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
31-31: .d.DFN
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
31-31: .d.DL
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
31-31: .d.DT
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
32-32: .data.Data
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
33-33: .datalist.DataList
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
34-34: .del_.Del
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
35-35: .details.Details
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
36-36: .dialog.Dialog
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
37-37: .dir.Dir
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
38-38: .div.Div
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
40-40: .embed.Embed
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
42-42: .fencedframe.FencedFrame
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
43-43: .fieldset.Fieldset
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
44-44: .figure.FigCaption
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
44-44: .figure.Figure
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
45-45: .font.Font
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
46-46: .footer.Footer
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
47-47: .form.Form
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
48-48: .frame.Frame
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
49-49: .frameset.Frameset
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
51-51: .h.H1
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
51-51: .h.H2
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
51-51: .h.H3
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
51-51: .h.H4
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
51-51: .h.H5
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
51-51: .h.H6
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
52-52: .head.Head
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
53-53: .header.Header
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
54-54: .hgroup.HGroup
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
55-55: .hr.Hr
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
56-56: .html.HTML
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
58-58: .i.I
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
59-59: .iframe.Iframe
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
60-60: .img.Img
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
61-61: .input.Input
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
62-62: .ins.Ins
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
64-64: .kbd.Kbd
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
66-66: .label.Label
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
67-67: .legend.Legend
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
68-68: .li.OL
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
68-68: .li.UL
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
68-68: .li.LI
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
69-69: .link.Link
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
71-71: .main.Main
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
72-72: .map.Map
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
73-73: .mark.Mark
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
74-74: .marquee.Marquee
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
75-75: .menu.Menu
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
76-76: .meta.Meta
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
77-77: .meter.Meter
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
79-79: .nav.Nav
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
80-80: .no.NoFrames
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
80-80: .no.NoScript
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
80-80: .no.NoBr
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
80-80: .no.NoEmbed
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
82-82: .object.Object
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
83-83: .output.Output
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
85-85: .p.P
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
86-86: .param.Param
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
87-87: .picture.Picture
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
88-88: .plaintext.Plaintext
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
89-89: .progress.Progress
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
91-91: .q.Q
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
93-93: .ruby.RB
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
93-93: .ruby.RP
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
93-93: .ruby.RT
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
93-93: .ruby.RTC
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
95-95: .s.S
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
96-96: .samp.Samp
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
97-97: .script.Script
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
98-98: .search.Search
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
99-99: .section.Section
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
100-100: .select.Select
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
101-101: .slot.Slot
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
102-102: .small.Small
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
103-103: .source.Source
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
104-104: .span.Span
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
105-105: .strike.Strike
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
106-106: .style.Style
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
107-107: .sub.Sub
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
108-108: .summary.Summary
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
109-109: .sup.Sup
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
111-111: .table.Table
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
111-111: .table.Tr
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
111-111: .table.Td
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
111-111: .table.Th
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
111-111: .table.THead
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
111-111: .table.TBody
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
111-111: .table.TFoot
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
112-112: .template.Template
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
113-113: .text.Text
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
114-114: .textarea.Textarea
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
115-115: .time.Time
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
116-116: .title.Title
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
117-117: .track.Track
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
118-118: .tt.TT
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
120-120: .u.U
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
122-122: .var.Var
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
123-123: .video.Video
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
125-125: .wbr.WBR
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
127-127: .xmp.XMP
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
BetterMD/elements/style.py
45-48: Combine if
branches using logical or
operator
Combine if
branches
(SIM114)
BetterMD/parse/collection.py
3-3: ..html.CustomHTML
imported but unused
Remove unused import: ..html.CustomHTML
(F401)
🔇 Additional comments (25)
BetterMD/elements/img.py (1)
18-18
:❓ Verification inconclusive
Confirm that replacing the
HTML
instance with a string is acceptable.
Switching fromhtml = HTML()
tohtml = "img"
could break any code that previously relied on methods from the oldHTML
class. Please conduct a thorough check for uses of thehtml
attribute to confirm this change will not introduce regressions.
🏁 Script executed:
#!/bin/bash rg -A 3 'Img' | rg '\.html'Length of output: 27
Replacement of HTML instance appears safe, but please verify any dependent usage manually.
The current search did not reveal any subsequent calls (e.g. method invocations viahtml.
) that would indicate reliance on the oldHTML
object. This suggests that switching fromhtml = HTML()
tohtml = "img"
is unlikely to cause regressions. However, because the automated search returned no output, I recommend a manual review of the surrounding codebase—particularly any areas that might indirectly depend on methods previously available on theHTML
class—to ensure no unintended side effects.BetterMD/parse/html.py (8)
5-7
: Class-level constant usage seems clear.
DefiningNON_PARSING_TAGS
at the class level is straightforward and logical.
8-10
: Initialise attributes inside constructor effectively.
Callingself.reset()
within__init__
is a clean approach. This ensures the parser’s internal state is consistently initialised.
11-14
:children
property is well-structured.
Exposing the current tag's children via a property is convenient for external usage.
31-45
:create_element
structure is flexible.
Returning a dictionary of attributes with references toparent
is an effective approach. The function is clear and likely easy to extend.
46-53
:create_text
is straightforward.
Creating text nodes as dictionaries keeps the parser consistent with the element structure.
108-184
: Check potential off-by-one foris_self_closing
.
At line 175, you referencehtml[i-1]
to detect self-closing tags. Ifi
ever becomes 0, this would be invalid. You may wish to guard againsti == 0
or restructure the logic.
185-195
: Closing tag handling is straightforward.
The method cleanly updates the parser state and returns the new index.
196-210
: Comment handling is effective.
Creating a “comment” element is helpful for debugging or maintaining DOM fidelity. Overall, this approach is well-designed.BetterMD/elements/comment.py (1)
8-9
: Verify method checks only part of HTML comment syntaxThe
verify
method checks only for "!--" which is part of an HTML comment tag but not the complete beginning tag ("<!--"). This might be intentional if your parser separates these components, but it could lead to incorrect matching if not.Please confirm that your HTML parser separates the "<" from "!--" when processing comment tags. If not, consider updating this method to check for the complete comment start tag.
BetterMD/parse/collection.py (1)
21-22
:❓ Verification inconclusive
Consider enhancing symbol search logic
The current implementation only checks the qualified name. Based on past review comments, previous versions might have checked both the HTML attribute and CustomHTML.verify method. Consider adding these checks back if they're still needed.
🏁 Script executed:
#!/bin/bash # Search for find_symbol method implementations in previous versions rg -A 10 -B 2 "def find_symbol" --no-ignoreLength of output: 794
Enhanced Symbol Search Logic Evaluation
I've verified that the current implementation in
BetterMD/parse/collection.py
(lines 21-22) strictly comparessymbol.__qualname__.lower()
with the provided name. The earlier checks based on HTML attributes and theCustomHTML.verify
method, as noted in previous review comments, are no longer present in the codebase. If these additional verifications are still required based on the intended symbol resolution logic, please reassess their necessity and reinstate them as needed.
- The method currently only checks the
__qualname__
and returns the symbol on a match.- No evidence was found of prior logic involving HTML attributes or the
CustomHTML.verify
method.- Action required: Confirm whether the expanded symbol matching is necessary. If so, consider reintroducing the former logic to verify symbols using HTML attributes and/or the
CustomHTML.verify
method.BetterMD/elements/code.py (4)
23-23
: LGTM! Improved inner content handlingThis change properly joins multiple inner elements into a single string, which is more robust than previous implementations.
26-26
: Good change - simpler HTML outputThe code now returns a simplified
<code>
tag instead of a<pre><code>
combination for language-specific blocks, which is cleaner and more appropriate for modern HTML.
30-31
: LGTM! Verify method implementationAdding the
verify
method improves the object-oriented design by allowing identification of "code" elements during parsing.
68-68
: Good update to use RST classReplacing the string representation with the RST class instance improves code consistency and better aligns with the object-oriented design of the codebase.
BetterMD/__init__.py (2)
3-3
: Unused imports included in__all__
The imported classes
Collection
,HTMLParser
, andMDParser
aren't used directly in this file, but they are included in__all__
to expose them to users of the module. While this is not incorrect, it might be clearer to add a comment explaining why they're imported.-from .parse import Collection, HTMLParser, MDParser +# Import these classes to expose them via __all__ +from .parse import Collection, HTMLParser, MDParser
39-39
: Good use of all for explicit exportsExplicitly defining the module's public API with
__all__
is a good practice. It clearly communicates what consumers of your module should use and helps control what gets imported withfrom module import *
.BetterMD/elements/style.py (3)
7-8
: Well-defined type aliases for CSS stylesThe type definitions for
StyleValue
andStyleDict
are clear and properly handle the recursive nature of CSS styles. This is a good approach to typing CSS data structures.
20-54
: Well-implemented CSS processing logicThe
_process_styles
method effectively handles nested CSS selectors and properties. The approach of separating properties from nested selectors and then recursively processing the nested ones is elegant and maintainable.🧰 Tools
🪛 Ruff (0.8.2)
45-48: Combine
if
branches using logicalor
operatorCombine
if
branches(SIM114)
66-80
: Well-documented Style class with clear parametersThe
Style
class is well-documented with proper docstrings explaining the purpose of each parameter. This is excellent practice for maintaining code readability and aiding future developers.BetterMD/elements/symbol.py (4)
23-25
: Excellent use of__init_subclass__
for symbol registrationUsing
__init_subclass__
to automatically register all Symbol subclasses is an elegant pattern. This ensures that all symbols are properly registered with the collection without requiring manual registration calls.
32-42
: Proper handling of mutable default argumentsThis implementation correctly handles mutable default arguments by using
None
as the default and initializing empty containers inside the method body. This addresses previous issues with shared mutable state.
160-171
: Well-structured recursive parsing logicThe recursive element handling in the
parse
method is well-implemented, with proper class lookup and error handling. The use of assertions ensures that expected behavior occurs and helps with debugging.
214-217
: Good implementation of string representation methodsAdding proper
__str__
and__repr__
methods makes debugging easier and improves the overall developer experience when working with Symbol objects.BetterMD/elements/__init__.py (1)
1-127
: Define__all__
to explicitly specify the public APIAs suggested in a previous review, you should define an
__all__
list to explicitly specify which symbols are part of your public API. This makes your module's interface clear and controls what gets imported withfrom BetterMD.elements import *
.__all__ = [ "Symbol", "Comment", "A", "Abbr", "Acronym", "Address", "Area", "Article", "Aside", "Audio", "Base", "BDI", "BDO", "Big", "Blockquote", "Body", "Br", "Button", # ... Include all other classes you want to expose ]🧰 Tools
🪛 Ruff (0.8.2)
1-1:
.symbol.Symbol
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
2-2:
.comment.Comment
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
3-3:
from .svg import *
used; unable to detect undefined names(F403)
5-5:
.text_formatting.Strong
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
5-5:
.text_formatting.Em
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
5-5:
.text_formatting.B
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
7-7:
.a.A
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
8-8:
.abbr.Abbr
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
9-9:
.acronym.Acronym
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
10-10:
.address.Address
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
11-11:
.area.Area
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
12-12:
.article.Article
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
13-13:
.aside.Aside
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
14-14:
.audio.Audio
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
16-16:
.base.Base
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
17-17:
.bd.BDI
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
17-17:
.bd.BDO
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
18-18:
.big.Big
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
19-19:
.blockquote.Blockquote
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
20-20:
.body.Body
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
21-21:
.br.Br
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
22-22:
.button.Button
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
24-24:
.canvas.Canvas
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
25-25:
.caption.Caption
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
26-26:
.center.Center
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
27-27:
.cite.Cite
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
28-28:
.code.Code
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
29-29:
.col.Col
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
29-29:
.col.Colgroup
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
31-31:
.d.DD
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
31-31:
.d.DFN
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
31-31:
.d.DL
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
31-31:
.d.DT
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
32-32:
.data.Data
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
33-33:
.datalist.DataList
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
34-34:
.del_.Del
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
35-35:
.details.Details
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
36-36:
.dialog.Dialog
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
37-37:
.dir.Dir
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
38-38:
.div.Div
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
40-40:
.embed.Embed
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
42-42:
.fencedframe.FencedFrame
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
43-43:
.fieldset.Fieldset
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
44-44:
.figure.FigCaption
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
44-44:
.figure.Figure
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
45-45:
.font.Font
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
46-46:
.footer.Footer
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
47-47:
.form.Form
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
48-48:
.frame.Frame
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
49-49:
.frameset.Frameset
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
51-51:
.h.H1
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
51-51:
.h.H2
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
51-51:
.h.H3
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
51-51:
.h.H4
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
51-51:
.h.H5
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
51-51:
.h.H6
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
52-52:
.head.Head
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
53-53:
.header.Header
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
54-54:
.hgroup.HGroup
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
55-55:
.hr.Hr
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
56-56:
.html.HTML
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
58-58:
.i.I
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
59-59:
.iframe.Iframe
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
60-60:
.img.Img
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
61-61:
.input.Input
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
62-62:
.ins.Ins
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
64-64:
.kbd.Kbd
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
66-66:
.label.Label
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
67-67:
.legend.Legend
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
68-68:
.li.OL
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
68-68:
.li.UL
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
68-68:
.li.LI
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
69-69:
.link.Link
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
71-71:
.main.Main
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
72-72:
.map.Map
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
73-73:
.mark.Mark
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
74-74:
.marquee.Marquee
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
75-75:
.menu.Menu
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
76-76:
.meta.Meta
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
77-77:
.meter.Meter
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
79-79:
.nav.Nav
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
80-80:
.no.NoFrames
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
80-80:
.no.NoScript
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
80-80:
.no.NoBr
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
80-80:
.no.NoEmbed
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
82-82:
.object.Object
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
83-83:
.output.Output
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
85-85:
.p.P
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
86-86:
.param.Param
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
87-87:
.picture.Picture
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
88-88:
.plaintext.Plaintext
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
89-89:
.progress.Progress
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
91-91:
.q.Q
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
93-93:
.ruby.RB
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
93-93:
.ruby.RP
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
93-93:
.ruby.RT
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
93-93:
.ruby.RTC
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
95-95:
.s.S
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
96-96:
.samp.Samp
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
97-97:
.script.Script
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
98-98:
.search.Search
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
99-99:
.section.Section
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
100-100:
.select.Select
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
101-101:
.slot.Slot
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
102-102:
.small.Small
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
103-103:
.source.Source
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
104-104:
.span.Span
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
105-105:
.strike.Strike
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
106-106:
.style.Style
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
107-107:
.sub.Sub
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
108-108:
.summary.Summary
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
109-109:
.sup.Sup
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
111-111:
.table.Table
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
111-111:
.table.Tr
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
111-111:
.table.Td
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
111-111:
.table.Th
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
111-111:
.table.THead
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
111-111:
.table.TBody
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
111-111:
.table.TFoot
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
112-112:
.template.Template
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
113-113:
.text.Text
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
114-114:
.textarea.Textarea
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
115-115:
.time.Time
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
116-116:
.title.Title
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
117-117:
.track.Track
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
118-118:
.tt.TT
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
120-120:
.u.U
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
122-122:
.var.Var
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
123-123:
.video.Video
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
125-125:
.wbr.WBR
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
127-127:
.xmp.XMP
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
|
||
class Polygon(Symbol): | ||
prop_list = ["points"] | ||
html = "polygon" | ||
md = "" | ||
rst = "" | ||
|
||
class Polyline(Symbol): | ||
prop_list = ["points"] | ||
html = "polyline" | ||
md = "" | ||
rst = "" | ||
|
||
class RadialGradient(Symbol): | ||
html = "radialGradient" | ||
md = "" | ||
rst = "" | ||
|
||
class Rect(Symbol): | ||
prop_list = ["x", "y", "width", "height", "rx", "ry"] | ||
html = "rect" | ||
md = "" | ||
rst = "" | ||
|
||
class SVGScript(Symbol): | ||
html = "script" | ||
md = "" | ||
rst = "" | ||
|
||
class Set(Symbol): | ||
html = "set" | ||
md = "" | ||
rst = "" | ||
|
||
class Stop(Symbol): | ||
html = "stop" | ||
md = "" | ||
rst = "" | ||
|
||
class Style(Symbol): | ||
html = "style" | ||
md = "" | ||
rst = "" | ||
|
||
class Svg(Symbol): | ||
prop_list = ["width", "height", "viewBox"] | ||
html = "svg" | ||
md = "" | ||
rst = "" | ||
|
||
class Switch(Symbol): | ||
html = "switch" | ||
md = "" | ||
rst = "" | ||
|
||
class SVGSymbol(Symbol): | ||
html = "symbol" | ||
md = "" | ||
rst = "" | ||
|
||
class SVGText(Symbol): | ||
html = "text" | ||
md = "" | ||
rst = "" | ||
|
||
class TextPath(Symbol): | ||
html = "textPath" | ||
md = "" | ||
rst = "" | ||
|
||
class Title(Symbol): | ||
html = "title" | ||
md = "" | ||
rst = "" | ||
|
||
class TRef(Symbol): | ||
html = "tref" | ||
md = "" | ||
rst = "" | ||
|
||
class TSpan(Symbol): | ||
html = "tspan" | ||
md = "" | ||
rst = "" | ||
|
||
class Use(Symbol): | ||
prop_list = ["href", "x", "y", "width", "height"] | ||
html = "use" | ||
md = "" | ||
rst = "" | ||
|
||
class View(Symbol): | ||
html = "view" | ||
md = "" | ||
rst = "" | ||
|
||
class VKern(Symbol): | ||
html = "vkern" | ||
md = "" | ||
rst = "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Consider implementing common methods for SVG elements.
The file defines numerous SVG element classes but doesn't include any methods for manipulating them or converting them to/from different formats. Consider adding methods that would be useful for working with SVG elements in the context of the BetterMD package.
For example, you might add a method to convert an SVG element to its HTML representation:
def to_html(self, attributes=None, content=None):
"""Convert the SVG element to its HTML representation."""
attrs = ""
if attributes:
for key, value in attributes.items():
attrs += f' {key}="{value}"'
if content:
return f"<{self.html}{attrs}>{content}</{self.html}>"
else:
return f"<{self.html}{attrs} />"
This could be added to the Symbol base class or implemented specifically for SVG elements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the default to_html
in symbol not good enough for svgs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
@R5dan You're absolutely right. After examining the Symbol
class implementation, the default to_html
method should be sufficient for SVG elements. The method already:
- Properly handles HTML attributes and properties
- Manages element nesting and content rendering
- Processes styles and classes correctly
- Supports indentation for formatted output
Your SVG classes inherit this functionality from Symbol
and properly define their HTML tag names and property lists. The existing implementation should render SVG elements correctly without needing any special methods.
My earlier suggestion was unnecessary in this case, as the architecture you've designed already handles SVG elements appropriately through inheritance.
🏁 Script executed:
#!/bin/bash
# Let's examine the Symbol class and its to_html method
cat BetterMD/elements/symbol.py
Length of output: 7429
@R5dan You're absolutely right. After examining the Symbol
class implementation, the default to_html
method is perfectly sufficient for SVG elements. The existing implementation:
- Properly handles HTML tag rendering with appropriate opening and closing tags
- Processes various attribute types correctly through the
handle_props
method - Handles nested elements and inner content appropriately
- Distinguishes between elements with content and self-closing elements
The SVG classes you've defined already set the correct html
attribute values and appropriate prop_list
items. Since they inherit from Symbol
, they automatically get all this HTML generation functionality.
My earlier suggestion wasn't needed - your architecture already elegantly handles SVG elements through the base Symbol
class's implementation.
class Svg(Symbol): | ||
prop_list = ["width", "height", "viewBox"] | ||
html = "svg" | ||
md = "" | ||
rst = "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Enhance SVG root element properties.
The Svg class property list includes basic dimensions but omits other important attributes like 'preserveAspectRatio' and 'xmlns'. The xmlns attribute is particularly important for stand-alone SVG documents.
class Svg(Symbol):
- prop_list = ["width", "height", "viewBox"]
+ prop_list = ["width", "height", "viewBox", "preserveAspectRatio", "xmlns", "xmlns:xlink"]
html = "svg"
md = ""
rst = ""
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
class Svg(Symbol): | |
prop_list = ["width", "height", "viewBox"] | |
html = "svg" | |
md = "" | |
rst = "" | |
class Svg(Symbol): | |
prop_list = ["width", "height", "viewBox", "preserveAspectRatio", "xmlns", "xmlns:xlink"] | |
html = "svg" | |
md = "" | |
rst = "" |
class HTML(CustomHTML): | ||
def to_html(self, inner, symbol, parent): | ||
return f"<img src={symbol.get_prop('src')} alt={symbol.get_prop('alt', '')} />" | ||
return f"})" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Consider validating the image properties.
While the syntax for returning the Markdown image reference is correct, it may be prudent to handle cases where the src
property is missing or empty, to prevent broken images being rendered.
Could you search the codebase references of Img
to ensure we never call get_prop('src')
when src
is likely to be None
or empty?
🏁 Script executed:
#!/bin/bash
rg -A 3 'Img'
Length of output: 610
Action: Validate the 'src' Property in BetterMD/elements/img.py
The current implementation in line 9 directly uses symbol.get_prop('src')
without checking if the value is missing or empty, which could result in broken image links. Our search of Img
references did not reveal any existing safeguards against this scenario. It is therefore recommended to add validation to ensure that the src
property is non-empty before rendering the Markdown image.
- In BetterMD/elements/img.py, consider updating the method to perform a check on the
src
property. For example:src = symbol.get_prop('src') if not src: # Handle the missing or empty src appropriately (e.g. return a fallback, log an error, or skip rendering) return "" return f""
- Ensuring that the
src
property is valid will help prevent broken images in the rendered Markdown.
text = r.get(url).text | ||
|
||
if text.startswith("<!DOCTYPE html>"): | ||
text = text[15:] | ||
|
||
return Symbol.from_html(text) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Improve error handling in URL requests
The from_url
method doesn't handle potential network errors when fetching content. Add try-except blocks to gracefully handle connection errors, timeouts, and other HTTP issues.
@staticmethod
def from_url(url):
import requests as r
- text = r.get(url).text
+ try:
+ response = r.get(url, timeout=10)
+ response.raise_for_status() # Raise an exception for HTTP errors
+ text = response.text
+ except Exception as e:
+ raise ValueError(f"Failed to fetch URL content: {e}")
if text.startswith("<!DOCTYPE html>"):
text = text[15:]
return Symbol.from_html(text)
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
import requests as r | |
text = r.get(url).text | |
if text.startswith("<!DOCTYPE html>"): | |
text = text[15:] | |
return Symbol.from_html(text) | |
@staticmethod | |
def from_url(url): | |
import requests as r | |
try: | |
response = r.get(url, timeout=10) | |
response.raise_for_status() # Raise an exception for HTTP errors | |
text = response.text | |
except Exception as e: | |
raise ValueError(f"Failed to fetch URL content: {e}") | |
if text.startswith("<!DOCTYPE html>"): | |
text = text[15:] | |
return Symbol.from_html(text) |
import requests as r | ||
text = r.get(url).text | ||
return Symbol.from_md(text) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add consistent error handling to MD.from_url
For consistency with the suggested improvements to HTML.from_url, implement the same error handling here:
@staticmethod
def from_url(url):
import requests as r
- text = r.get(url).text
- return Symbol.from_md(text)
+ try:
+ response = r.get(url, timeout=10)
+ response.raise_for_status() # Raise an exception for HTTP errors
+ text = response.text
+ return Symbol.from_md(text)
+ except Exception as e:
+ raise ValueError(f"Failed to fetch URL content: {e}")
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def from_url(url): | |
import requests as r | |
text = r.get(url).text | |
return Symbol.from_md(text) | |
def from_url(url): | |
import requests as r | |
try: | |
response = r.get(url, timeout=10) | |
response.raise_for_status() # Raise an exception for HTTP errors | |
text = response.text | |
return Symbol.from_md(text) | |
except Exception as e: | |
raise ValueError(f"Failed to fetch URL content: {e}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (1)
BetterMD/parse/html.py (1)
85-85
:⚠️ Potential issueAdd a boundary check when accessing
html[i + 1]
.Accessing
html[i + 1]
risks going out of bounds ifi
islen(html) - 1
. Consider verifyingi + 1 < len(html)
before this access.-elif html[i + 1] == '/': +elif (i + 1 < len(html)) and html[i + 1] == '/':
🧹 Nitpick comments (2)
BetterMD/elements/symbol.py (1)
159-163
: Consider removing or replacing the debug file write.Temporary debug operations (writing to
t2.json
) may lead to performance or security concerns in production environments. If this is purely for debugging, consider guarding it with a flag or removing it altogether.159 with open("t2.json", "w") as f: 160 161 d = j.dumps([handle(elm) for elm in parsed.copy()]) 162 f.write(d) 163 + # Consider removing or guarding the debug file write for production.
BetterMD/elements/style.py (1)
66-79
: Document the usage of thestyle
andraw
parameters.The constructor neatly initialises embedded style rules and raw CSS text. Consider adding clarifying docstrings or inline comments explaining how these fields integrate with other elements (e.g. how
raw
appends extra CSS text to the generated<style>
block).
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
BetterMD/elements/style.py
(1 hunks)BetterMD/elements/symbol.py
(3 hunks)BetterMD/parse/html.py
(1 hunks)
🧰 Additional context used
🧬 Code Definitions (2)
BetterMD/elements/style.py (4)
BetterMD/elements/symbol.py (2)
Symbol
(12-230)to_html
(108-120)BetterMD/html/custom_html.py (1)
CustomHTML
(9-15)BetterMD/elements/comment.py (3)
HTML
(4-9)verify
(8-9)to_html
(5-6)BetterMD/elements/svg.py (1)
Style
(339-342)
BetterMD/elements/symbol.py (5)
BetterMD/parse/html.py (3)
parse
(54-104)HTMLParser
(5-205)children
(12-13)BetterMD/parse/typing.py (3)
parse
(18-18)ELEMENT
(10-14)TEXT
(5-8)BetterMD/parse/markdown/parser.py (1)
MDParser
(9-197)BetterMD/parse/collection.py (2)
Collection
(8-27)find_symbol
(19-27)BetterMD/utils.py (1)
set_recursion_limit
(17-18)
🪛 Ruff (0.8.2)
BetterMD/parse/html.py
65-65: SyntaxError: Expected an indented block after if
statement
BetterMD/elements/style.py
45-48: Combine if
branches using logical or
operator
Combine if
branches
(SIM114)
🔇 Additional comments (1)
BetterMD/elements/style.py (1)
43-50
: Refine optional compound checks.This chain of
if/elif
blocks is perfectly readable, although the static analysis tool suggests merging them with logical operators. If you prefer brevity, you could apply a single check with(key.startswith(':') or key.startswith('#') or ...)
. However, the current approach is also clear and explicit.🧰 Tools
🪛 Ruff (0.8.2)
45-48: Combine
if
branches using logicalor
operatorCombine
if
branches(SIM114)
BetterMD/parse/html.py
Outdated
if self.current_non_parsing_tag.lower() == "script": | ||
# Found closing tag, create element with unparsed content | ||
self.children.append(self.create_text(self.non_parsing_content)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indent the block under the if
statement.
The code below causes a syntax error (as flagged by static analysis): line 64 is not indented. It must be indented under the if
condition to avoid a SyntaxError
.
63 if self.current_non_parsing_tag.lower() == "script":
-64 # Found closing tag, create element with unparsed content
-65 self.children.append(self.create_text(self.non_parsing_content))
+64 # Found closing tag, create element with unparsed content
+65 self.children.append(self.create_text(self.non_parsing_content))
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
if self.current_non_parsing_tag.lower() == "script": | |
# Found closing tag, create element with unparsed content | |
self.children.append(self.create_text(self.non_parsing_content)) | |
if self.current_non_parsing_tag.lower() == "script": | |
# Found closing tag, create element with unparsed content | |
self.children.append(self.create_text(self.non_parsing_content)) |
🧰 Tools
🪛 Ruff (0.8.2)
65-65: SyntaxError: Expected an indented block after if
statement
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
BetterMD/elements/symbol.py (1)
92-107
: Style attribute formatting logic could be improved.While the
handle_props
method properly handles different property types, the style dictionary formatting in lines 102-103 could be improved for clarity and maintainability.Consider using a more readable approach for style dictionary formatting:
- elif isinstance(v, dict): - prop_list.append(f'{k}="{"; ".join([f"{k}:{v}" for k,v in v.items()])}"') + elif isinstance(v, dict): + style_pairs = [f"{sk}:{sv}" for sk, sv in v.items()] + joined_styles = "; ".join(style_pairs) + prop_list.append(f'{k}="{joined_styles}"')
🧹 Nitpick comments (2)
BetterMD/elements/symbol.py (2)
21-23
: Class-level parser initialization may cause unexpected behavior.Creating class-level instances of Collection, HTMLParser, and MDParser means all Symbol subclasses will share the same parsers. While this makes sense for Collection (which needs to be shared), the parsers might need configuration that differs between subclasses.
Consider using lazy initialization of parsers or allowing subclasses to provide their own parser instances:
- collection = Collection() - html_parser = HTMLParser() - md_parser = MDParser() + collection = Collection() + _html_parser = None + _md_parser = None + + @classmethod + def get_html_parser(cls): + if cls._html_parser is None: + cls._html_parser = HTMLParser() + return cls._html_parser + + @classmethod + def get_md_parser(cls): + if cls._md_parser is None: + cls._md_parser = MDParser() + return cls._md_parser
159-200
: Parse method has complex nested structure.The
parse
class method includes a nested function and complex parsing logic. While functionally correct, this approach increases cognitive load.Consider extracting the
handle_element
nested function to a separate class method to improve readability and testability:+ @classmethod + def handle_element(cls, element:'ELEMENT|TEXT'): + if element['type'] == 'text': + text = cls.collection.find_symbol("text", raise_errors=True) + assert text is not None, "`collection.find_symbol` is broken" + return text(element['content']) + + symbol_cls = cls.collection.find_symbol(element['name'], raise_errors=True) + assert symbol_cls is not None, "`collection.find_symbol` is broken" + + return symbol_cls.parse(element) @classmethod def parse(cls, text:'ELEMENT|TEXT') -> 'Symbol': - def handle_element(element:'ELEMENT|TEXT'): - if element['type'] == 'text': - text = cls.collection.find_symbol("text", raise_errors=True) - assert text is not None, "`collection.find_symbol` is broken" - return text(element['content']) - - symbol_cls = cls.collection.find_symbol(element['name'], raise_errors=True) - assert symbol_cls is not None, "`collection.find_symbol` is broken" - - return symbol_cls.parse(element) - if text["type"] == "text": return cls.collection.find_symbol("text", raise_errors=True)(text["content"]) # ... rest of method ... - inner=[handle_element(elm) for elm in text["children"]] + inner=[cls.handle_element(elm) for elm in text["children"]]
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
BetterMD/elements/symbol.py
(3 hunks)
🧰 Additional context used
🧬 Code Definitions (1)
BetterMD/elements/symbol.py (4)
BetterMD/parse/html.py (3)
parse
(54-104)HTMLParser
(5-205)children
(12-13)BetterMD/parse/typing.py (3)
parse
(18-18)ELEMENT
(10-14)TEXT
(5-8)BetterMD/parse/markdown/parser.py (1)
MDParser
(9-197)BetterMD/parse/collection.py (3)
Collection
(8-27)add_symbols
(13-14)find_symbol
(19-27)
🔇 Additional comments (10)
BetterMD/elements/symbol.py (10)
6-8
: Imports have been updated to support new parsing functionality.The addition of parsing-related imports (HTMLParser, MDParser, ELEMENT, TEXT, Collection) along with utility classes like List and type definitions supports the new parsing capabilities added to the Symbol class.
18-19
: New flag attributes enhance markup control.The addition of
block
andself_closing
flags provides better control over HTML output formatting and Markdown rendering. These attributes are properly used in the rendering methods.
25-27
: Automatic subclass registration improves usability.The
__init_subclass__
method automatically registers subclasses in the collection, which simplifies the library's usage and reduces chances of registration errors.
29-44
: Mutable default arguments issue has been properly addressed.The constructor now uses
None
as the default value for mutable arguments (styles
,classes
,inner
) and properly initializes them to empty collections if not provided. This addresses the previous review comments about mutable defaults.
46-53
: Copy method also addresses mutable defaults issue.The
copy
method now also follows best practices by usingNone
as the default value for mutable arguments and initializing them appropriately.
108-121
: HTML output method properly handles self-closing tags.The updated
to_html
method now correctly renders self-closing tags when appropriate and includes proper handling of child elements.
127-137
: Markdown output method carefully handles element types.The updated
to_md
method now properly formats blocks, handles newlines, and ensures correct rendering of different element types in Markdown.
147-150
: Debug file write was removed as requested.The
from_html
method no longer contains the debugging code that wrote parsing results to 't2.json', addressing a previous review comment.
213-216
: String representation enhances debugging.The addition of
__str__
and__repr__
methods provides helpful string representations of Symbol instances, which improves debugging and logging capabilities.
10-10
:❓ Verification inconclusive
High recursion limit may indicate deeper issues.
Setting a high recursion limit (10000) is generally a symptom of deeply nested structures or recursive algorithms that could be optimized. Consider reviewing the parsing algorithms to ensure they're not overly recursive.
🏁 Script executed:
#!/bin/bash # Search for recursive calls in the parsing code echo "Searching for recursive patterns in the parsing code:" rg -A 3 -B 3 "def parse.*:" BetterMD/parse/Length of output: 2108
Caution Regarding High Recursion Limit in Parsing Algorithms
The recent grep search of the parsing modules revealed only standard definitions of various
parse
methods without any clear evidence of deep recursive calls. Nonetheless, the high recursion limit specified on line 10 ofBetterMD/elements/symbol.py
might still be masking underlying issues. Please double-check that this limit is truly necessary—either by confirming that the parsing logic cannot be restructured into a less recursion-intensive implementation or by adding a comment that documents why such a high limit is justified.
- File:
BetterMD/elements/symbol.py
– Line 10:set_recursion_limit(10000)
- Observation: Definitions across
BetterMD/parse/
do not visibly demonstrate problematic recursion; however, manual verification and further testing are advised.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
BetterMD/parse/html.py (1)
84-85
: Prevent indexing risk when checking for closing tags.
Accessinghtml[i + 1]
can cause an out-of-bounds error ifi
points to the last character of the string. A similar concern was raised in an earlier review.-elif html[i + 1] == '/': +elif (i + 1 < len(html)) and html[i + 1] == '/': # Closing tag i = self.handle_closing_tag(html, i + 2)
🧹 Nitpick comments (3)
BetterMD/parse/html.py (3)
5-6
: Consider adding a class-level docstring.
Providing a short, descriptive docstring for theHTMLParser
class will help other developers quickly understand the parser’s purpose and usage.
105-178
: Refactor thehandle_opening_tag
method for reduced complexity.
This method handles multiple responsibilities (tag name extraction, attribute parsing, self-closing detection), making it lengthy. Consider splitting out attribute parsing into its own helper method to enhance readability and maintainability.
190-205
: Add an option to preserve or discard comments at parse time.
Currently, comments are parsed and included in the DOM, which is valid. However, some use-cases may require discarding HTML comments. Providing a toggle or method parameter to strip comments would make the parser more flexible.
if self.buffer: | ||
self.dom.append(self.create_text(self.buffer)) | ||
|
||
return self.dom | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Consider handling unmatched or partially opened tags at EOF.
When the parser reaches the end of the HTML but still has an unclosed tag, the code does not currently handle it. This can leave the DOM in an inconsistent state if the input is malformed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 10
♻️ Duplicate comments (4)
BetterMD/__init__.py (3)
1-7
: Symbol should be explicitly imported.The code uses
Symbol
at multiple places but it's only imported through a star import. This could lead to confusion and potential issues if the import structure changes.from .elements import * +from .elements.symbol import Symbol # Explicitly import Symbol from .html import CustomHTML from .markdown import CustomMarkdown from .rst import CustomRst from .parse import HTMLParser, MDParser, Collection
🧰 Tools
🪛 Ruff (0.8.2)
2-2:
from .elements import *
used; unable to detect undefined names(F403)
17-25
: Improve error handling in URL requests.The
from_url
method doesn't handle potential network errors when fetching content, which could cause unexpected failures for users.@staticmethod def from_url(url): import requests as r - text = r.get(url).text + try: + response = r.get(url, timeout=10) + response.raise_for_status() # Raise an exception for HTTP errors + text = response.text + except Exception as e: + raise ValueError(f"Failed to fetch HTML content: {e}") if text.startswith("<!DOCTYPE html>"): text = text[15:] return Symbol.from_html(text)🧰 Tools
🪛 Ruff (0.8.2)
25-25:
Symbol
may be undefined, or defined from star imports(F405)
36-40
: Add error handling to MD.from_url method.Similar to the HTML.from_url method, this method should handle potential network errors for consistency.
@staticmethod def from_url(url): import requests as r - text = r.get(url).text - return Symbol.from_md(text) + try: + response = r.get(url, timeout=10) + response.raise_for_status() # Raise an exception for HTTP errors + text = response.text + return Symbol.from_md(text) + except Exception as e: + raise ValueError(f"Failed to fetch Markdown content: {e}")🧰 Tools
🪛 Ruff (0.8.2)
40-40:
Symbol
may be undefined, or defined from star imports(F405)
BetterMD/elements/symbol.py (1)
92-106
: 🛠️ Refactor suggestionImprove style attribute formatting in handle_props method.
The current implementation of style attribute formatting might produce incorrect CSS if the input dictionary keys or values contain spaces or special characters. Consider using a more robust approach.
elif isinstance(v, dict): - prop_list.append(f'{k}="{"; ".join([f"{k}:{v}" for k,v in v.items()])}"') + style_parts = [] + for style_key, style_val in v.items(): + # Trim whitespace and ensure proper formatting + clean_key = style_key.strip() + clean_val = str(style_val).strip() + if clean_key and clean_val: + style_parts.append(f"{clean_key}: {clean_val}") + prop_list.append(f'{k}="{"; ".join(style_parts)}"')
🧹 Nitpick comments (17)
TODO.md (2)
10-14
: Good addition of detailed parsing tasks.The new subtasks provide better granularity for tracking parsing implementation progress. However, the indentation is inconsistent with markdown standards.
Consider standardising indentation to 2 spaces per level rather than 4:
- [x] Add HTML parsing - [x] Add MD parsing - - [x] Add inline support - - [x] Add escape chars (\\) - [ ] Add RST parsing + - [x] Add HTML parsing + - [x] Add MD parsing + - [x] Add inline support + - [x] Add escape chars (\\) + - [ ] Add RST parsing🧰 Tools
🪛 markdownlint-cli2 (0.17.2)
10-10: Unordered list indentation
Expected: 2; Actual: 4(MD007, ul-indent)
11-11: Unordered list indentation
Expected: 2; Actual: 4(MD007, ul-indent)
12-12: Unordered list indentation
Expected: 4; Actual: 8(MD007, ul-indent)
13-13: Unordered list indentation
Expected: 4; Actual: 8(MD007, ul-indent)
14-14: Unordered list indentation
Expected: 2; Actual: 4(MD007, ul-indent)
18-21
: Typo in section heading and temporary content.There is a typo in "Inportant" (should be "Important"). Also, this section is marked for deletion, suggesting it's temporary.
Consider fixing the typo and addressing the tasks in this section:
-## Inportant +## ImportantNote that the instruction to "Delete this section" indicates this is temporary content that should be removed after addressing the table.py issue.
tests/test.py (1)
1-3
: Updated imports and commented out debug mode.The import statement now includes newer components like B, THead, and TBody. Note that several imports are unused according to static analysis.
Consider removing unused imports to improve code clarity:
-from BetterMD import H1, H2, Text, Div, LI, OL, UL, A, B, Table, Tr, Td, Th, THead, TBody, Blockquote, I, Input, CustomRst, CustomHTML, CustomMarkdown, enable_debug_mode +from BetterMD import H1, Text, Div, LI, OL, UL, A, B, Table, Tr, Td, Th, THead, TBody, Blockquote, I, InputAlso, should the
enable_debug_mode()
call remain commented out? If it's needed for development, consider documenting why it's commented.🧰 Tools
🪛 Ruff (0.8.2)
1-1:
BetterMD.H2
imported but unusedRemove unused import
(F401)
1-1:
BetterMD.CustomRst
imported but unusedRemove unused import
(F401)
1-1:
BetterMD.CustomHTML
imported but unusedRemove unused import
(F401)
1-1:
BetterMD.CustomMarkdown
imported but unusedRemove unused import
(F401)
1-1:
BetterMD.enable_debug_mode
imported but unusedRemove unused import
(F401)
BetterMD/utils.py (3)
10-13
: Implement the empty hook methods.The
on_set
andon_append
methods are using empty implementations (...
). These appear to be hook methods for subclasses, but should include basic implementations or docstrings explaining their purpose.class List(list['Symbol'], t.Generic[T]): - def on_set(self, key, value): ... + def on_set(self, key, value): + """Hook method called before setting an item in the list. + + Subclasses can override this method to implement custom behavior. + """ + pass - def on_append(self, object: 'T'): ... + def on_append(self, object: 'T'): + """Hook method called before appending an item to the list. + + Subclasses can override this method to implement custom behavior. + """ + pass
35-42
: Consider adding docstrings to conversion methods.The
to_html
,to_md
, andto_rst
methods lack documentation. Adding docstrings would improve code maintainability and help users understand how these methods are intended to be used.def to_html(self): + """Convert all elements in the list to HTML and return the results as a list.""" return [elm.to_html() for elm in self] def to_md(self): + """Convert all elements in the list to Markdown and return the results as a list.""" return [elm.to_md() for elm in self] def to_rst(self): + """Convert all elements in the list to reStructuredText and return the results as a list.""" return [elm.to_rst() for elm in self]
44-48
: Add docstrings to recursion limit functions.The recursion limit functions would benefit from docstrings explaining their purpose and when they should be used.
def set_recursion_limit(limit): + """Set the maximum recursion depth for the Python interpreter. + + This is useful when processing deeply nested structures. + + Args: + limit: The maximum recursion depth. + """ sys.setrecursionlimit(limit) def get_recursion_limit(): + """Get the current maximum recursion depth for the Python interpreter. + + Returns: + The current maximum recursion depth. + """ return sys.getrecursionlimit().github/workflows/publish.yml (1)
11-11
: Remove trailing whitespace.There is a trailing space at the end of this line.
- id-token: write + id-token: write🧰 Tools
🪛 YAMLlint (1.35.1)
[error] 11-11: trailing spaces
(trailing-spaces)
BetterMD/elements/table.py (3)
158-158
: Use "from" with exception re-raising.When re-raising exceptions in an except block, use "from" to maintain the exception chain for better debugging.
- raise ImportError("`tables` extra is required to use `to_pandas`") + raise ImportError("`tables` extra is required to use `to_pandas`") from e🧰 Tools
🪛 Ruff (0.8.2)
158-158: Within an
except
clause, raise exceptions withraise ... from err
orraise ... from None
to distinguish them from errors in exception handling(B904)
186-186
: Use "from" with exception re-raising here too.This is another instance where the exception chain should be preserved.
- raise ImportError("`tables` extra is required to use `from_pandas`") + raise ImportError("`tables` extra is required to use `from_pandas`") from e🧰 Tools
🪛 Ruff (0.8.2)
186-186: Within an
except
clause, raise exceptions withraise ... from err
orraise ... from None
to distinguish them from errors in exception handling(B904)
406-407
: Fix code style issues.This line contains multiple statements on a single line and uses star-argument unpacking after a keyword argument, which is discouraged.
- if head: self.head = head - return super().prepare(parent, table=table, row=self, *args, **kwargs) + if head: + self.head = head + kwargs['row'] = self + return super().prepare(parent, table=table, **kwargs)🧰 Tools
🪛 Ruff (0.8.2)
406-406: Multiple statements on one line (colon)
(E701)
407-407: Star-arg unpacking after a keyword argument is strongly discouraged
(B026)
BetterMD/elements/symbol.py (7)
6-8
: Consider importing specific classes from typing module instead of the entire module.At line 1, you're importing the entire typing module as
t
, but then at line 8 importingATTR_TYPES
directly. Consider being consistent with your import strategy by either importing specific types or using the module alias throughout.-import typing as t +from typing import Union, List as TypeList, Optional from ..markdown import CustomMarkdown from ..html import CustomHTML from ..rst import CustomRst from ..parse import HTMLParser, MDParser, ELEMENT, TEXT, Collection from ..utils import List, set_recursion_limit -from ..typing import ATTR_TYPES +from ..typing import ATTR_TYPES
21-23
: Class attributes should use type annotations.Add type annotations to class attributes for better code clarity and to aid static type checkers.
- collection = Collection() - html_parser = HTMLParser() - md_parser = MDParser() + collection: Collection = Collection() + html_parser: HTMLParser = HTMLParser() + md_parser: MDParser = MDParser()
25-27
: Add docstring to explain the purpose of init_subclass.The
__init_subclass__
method automatically registers subclasses with the collection, which is a key architectural feature. This should be documented for clarity.def __init_subclass__(cls, **kwargs) -> None: + """ + Automatically register all Symbol subclasses with the collection. + This allows the parser to find and instantiate the appropriate Symbol subclass + when parsing HTML or Markdown. + """ cls.collection.add_symbols(cls) super().__init_subclass__(**kwargs)
112-120
: Consider adding parameter to control indentation of nested HTML elements.The
to_html
method accepts anindent
parameter but doesn't use it to format the output with proper indentation. This would improve readability of the generated HTML.def to_html(self, indent=0) -> 'str': if isinstance(self.html, CustomHTML): return self.html.to_html(self.children, self, self.parent) + indent_str = " " * indent + next_indent = indent + 1 + next_indent_str = " " * next_indent + inner_HTML = "\n".join([ - e.to_html(0) if not (len(self.children) == 1 and isinstance(e.html, str) and e.html == "text") - else e.to_html(0) for e in self.children + next_indent_str + e.to_html(next_indent) if not (len(self.children) == 1 and isinstance(e.html, str) and e.html == "text") + else e.to_html(next_indent) for e in self.children ]) if inner_HTML or not self.self_closing: - return f"<{self.html}{self.handle_props(False)}>{inner_HTML}</{self.html}>" + return f"{indent_str}<{self.html}{self.handle_props(False)}>{inner_HTML}\n{indent_str}</{self.html}>" else: - return f"<{self.html}{self.handle_props(False)} />" + return f"{indent_str}<{self.html}{self.handle_props(False)} />"
181-189
: Use CSS parser for more robust style handling.For a more robust handling of CSS styles, consider using a specialized CSS parser library instead of a simple string split approach, which may not handle all valid CSS syntax correctly.
Run the following command to check available CSS parsing libraries:
#!/bin/bash # Check available CSS parsing libraries pip search tinycss2 | grep -i "css parser" pip search cssutils | grep -i "css parser"Then update your code to use the selected library for style parsing:
# Example using tinycss2 (you'll need to add it to your dependencies) import tinycss2 # Inside the parse method: if "style" in attributes: style_str = attributes["style"] if isinstance(style_str, str): # Parse inline CSS style_tokens = tinycss2.parse_declaration_list(style_str) for token in style_tokens: if token.type == 'declaration' and token.value is not None: styles[token.name] = token.value elif isinstance(style_str, dict): styles = style_str del attributes["style"]
210-213
: Make str and repr more informative.The current string representation only shows the HTML tag and properties but doesn't indicate the type of the Symbol or its children. This could make debugging more difficult.
def __str__(self): - return f"<{self.html}{self.handle_props()} />" + children_count = len(self.children) + return f"<{self.__class__.__name__} html='{self.html}'{self.handle_props()} children={children_count} />" - __repr__ = __str__ + def __repr__(self): + return self.__str__()
158-170
: Add documentation for the element-handling logic.The
handle_element
nested function in theparse
method contains critical logic for processing different element types, but lacks documentation explaining its purpose and approach.@classmethod def parse(cls, text:'ELEMENT|TEXT') -> 'Symbol': + """ + Parse an ELEMENT or TEXT into a Symbol instance. + + Args: + text: An ELEMENT or TEXT dictionary representing HTML or Markdown content. + + Returns: + A Symbol instance representing the parsed content. + """ def handle_element(element:'ELEMENT|TEXT'): + """ + Process an element or text node and convert it to the appropriate Symbol instance. + + Args: + element: An ELEMENT or TEXT dictionary to process. + + Returns: + A Symbol instance representing the element. + """ if element['type'] == 'text': text = cls.collection.find_symbol("text", raise_errors=True) assert text is not None, "`collection.find_symbol` is broken" return text(element['content'])
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (14)
.github/workflows/publish.yml
(1 hunks)BetterMD/__init__.py
(1 hunks)BetterMD/__main__.py
(1 hunks)BetterMD/elements/symbol.py
(3 hunks)BetterMD/elements/table.py
(1 hunks)BetterMD/html/custom_html.py
(1 hunks)BetterMD/markdown/custom_markdown.py
(1 hunks)BetterMD/rst/custom_rst.py
(1 hunks)BetterMD/typing.py
(1 hunks)BetterMD/utils.py
(1 hunks)README.md
(1 hunks)TODO.md
(1 hunks)setup.py
(3 hunks)tests/test.py
(2 hunks)
✅ Files skipped from review due to trivial changes (1)
- BetterMD/main.py
🚧 Files skipped from review as they are similar to previous changes (4)
- README.md
- BetterMD/typing.py
- setup.py
- BetterMD/rst/custom_rst.py
🧰 Additional context used
🧬 Code Graph Analysis (4)
tests/test.py (4)
BetterMD/elements/text_formatting.py (1)
B
(20-23)BetterMD/elements/table.py (11)
Table
(132-192)Tr
(337-407)THead
(194-241)TBody
(243-296)prepare
(191-192)prepare
(237-241)prepare
(292-296)prepare
(331-335)prepare
(403-407)prepare
(429-437)prepare
(475-483)BetterMD/elements/i.py (1)
I
(13-16)BetterMD/__init__.py (1)
enable_debug_mode
(42-46)
BetterMD/utils.py (7)
BetterMD/elements/symbol.py (4)
Symbol
(12-213)to_html
(108-120)to_md
(122-136)to_rst
(138-143)BetterMD/html/custom_html.py (1)
to_html
(11-11)BetterMD/elements/style.py (1)
to_html
(56-62)BetterMD/elements/text.py (3)
to_html
(17-18)to_md
(20-21)to_rst
(23-24)BetterMD/markdown/custom_markdown.py (1)
to_md
(14-14)BetterMD/elements/img.py (2)
to_md
(7-9)to_rst
(12-13)BetterMD/rst/custom_rst.py (1)
to_rst
(14-14)
BetterMD/elements/table.py (3)
BetterMD/utils.py (5)
List
(10-42)to_md
(38-39)append
(15-17)get
(19-23)to_rst
(41-42)BetterMD/elements/text.py (3)
Text
(8-29)to_md
(20-21)to_rst
(23-24)BetterMD/elements/symbol.py (4)
to_md
(122-136)to_rst
(138-143)Symbol
(12-213)prepare
(80-86)
BetterMD/elements/symbol.py (4)
BetterMD/parse/html.py (3)
parse
(54-103)HTMLParser
(5-204)children
(12-13)BetterMD/parse/typing.py (3)
parse
(18-18)ELEMENT
(10-14)TEXT
(5-8)BetterMD/parse/markdown/parser.py (1)
MDParser
(9-197)BetterMD/parse/collection.py (3)
Collection
(8-27)add_symbols
(13-14)find_symbol
(19-27)
🪛 Ruff (0.8.2)
tests/test.py
1-1: BetterMD.H2
imported but unused
Remove unused import
(F401)
1-1: BetterMD.CustomRst
imported but unused
Remove unused import
(F401)
1-1: BetterMD.CustomHTML
imported but unused
Remove unused import
(F401)
1-1: BetterMD.CustomMarkdown
imported but unused
Remove unused import
(F401)
1-1: BetterMD.enable_debug_mode
imported but unused
Remove unused import
(F401)
BetterMD/__init__.py
2-2: from .elements import *
used; unable to detect undefined names
(F403)
11-11: Symbol
may be undefined, or defined from star imports
(F405)
15-15: Symbol
may be undefined, or defined from star imports
(F405)
25-25: Symbol
may be undefined, or defined from star imports
(F405)
30-30: Symbol
may be undefined, or defined from star imports
(F405)
34-34: Symbol
may be undefined, or defined from star imports
(F405)
40-40: Symbol
may be undefined, or defined from star imports
(F405)
48-48: Symbol
may be undefined, or defined from star imports
(F405)
BetterMD/elements/table.py
123-123: Redefinition of unused ThRST
from line 115
(F811)
127-127: Redefinition of unused TBodyRST
from line 99
(F811)
158-158: Within an except
clause, raise exceptions with raise ... from err
or raise ... from None
to distinguish them from errors in exception handling
(B904)
167-167: pandas
imported but unused; consider using importlib.util.find_spec
to test for availability
(F401)
186-186: Within an except
clause, raise exceptions with raise ... from err
or raise ... from None
to distinguish them from errors in exception handling
(B904)
192-192: Star-arg unpacking after a keyword argument is strongly discouraged
(B026)
241-241: Star-arg unpacking after a keyword argument is strongly discouraged
(B026)
274-274: Within an except
clause, raise exceptions with raise ... from err
or raise ... from None
to distinguish them from errors in exception handling
(B904)
296-296: Star-arg unpacking after a keyword argument is strongly discouraged
(B026)
329-329: Within an except
clause, raise exceptions with raise ... from err
or raise ... from None
to distinguish them from errors in exception handling
(B904)
335-335: Star-arg unpacking after a keyword argument is strongly discouraged
(B026)
363-363: Within an except
clause, raise exceptions with raise ... from err
or raise ... from None
to distinguish them from errors in exception handling
(B904)
385-385: Within an except
clause, raise exceptions with raise ... from err
or raise ... from None
to distinguish them from errors in exception handling
(B904)
406-406: Multiple statements on one line (colon)
(E701)
407-407: Star-arg unpacking after a keyword argument is strongly discouraged
(B026)
437-437: Star-arg unpacking after a keyword argument is strongly discouraged
(B026)
455-455: Do not use mutable data structures for argument defaults
Replace with None
; initialize within function
(B006)
455-455: Do not use mutable data structures for argument defaults
Replace with None
; initialize within function
(B006)
455-455: Do not use mutable data structures for argument defaults
Replace with None
; initialize within function
(B006)
483-483: Star-arg unpacking after a keyword argument is strongly discouraged
(B026)
🪛 YAMLlint (1.35.1)
.github/workflows/publish.yml
[error] 11-11: trailing spaces
(trailing-spaces)
🪛 markdownlint-cli2 (0.17.2)
TODO.md
10-10: Unordered list indentation
Expected: 2; Actual: 4
(MD007, ul-indent)
11-11: Unordered list indentation
Expected: 2; Actual: 4
(MD007, ul-indent)
12-12: Unordered list indentation
Expected: 4; Actual: 8
(MD007, ul-indent)
13-13: Unordered list indentation
Expected: 4; Actual: 8
(MD007, ul-indent)
14-14: Unordered list indentation
Expected: 2; Actual: 4
(MD007, ul-indent)
🔇 Additional comments (13)
TODO.md (1)
6-7
: Fixed typos and improved capitalization.The corrections to "HTML elements" and "PYPI" improve the clarity and accuracy of the TODO list.
BetterMD/html/custom_html.py (3)
2-2
: Good addition of ABC module for proper abstract class implementation.Importing the Abstract Base Class and abstractmethod decorator from the abc module allows for proper object-oriented design with enforced method implementations in subclasses.
9-11
: Enhanced class design with abstract base class pattern.Making CustomHTML an abstract base class with an explicitly marked abstract method enforces that all subclasses must implement the to_html method. This improves API stability and prevents runtime errors.
13-13
: Added flexible parameter handling to prepare method.The addition of *args and **kwargs parameters to the prepare method increases flexibility and compatibility with subclasses that may need additional parameters. The explicit return type annotation improves type safety.
tests/test.py (4)
39-39
: Updated text formatting using B component.Changed from Strong to B component for bold text, which aligns with the updated API.
42-80
: Enhanced table structure with proper THead and TBody components.The table example now uses semantic table components (THead and TBody) with proper header cells (Th) and styling. This is a significant improvement in structure and HTML semantics.
Note that the complexity of this example makes it a good test case for the table handling capabilities, particularly with:
- Text alignment through styles
- Empty cells
- Varying content lengths
- Multiple rows in both header and body
82-86
: Improved output formatting.The addition of explicit newline and separator control in the print statement allows for better control of the output format.
94-96
: Updated text formatting using B and I components.Changed from Strong/Em to B/I components for formatting, which aligns with the updated API design.
BetterMD/markdown/custom_markdown.py (4)
2-2
: Good addition of ABC module for proper abstract class implementation.Importing the Abstract Base Class and abstractmethod decorator from the abc module enables proper object-oriented design with enforced method implementations in subclasses.
9-9
: Enhanced class design with abstract base class pattern.Making CustomMarkdown an abstract base class improves API stability by requiring subclasses to implement necessary methods.
13-14
: Enforced implementation of to_md method in subclasses.The addition of the @AbstractMethod decorator ensures that all subclasses must implement the to_md method, preventing runtime errors and improving code quality.
16-18
: Improved method signatures with flexible parameters and type annotations.The addition of *args and **kwargs parameters to the prepare method increases flexibility, while the explicit return type annotations for both prepare and verify methods improve type safety.
BetterMD/__init__.py (1)
48-48
:❓ Verification inconclusive
Ensure all exported entities are properly defined.
The
__all__
list includes several entities, but some might not be properly defined or imported in this file.
🏁 Script executed:
#!/bin/bash # Check if all entities in __all__ are properly defined or imported grep -n "^__all__" BetterMD/__init__.py grep -n "class Symbol" BetterMD/elements/symbol.py grep -n "class Collection\|class HTMLParser\|class MDParser" BetterMD/parse.pyLength of output: 387
Action Required: Verify Exported Entities in
__init__.py
- The
Symbol
entity is confirmed as defined inBetterMD/elements/symbol.py
.- The file
BetterMD/parse.py
(which was expected to containCollection
,HTMLParser
, andMDParser
) does not exist. Please verify if these entities have been moved to a different file or if the export list in__init__.py
needs updating.- Ensure that all other exported entities in the
__all__
list (e.g.HTML
,MD
,CustomHTML
,CustomMarkdown
,CustomRst
,enable_debug_mode
) are properly defined or imported.🧰 Tools
🪛 Ruff (0.8.2)
48-48:
Symbol
may be undefined, or defined from star imports(F405)
- name: Publish to Test PyPI | ||
uses: pypa/[email protected] | ||
with: | ||
user: __token__ | ||
password: ${{ secrets.PYPI_API_TOKEN }} | ||
repository_url: https://test.pypi.org/legacy/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Use separate tokens for PyPI and Test PyPI.
Using the same token for both PyPI and Test PyPI is a security risk. Create and use separate tokens for each repository.
- name: Publish to Test PyPI
uses: pypa/[email protected]
with:
user: __token__
- password: ${{ secrets.PYPI_API_TOKEN }}
+ password: ${{ secrets.TEST_PYPI_API_TOKEN }}
repository_url: https://test.pypi.org/legacy/
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
- name: Publish to Test PyPI | |
uses: pypa/[email protected] | |
with: | |
user: __token__ | |
password: ${{ secrets.PYPI_API_TOKEN }} | |
repository_url: https://test.pypi.org/legacy/ | |
- name: Publish to Test PyPI | |
uses: pypa/[email protected] | |
with: | |
user: __token__ | |
password: ${{ secrets.TEST_PYPI_API_TOKEN }} | |
repository_url: https://test.pypi.org/legacy/ |
super().__init__(styles, classes, dom, inner, **props) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Avoid mutable default arguments.
Using mutable objects (like dictionaries and lists) as default arguments is a Python anti-pattern because they are created only once and shared between function calls.
- def __init__(self, styles: dict[str, str] = {}, classes: list[str] = [], dom: bool = True, inner: list[Symbol] = [], **props):
- super().__init__(styles, classes, dom, inner, **props)
+ def __init__(self, styles: dict[str, str] = None, classes: list[str] = None, dom: bool = True, inner: list[Symbol] = None, **props):
+ styles = styles or {}
+ classes = classes or []
+ inner = inner or []
+ super().__init__(styles, classes, dom, inner, **props)
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def __init__(self, styles: dict[str, str] = {}, classes: list[str] = [], dom: bool = True, inner: list[Symbol] = [], **props): | |
super().__init__(styles, classes, dom, inner, **props) | |
def __init__(self, styles: dict[str, str] = None, classes: list[str] = None, dom: bool = True, inner: list[Symbol] = None, **props): | |
styles = styles or {} | |
classes = classes or [] | |
inner = inner or [] | |
super().__init__(styles, classes, dom, inner, **props) |
🧰 Tools
🪛 Ruff (0.8.2)
455-455: Do not use mutable data structures for argument defaults
Replace with None
; initialize within function
(B006)
455-455: Do not use mutable data structures for argument defaults
Replace with None
; initialize within function
(B006)
455-455: Do not use mutable data structures for argument defaults
Replace with None
; initialize within function
(B006)
class TBodyRST(CustomRst): | ||
def to_rst(self, inner, symbol, parent): | ||
# This is now handled by TableRST | ||
return "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove duplicate TBodyRST class definition.
There's a duplicate definition of the TBodyRST
class. The second definition will override the first one, which is likely not intended.
-class TBodyRST(CustomRst):
- def to_rst(self, inner, symbol, parent):
- # This is now handled by TableRST
- return ""
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
class TBodyRST(CustomRst): | |
def to_rst(self, inner, symbol, parent): | |
# This is now handled by TableRST | |
return "" |
🧰 Tools
🪛 Ruff (0.8.2)
127-127: Redefinition of unused TBodyRST
from line 99
(F811)
class ThRST(CustomRst): | ||
def to_rst(self, inner, symbol, parent): | ||
return " ".join([e.to_rst() for e in inner]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove duplicate ThRST class definition.
There's a duplicate definition of the ThRST
class. The second definition will override the first one, which could lead to unexpected behavior.
-class ThRST(CustomRst):
- def to_rst(self, inner, symbol, parent):
- return " ".join([e.to_rst() for e in inner])
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
class ThRST(CustomRst): | |
def to_rst(self, inner, symbol, parent): | |
return " ".join([e.to_rst() for e in inner]) |
🧰 Tools
🪛 Ruff (0.8.2)
123-123: Redefinition of unused ThRST
from line 115
(F811)
@classmethod | ||
def from_pandas(cls, series:'pd.Series', head:'t.Literal[False]'=False): ... | ||
|
||
@t.overload | ||
@classmethod | ||
def from_pandas(cls, series:'pd.Index', head:'t.Literal[True]'): ... | ||
|
||
@classmethod | ||
def from_pandas(cls, series:'pd.Series | pd.Index', head:'bool'=False): | ||
try: | ||
self = cls() | ||
|
||
if head: | ||
self.extend_children([Th(inner=[Text(d)]) for d in series]) | ||
|
||
self.extend_children([Td(inner=[Text(d)]) for d in series]) | ||
|
||
return self | ||
except ImportError: | ||
raise ImportError("`tables` extra is required to use `from_pandas`") | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix incorrect behavior in Tr.from_pandas implementation.
The implementation of Tr.from_pandas
adds both Th
and Td
elements when head=True
, which seems incorrect. It should only add one type of elements based on the head
parameter.
@classmethod
def from_pandas(cls, series:'pd.Series | pd.Index', head:'bool'=False):
try:
self = cls()
if head:
self.extend_children([Th(inner=[Text(d)]) for d in series])
-
- self.extend_children([Td(inner=[Text(d)]) for d in series])
+ else:
+ self.extend_children([Td(inner=[Text(d)]) for d in series])
return self
except ImportError:
raise ImportError("`tables` extra is required to use `from_pandas`")
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
@classmethod | |
def from_pandas(cls, series:'pd.Series', head:'t.Literal[False]'=False): ... | |
@t.overload | |
@classmethod | |
def from_pandas(cls, series:'pd.Index', head:'t.Literal[True]'): ... | |
@classmethod | |
def from_pandas(cls, series:'pd.Series | pd.Index', head:'bool'=False): | |
try: | |
self = cls() | |
if head: | |
self.extend_children([Th(inner=[Text(d)]) for d in series]) | |
self.extend_children([Td(inner=[Text(d)]) for d in series]) | |
return self | |
except ImportError: | |
raise ImportError("`tables` extra is required to use `from_pandas`") | |
@classmethod | |
def from_pandas(cls, series:'pd.Series | pd.Index', head:'bool'=False): | |
try: | |
self = cls() | |
if head: | |
self.extend_children([Th(inner=[Text(d)]) for d in series]) | |
else: | |
self.extend_children([Td(inner=[Text(d)]) for d in series]) | |
return self | |
except ImportError: | |
raise ImportError("`tables` extra is required to use `from_pandas`") |
🧰 Tools
🪛 Ruff (0.8.2)
385-385: Within an except
clause, raise exceptions with raise ... from err
or raise ... from None
to distinguish them from errors in exception handling
(B904)
def from_list(cls, data:'list[list[str]]'): | ||
try: | ||
self = cls() | ||
|
||
for row in data: | ||
self.add_child(Tr.from_list(row)) | ||
|
||
except Exception as e: | ||
logger.error(f"Exception occurred in `from_list`: {e}") | ||
|
||
def to_list(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix missing return statement in from_list.
The from_list
method in TBody
class doesn't return anything, which is likely an oversight.
@classmethod
def from_list(cls, data:'list[list[str]]'):
try:
self = cls()
for row in data:
self.add_child(Tr.from_list(row))
+
+ return self
except Exception as e:
logger.error(f"Exception occurred in `from_list`: {e}")
+ raise
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def from_list(cls, data:'list[list[str]]'): | |
try: | |
self = cls() | |
for row in data: | |
self.add_child(Tr.from_list(row)) | |
except Exception as e: | |
logger.error(f"Exception occurred in `from_list`: {e}") | |
def to_list(self): | |
@classmethod | |
def from_list(cls, data:'list[list[str]]'): | |
try: | |
self = cls() | |
for row in data: | |
self.add_child(Tr.from_list(row)) | |
return self | |
except Exception as e: | |
logger.error(f"Exception occurred in `from_list`: {e}") | |
raise | |
def to_list(self): |
import pandas as pd | ||
if len(self.data) == 0: | ||
pass # Return undefined | ||
|
||
elif len(self.data) == 1: | ||
return pd.Index([d.data for d in self.data]) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix incomplete method implementation.
The to_pandas
method in THead
class is incomplete - it has a condition but no return for the case when len(self.data) == 1
.
def to_pandas(self) -> 'pd.Index':
import pandas as pd
if len(self.data) == 0:
pass # Return undefined
elif len(self.data) == 1:
- return pd.Index([d.data for d in self.data])
+ return pd.Index([d.data for d in self.data[0].data])
+
+ # Add a default return or raise an appropriate exception
+ return pd.Index([])
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
import pandas as pd | |
if len(self.data) == 0: | |
pass # Return undefined | |
elif len(self.data) == 1: | |
return pd.Index([d.data for d in self.data]) | |
def to_pandas(self) -> 'pd.Index': | |
import pandas as pd | |
if len(self.data) == 0: | |
pass # Return undefined | |
elif len(self.data) == 1: | |
return pd.Index([d.data for d in self.data[0].data]) | |
# Add a default return or raise an appropriate exception | |
return pd.Index([]) |
from ..utils import List, set_recursion_limit | ||
from ..typing import ATTR_TYPES | ||
|
||
set_recursion_limit(10000) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Document the reason for setting a high recursion limit.
Setting a high recursion limit (10000) might be necessary for parsing deeply nested structures, but without documentation, it's unclear why this specific limit was chosen and what the implications are.
🏁 Script executed:
#!/bin/bash
# Find examples of deeply nested content in markdown or HTML files that would require high recursion limits
find . -type f -name "*.md" -o -name "*.html" | xargs wc -l | sort -nr | head -5
Length of output: 125
Document the rationale behind raising the recursion limit
The call to set_recursion_limit(10000)
on line 10 in BetterMD/elements/symbol.py
currently lacks an explanation. While a search for deeply nested markdown or HTML files did not reveal any obvious cases requiring such a high limit, please add documentation explaining:
- Why a recursion limit of 10000 is necessary.
- Under which conditions or parsing scenarios (e.g. potential deeply nested structures) this increased limit is expected to be beneficial.
- Any potential implications or risks associated with such a high limit.
@classmethod | ||
def from_html(cls, text:'str') -> 'List[Symbol]': | ||
parsed = cls.html_parser.parse(text) | ||
return List([cls.collection.find_symbol(elm['name'], raise_errors=True).parse(elm) for elm in parsed]) | ||
|
||
@classmethod | ||
def from_md(cls, text: str) -> 'List[Symbol]': | ||
parsed = cls.md_parser.parse(text) | ||
return List([cls.collection.find_symbol(elm['name'] , raise_errors=True).parse(elm) for elm in parsed]) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add error handling for parsing failures.
The from_html
and from_md
methods might raise exceptions if parsing fails or if a required symbol is not found. Consider adding proper error handling to make the code more robust.
@classmethod
def from_html(cls, text:'str') -> 'List[Symbol]':
- parsed = cls.html_parser.parse(text)
- return List([cls.collection.find_symbol(elm['name'], raise_errors=True).parse(elm) for elm in parsed])
+ try:
+ parsed = cls.html_parser.parse(text)
+ result = []
+ for elm in parsed:
+ try:
+ symbol_cls = cls.collection.find_symbol(elm['name'], raise_errors=True)
+ result.append(symbol_cls.parse(elm))
+ except ValueError as e:
+ import logging
+ logging.warning(f"Could not find symbol for {elm['name']}: {e}")
+ # Consider a fallback option here, e.g., using a generic Symbol
+ return List(result)
+ except Exception as e:
+ import logging
+ logging.error(f"Error parsing HTML: {e}")
+ return List([])
Similar changes should be applied to the from_md
method for consistency.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
@classmethod | |
def from_html(cls, text:'str') -> 'List[Symbol]': | |
parsed = cls.html_parser.parse(text) | |
return List([cls.collection.find_symbol(elm['name'], raise_errors=True).parse(elm) for elm in parsed]) | |
@classmethod | |
def from_md(cls, text: str) -> 'List[Symbol]': | |
parsed = cls.md_parser.parse(text) | |
return List([cls.collection.find_symbol(elm['name'] , raise_errors=True).parse(elm) for elm in parsed]) | |
@classmethod | |
def from_html(cls, text:'str') -> 'List[Symbol]': | |
- parsed = cls.html_parser.parse(text) | |
- return List([cls.collection.find_symbol(elm['name'], raise_errors=True).parse(elm) for elm in parsed]) | |
+ try: | |
+ parsed = cls.html_parser.parse(text) | |
+ result = [] | |
+ for elm in parsed: | |
+ try: | |
+ symbol_cls = cls.collection.find_symbol(elm['name'], raise_errors=True) | |
+ result.append(symbol_cls.parse(elm)) | |
+ except ValueError as e: | |
+ import logging | |
+ logging.warning(f"Could not find symbol for {elm['name']}: {e}") | |
+ # Consider a fallback option here, e.g., using a generic Symbol | |
+ return List(result) | |
+ except Exception as e: | |
+ import logging | |
+ logging.error(f"Error parsing HTML: {e}") | |
+ return List([]) | |
@classmethod | |
def from_md(cls, text: str) -> 'List[Symbol]': | |
- parsed = cls.md_parser.parse(text) | |
- return List([cls.collection.find_symbol(elm['name'] , raise_errors=True).parse(elm) for elm in parsed]) | |
+ try: | |
+ parsed = cls.md_parser.parse(text) | |
+ result = [] | |
+ for elm in parsed: | |
+ try: | |
+ symbol_cls = cls.collection.find_symbol(elm['name'], raise_errors=True) | |
+ result.append(symbol_cls.parse(elm)) | |
+ except ValueError as e: | |
+ import logging | |
+ logging.warning(f"Could not find symbol for {elm['name']}: {e}") | |
+ # Consider a fallback option here, e.g., using a generic Symbol | |
+ return List(result) | |
+ except Exception as e: | |
+ import logging | |
+ logging.error(f"Error parsing MD: {e}") | |
+ return List([]) |
def parse(cls, text:'ELEMENT|TEXT') -> 'Symbol': | ||
def handle_element(element:'ELEMENT|TEXT'): | ||
if element['type'] == 'text': | ||
text = cls.collection.find_symbol("text", raise_errors=True) | ||
assert text is not None, "`collection.find_symbol` is broken" | ||
return text(element['content']) | ||
|
||
symbol_cls = cls.collection.find_symbol(element['name'], raise_errors=True) | ||
assert symbol_cls is not None, "`collection.find_symbol` is broken" | ||
|
||
return symbol_cls.parse(element) | ||
|
||
if text["type"] == "text": | ||
return cls.collection.find_symbol("text", raise_errors=True)(text["content"]) | ||
|
||
# Extract attributes directly from the attributes dictionary | ||
attributes = text["attributes"] | ||
|
||
# Handle class attribute separately if it exists | ||
classes = [] | ||
if "class" in attributes: | ||
classes = attributes["class"].split() if isinstance(attributes["class"], str) else attributes["class"] | ||
del attributes["class"] | ||
|
||
# Handle style attribute separately if it exists | ||
styles = {} | ||
if "style" in attributes: | ||
style_str = attributes["style"] | ||
if isinstance(style_str, str): | ||
styles = dict(item.split(":") for item in style_str.split(";") if ":" in item) | ||
elif isinstance(style_str, dict): | ||
styles = style_str | ||
del attributes["style"] | ||
|
||
inner=[handle_element(elm) for elm in text["children"]] | ||
|
||
return cls( | ||
styles=styles, | ||
classes=classes, | ||
inner=inner, | ||
**attributes | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add validation for malformed style attributes.
The style attribute parsing in the parse
method assumes that each style declaration has a colon separator. If malformed style attributes are encountered (missing colons, empty parts, etc.), this could raise exceptions or produce unexpected results.
# Handle style attribute separately if it exists
styles = {}
if "style" in attributes:
style_str = attributes["style"]
if isinstance(style_str, str):
- styles = dict(item.split(":") for item in style_str.split(";") if ":" in item)
+ for item in style_str.split(";"):
+ item = item.strip()
+ if ":" in item:
+ key, value = item.split(":", 1)
+ key = key.strip()
+ value = value.strip()
+ if key and value: # Only add non-empty key-value pairs
+ styles[key] = value
elif isinstance(style_str, dict):
styles = style_str
del attributes["style"]
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def parse(cls, text:'ELEMENT|TEXT') -> 'Symbol': | |
def handle_element(element:'ELEMENT|TEXT'): | |
if element['type'] == 'text': | |
text = cls.collection.find_symbol("text", raise_errors=True) | |
assert text is not None, "`collection.find_symbol` is broken" | |
return text(element['content']) | |
symbol_cls = cls.collection.find_symbol(element['name'], raise_errors=True) | |
assert symbol_cls is not None, "`collection.find_symbol` is broken" | |
return symbol_cls.parse(element) | |
if text["type"] == "text": | |
return cls.collection.find_symbol("text", raise_errors=True)(text["content"]) | |
# Extract attributes directly from the attributes dictionary | |
attributes = text["attributes"] | |
# Handle class attribute separately if it exists | |
classes = [] | |
if "class" in attributes: | |
classes = attributes["class"].split() if isinstance(attributes["class"], str) else attributes["class"] | |
del attributes["class"] | |
# Handle style attribute separately if it exists | |
styles = {} | |
if "style" in attributes: | |
style_str = attributes["style"] | |
if isinstance(style_str, str): | |
styles = dict(item.split(":") for item in style_str.split(";") if ":" in item) | |
elif isinstance(style_str, dict): | |
styles = style_str | |
del attributes["style"] | |
inner=[handle_element(elm) for elm in text["children"]] | |
return cls( | |
styles=styles, | |
classes=classes, | |
inner=inner, | |
**attributes | |
) | |
def parse(cls, text:'ELEMENT|TEXT') -> 'Symbol': | |
def handle_element(element:'ELEMENT|TEXT'): | |
if element['type'] == 'text': | |
text = cls.collection.find_symbol("text", raise_errors=True) | |
assert text is not None, "`collection.find_symbol` is broken" | |
return text(element['content']) | |
symbol_cls = cls.collection.find_symbol(element['name'], raise_errors=True) | |
assert symbol_cls is not None, "`collection.find_symbol` is broken" | |
return symbol_cls.parse(element) | |
if text["type"] == "text": | |
return cls.collection.find_symbol("text", raise_errors=True)(text["content"]) | |
# Extract attributes directly from the attributes dictionary | |
attributes = text["attributes"] | |
# Handle class attribute separately if it exists | |
classes = [] | |
if "class" in attributes: | |
classes = attributes["class"].split() if isinstance(attributes["class"], str) else attributes["class"] | |
del attributes["class"] | |
# Handle style attribute separately if it exists | |
styles = {} | |
if "style" in attributes: | |
style_str = attributes["style"] | |
if isinstance(style_str, str): | |
for item in style_str.split(";"): | |
item = item.strip() | |
if ":" in item: | |
key, value = item.split(":", 1) | |
key = key.strip() | |
value = value.strip() | |
if key and value: # Only add non-empty key-value pairs | |
styles[key] = value | |
elif isinstance(style_str, dict): | |
styles = style_str | |
del attributes["style"] | |
inner = [handle_element(elm) for elm in text["children"]] | |
return cls( | |
styles=styles, | |
classes=classes, | |
inner=inner, | |
**attributes | |
) |
Replaced by #11 |
Summary by CodeRabbit
New Features
Body
,Header
,Footer
, and many others to support diverse content structures.MDParser
class for improved Markdown parsing capabilities.Search
,Dialog
,Summary
, and others.BetterMD
framework, enhancing overall functionality.Chores