Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor the the ParsedTypeDocstring #874

Draft
wants to merge 24 commits into
base: master
Choose a base branch
from

Conversation

tristanlatr
Copy link
Contributor

@tristanlatr tristanlatr commented Feb 2, 2025

Refactor the the ParsedTypeDocstring - used in numpy and google docformat as well as when --process-types option is passed.

This gets rid of some unfortunate code that was not doing it's thing right. We replace that dirty processing by legitimately creating a docutils tree for these types are rendering correctly. In order to do that, the linker had to be adjusted not to wrap links inside <code> tags anymore (see the preliminary refactor commit ec6c907).

Make possible for PR #723 to be complete
Fixes #581
Fixes #873

Before:
Capture d’écran, le 2025-02-02 à 15 31 32

After:
Capture d’écran, le 2025-02-02 à 15 31 57

The warning diff was unexpected... it appears that pydoctor was reporting link not found at several unrelated places, increasing the number of warnings uselessly.

…de> tags are now added by the html translator when the document is a docstring. Otherwise it does not add the enclosing <code> tags because we're already in the middle of a code tag or similar <span class="rst-literal">.

Adjust the themes so the <code> tags and <span class="rst-literal"> are really equivalent.
Comment on lines -100 to -143

combined_tokens: list[tuple[Any, TokenType]] = []

open_parenthesis = 0
open_square_braces = 0

for _token, _type in tokens:
# The actual type of_token is str | Tag | Node.

if (_type is TokenType.DELIMITER and _token in ('[', '(', ')', ']')) \
or _type is TokenType.OBJ:
if _token == "[": open_square_braces += 1
elif _token == "(": open_parenthesis += 1

if _type is TokenType.OBJ:
_token = docstring_linker.link_xref(
_token, _token, self._lineno)

if open_square_braces + open_parenthesis > 0:
try: last_processed_token = combined_tokens[-1]
except IndexError:
combined_tokens.append((_token, _type))
else:
if last_processed_token[1] is TokenType.OBJ \
and isinstance(last_processed_token[0], Tag):
# Merge with last Tag
if _type is TokenType.OBJ:
assert isinstance(_token, Tag)
last_processed_token[0](*_token.children)
else:
last_processed_token[0](_token)
else:
combined_tokens.append((_token, _type))
else:
combined_tokens.append((_token, _type))

if _token == "]": open_square_braces -= 1
elif _token == ")": open_parenthesis -= 1

else:
# the token will be processed in _convert_type_spec_to_stan() method.
combined_tokens.append((_token, _type))

return combined_tokens
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was the unfortunate code

Copy link

codecov bot commented Feb 2, 2025

Codecov Report

Attention: Patch coverage is 96.22642% with 2 lines in your changes missing coverage. Please review.

Project coverage is 92.77%. Comparing base (3357f21) to head (7eb22b7).
Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
pydoctor/node2stan.py 71.42% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #874      +/-   ##
==========================================
- Coverage   92.79%   92.77%   -0.02%     
==========================================
  Files          47       47              
  Lines        8468     8448      -20     
  Branches     1550     1542       -8     
==========================================
- Hits         7858     7838      -20     
  Misses        350      350              
  Partials      260      260              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

This comment has been minimized.

This comment has been minimized.

1 similar comment

This comment has been minimized.

This comment has been minimized.

This reverts commit c6ffcef.

This comment has been minimized.

@tristanlatr
Copy link
Contributor Author

The warning diff is OK, I've double checked it and it appears that pydoctor was reporting link not found at several unrelated places, increasing the number of warnings uselessly.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

docs/source/conf.py Outdated Show resolved Hide resolved

This comment has been minimized.

1 similar comment

This comment has been minimized.

Comment on lines +188 to +193
['defaultdict', ', and '],
['defaultdict', ', or '],
['defaultdict', ' of '],
['defaultdict', ' of ', 'x', ' to '],
['defaultdict', ', ', 'and'],
['defaultdict', ', ', 'or'],
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are edge cases where a natural language delimiter is trailing in a type expression from docstrings.

It is not valid english so the token should not be interpreted as a delimiter when it expect a non-existent following-up token. It should end up in the same token just like 'defaultdict of' below. Though, emitting more unknown tokens kinds should be done with care since it will alter the behaviour of numpy and google style docstring's Return section and friends. But these input should really be treated as a single unknown kind of token instead of being tokenized wrongly. The only exception is the comma delimiter that is special because the adding a trailing coma should not make the preceding token to be merged with the comma.

This can be addressed in a following-up PR.

test:36: bad rendering of class signature: SAXParseException: <unknown>.+ undefined entity
'''.splitlines()

# Some how the type processing get rid of the non breaking spaces, but it's more an implementation
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this implementation detail is now gone.

else:
if isinstance(resolved, str):
xref = intersphinx_link(label, url=resolved)
else:
xref = taglink(resolved, self.page_url, label)

return tags.code(xref)
return xref
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is rather important: it makes the linker always produce a <a> tag only and never wrap it in a <code> tag. This is good because having the code tag is rather a presentation matter and the linker should,.. well link. This transfers the presentation responsibility to the node2stan.py module that will include <code> or not based on the source of the document.

Comment on lines -292 to +299
return tags.transparent(label)
return tags.a(label)

def link_xref(self, target: str, label: "Flattenable", lineno: int) -> Tag:
return tags.code(label)
return tags.a(label)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This explains the major HTML diffs in the tests. It's Important that the NotFoundLinker emits a <a> tag (in this case with no href attribute) rather than a transparent or code tag because it's closer to what the actual linker does.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm gonna dig for a better reproducer of the issue and remove this huge file from the tests.

- fix the linenumber of reported type docstring issues by 1.
- the nested warnings for unknown token are now properly propagated and reported.
…m:twisted/pydoctor into 873-implement-parsedtypedocstring.to_node

This comment has been minimized.

…inenumber correct for epytext and restructuredtext.

This comment has been minimized.

This comment has been minimized.

1 similar comment

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

Copy link
Contributor Author

@tristanlatr tristanlatr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a hell of a refactor... But it's in the right direction.


TokenType.CONTROL: lambda _token, _, __: \
nodes.emphasis(_token, _token),

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

This comment has been minimized.

This comment has been minimized.

Copy link

Diff from pydoctor_primer, showing the effect of this PR on open source code:

numpy (https://github.com/numpy/numpy)
- /projects/numpy/numpy/_core/_ufunc_config.py:389: bad docstring: invalid value set (missing closing brace): {divide
+ /projects/numpy/numpy/_core/_ufunc_config.py:388: bad docstring: invalid value set (missing closing brace): {divide
- /projects/numpy/numpy/_core/_ufunc_config.py:389: bad docstring: invalid value set (missing opening brace): invalid}
+ /projects/numpy/numpy/_core/_ufunc_config.py:388: bad docstring: invalid value set (missing opening brace): invalid}
- /projects/numpy/numpy/_core/fromnumeric.py:387: bad docstring: invalid value set (missing closing brace): {'raise'(
+ /projects/numpy/numpy/_core/fromnumeric.py:386: bad docstring: invalid value set (missing closing brace): {'raise'(
- /projects/numpy/numpy/_core/fromnumeric.py:387: bad docstring: invalid value set (missing opening brace): }
+ /projects/numpy/numpy/_core/fromnumeric.py:386: bad docstring: invalid value set (missing opening brace): }
- /projects/numpy/numpy/_core/fromnumeric.py:387: bad docstring: unbalanced parenthesis in type expression
+ /projects/numpy/numpy/_core/fromnumeric.py:386: bad docstring: unbalanced parenthesis in type expression
- /projects/numpy/numpy/_core/fromnumeric.py:3899: bad docstring: invalid value set (missing closing brace): {int
+ /projects/numpy/numpy/_core/fromnumeric.py:3898: bad docstring: invalid value set (missing closing brace): {int
- /projects/numpy/numpy/_core/fromnumeric.py:3899: bad docstring: invalid value set (missing opening brace): float}
+ /projects/numpy/numpy/_core/fromnumeric.py:3898: bad docstring: invalid value set (missing opening brace): float}
- /projects/numpy/numpy/_core/fromnumeric.py:3926: bad docstring: invalid value set (missing closing brace): {int
+ /projects/numpy/numpy/_core/fromnumeric.py:3925: bad docstring: invalid value set (missing closing brace): {int
- /projects/numpy/numpy/_core/fromnumeric.py:3926: bad docstring: invalid value set (missing opening brace): float}
+ /projects/numpy/numpy/_core/fromnumeric.py:3925: bad docstring: invalid value set (missing opening brace): float}
- /projects/numpy/numpy/_core/fromnumeric.py:4103: bad docstring: invalid value set (missing closing brace): {int
+ /projects/numpy/numpy/_core/fromnumeric.py:4102: bad docstring: invalid value set (missing closing brace): {int
- /projects/numpy/numpy/_core/fromnumeric.py:4103: bad docstring: invalid value set (missing opening brace): float}
+ /projects/numpy/numpy/_core/fromnumeric.py:4102: bad docstring: invalid value set (missing opening brace): float}
- /projects/numpy/numpy/_core/fromnumeric.py:4130: bad docstring: invalid value set (missing closing brace): {int
+ /projects/numpy/numpy/_core/fromnumeric.py:4129: bad docstring: invalid value set (missing closing brace): {int
- /projects/numpy/numpy/_core/fromnumeric.py:4130: bad docstring: invalid value set (missing opening brace): float}
+ /projects/numpy/numpy/_core/fromnumeric.py:4129: bad docstring: invalid value set (missing opening brace): float}
- /projects/numpy/numpy/_core/einsumfunc.py:776: bad docstring: invalid value set (missing closing brace): {bool
+ /projects/numpy/numpy/_core/einsumfunc.py:775: bad docstring: invalid value set (missing closing brace): {bool
- /projects/numpy/numpy/_core/einsumfunc.py:776: bad docstring: invalid value set (missing opening brace): }
+ /projects/numpy/numpy/_core/einsumfunc.py:775: bad docstring: invalid value set (missing opening brace): }
- /projects/numpy/numpy/_core/einsumfunc.py:1090: bad docstring: invalid value set (missing closing brace): {data-type
+ /projects/numpy/numpy/_core/einsumfunc.py:1089: bad docstring: invalid value set (missing closing brace): {data-type
- /projects/numpy/numpy/_core/einsumfunc.py:1090: bad docstring: invalid value set (missing opening brace): None}
+ /projects/numpy/numpy/_core/einsumfunc.py:1089: bad docstring: invalid value set (missing opening brace): None}
- /projects/numpy/numpy/_core/einsumfunc.py:1114: bad docstring: invalid value set (missing closing brace): {False
+ /projects/numpy/numpy/_core/einsumfunc.py:1113: bad docstring: invalid value set (missing closing brace): {False
- /projects/numpy/numpy/_core/einsumfunc.py:1114: bad docstring: invalid value set (missing opening brace): }
+ /projects/numpy/numpy/_core/einsumfunc.py:1113: bad docstring: invalid value set (missing opening brace): }
- /projects/numpy/numpy/lib/_datasource.py:175: bad docstring: invalid value set (missing closing brace): {None
+ /projects/numpy/numpy/lib/_datasource.py:174: bad docstring: invalid value set (missing closing brace): {None
- /projects/numpy/numpy/lib/_datasource.py:175: bad docstring: invalid value set (missing opening brace): str}
+ /projects/numpy/numpy/lib/_datasource.py:174: bad docstring: invalid value set (missing opening brace): str}
- /projects/numpy/numpy/lib/_datasource.py:177: bad docstring: invalid value set (missing closing brace): {None
+ /projects/numpy/numpy/lib/_datasource.py:176: bad docstring: invalid value set (missing closing brace): {None
- /projects/numpy/numpy/lib/_datasource.py:177: bad docstring: invalid value set (missing opening brace): str}
+ /projects/numpy/numpy/lib/_datasource.py:176: bad docstring: invalid value set (missing opening brace): str}
+ /projects/numpy/numpy/lib/_datasource.py:499: bad docstring: invalid value set (missing closing brace): {None
+ /projects/numpy/numpy/lib/_datasource.py:499: bad docstring: invalid value set (missing opening brace): str}
- /projects/numpy/numpy/lib/_datasource.py:500: bad docstring: invalid value set (missing closing brace): {None
+ /projects/numpy/numpy/lib/_datasource.py:501: bad docstring: invalid value set (missing closing brace): {None
- /projects/numpy/numpy/lib/_datasource.py:500: bad docstring: invalid value set (missing opening brace): str}
+ /projects/numpy/numpy/lib/_datasource.py:501: bad docstring: invalid value set (missing opening brace): str}
- /projects/numpy/numpy/lib/_datasource.py:502: bad docstring: invalid value set (missing closing brace): {None
- /projects/numpy/numpy/lib/_datasource.py:502: bad docstring: invalid value set (missing opening brace): str}
- /projects/numpy/numpy/lib/_datasource.py:669: bad docstring: invalid value set (missing closing brace): {None
+ /projects/numpy/numpy/lib/_datasource.py:668: bad docstring: invalid value set (missing closing brace): {None
- /projects/numpy/numpy/lib/_datasource.py:669: bad docstring: invalid value set (missing opening brace): str}
+ /projects/numpy/numpy/lib/_datasource.py:668: bad docstring: invalid value set (missing opening brace): str}
- /projects/numpy/numpy/lib/_datasource.py:671: bad docstring: invalid value set (missing closing brace): {None
+ /projects/numpy/numpy/lib/_datasource.py:670: bad docstring: invalid value set (missing closing brace): {None
- /projects/numpy/numpy/lib/_datasource.py:671: bad docstring: invalid value set (missing opening brace): str}
+ /projects/numpy/numpy/lib/_datasource.py:670: bad docstring: invalid value set (missing opening brace): str}
- /projects/numpy/numpy/lib/_npyio_impl.py:333: bad docstring: invalid value set (missing closing brace): {None
+ /projects/numpy/numpy/lib/_npyio_impl.py:332: bad docstring: invalid value set (missing closing brace): {None
- /projects/numpy/numpy/lib/_npyio_impl.py:333: bad docstring: invalid value set (missing opening brace): }
+ /projects/numpy/numpy/lib/_npyio_impl.py:332: bad docstring: invalid value set (missing opening brace): }
- /projects/numpy/numpy/lib/_npyio_impl.py:1455: bad docstring: invalid value set (missing closing brace): {None
+ /projects/numpy/numpy/lib/_npyio_impl.py:1454: bad docstring: invalid value set (missing closing brace): {None
- /projects/numpy/numpy/lib/_npyio_impl.py:1455: bad docstring: invalid value set (missing opening brace): str}
+ /projects/numpy/numpy/lib/_npyio_impl.py:1454: bad docstring: invalid value set (missing opening brace): str}
- /projects/numpy/numpy/lib/_npyio_impl.py:1809: bad docstring: invalid value set (missing closing brace): {None
+ /projects/numpy/numpy/lib/_npyio_impl.py:1808: bad docstring: invalid value set (missing closing brace): {None
- /projects/numpy/numpy/lib/_npyio_impl.py:1809: bad docstring: invalid value set (missing opening brace): sequence}
+ /projects/numpy/numpy/lib/_npyio_impl.py:1808: bad docstring: invalid value set (missing opening brace): sequence}
- /projects/numpy/numpy/lib/_npyio_impl.py:1827: bad docstring: invalid value set (missing closing brace): {True
+ /projects/numpy/numpy/lib/_npyio_impl.py:1826: bad docstring: invalid value set (missing closing brace): {True
- /projects/numpy/numpy/lib/_npyio_impl.py:1827: bad docstring: invalid value set (missing opening brace): }
+ /projects/numpy/numpy/lib/_npyio_impl.py:1826: bad docstring: invalid value set (missing opening brace): }
- /projects/numpy/numpy/lib/_function_base_impl.py:1005: bad docstring: invalid value set (missing closing brace): {1
+ /projects/numpy/numpy/lib/_function_base_impl.py:1004: bad docstring: invalid value set (missing closing brace): {1
- /projects/numpy/numpy/lib/_function_base_impl.py:1005: bad docstring: invalid value set (missing opening brace): 2}
+ /projects/numpy/numpy/lib/_function_base_impl.py:1004: bad docstring: invalid value set (missing opening brace): 2}
- /projects/numpy/numpy/lib/_function_base_impl.py:3935: bad docstring: invalid value set (missing closing brace): {int
+ /projects/numpy/numpy/lib/_function_base_impl.py:3934: bad docstring: invalid value set (missing closing brace): {int
- /projects/numpy/numpy/lib/_function_base_impl.py:3935: bad docstring: invalid value set (missing opening brace): None}
+ /projects/numpy/numpy/lib/_function_base_impl.py:3934: bad docstring: invalid value set (missing opening brace): None}
- /projects/numpy/numpy/lib/_function_base_impl.py:4096: bad docstring: invalid value set (missing closing brace): {int

... (truncated 502 lines) ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
1 participant