-
-
Notifications
You must be signed in to change notification settings - Fork 2.3k
[singlehtml] add docname to section anchor to make them unique #13739
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
e6b65fb
to
5117057
Compare
Hi @gastmaier - I'm a former semi-regular volunteer contributor here, although I have been less active recently. Thanks for the pull request; and sorry that I did not notice the toctree constructor problem, as you mention in #13717. I am reading both #13717 and this PR #13739 to try to understand the different approaches and reasons for them. Also: do you have a test case that we could add under |
Hi @jayaddison maybe extending tests/test_builders/test_build_html_tocdepth.py |
@gastmaier that sounds perfect, yep! (I'd forgotten about those tests) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Align with purpose
c50ba56
to
910de47
Compare
Hi, @jayaddison and @akhilsmokie7-cloud I rebased and added the test to check for duplicated ids. I added the test to the bottom, checking out fe728f4 will fail at
as expected, since at f5457f1 On, "the html build also changes the images src path during the write step", this is what I am talking about CI note:
is due to 2e51b787680cefdfe56b3438d809e6476600a47e Thanks, |
To assert unique ids in singlehtml builder. Signed-off-by: Jorge Marques <[email protected]>
Since the singlehtml aggregates all doc files into a single html page during the write step, and the ids must be unique for proper link anchoring, add test that collects all ids in the page and checks if all ids are unique, by asserting the length of the list against it as a set.
And use modified id directly, modified in the previous commit to ensure unique id in singlehtml.
910de47
to
3a92a34
Compare
@@ -110,7 +110,7 @@ def assemble_toc_secnumbers(self) -> dict[str, dict[str, tuple[int, ...]]]: | |||
new_secnumbers: dict[str, tuple[int, ...]] = {} | |||
for docname, secnums in self.env.toc_secnumbers.items(): | |||
for id, secnum in secnums.items(): | |||
alias = f'{docname}/{id}' | |||
alias = f'{docname}{id}' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What kind of values are possible for the docname
and id
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(also: I guess people shouldn't have written hyperlinks or saved bookmarks with the assumption that these aliases are stable? but, even so - if we change the format, I guess we would break those?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gastmaier in fact: I'm not sure where these /
separator characters appear. What does this code relate to?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For singlehtml and at the assemble toctree step, the href is a tuple of docname and refid.
#document-path/to/#id1
to try to avoid the refid confliction in singlehtml mode problem, which didn't work because it would patch toctree, but the content body still had the non-unique ids.
My pr changes the toctree href format from
#document-path/to/#id1
to #document-path/to#id1
(removes end slash)
and for content ids from
#d1
to #document-path/to#id1
(adds doc prefix to make unique)
the new template is therefore:
#document-{doc}#{id}
direct tuple of docname and refid, without the slash.
These are valid HTML anchors, but do require escaping when manipulating with:
css
#document-test\/extra\#test {color: #f00;}
and javascript
document.querySelector('#document-test\\/extra#test')
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
singlehtml.zip
here is a singlehtml build with the patch
@@ -497,6 +498,15 @@ def depart_term(self, node: Element) -> None: | |||
|
|||
self.body.append('</dt>') | |||
|
|||
def visit_section(self, node: section) -> None: | |||
if self.builder.name == 'singlehtml' and node['ids']: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't seem to use many @property
methods in the Sphinx writers, but maybe this singlehtml
condition is getting to the point where it makes sense (this is the third potential callsite, I think?).
Maybe pedantic of me to mention, but: running the test code without the fix in place does confirm that the test case fails (duplication of |
(I attempted that to reassure myself and to learn slightly more about how the fix works) |
Drafting again, I spotted more links using the non-doc-prefixed anchor in the body.
are not being prefixed. but their links to it are correct ( I will give yet another try, but this time transversing the pickled to patch all ids early on, instead of patching at the nodes visit. |
Purpose
Follow up to #13717, inverting the logic, instead of patching the toctree to yield "#id1" instead of "#document-path/to#id1", have the section id to be docname preffixed, solving non-unique ids in singlehtml.
Allows to remove post Sphinx transforms like in here
Top level overview of current behavior
Approach taken
Based on the LaTeX builder solution.
sphinx/writers/latex.py#hypertarget[withdoc=True]
method suffixes docutils id with the docname.In my implementation I edit ids['0'] directly to not have to overwrite the whole
visit_section
method, but I understand if requested to not modify the tree and instead overwrite.On the format #document-test/extra#id1
It is compatible with HTML anchoring, CSS and JavaScript selectors, but require escaping:
Tests
The following tests are relevant:
References