-
Notifications
You must be signed in to change notification settings - Fork 873
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
attr_list: add option to preserve attributes without value as-is #1501
Comments
IIRC it is simply how etree works. I don't Python Markdown especially cares about how it is output, they are just limited by the behavior of etree. Attributes are defined with a key and value. You have to define something for the value: >>> tag = etree.Element('div', {'download': ''})
>>> etree.tostring(tag)
b'<div download="" />'
>>> tag = etree.Element('div', {'download': 'download'})
>>> etree.tostring(tag)
b'<div download="download" />' What's more correct? What would you suggest? The only thing I can think of is having some sort of post processer to strip out values if desired, but then you have to know how to identify with the value is meant to be stripped or not meant to be stripped. Anyway, I don't know that this is likely to change unless the document handler was changed and it supported a way to do what you want, and I doubt Python Markdown will be doing such a massive change like replacing the HTML handler at this point. |
This is dependent on the >>> markdown.markdown('[download](file.txt){download}', extensions=['attr_list'])
'<p><a download="download" href="file.txt">download</a></p>'
>>> markdown.markdown('[download](file.txt){download}', extensions=['attr_list'], output_format='html')
'<p><a download href="file.txt">download</a></p>' There is an argument to be made that we could change the default here. However, we have avoided that because XHTML is still valid HTML5, but HTML5 is not always valid XHTML. If any users are still using this lib to generate strict XHML pages, then a change to the default would break their pages. However, leaving the default as-is breaks nothing while allowing those users who care to change the output to match their desired format. |
@facelessuser as a reminder, we use our own custom serializer, not etree's built-in one. And that allows us to address things like this. markdown/markdown/serializers.py Lines 151 to 155 in 4260e7b
|
Oh, right, I forgot about that. I'd say in modern day, XHMTL is used nearly as much anymore, but I get the desire to not change the status quo. |
This inconsistency seems quite problematic. Modern browsers treat I argue that for consistency the default value of an attribute without value in XHTML should be the blank string ( |
@FeldrinH can you point to where in the XHTML spec this behavior is specified? Or is this a situation where the browser acts correctly only if strict XHTML is used (proper headers and doc type forces browser to recognize a page as strict XHTML such that the page fails to render if the XHTML is not valid XML)? And is this the behavior for any key-only attribute or only the |
I specifically meant that modern browsers treat For XHTML attributes without values are not allowed, so my proposal is that |
I understood you the first time. However, you are pointing at the HTML spec. If we are going to change the output for XHTML (as you are suggesting), then I want to see where the XHTML spec supports the proposed change. However, if you want valid HTML, then you should be specifying |
I don't quite understand what you are asking for. I am certain even without reading the spec that XHTML supports |
So far you have not demonstrated that our output is invalid. If you can point to where the XHTML spec indicates that |
Both are valid. My problem is that currently switching from XHTML output to HTML output changes the meaning of the generated HTML/XHTML document. |
|
Browsers only recognize XHTML as XHTML if the proper doc type is defined. Have you tested this properly? I don't know because you keep making assertions with nothing to back them up. |
AFAIK in modern browsers XHTML and HTML just have syntax and validation differences. (I couldn't find an explicit source for this, but any sources I've found listing the differences between HTML and XHTML only mention extra syntax restrictions and validation in XHTML.) The meaning of an attribute value shouldn't change regardless of what the browser recognizes the document as. So
I did test that Here is my test setup: xhtmltest.zip. Serve it with any static file server and try for yourself. In my tests |
I actually had some time to look at this today. The issue is that I note that the XHTML spec lists all known boolean attributes ( Why not just change all booleans to There is one additional complication. When we are building the tree in our parser, we immediately assign In other words, this is much more complicated than it may seem at first. In the short term, I will say that using |
I think it's worth noting that this issue already exists for the HTML output format: >>> markdown.markdown('[download](file.txt){download="download"}', extensions=['attr_list'], output_format='html')
'<p><a download href="file.txt">download</a></p>' |
An argument could be made that a user setting That said, if we were going to do anything, I think the only reasonable solution would be to incorporate a list of known boolean attributes (which may be different depending on output format but would not include So how would we implement that? After all, internally we have no way to represent |
Currently, if you have an attribute without value then attr_list defaults the value to the name of the attribute. For example:
is rendered as
I would like to have an option to preserve it as simply
download
without value, so the output would be this:This would make the attribute list in markdown more consistent with the produced HTML.
PS: What was the reasoning behind the current behavior? I can't think of a scenario where I would want the attribute to have its name as the value.
The text was updated successfully, but these errors were encountered: