-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(markdown): add support for HTML content #855
Conversation
Merge ProtectionsYour pull request matches the following merge protections and will not be merged until they are valid. 🟢 Enforce conventional commitWonderful, this rule succeeded.Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/
🟢 Require two reviewer for test updatesWonderful, this rule succeeded.When test data is updated, we require two reviewers
|
2890c32
to
e75f09e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀
Signed-off-by: Panos Vagenas <[email protected]>
e75f09e
to
7adea4a
Compare
I suggest making a reminder issue about html features which cannot be converted to markdown, i.e. we will have to deal with tables having merged cells, which are only possible via html code blocks. |
Signed-off-by: Panos Vagenas <[email protected]>
@dolfim-ibm can you elaborate? Unless you are you referring to Markdown export in general, this PR does not involve HTML-to-Markdown; what it does (when needed) is Markdown-to-HTML & then HTML parsing, in order to streamline HTML content processing. |
I'm referring to Markdown documents using html We would enhance the export to markdown in the DoclingDocument with an option to potentially output html tables if we detect that the markdown format would be "too lossy". Something like |
Addresses #734.