Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove UseSoftlineBreakAsHardlineBreak from MarkdownParser #510

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

Mpdreamz
Copy link
Member

@Mpdreamz Mpdreamz commented Feb 14, 2025

This change ensures that soft line breaks are no longer treated as hard line breaks during markdown parsing. It enhances compatibility with standard markdown behavior.

This implements the request made here: #343

This should align better with how writers write as per: https://asciidoctor.org/docs/asciidoc-recommended-practices/#one-sentence-per-line

NOTE this is high impactful on asciidocalypse and docs-content.

Introduces a new soft_line_endings configuration option to docset.yml control line break behavior in Markdown parsing. When enabled, soft line endings are converted to hard HTML breaks (<br />).

NOTE: The new default of this config option is false so won't turn soft enters in the markdown to hard breaks in the HTML.

cc @karenzone @shainaraskas

This change ensures that soft line breaks are no longer treated as hard line breaks during markdown parsing. It enhances compatibility with standard markdown behavior .
Introduces a new `soft_line_endings` configuration option to control line break behavior in Markdown parsing. When enabled, soft line endings are converted to hard HTML breaks (`<br />`). Updated relevant parser logic and documentation to reflect this addition.
@bmorelli25
Copy link
Member

Well that was fast 😅. I'll check this out and test against docs-content later today!

Copy link

@karenzone karenzone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fantastic news! Thanks for making this happen--and so quickly!

@bmorelli25
Copy link
Member

So I must be doing something wrong. I don't notice any differences in the output when testing this against docs-content. I also tried to write a script that diffs the HTML output of building with this change vs building without this change. After correcting for changes in edit_this_page links, the HTML diff is empty. Does that sound possible?

@shainaraskas
Copy link
Contributor

shainaraskas commented Feb 20, 2025

@bmorelli25 it sounds possible to me - the reason why this change is relatively safe is because the migration tool imported paragraphs as lines, and put hard line breaks between paragraphs, and we as writers just kept following the pattern.

consider trying to point it at a branch that you have altered to have the "problematic" content, e.g.

This should appear as two paragraphs.
Because soft line breaks are turned off.
overkill explanation and examples

current behavior

To integrate with Active Directory, you configure an `active_directory` realm and map Active Directory users and groups to roles in the role mapping file.
TEST TEST!

becomes

image

and

To integrate with Active Directory, you configure an `active_directory` realm and map Active Directory users and groups to roles in the role mapping file.

TEST TEST!

becomes

image

what we want is

To integrate with Active Directory, you configure an `active_directory` realm and map Active Directory users and groups to roles in the role mapping file.
TEST TEST!

to be

image

and

To integrate with Active Directory, you configure an `active_directory` realm and map Active Directory users and groups to roles in the role mapping file. 

TEST TEST!

to be

image

going to quickly regex again for this case - when I did it before, the only lines sitting directly beside each other were inside of code blocks (which this change does not impact).

@shainaraskas
Copy link
Contributor

shainaraskas commented Feb 20, 2025

yep, quick scan shows we only have consecutive lines of text in:

  • front matter
  • code blocks
  • lists
  • intros to lists
  • directives + their contents
  • definition lists

used [a-z,0-9,\.]\n{1}[^\d\W]

none of these would be impacted by this change afaik

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants