confluence2markdown is built on a broken tokenizer #2

oberlies · 2017-03-21T16:06:24Z

Floby's node-tokenizer is fundamentally broken - see Floby/node-tokenizer#15

This problem is also not resolved in your fork. This is a shame because otherwise this would be a very useful tool...

pborenstein · 2017-03-21T16:26:46Z

My fork was an ugly hack to get a project finished. (You can see how ugly when I set the max token from 128 to 1024 characters.) I haven't really come back to it since then. If you can make it more robust, that would be great.

oberlies · 2017-03-22T16:49:50Z

Well, the issue has been present in Floby's node-tokenizer for years and there hasn't even been a bug report for this up to now. Appears to me that the project is dead. Not something where I'd want to invest...

Parsing the old (3.5) Confluence markup is quite complicated. (After all, this is why they abandoned the format.) So I now switched to an approach that avoids this problem - or rather leaves it to Atlassian:

Create an empty wiki page in a current version of Confluence and save it.
Edit the page again.
Go to Insert > Markup and insert the Confluence markup
Click on Open in source editor to show the content in the new XML-based storage format
Copy the content to an XML file
Add the missing header and footer (see Conversion fails for (non well-formed) XML obtained from "View Source" highsource/confluence-to-markdown-converter#8)
Convert to markdown using the c2md.xsl XSLT transformation from https://github.com/highsource/confluence-to-markdown-converter

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

confluence2markdown is built on a broken tokenizer #2

confluence2markdown is built on a broken tokenizer #2

oberlies commented Mar 21, 2017

pborenstein commented Mar 21, 2017

oberlies commented Mar 22, 2017

confluence2markdown is built on a broken tokenizer #2

confluence2markdown is built on a broken tokenizer #2

Comments

oberlies commented Mar 21, 2017

pborenstein commented Mar 21, 2017

oberlies commented Mar 22, 2017