-
Notifications
You must be signed in to change notification settings - Fork 386
Migrating LoopBack Docs to Markdown for use with Jekyll
GOAL: To create an open-source site similar to Express docs. High-level tasks:
- Export content of APIC space to HTML. (This space now contains the source documentation for LoopBack, which is duplicated in pages with the same title in the LB space.)
- Convert/strip HTML to markdown using script.
- Add pages to this repo, edit as necessary
When site goes live, replace Confluence pages with redirects to here.
NOTE: Although the long-term plan is to have both LoopBack 2.x and 3.0 docs, initially we should focus on 2.x, since 3.0 is not released yet. As the 3.0 release approaches, we can "clone" the 2.0 docs into /docs/lb3
, and then add/modify the content as needed.
Article title is in this block:
<h1 id="title-heading" class="pagetitle">
<span id="title-text">... Article title ... </span>
</h1>
Use the contents of the <span id="title-text">
tag as the value for the title
property in the article front-matter.
NOTE: If the title includes a colon character (:), Jekyll requires the title
property to be quoted. In the Confluence export, these articles will have file names that are numbers instead of text.
Every markdown file must start with some Jekyll front-matter that looks like this:
---
title: The article title goes here
layout: page
keywords: LoopBack
tags:
sidebar: lb2_sidebar
permalink: /doc/lb2/The-file-name-goes-here.html
summary:
---
NOTE: The three dashes before and after front-matter are required.
In general, we don't have a consistent summary for every article, so we'll leave the summary
property blank.
Confluence export apparently does not include "labels" data, so we'll also leave the tags
property blank. This seems pretty lame on the part of Confluence (Atlassian).
The actual article content is in:
... Content here ...
</div>```
Everthing above and below this, i.e. outside of this tag, can be discarded.
### Other stuff that should be discarded.
Some pages may have these, which should just be discarded.
#### Injected CSS
Discard injected CSS: `<style type='text/css'>/*<![CDATA[*/ .... /*]]>*/</style>`
#### Confluence-generated TOC
Since our Jekyll theme has it's own [automatic generated TOCs](http://idratherbewriting.com/documentation-theme-jekyll/mydoc_pages.html#automatic-mini-tocs), we should discard this HTML (that occurs only in some pages):
The class selector rbtoc1470354523244
varies by page.
We need to process links whose href
destination URL begins with https://docs.strongloop.com/display/APIC/
so they link to the new page here instead of the old page. All other links should be left "as is".
Convert
<a href="https://docs.strongloop.com/display/APIC/Creating+model+relations" rel="nofollow">
Creating model relations
</a>
To
[Creating model relations](/doc/lb2/Creating-model-relations.html)
Convert headings as follows:
Confluence HTML | Markdown |
---|---|
<h2> .. </h2> |
## |
<h3> .. </h3> |
### |
<h4> .. </h4> |
### |
<h5> .. </h5> |
### |
<h6> .. </h6> |
### |
I'm assuming we can convert the HTML to markdown without too much trouble, but I'm keeping this here for reference in case we need it.
In case it's easier to export to Word and then convert the Word files to markdown. See How can doc/docx files be converted to markdown or structured text?.
Other references:
- https://domchristie.github.io/to-markdown/ Online HTML to MD converter.
- http://pandoc.org/
References: