refactor: parse markdown on the server for content pages#411
refactor: parse markdown on the server for content pages#411eatyourgreens wants to merge 1 commit intomainfrom
Conversation
36e89fc to
95516c9
Compare
|
I think this will be my experiment for the hack day on Friday. It builds without errors but I’m not sure that it works. |
43e929a to
f1764f2
Compare
- Add a `markdown2Html` helper that converts markdown to HTMl with `remark`. - Add a `material2Html` helper that runs `markdown2Html` with the appropriate Remark plugins for course material. - Add a `html2React` helper that converts HTML strings to a React component tree with `rehype-react`. - Parse markdown in `getStaticProps` and add a `html` prop to teaching material pages. - Update the event API to add a `html` property to events and event groups. - Update `Content` so that it accepts `html` as a prop, instead of `markdown`. - Update the `Markdown` component to use `/lib/markdown`. - Parse HTML strings client-side with `rehype` and add interactivity.
f1764f2 to
552f519
Compare
|
I had assumed the primary purpose of this is to reduce the props, it only reduces NEXT_DATA from 290kb to 274kb, This is more a indication of how disgusting the size the props is, this is some of the very oldest code on gutenberg alongside the nlp stuff (easily predating me) and I think the intention was that the actual markdown was removed from any further drilling but obviously it was always needed to be present to rehydrate the content on the content component. In terms of page load, there has been no difference in the hydration speed, obviously in both cases the SSG page is fully loaded near instantly and the js in my tests is done in 600ms with the re-parsing the react and 605 with this one, so functionally identical. In both cases as well the SSG page is completely sufficient prior to the JS loading in as well, the whole point of next really. In terms of problems: I think if this is a good idea to change to then there needs to be some kind of effort to sanitise the html though, the existing code santises with react markdown but this would presumably just fully include <script> tags or embedded frames? While we have some kiknd of protection of our own deploy via the markdown repo it isnt nearly as strong as the protection on the gutenberg app and other deployments may point to sources they dont control and could have arbitrary code injected. I also am concerned about turning html into react components so our process is becoming markdown to html to react and also markdown to react and both our react doms have to be consistent. Like it is significantly more confusing and i'm worried if we add functionality to components then will it just always work, which the current solution is almost guaranteed to do. |
|
This is the same markdown processor that
Remark should sanitise output by default, but I'll double-check that. It's loosely based on similar work I did for Zooniverse's Markdown parser a couple of years ago: seperating the markdown parser from the custom React component processor so that they can be run separately. |
|
Yeah that is fine if it is still safe, just obviously r-m will ignore <script> or <iframe> or whatever but if remark does not explicitly strip those then does rehype? I mean I know it isnt setting dangerously innerhtml its just worth checking. |
Good question. I think so, but it's been a couple of years since I've worked with Rehype in detail. React Markdown runs the Remark output through I assumed that remark-rehype sanitises output, but I'm not 100% sure about that. |
markdown2Htmlhelper that converts markdown to HTML withremark.material2Htmlhelper that runsmarkdown2Htmlwith the appropriate Remark plugins for course material.html2Reacthelper that converts HTML strings to a React component tree withrehype-react.getStaticPropsand add ahtmlprop to teaching material pages.htmlproperty to events and event groups.Contentso that it acceptshtmlas a prop, instead ofmarkdown.Markdowncomponent to use/lib/markdown.rehypeand add interactivity.