Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configurable cell split #63

Open
rossant opened this issue Jul 15, 2015 · 1 comment
Open

Configurable cell split #63

rossant opened this issue Jul 15, 2015 · 1 comment

Comments

@rossant
Copy link
Owner

rossant commented Jul 15, 2015

Issue raised by @bollwyvl. The question is "how to customize how cells are split during Markdown => ipynb conversion"?

There are several ways of doing it; here's one.

By default, the current behavior is kept: the smallest cells are created (i.e. one for every new line \n\n, basically), using the observation that it's easier to merge than to split cells.

Then, there is a dedicated class, for example CellMerger, that is responsible for merging consecutive cells. It implements a method to_merge(cell0, cell_1) that takes as input two consecutive ipymd cells, and returns True or False. When converting from Markdown to ipynb, this class is called at the end of the conversion. It starts from the first cell, decides whether it should be merged with the next one, does the merge if necessary, and move to the next pair of cells, until the end. (probably this should only concern Markdown cells; a code cell will never be merged with a Markdown cell!)

Returning always True means that we'll end up with as few Markdown cells as possible (a single big cell if there's no code cell).

Returning always False means that the current behavior is kept.

In the middle, we can customize how cells should be created out of Markdown text. For example, decide that two Markdown cells should be merged if none starts with a header.

The default CellMerger should implement reasonable heuristics between the two extremes.

Another thing: it may be possible to force a split with something like a *** in Markdown, but I don't think it should be mandatory (i.e. not writing any *** should not result in a single big cell for the whole document!)

@bollwyvl
Copy link
Contributor

Data point: The empty metadata --- will force a split.

I like the configurable postprocessor idea... Might require some reparsing
if you toss away all the tokens. But probably better than the writers
having to think about it.

Splitting on headers is good.

Consecutive list bullets should be joined, probably, as should indented
things inside lists (like example code in lists in README).

Multi line inline html/math will be tricky, perhaps: on nbviewer we are
constantly running into problems and bother @lepture about it :) another
reason to get mainline mistune in the mix...

On 04:30, Wed, Jul 15, 2015 Cyrille Rossant [email protected]
wrote:

Issue raised by @bollwyvl https://github.com/bollwyvl. The question is
"how to customize how cells are split during Markdown => ipynb conversion"?

There are several ways of doing it; here's one.

By default, the current behavior is kept: the smallest cells are created
(i.e. one for every new line \n\n, basically), using the observation that
it's easier to merge than to split cells.

Then, there is a dedicated class, for example CellMerger, that is
responsible for merging consecutive cells. It implements a method to_merge(cell0,
cell_1) that takes as input two consecutive ipymd cells, and returns True
or False. When converting from Markdown to ipynb, this class is called at
the end of the conversion. It starts from the first cell, decides whether
it should be merged with the next one, does the merge if necessary, and
move to the next pair of cells, until the end. (probably this should only
concern Markdown cells; a code cell will never be merged with a Markdown
cell!)

Returning always True means that we'll end up with as few Markdown cells
as possible (a single big cell if there's no code cell).

Returning always False means that the current behavior is kept.

In the middle, we can customize how cells should be created out of
Markdown text. For example, decide that two Markdown cells should be merged
if none starts with a header.

The default CellMerger should implement reasonable heuristics between the
two extremes.

Another thing: it may be possible to force a split with something like a
*** in Markdown, but I don't think it should be mandatory (i.e. not
writing any *** should not result in a single big cell for the whole
document!)


Reply to this email directly or view it on GitHub
#63.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants