Proposed AI changes adding support for fit-content, font-feature-settings, ::first-letter, ::first-line

Following up here (so the technical bits are remembered and searchable in the right repo) on the discussion started at https://github.com/koreader/koreader/discussions/14848 and some changes made with the help of AI by @reuerendo:
> Hi. Recently, I managed to implement several improvements in crengine that I really missed: support for things like fit-content, ::first-letter, ::first-line, and font-feature-settings: "ss##".
The problem is that I'm not a programmer, and all the changes were made using AI. I can't assess the quality of this code. I only know that everything works for me.
Perhaps one of the developers could explain to me how to propose my changes for inclusion in the official build? And is it even worth proposing such things without being a programmer?

> It was like that: fit-content -> first-letter -> first-line -> sets. Each branch contains the previous ones

https://github.com/koreader/crengine/compare/master...reuerendo:crengine:sets
(I have kept a backup of your changes in case you mess with or delete your branch, there are some interesting ideas.)

(Sorry about the long write-up, maybe you spent less time co-piloting the AI than I did writing all this :/)

### General thoughts about AI work and submissions

I'm not certain you are not a programmer :) There are bits here and there that feel like you may have written them (some comments in cyrillic, some bits a bit clumsy that feel not AI :)).
But there is a sure thing: you (and probably most new contributors) can't know about all crengine pecularities. It's somehow fragile, and if there are stuff not done in a walk-along-these-pecularities way, it may generate lots of others unrelated future issues (ie. rendering hash mismatch causing a full re-rendering on each document re-opening, or full-reparsing when we change a style via styletweak) that I may not see yet and that I (not the AI) will have to solve.
The AI may get a better feeling about all these pecularities with the help of my comments added along the years, but it's possible I didn't add comments about all of these, or that the codebase being so huge, the AI can't really grasp the interrelations of everything. But it is surely better at that than any new-at-crengine human can be :)

So, a newcomer using AI can surely get things done and working. But along the way, the AI must have suggested various options (I see in the changes: `// ::first-line (Variant A): affect line breaking/metrics.` - so it must have proposed other variants and possible solutions. It probably came with pros/cons AI text, and you were able to decide for one of the variants. But I'm a bit unconfortable as a reviewer (and probably the person who knows the most about crengine) just seeing the final result with decisions made by a person that is not. May be the cons for the variant you chose were really a long-term problem, or the AI didn't see one - and may be the other variants that felt to you like not the one to chose were actually the most promising one, may be with some tweaks that I could have proposed to the AI if I were the one doing the prompts :)

In my single experience with coding with AI: #645, the AI initially proposed 4 options, all were complicated - I then just suggested some other way, and it turned its vision completely, and went with my idea. And along the way, there were 4 times where I had to refocus it on some way of doing it.
So, It feels we may get somewhere more easily and quicker with AI, but (except for simple things like your `fit-content`) it feels it would have to be driven by somebody with more experience with crengine (that is: me :/).

Just curious: can you tell us about what AI tool/model you used and how ? Was there many many prompts, or you just told it "Implement this" and let it do its thing - maybe testing it and mentionning issues, and letting it solve them ? Or were you more involved in undestanding the changes and you made some change yourself?.

### About fit-content

This one is the less harmful change, it's just some new property named value. The lvstsheet parsing feels ok (except the AI needlessely filled all the `parse_number_value()` default parameters with their default value of `false`).
I haven't really looked if the handling in lvrend.cpp is per-specs and right, but I will if this is made a standalone PR.
So, feel free to make a PR for this one (a standalone branch that you make a PR with).

### About font-feature-settings

The AI has probably made the most annoying thing right: the parsing :)
In the implementation, the result is handled as a string, with lots of `lString8::empty_str` that I don't really like (maybe there are no other solution, dunno, I would have liked asking the AI :))
Also, if we were to support that, maybe we should get rid of the font features bitmap I added to support `font-variant`, which is redundant and more restricted (only 32 bits, so only 31 possible options, while having a string gives unlimited and support for unknown Opentype tags). So, it would need some thinking and possibly a rewrite of the `font-variant` support. And see what the specs says about interactions/precedence/inheritance between `font-feature-settings` and `font-variant` (the AI can probably tell us about that :)).
So, I'd rather you not pursue on this one.
Maybe I'll have a look in the coming months.

Just asking: why do you feel the need for that one? You have real books using these that don't have a font-variant equivalent (I guess a good publisher would use them for older eReader software that don't know about font-feature-settings).

### About ::first-letter

The AI went with mimicking support for pseudo elements `::before` and `::after`, which feel like the most obvious solution.
But there is a fundamental flaw: it's modifying the DOM, wrapping the first letter in a new node, so cutting the first text node.
https://github.com/koreader/koreader/issues/9142#issuecomment-1140400225
> But even for :first-letter, it's complicated if there are inline nodes and the first letter is in some nested inline node [You or the AI solved that one]. Moreover, then, any highlight/xpointer made later in that paragraph would have their index shifted by 1 (as the first letter is no longer in the same text node). Quite too complicated to give it a go, for something that is just cosmetics :/

But you or the AI has overlooked the second one, which is a blocker. If people on previous books have made some highlights, they will be shifted by one char. And it will also happen if a person toggle Embedded styles or add a cosmetic styletweak to add that `::first-letter`: their highlights in a paragraph with `::first-letter` will be messed up.
This is something that the solution should handle. I would have been curious to see if the AI in its suggestions/variants had mention that :) and if not, I would have told it to see what alternative solutions it would come up with.

Also, it feels that the parsing/extraction of what is the first letter (per-specs, bringing dots around that first letter, that you did) miss bringing any diacritics following that base letter.

Anyway, your AI was quite good as grasping all of lvrend and lvtinydom, and it even had the idea of going with `ldomDocument::saveFirstLineStylesData()` (similar with how we handle list items numbers), holding some stuff in some independant datastructures that are saved in the cache, something I would have never thought about ! It's some idea I will try to keep in the back of my mind as a possible helper for some problems. Even if it feels it may not be necessary for `::first-letter`.

Noting some thoughts about a possible idea to investigate to support `::first-letter`:
To not mess with text nodes, may be add some attribute in the parent node (or create an empty node) flagging it `hasFirstLetter`, and maybe store in it the indice of the text node where the first-letter stops. Then, on rendering, build some "generated content" (like list items bullets and numbers, or pseudoelement `::before` content) with that "first letter" char**s**. Render that with the style of the `::first-letter` (so, need to create that empty node, like for `::before`). Then, in lvtextfm, check if there is the attribute `hasFirstLetter`, and if yes, when measuring/rendering that text node, skip the first chars before that indice just as if there were not there in the text node.
This wouldn't break/shift highlights later in the paragraph, may be just highlights including that first letter would just not get that first letter part of the highlight (as it is generated content), which could be considered a cosmetic issue.

So, I'd rather you not pursue on this one. Maybe I'll have a look in the coming months.

### About ::first-line

Your AI had some very clever idea I never thought of :)

```c
        // ::first-line (Variant A): affect line breaking/metrics.
        // We first apply the overlay to the whole paragraph to let the existing
        // line-breaking logic determine the exact first line end with correct widths.
        // Right after the first line is emitted, we restore base metrics for the rest.
        bool firstLineOverlayActive = false;
        LVArray<src_text_fragment_t*> base_srcs;
        base_srcs.reserve(m_length);
```
Instead of bothering with a wrapping node or a generated content node (like it did for :first-letter) and the complexity to know up to which char or inline node (that will end that first line) we should wrap (and we could not even wrap if we are in the middle of a nested inline node), just have this handled by the text rendering code: copy all the text fragments (lower level datastructure than text nodes), apply to them the `::first-line` font style (not sure how it handles different original styles, ie. bits in regular, bits in italic in the original text, and the first-line adding only font-weight, do we keep the italic?), measure all the paragraph text size with that font, render it (so, the first-line gets that style), and when done with the first line, restore the original fragments, re-measure the text, and go along rendering the next lines with the original font ! Very clever ! (but quite ugly, but the ugliness is quite localized, so I could bear it - and the overhead of measuring twice the paragraph).

But the `::first-line` element, per-specs, allows for other CSS properties https://developer.mozilla.org/en-US/docs/Web/CSS/Reference/Selectors/::first-line#allowable_properties like color, background-color, that this solution can't handle (as it can just tweak the properties stored in the `src_text_fragment_t` that are all mostly just about the font). I'm curious if the AI mentionned that to you when proposing the variants, do you remember ?
May be this limitation would be just fine, but it's something to discuss - and I would have liked to see the AI other variants, may be it had some ideas on how to support more of the properties (and I would have decided if the added complexity is worth it or not).

So, I'd rather you not pursue on this one. Maybe I'll have a look in the coming months.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposed AI changes adding support for fit-content, font-feature-settings, ::first-letter, ::first-line #646

General thoughts about AI work and submissions

About fit-content

About font-feature-settings

About ::first-letter

About ::first-line

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Proposed AI changes adding support for fit-content, font-feature-settings, ::first-letter, ::first-line #646

Description

General thoughts about AI work and submissions

About fit-content

About font-feature-settings

About ::first-letter

About ::first-line

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions