Background tree-sitter incremental parsing to improve TextArea responsiveness. #5645

paul-ollis · 2025-03-13T11:43:41Z

Tree-sitter's incremental parsing is not always fast enough to be
executed for every editing keystroke without providing a very poor user
experience. For example:

"""Docstring with closing quotes not yet added <cursor here>

import textual
...

Typing into the above docstring can become very slow. For a 25,000 line
Python file on my laptop, each change causes a reparse time of about
0.2--0.3 seconds: editing is painful.

This change decouples incremental parsing from TextArea edits, using
tree-sitter's timeout mechanism and a task to effectivley run parsing in
the background, on a snapshot of the TextAreas's contents. While parsing
is in progress, editing of the TextArea text continues in a responsive
manner.

Edits to the tree-siiter parse tree are buffered until the background
parser is able to process them, while edits to the displayed text are
applied as they occur.

Please review the following checklist.

[x ] Docstrings on all new or modified functions / classes
[ n/a] Updated documentation
[ x] Updated CHANGELOG.md (where appropriate)

paul-ollis · 2025-03-13T11:44:51Z

This incorporates changes from #5642 and will need rebasing if and when that PR
is merged or rejected.

I can imaging that this might inspire ideas for possible different approaches
and am more than happy to discuss and help update this PR with any you might have.

Things I am not entirely happy about.

In TextArea._handle_syntax_tree_update() forces a refresh of the entire
visible area, which feels heavy handed.

I think issue Investigate reducing TextArea updates #4086 might be relevant here.
A small number of the BackgroundSyntaxParser.trigger_syntax_tree_update tasks
do not get cleaned up when the tests are executed, causing a variable number
of 'Task was destroyed but it is pending!' messages to be printed.

I have not yet convinced myself that this will not occur for normal
applications.

paul-ollis · 2025-03-13T17:32:26Z

I have updated this to make some use of Tree.updated_ranges().

This only reduces the additional redrawing that can (perhaps always) occur as the result of parsing completing in the background.

So as it stands, I think this PR exacerbates the problem that #4086 is trying to solve. I will perform some measurements and report them here - probably tomorrow.

paul-ollis · 2025-03-14T10:35:22Z

I ran some tests (with instrumented code) and this PRs TextArea does not seem to obviously call render_line more often than the 'main' version.

BUT Then I realised that the code is almost certainly broken for a TextArea using soft wrapping. I think more automatic tests are needed. So, as it stands, this PR is not in a fit state to merge.

I will spend some more time over the next week looking at this. My LaTeX editor really needs and I would prefer to be able to use an unmodified version of textual.

darrenburns · 2025-03-14T10:43:04Z

The "wrapped_document" attribute in TextArea lets you translate from a position in the unwrapped document to the corresponding position in the wrapped view of the document - hopefully that can be useful in your investigation.

paul-ollis · 2025-04-01T11:54:06Z

@darrenburns. My apologies for being so quiet for this pull request.

I was not happy with the extra potential load that might result from decoupling syntax parsing from applying editing changes. Potentially, the change could increase the terminal I/O, although this is not easy to verify.

Tree-sitter's changed_ranges method does not help, so I started to look at ways to reduce the amount of redrawing and hence terminal I/O (which will make a bit difference to my LaTeX editor application - I can explain why if you are interested).

And, I think, I have been successful. But the journey has been rather twisty and turny, revealing various corner cases that only matter as a result of limiting the amount of redrawing. I have code that now drastically reduces the amount of I/O and seems very responsive. I am currently working on fixing remaining corner cases, adding tests suggested by corner cases and then refactoring.

I hope to push some changes over the next 2-3 days.

paul-ollis · 2025-04-03T20:16:27Z

I have some videos showing the TextArea performance improvements here: https://github.com/paul-ollis/textual/wiki

This needs some more work (possibly quite a lot) to become an acceptable pull request, but I wanted to give you a chance to review my approach. It is now a big and invasive change, so I appreciate that it may be a while before you can spend time on this. I also realise that you might not wish to progress with this, so I do not plan to spend much more time on this until you are able to indicate whether the effort is worth while. (This version of code is very useful to me for my project, but further effort is not worth it for that project.)

I will rebase this so that the checks run.

paul-ollis · 2025-04-03T20:45:38Z

How my changes work

The TextArea widget no longer uses refresh to indicate broad repaint regions. Instead it simply flags that repainting is required and uses a new hook, which is triggered just before repainting, to:

build a 'pre-render' image of the text area.
compare the image to the previous image and generate more precise repaint regions.

This adds a _pre_render_lines method which is a bit of beast and nests functions 3-deep. I was writing with a mix of readability and performance in mind. As a result the render_line method is shorter and simpler.

(A simpler approach might be to create a full set of render strips and use them for the comparison. I did not investigate this approach, but am open to spending some time on it).

For use edits and within line cursor movement this results in greatly reduced terminal I/O.

Any operation that involves text scrolling will result in many or all lines being very different. So line comparisons only try to find common leading and trailing text (typically runs of space characters) producing a single region covering a central part where the lines differ.

The TextArea widget code queries the entires syntax tree for each edit, using the tree-sitter Query.captures method. This has potential scaling issues, but such issues are exacerbated by the fact the Query.captures method scales very poorly with the number of line it is asked to generate captures for. It appears to be quadratic or something similar I think - I strongly suspect a bug in tree-sitter or its python bindings. On my laptop, this makes editing a 25,000 line Python file painfully unresponsive. A 25,000 lines Python file is large, but not entirely unreasonable. I actually became aware of this behaviour developing a LaTeX editor, which showed symptoms after about 1,000 lines. This change replaces the plain TextArea._highlights dictionary with a dictonary-like class that lazily performs item access to build hilghlight information for small blocks of (50) lines at a time. As well as keeping the TextArea very much more responsive, it will reduce the average memory requirements for larger documents. During regression testing, I discovered that the per-line highlights are not necessarily in the correct (best) order for syntax highlighting. For example, highlighting within string expressions can lost. So I added suitable sorting. This required that the snapshots for some tests to be updated.

Tree-sitter's incremental parsing is not always fast enough to be executed for every editing keystroke without providing a very poor user experience. For example: ```python """Docstring with closing quotes not yet added <cursor here> import textual ... ``` Typing into the above docstring can become very slow. For a 25,000 line Python file on my laptop, each change causes a reparse time of about 0.2 - 0.3 seconds: editing is painful. This change decouples incremental parsing from TextArea edits, using tree-sitter's timeout mechanism and a task to effectivley run parsing in the background, on a snapshot of the TextAreas's contents. While parsing is in progress, editing of the TextArea text continues in a responsive manner. Edits to the tree-siiter parse tree are buffered until the background parser is able to process them, while edits to the displayed text are applied as they occur.

A first attempt at using changed_ranges to reduce the amount of redrawing triggered by completion of a tree-sitter incremental parse.

1. ChopsUpdate docstring was not correct. 2. ChopsUpdate.render_segments was generating longer sequences than necessary because it always cropped strips with a start of zero.

Replaced looped calculation with simple len(...).

Now capturing a pre-rendered version of the TextArea window and using changes to the pre-rendered version to generate change regions.

Drop 3.9 from supported tree-sitter Python versions. This allows predicates in SCM files to work.

paul-ollis changed the title ~~Bacground tree-sitter increment parsing to improve TextArea responsiveness.~~ Bacground tree-sitter incremental parsing to improve TextArea responsiveness. Mar 13, 2025

paul-ollis changed the title ~~Bacground tree-sitter incremental parsing to improve TextArea responsiveness.~~ Background tree-sitter incremental parsing to improve TextArea responsiveness. Mar 13, 2025

paul-ollis force-pushed the textarea-speedup-2 branch 2 times, most recently from cf7a350 to a14a765 Compare April 7, 2025 18:14

paul-ollis marked this pull request as draft April 9, 2025 12:22

This was referenced Apr 11, 2025

Make syntax highlight generation scale better. #5642

Draft

Tree-sitter callback-parsing prevents correct query processing. #5746

Open

paul-ollis force-pushed the textarea-speedup-2 branch 2 times, most recently from e68da9d to dc46901 Compare April 25, 2025 14:47

paul-ollis force-pushed the textarea-speedup-2 branch from dc46901 to ef989ff Compare April 28, 2025 18:01

paul-ollis added 8 commits April 28, 2025 19:04

Update with suggestions by Darren Burns.

603ba67

Use tree-sitter Tree.changed_ranges.

d9398f6

A first attempt at using changed_ranges to reduce the amount of redrawing triggered by completion of a tree-sitter incremental parse.

Compositor fix.

24ee972

1. ChopsUpdate docstring was not correct. 2. ChopsUpdate.render_segments was generating longer sequences than necessary because it always cropped strips with a start of zero.

Fix slow property in wrapped document.

3d339eb

Replaced looped calculation with simple len(...).

Appear to have working render-based diff method.

329f77d

Now capturing a pre-rendered version of the TextArea window and using changes to the pre-rendered version to generate change regions.

Figured out I need to mark my tests as 'syntax'.

eb07f86

paul-ollis force-pushed the textarea-speedup-2 branch from ef989ff to aa040db Compare April 28, 2025 18:04

Run tree_sitter.parse with whole text.

154483a

Drop 3.9 from supported tree-sitter Python versions. This allows predicates in SCM files to work.

paul-ollis force-pushed the textarea-speedup-2 branch from aa040db to 154483a Compare May 1, 2025 07:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Background tree-sitter incremental parsing to improve TextArea responsiveness. #5645

Background tree-sitter incremental parsing to improve TextArea responsiveness. #5645

Uh oh!

paul-ollis commented Mar 13, 2025 •

edited

Loading

Uh oh!

paul-ollis commented Mar 13, 2025

Uh oh!

paul-ollis commented Mar 13, 2025

Uh oh!

paul-ollis commented Mar 14, 2025

Uh oh!

darrenburns commented Mar 14, 2025

Uh oh!

paul-ollis commented Apr 1, 2025

Uh oh!

paul-ollis commented Apr 3, 2025

Uh oh!

paul-ollis commented Apr 3, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Background tree-sitter incremental parsing to improve TextArea responsiveness. #5645

Are you sure you want to change the base?

Background tree-sitter incremental parsing to improve TextArea responsiveness. #5645

Uh oh!

Conversation

paul-ollis commented Mar 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

paul-ollis commented Mar 13, 2025

Uh oh!

paul-ollis commented Mar 13, 2025

Uh oh!

paul-ollis commented Mar 14, 2025

Uh oh!

darrenburns commented Mar 14, 2025

Uh oh!

paul-ollis commented Apr 1, 2025

Uh oh!

paul-ollis commented Apr 3, 2025

Uh oh!

paul-ollis commented Apr 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

How my changes work

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

paul-ollis commented Mar 13, 2025 •

edited

Loading

paul-ollis commented Apr 3, 2025 •

edited

Loading