-
Notifications
You must be signed in to change notification settings - Fork 201
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use ESP32-S3 vector extensions for LUT processing and diffing #281
Conversation
@martinberlin @schuhumi would you mind trying the branch and check if things work for you? |
@vroland I got it to work! Only my waveform has lots of ghosting now, I'll need to tweak it again. Does the Also, are there any speed gains to be expected from using |
First test was to take my 9.7" proto-board with v7 and update to this branch. Only change I made was to switch: And PSRAM speed to 120 Mhz. Flashing dragon example I got this via Serial:
Switching back to the PSRAM speed of 80 Mhz I got exactly the same error. It seems is not the PSRAM speed which is triggering this. |
@schuhumi It should only be used when drawing is initiated, I can have a look again. @martinberlin Indeed, looks like there is some alignment problem with some display resolutions, I'll look into it :) |
New test with a 6" display:
But independent of PSRAM I cannot see any better performance with this display. Or it's only when you have an existing framebuffer and you update with a new image where I can spot the time difference? |
@martinberlin Sorry for the late response, life had me quite busy.. So initially I did only try to get it to work, I did not do any benchmarks. But you got me curious, and I just had a closer look. I wrapped my draw_base function this way: uint32_t t1 = esp_timer_get_time() / 1000;
epd_draw_base(
epd_full_screen(),
fb,
epd_full_screen(),
MODE_DU | MODE_PACKING_1PPB_DIFFERENCE,
temperature,
NULL, // drawn_lines,
NULL, // drawn columns (only when testing vector extensions)
epd_get_display()->default_waveform
);
uint32_t t2 = esp_timer_get_time() / 1000;
printf("[<With/No> vector extensions] actual draw took %ldms.\n", t2 - t1); For a fair comparison I made sure that in both cases:
Without vector extensions:
With vector extensions:
So for me it's hardly any difference too... (this is on the ED133UT2 display) |
Hello @vroland Test in main branch: Test in vector-extension branch: Is 20 ms faster which is about 5% speed increase. Would be interesting to test it also with a bigger epaper. Nice optimization! Now into testing with other display sizes I'm afraid this check won't work for color epapers like WT-F DES or Eink Kaleido: EpdRect epd_difference_image_base(
int fb_width, [...] ) {
printf("fb_width:%d\n",fb_width);
assert(fb_width % 16 == 0); // --> Not all display widths are module 16 Two examples might be the last 2 definitions you can find in s3_color_implementation branch: display GDEW101C01: 2232 modulo 16 = 8 |
Hi, yes I'm back and un-jetlagged again ;) The branch should now work with the 9.7" display. Afaik, displays with width % 16 != 0 never really worked before either. I think as a workaround we have to virtually increase resolution. I have no such display to test though, with your color display it just worked? @schuhumi @martinberlin Regarding the speed: With the LCD peripheral the output speed is fixed, and the computation has to keep up with whatever is set. To actually see a faster speed you have to increase the bus speed by calling |
Hi @vroland great will try to increase the speed.
About this I can confirm that those 2 models mentioned whose width in fact is not module 16, work perfectly with the main branch but they won’t work with this PR branch. Tested this with GDEW101C01 2232 row pixel width. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small review to optimize dirty lines plus warning that not all panels are % 16
Oh I see, now that's impressive! I was able to go from 23MHz to 30, at 32 I sometimes get line buffer underruns. That decreases the time for epd_draw_base from 95ms to ~70ms! |
@martinberlin I added support for unaligned diffing and LUT lookup, i.e., all displays with width % 8 == 0 should now work. Can you test again? |
This looks very good @vroland |
Testing with this display Kaleido 6" const EpdDisplay_t EC060KH5 = {
.width = 1448, .height = 1072,
.bus_width = 8, .bus_speed = 20,
.default_waveform = &epdiy_ED097TC2,
.display_type = DISPLAY_TYPE_GENERIC,
.display_color_filter = DISPLAY_CFA_KALEIDO
}; Using the main branch this are the timings to draw dragon: (Note I have a big dragon that is: 1600x1100 since I made it to test 13.3" displays) Now merged vector branch into my s3_color_implementation so I later test also color and discard that there is a problem being width % 8 == 0. epdiy: Using optimized vector implementation on the ESP32-S3, only 1k of 65536 LUT in use! Speed is actually quite similar. But now I can tune up the display clock. In both cases PSRAM is at 120Mhz speed. The display now can be set up to 30Mhz that is actually what is on the datasheet (I think) epdiy: diff: 36ms, draw: 450ms, buffer update: 17ms, total: 503ms Now the total time is 200 ms faster. Will do later some additional color tests to confirm the color part is still working as expected. |
Nice, thanks for testing! Once you approve I'll merge. |
Awesome just checking file by file before approving. Keep this URL to test with your DES color epaper (also %8 width) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great performance improvement
Adds optimized versions for 1bpp difference LUT lookup, highlevel framebuffer diffing, and output line masking.
Combined with 120MHz PSRAM (activate in the experimental options), we now get sub-second updates for a 1872x1404 display using epdiy V7.