Skip to content

Conversation

maettuu
Copy link
Contributor

@maettuu maettuu commented Aug 6, 2025

This test was automatically generated and could serve as a regression test for PR #19184. The added test:

  • fails on the codebase prior to the PR
  • passes on the codebase after the PR
  • still passes against today’s pdf.js master branch

It verifies that ToUnicodeMap correctly maps Extension B characters (surrogate pairs) to their full Unicode code points using codePointAt.

This is part of our research at the ZEST group of University of Zurich in collaboration with Mozilla.
If you have any suggestions, questions, or simply want to learn more, feel free to contact us at [email protected] and [email protected].

Verifies that ToUnicodeMap correctly maps Extension B characters to their full Unicode code points using codePointAt

See PR mozilla#19184
@@ -0,0 +1,34 @@
/* Copyright 2022 Mozilla Foundation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/2022/2025

const expected = 0x20000; // Unicode code point for the character
let actual;
toUnicodeMap.forEach((charCode, unicode) => {
if (charCode === "32") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also use 0x20.toString() here to make it a little bit clearer where the value comes from and that it matches the key in the cmap variable. That'd also allow for removing the comment below.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants