Skip to content

Sourcemap improvements #23741

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 25 commits into from
Mar 13, 2025
Merged

Conversation

lvlte
Copy link
Contributor

@lvlte lvlte commented Feb 24, 2025

Provide source map settings to emcc (resolves #23686, resolves #22189) :

  • for embedding the sources content in the source map : -gsource-map=inline.
  • for applying path prefix substitution : -sSOURCE_MAP_PREFIXES=["<old>=<new>"].

Update documentation accordingly.

Fix source file resolver :

  • Always fall back to the given filepath if prefix not provided or doesn't match.
  • Fix source content loading when no --load-prefix is given/needed.
  • Fix relative path in "sources" field when--prefix is given but doesn't match the given file.
  • Cache filepaths with no/unmatched prefix as well.
  • Resolve deterministic prefix when loading source content (related: Resolve synthetic compilation dir /emsdk/emscripten to obtain actual path for better source map #20779).
  • Don't emit relative paths for sources with a deterministic prefix.

Improve existing test for wasm-sourcemap.py :

  • Parameterize test_wasm_sourcemap() and do proper checks according to the combinations of options given.
  • Fix regex for checking the "mappings" field (was checking only the first character).

Add test for emcc covering the basic use cases where :

  • no option is given (users do path substitution as needed via their client or server configuration).
  • different prefixes are provided for source files and emscripten dependencies.
  • source content is embedded in the sourcemap.

- Always fall back to the given filepath if prefix not provided or doesn't match.
- Fix source content loading when no `--load-prefix` is given/needed.
- Fix relative path in "sources" field when`--prefix` is given but doesn't match the given file.
- Cache filepaths with no/unmatched prefix as well.
- Parameterize `test_wasm_sourcemap()` and do proper checks according to the combinations of options given.
- Fix regex for checking the  "mappings" field (was checking only the first character).
This allows to provide sourcemap settings to emcc via -s options. Each setting maps to an option used in tools/wasm-sourcemap.py :
- `INLINE_SOURCES` => `--sources`
- `SOURCE_MAP_PREFIXES` => `--prefix`
- `INLINE_SOURCES_PREFIXES` => `--load-prefix`
This covers the basic use cases where :
- no option is given (users do path substitution as needed via their client or server configuration)
- different prefixes are provided for source files and emscripten dependencies
- source content is embedded in the sourcemap
I realized while doing tests that we won't ever need to provide this option (along with `INLINE_SOURCES` for loading sources content in the sourcemap), since all sources are known by emcc and can be resolved properly, including paths with deterministic prefix.
Copy link
Collaborator

@sbc100 sbc100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks awesome, thanks for working on it!

The only part that gives me some hesitation is that addition of two new settings. If possible we try to avoid adding new settings but this might be a case when they are needed.

I wonder if INLINE_SOURCES could be replaced with -gsource-map=inline?

I also wonder if this is really two different uses? How common do you thing the embedding of sources will be compared to the use of prefixes? What kind of use case fo you see for both of them? Could we split this PR into two for these two features perhaps?

(Also, thanks for fixing the typo, we can probably land that as its own cleanup PR right away).

@lvlte
Copy link
Contributor Author

lvlte commented Feb 24, 2025

I wonder if INLINE_SOURCES could be replaced with -gsource-map=inline?

Sure, I chose the -s option because it was easier to implement. I will do that tomorrow.

I also wonder if this is really two different uses?

Yes, if we embed the sources content, we don't need to use path substitution, and conversely if we use prefixes it implies we don't have inline content.

How common do you thing the embedding of sources will be compared to the use of prefixes?

I tried both :

  • embedding is easier because it works out of the box now that the resolver is fixed.
  • using prefixes might be preferred for large projects but requires some tweaking, it works fine just like if we do path substitution via server or client configuration.

What kind of use case fo you see for both of them?

The tests might covers some combinations since the options are not mutually exclusive, but yes using both doesn't make sense.

@lvlte
Copy link
Contributor Author

lvlte commented Feb 24, 2025

Could we split this PR into two for these two features perhaps?

I would prefer not honestly. Both features require the path resolver fix to work properly, which represents most of the code change (if we ignore the tests). The tests also share the same code for different options.

@sbc100
Copy link
Collaborator

sbc100 commented Feb 24, 2025

Could we split this PR into two for these two features perhaps?

I would prefer not honestly. Both features require the path resolver fix to work properly, which represents most of the code change (if we ignore the tests). The tests also share the same code for different options.

Fair enough. I think we can land this as one change then, once we have finished bikeshedding the settings naming.

BTW, gcc and clang already have -ffile-prefix-map=old=new.. i wonder if we could re-use that as a linker flag here instead of adding a new setting? Maybe too magical?

@lvlte
Copy link
Contributor Author

lvlte commented Feb 25, 2025

I'm not sure about -ffile-prefix-map=old=new (to be honest my understanding of gcc/clang is very limited, and I'm not sure to understand what you mean by reusing it as a linker flag).

But I noticed we use that to replace the emscripten path with the synthetic path here :

source_dir = utils.path_from_root()
relative_source_dir = os.path.relpath(source_dir, self.build_dir)
cflags += [f'-ffile-prefix-map={source_dir}={DETERMINISITIC_PREFIX}',
f'-ffile-prefix-map={relative_source_dir}={DETERMINISITIC_PREFIX}',
f'-fdebug-compilation-dir={DETERMINISITIC_PREFIX}']

and, for example when SDL is required, I noticed most of the lib files that are cached won't have their path prefix replaced. If I provide mappings to emcc via -ffile-prefix-map, it won't work either for most of the lib files (cached ports and other lib files).

@sbc100 sbc100 requested a review from dschuff February 25, 2025 19:21
@sbc100
Copy link
Collaborator

sbc100 commented Feb 25, 2025

@dschuff @kripken WDYT?

Copy link
Member

@kripken kripken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

Please document -gsource-map=inline option in emcc --help, which is in site/source/docs/tools_reference/emcc.rst. May as well also mention that SOURCE_MAP_PREFIXES may be useful, there.

@dschuff
Copy link
Member

dschuff commented Feb 25, 2025

Sorry I'm late to this party, I have a few questions. Mostly I'm concerned about making the source map behavior as close to or consistent with the DWARF behavior as possible, and as deterministic as possible (aside of course from the usual considerations already mentioned). I think -gsource-map=inline is fine. For flags, I'm wondering if it might it make sense to get the desired behavior using one of the DWARF flags (e.g. -ffile-prefix-map/-fdebug-prefix-map/-fdebug-compilation-dir). These change the output that is used to create the source map, but we could potentially even use them to also control the behavior of the source map generator as well.

@sbc100
Copy link
Collaborator

sbc100 commented Feb 26, 2025

Sorry I'm late to this party, I have a few questions. Mostly I'm concerned about making the source map behavior as close to or consistent with the DWARF behavior as possible, and as deterministic as possible (aside of course from the usual considerations already mentioned). I think -gsource-map=inline is fine. For flags, I'm wondering if it might it make sense to get the desired behavior using one of the DWARF flags (e.g. -ffile-prefix-map/-fdebug-prefix-map/-fdebug-compilation-dir). These change the output that is used to create the source map, but we could potentially even use them to also control the behavior of the source map generator as well.

I think that is what I was suggesting in #23741 (comment). i'm not sure about extending the meaning of those existing flags (which I think today are no link time, but only compile time flags). Maybe its a good idea? Maybe too magical?

@sbc100
Copy link
Collaborator

sbc100 commented Feb 26, 2025

@lvlte I'm curious what you think main use case for SOURCE_MAP_PREFIXES might be? How are you planning on using it, for example?

@lvlte
Copy link
Contributor Author

lvlte commented Feb 26, 2025

The usage would be exactly the same as for the --prefix option in tools/wasm-sourcemap.py, one flag for any number of prefix replacements, for example :

emcc ... -sSOURCE_MAP_PREFIXES=["/emsdk/emscripten/=scheme://host/path/to/emscripten/", "=scheme://host/"]

How would users obtain the same output using -ffile-prefix-map/-fdebug-prefix-map/-fdebug-compilation-dir exactly ?

Speaking about this, anyone knows why most of the cached lib files don't have a deterministic prefix ?

@sbc100
Copy link
Collaborator

sbc100 commented Feb 26, 2025

The usage would be exactly the same as for the --prefix option in tools/wasm-sourcemap.py,

But we don't currently use that flag do we? I guess I'm curious how folks have been using source maps without this, assuming its necessary? Or have source maps been broken since we switch to the using a deterministic prefix?

one flag for any number of prefix replacements, for example :

emcc ... -sSOURCE_MAP_PREFIXES=["/emsdk/emscripten/=scheme://host/path/to/emscripten/", "=scheme://host/"]

How would users obtain the same output using -ffile-prefix-map/-fdebug-prefix-map/-fdebug-compilation-dir exactly ?

I think the idea would be that you would write -ffile-prefix-map=/emsdk/emscripten/=scheme://host/path/to/emscripten at link time instead of -sSOURCE_MAP_PREFIXES=.. but it would have the same effect. And since -ffile-prefix-map= is already a known compiler flag we wouldn't be adding any new flag. However, emcc doesn't currently parse this flag (it just passes it directly to clang). Its also not clear if re-using this existing flag makes sense.

Speaking about this, anyone knows why most of the cached lib files don't have a deterministic prefix ?

They should all use the prefix I think. Can you give an example of one that does not using the prefix?

@sbc100
Copy link
Collaborator

sbc100 commented Mar 4, 2025

I think maybe we should a ChangeLog entry for this?

@sbc100 sbc100 enabled auto-merge (squash) March 4, 2025 17:59
auto-merge was automatically disabled March 6, 2025 15:38

Head branch was pushed to by a user without write access

@sbc100
Copy link
Collaborator

sbc100 commented Mar 12, 2025

Sorry, can you resolve conflict and merge/rebase main one more time?

@lvlte
Copy link
Contributor Author

lvlte commented Mar 12, 2025

No problem, it should be ok now :)

@sbc100 sbc100 enabled auto-merge (squash) March 12, 2025 19:53
@lvlte
Copy link
Contributor Author

lvlte commented Mar 13, 2025

Oops, I realize now that version 4.0.5 was already released.. I'm gonna update the changelog accordingly. Sorry you will need to re-enable auto-merge again.

auto-merge was automatically disabled March 13, 2025 12:13

Head branch was pushed to by a user without write access

@sbc100 sbc100 merged commit c70b9ff into emscripten-core:main Mar 13, 2025
27 of 28 checks passed
JoeOsborn pushed a commit to JoeOsborn/emscripten that referenced this pull request Mar 14, 2025
Provide source map settings to emcc (resolves emscripten-core#23686, resolves emscripten-core#22189) :
- for embedding the sources content in the source map :
`-gsource-map=inline`.
- for applying path prefix substitution :
`-sSOURCE_MAP_PREFIXES=["<old>=<new>"]`.

Update documentation accordingly. 

Fix source file resolver : 
- Always fall back to the given filepath if prefix not provided or
doesn't match.
- Fix source content loading when no `--load-prefix` is given/needed.
- Fix relative path in "sources" field when`--prefix` is given but
doesn't match the given file.
- Cache filepaths with no/unmatched prefix as well.
- Resolve deterministic prefix when loading source content (related:
emscripten-core#20779).
- Don't emit relative paths for sources with a deterministic prefix.

Improve existing test for wasm-sourcemap.py :
- Parameterize `test_wasm_sourcemap()` and do proper checks according to
the combinations of options given.
- Fix regex for checking the "mappings" field (was checking only the
first character).

Add test for emcc covering the basic use cases where :
- no option is given (users do path substitution as needed via their
client or server configuration).
- different prefixes are provided for source files and emscripten
dependencies.
- source content is embedded in the sourcemap.
@sbc100
Copy link
Collaborator

sbc100 commented Mar 14, 2025

Looks like this caused a failure on the windows emscripten-releases builder:

https://ci.chromium.org/ui/p/emscripten-releases/builders/try/win/b8720422682063060497/overview


======================================================================
FAIL: test_emcc_sourcemap_options_prefix (test_other.other)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "C:\Users\chrome-bot\AppData\Local\vpython-root.0\store\cpython+dunqimq8p43kdruh73ujf8fpl8\contents\bin\Lib\unittest\case.py", line 60, in testPartExecutor
    yield
  File "C:\Users\chrome-bot\AppData\Local\vpython-root.0\store\cpython+dunqimq8p43kdruh73ujf8fpl8\contents\bin\Lib\unittest\case.py", line 676, in run
    self._callTestMethod(testMethod)
  File "C:\Users\chrome-bot\AppData\Local\vpython-root.0\store\cpython+dunqimq8p43kdruh73ujf8fpl8\contents\bin\Lib\unittest\case.py", line 633, in _callTestMethod
    method()
  File "C:\b\s\w\ir\x\w\install\emscripten\test\common.py", line 895, in resulting_test
    return func(self, *args)
  File "C:\b\s\w\ir\x\w\install\emscripten\test\test_other.py", line 10495, in test_emcc_sourcemap_options
    self.assertEqual(src_file_url, 'file:///path/to/src/hello_123.c')
  File "C:\Users\chrome-bot\AppData\Local\vpython-root.0\store\cpython+dunqimq8p43kdruh73ujf8fpl8\contents\bin\Lib\unittest\case.py", line 912, in assertEqual
    assertion_func(first, second, msg=msg)
  File "C:\Users\chrome-bot\AppData\Local\vpython-root.0\store\cpython+dunqimq8p43kdruh73ujf8fpl8\contents\bin\Lib\unittest\case.py", line 1292, in assertMultiLineEqual
    self.fail(self._formatMessage(msg, standardMsg))
  File "C:\Users\chrome-bot\AppData\Local\vpython-root.0\store\cpython+dunqimq8p43kdruh73ujf8fpl8\contents\bin\Lib\unittest\case.py", line 753, in fail
    raise self.failureException(msg)
AssertionError: 'hello_123.c' != 'file:///path/to/src/hello_123.c'
- hello_123.c
+ file:///path/to/src/hello_123.c

@lvlte
Copy link
Contributor Author

lvlte commented Mar 14, 2025

Ok I will look into it tomorrow.

@sbc100
Copy link
Collaborator

sbc100 commented Mar 14, 2025

Probably just some kind of windows pathname issue.

sbc100 added a commit to sbc100/emscripten that referenced this pull request Mar 14, 2025
This test was recently add in emscripten-core#23741 but currently fails on windows.
sbc100 added a commit that referenced this pull request Mar 14, 2025
This test was recently add in #23741 but currently fails on windows.
sbc100 pushed a commit that referenced this pull request Mar 15, 2025
Normalize prefixes instead of expecting windows users to do it. 

See
#23741 (comment)

The tests now pass on windows.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow to pass more sourcemap options to emcc Embed source files in source map file
4 participants