Skip to content

Commit c70b9ff

Browse files
authored
Sourcemap improvements (#23741)
Provide source map settings to emcc (resolves #23686, resolves #22189) : - for embedding the sources content in the source map : `-gsource-map=inline`. - for applying path prefix substitution : `-sSOURCE_MAP_PREFIXES=["<old>=<new>"]`. Update documentation accordingly. Fix source file resolver : - Always fall back to the given filepath if prefix not provided or doesn't match. - Fix source content loading when no `--load-prefix` is given/needed. - Fix relative path in "sources" field when`--prefix` is given but doesn't match the given file. - Cache filepaths with no/unmatched prefix as well. - Resolve deterministic prefix when loading source content (related: #20779). - Don't emit relative paths for sources with a deterministic prefix. Improve existing test for wasm-sourcemap.py : - Parameterize `test_wasm_sourcemap()` and do proper checks according to the combinations of options given. - Fix regex for checking the "mappings" field (was checking only the first character). Add test for emcc covering the basic use cases where : - no option is given (users do path substitution as needed via their client or server configuration). - different prefixes are provided for source files and emscripten dependencies. - source content is embedded in the sourcemap.
1 parent 4cbb47d commit c70b9ff

File tree

10 files changed

+184
-55
lines changed

10 files changed

+184
-55
lines changed

ChangeLog.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,10 @@ See docs/process.md for more on how version tagging works.
2020

2121
4.0.6 (in development)
2222
----------------------
23+
- Added support for applying path prefix substitution to the sources of the
24+
source map : use `-sSOURCE_MAP_PREFIXES=["<old>=<new>"]` with `-gsource-map`.
25+
Alternatively, you can now embed the sources content into the source map file
26+
using `-gsource-map=inline`. (#23741)
2327

2428
4.0.5 - 03/12/25
2529
----------------

docs/emcc.txt

Lines changed: 15 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -181,15 +181,23 @@ Options that are modified or new in *emcc* are listed below:
181181
alongside the wasm object files. This option must be used together
182182
with "-c".
183183

184-
"-gsource-map"
184+
"-gsource-map[=inline]"
185185
[link] Generate a source map using LLVM debug information (which
186186
must be present in object files, i.e., they should have been
187-
compiled with "-g"). When this option is provided, the **.wasm**
188-
file is updated to have a "sourceMappingURL" section. The resulting
189-
URL will have format: "<base-url>" + "<wasm-file-name>" + ".map".
190-
"<base-url>" defaults to being empty (which means the source map is
191-
served from the same directory as the Wasm file). It can be changed
192-
using --source-map-base.
187+
compiled with "-g").
188+
189+
When this option is provided, the **.wasm** file is updated to have
190+
a "sourceMappingURL" section. The resulting URL will have format:
191+
"<base-url>" + "<wasm-file-name>" + ".map". "<base-url>" defaults
192+
to being empty (which means the source map is served from the same
193+
directory as the Wasm file). It can be changed using --source-map-
194+
base.
195+
196+
Path substitution can be applied to the referenced sources using
197+
the "-sSOURCE_MAP_PREFIXES" (link). If "inline" is specified, the
198+
sources content is embedded in the source map (in this case you
199+
don't need path substitution, but it comes with the cost of having
200+
a large source map file).
193201

194202
"-g<level>"
195203
[compile+link] Controls the level of debuggability. Each level

emcc.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1196,8 +1196,8 @@ def consume_arg_file():
11961196
else:
11971197
settings.SEPARATE_DWARF = True
11981198
settings.GENERATE_DWARF = 1
1199-
elif requested_level == 'source-map':
1200-
settings.GENERATE_SOURCE_MAP = 1
1199+
elif requested_level in ['source-map', 'source-map=inline']:
1200+
settings.GENERATE_SOURCE_MAP = 1 if requested_level == 'source-map' else 2
12011201
settings.EMIT_NAME_SECTION = 1
12021202
newargs[i] = '-g'
12031203
else:

site/source/docs/tools_reference/emcc.rst

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -172,16 +172,23 @@ Options that are modified or new in *emcc* are listed below:
172172

173173
.. _emcc-gsource-map:
174174

175-
``-gsource-map``
175+
``-gsource-map[=inline]``
176176
[link]
177177
Generate a source map using LLVM debug information (which must
178178
be present in object files, i.e., they should have been compiled with ``-g``).
179+
179180
When this option is provided, the **.wasm** file is updated to have a
180181
``sourceMappingURL`` section. The resulting URL will have format:
181182
``<base-url>`` + ``<wasm-file-name>`` + ``.map``. ``<base-url>`` defaults
182183
to being empty (which means the source map is served from the same directory
183184
as the Wasm file). It can be changed using :ref:`--source-map-base <emcc-source-map-base>`.
184185

186+
Path substitution can be applied to the referenced sources using the
187+
``-sSOURCE_MAP_PREFIXES`` (:ref:`link <source_map_prefixes>`).
188+
If ``inline`` is specified, the sources content is embedded in the source map
189+
(in this case you don't need path substitution, but it comes with the cost of
190+
having a large source map file).
191+
185192
.. _emcc-gN:
186193

187194
``-g<level>``

site/source/docs/tools_reference/settings_reference.rst

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3131,6 +3131,21 @@ This is enabled automatically when using -gsource-map with sanitizers.
31313131

31323132
Default value: false
31333133

3134+
.. _source_map_prefixes:
3135+
3136+
SOURCE_MAP_PREFIXES
3137+
===================
3138+
3139+
List of path substitutions to apply in the "sources" field of the source map.
3140+
Corresponds to the ``--prefix`` option used in ``tools/wasm-sourcemap.py``.
3141+
Must be used with ``-gsource-map``.
3142+
3143+
This setting allows to map path prefixes to the proper ones so that the final
3144+
(possibly relative) URLs point to the correct locations :
3145+
``-sSOURCE_MAP_PREFIXES=/old/path=/new/path``
3146+
3147+
Default value: []
3148+
31343149
.. _default_to_cxx:
31353150

31363151
DEFAULT_TO_CXX

src/settings.js

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2046,6 +2046,17 @@ var USE_OFFSET_CONVERTER = false;
20462046
// This is enabled automatically when using -gsource-map with sanitizers.
20472047
var LOAD_SOURCE_MAP = false;
20482048

2049+
// List of path substitutions to apply in the "sources" field of the source map.
2050+
// Corresponds to the ``--prefix`` option used in ``tools/wasm-sourcemap.py``.
2051+
// Must be used with ``-gsource-map``.
2052+
//
2053+
// This setting allows to map path prefixes to the proper ones so that the final
2054+
// (possibly relative) URLs point to the correct locations :
2055+
// ``-sSOURCE_MAP_PREFIXES=/old/path=/new/path``
2056+
//
2057+
// [link]
2058+
var SOURCE_MAP_PREFIXES = [];
2059+
20492060
// Default to c++ mode even when run as ``emcc`` rather then ``emc++``.
20502061
// When this is disabled ``em++`` is required linking C++ programs. Disabling
20512062
// this will match the behaviour of gcc/g++ and clang/clang++.

src/settings_internal.js

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -204,7 +204,10 @@ var USE_READY_PROMISE = true;
204204
// If true, building against Emscripten's wasm heap memory profiler.
205205
var MEMORYPROFILER = false;
206206

207-
var GENERATE_SOURCE_MAP = false;
207+
// Set automatically to :
208+
// - 1 when using `-gsource-map`
209+
// - 2 when using `gsource-map=inline` (embed sources content in souce map)
210+
var GENERATE_SOURCE_MAP = 0;
208211

209212
var GENERATE_DWARF = false;
210213

test/test_other.py

Lines changed: 71 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,7 @@
5050
import line_endings
5151
from tools import webassembly
5252
from tools.settings import settings
53+
from tools.system_libs import DETERMINISTIC_PREFIX
5354

5455
scons_path = shutil.which('scons')
5556
emmake = shared.bat_suffix(path_from_root('emmake'))
@@ -10432,25 +10433,84 @@ def test_check_sourcemapurl_default(self, *args):
1043210433
source_mapping_url_content = webassembly.to_leb(len('sourceMappingURL')) + b'sourceMappingURL' + webassembly.to_leb(len('a.wasm.map')) + b'a.wasm.map'
1043310434
self.assertIn(source_mapping_url_content, output)
1043410435

10435-
def test_wasm_sourcemap(self):
10436-
# The no_main.c will be read (from relative location) due to speficied "-s"
10436+
@parameterized({
10437+
'': ([], [], []),
10438+
'prefix_wildcard': ([], ['--prefix', '=wasm-src://'], []),
10439+
'prefix_partial': ([], ['--prefix', '/emscripten/=wasm-src:///emscripten/'], []),
10440+
'sources': (['--sources'], [], ['--load-prefix', '/emscripten/test/other/wasm_sourcemap=.'])
10441+
})
10442+
@parameterized({
10443+
'': ('/',),
10444+
'basepath': ('/emscripten/test',)
10445+
})
10446+
def test_wasm_sourcemap(self, sources, prefix, load_prefix, basepath):
10447+
# The no_main.c will be read from relative location if necessary (depends
10448+
# on --sources and --load-prefix options).
1043710449
shutil.copy(test_file('other/wasm_sourcemap/no_main.c'), '.')
10450+
DW_AT_decl_file = '/emscripten/test/other/wasm_sourcemap/no_main.c'
1043810451
wasm_map_cmd = [PYTHON, path_from_root('tools/wasm-sourcemap.py'),
10439-
'--sources', '--prefix', '=wasm-src://',
10440-
'--load-prefix', '/emscripten/test/other/wasm_sourcemap=.',
10452+
*sources, *prefix, *load_prefix,
1044110453
'--dwarfdump-output',
1044210454
test_file('other/wasm_sourcemap/foo.wasm.dump'),
1044310455
'-o', 'a.out.wasm.map',
1044410456
test_file('other/wasm_sourcemap/foo.wasm'),
10445-
'--basepath=' + os.getcwd()]
10457+
'--basepath=' + basepath]
1044610458
self.run_process(wasm_map_cmd)
1044710459
output = read_file('a.out.wasm.map')
10448-
# has "sources" entry with file (includes also `--prefix =wasm-src:///` replacement)
10449-
self.assertIn('wasm-src:///emscripten/test/other/wasm_sourcemap/no_main.c', output)
10450-
# has "sourcesContent" entry with source code (included with `-s` option)
10451-
self.assertIn('int foo()', output)
10452-
# has some entries
10453-
self.assertRegex(output, r'"mappings":\s*"[A-Za-z0-9+/]')
10460+
# "sourcesContent" contains source code iff --sources is specified.
10461+
self.assertIn('int foo()' if sources else '"sourcesContent":[]', output)
10462+
if prefix: # "sources" contains URL with prefix path substition if provided
10463+
sources_url = 'wasm-src:///emscripten/test/other/wasm_sourcemap/no_main.c'
10464+
else: # otherwise a path relative to the given basepath.
10465+
sources_url = utils.normalize_path(os.path.relpath(DW_AT_decl_file, basepath))
10466+
self.assertIn(sources_url, output)
10467+
# "mappings" contains valid Base64 VLQ segments.
10468+
self.assertRegex(output, r'"mappings":\s*"(?:[A-Za-z0-9+\/]+[,;]?)+"')
10469+
10470+
@parameterized({
10471+
'': ([], 0),
10472+
'prefix': ([
10473+
'<cwd>=file:///path/to/src',
10474+
DETERMINISTIC_PREFIX + '=file:///path/to/emscripten',
10475+
], 0),
10476+
'sources': ([], 1)
10477+
})
10478+
def test_emcc_sourcemap_options(self, prefixes, sources):
10479+
wasm_sourcemap = importlib.import_module('tools.wasm-sourcemap')
10480+
cwd = os.getcwd()
10481+
src_file = shutil.copy(test_file('hello_123.c'), cwd)
10482+
lib_file = DETERMINISTIC_PREFIX + '/system/lib/libc/musl/src/stdio/fflush.c'
10483+
if prefixes:
10484+
prefixes = [p.replace('<cwd>', cwd) for p in prefixes]
10485+
self.set_setting('SOURCE_MAP_PREFIXES', prefixes)
10486+
args = ['-gsource-map=inline' if sources else '-gsource-map']
10487+
self.emcc(src_file, args=args, output_filename='test.js')
10488+
output = read_file('test.wasm.map')
10489+
# Check source file resolution
10490+
p = wasm_sourcemap.Prefixes(prefixes, base_path=cwd)
10491+
self.assertEqual(len(p.prefixes), len(prefixes))
10492+
src_file_url = p.resolve(utils.normalize_path(src_file))
10493+
lib_file_url = p.resolve(utils.normalize_path(lib_file))
10494+
if prefixes:
10495+
self.assertEqual(src_file_url, 'file:///path/to/src/hello_123.c')
10496+
self.assertEqual(lib_file_url, 'file:///path/to/emscripten/system/lib/libc/musl/src/stdio/fflush.c')
10497+
else:
10498+
self.assertEqual(src_file_url, 'hello_123.c')
10499+
self.assertEqual(lib_file_url, '/emsdk/emscripten/system/lib/libc/musl/src/stdio/fflush.c')
10500+
# "sources" contains resolved filepath.
10501+
self.assertIn(f'"{src_file_url}"', output)
10502+
self.assertIn(f'"{lib_file_url}"', output)
10503+
# "sourcesContent" contains source code iff -gsource-map=inline is specified.
10504+
if sources:
10505+
p = wasm_sourcemap.Prefixes(prefixes, preserve_deterministic_prefix=False)
10506+
for filepath in [src_file, lib_file]:
10507+
resolved_path = p.resolve(utils.normalize_path(filepath))
10508+
sources_content = json.dumps(read_file(resolved_path))
10509+
self.assertIn(sources_content, output)
10510+
else:
10511+
self.assertIn('"sourcesContent":[]', output)
10512+
# "mappings" contains valid Base64 VLQ segments.
10513+
self.assertRegex(output, r'"mappings":\s*"(?:[A-Za-z0-9+\/]+[,;]?)+"')
1045410514

1045510515
def test_wasm_sourcemap_dead(self):
1045610516
wasm_map_cmd = [PYTHON, path_from_root('tools/wasm-sourcemap.py'),

tools/building.py

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1137,6 +1137,13 @@ def emit_wasm_source_map(wasm_file, map_file, final_wasm):
11371137
'--dwarfdump=' + LLVM_DWARFDUMP,
11381138
'-o', map_file,
11391139
'--basepath=' + base_path]
1140+
1141+
if settings.SOURCE_MAP_PREFIXES:
1142+
sourcemap_cmd += ['--prefix', *settings.SOURCE_MAP_PREFIXES]
1143+
1144+
if settings.GENERATE_SOURCE_MAP == 2:
1145+
sourcemap_cmd += ['--sources']
1146+
11401147
check_call(sourcemap_cmd)
11411148

11421149

tools/wasm-sourcemap.py

Lines changed: 47 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,10 @@
2525
sys.path.insert(0, __rootdir__)
2626

2727
from tools import utils
28+
from tools.system_libs import DETERMINISTIC_PREFIX
29+
from tools.shared import path_from_root
30+
31+
EMSCRIPTEN_PREFIX = utils.normalize_path(path_from_root())
2832

2933
logger = logging.getLogger('wasm-sourcemap')
3034

@@ -46,42 +50,57 @@ def parse_args():
4650

4751

4852
class Prefixes:
49-
def __init__(self, args):
53+
def __init__(self, args, base_path=None, preserve_deterministic_prefix=True):
5054
prefixes = []
5155
for p in args:
5256
if '=' in p:
5357
prefix, replacement = p.split('=')
5458
prefixes.append({'prefix': prefix, 'replacement': replacement})
5559
else:
56-
prefixes.append({'prefix': p, 'replacement': None})
60+
prefixes.append({'prefix': p, 'replacement': ''})
61+
self.base_path = base_path
62+
self.preserve_deterministic_prefix = preserve_deterministic_prefix
5763
self.prefixes = prefixes
5864
self.cache = {}
5965

6066
def resolve(self, name):
6167
if name in self.cache:
6268
return self.cache[name]
6369

70+
source = name
71+
if not self.preserve_deterministic_prefix and name.startswith(DETERMINISTIC_PREFIX):
72+
source = EMSCRIPTEN_PREFIX + utils.removeprefix(name, DETERMINISTIC_PREFIX)
73+
74+
provided = False
6475
for p in self.prefixes:
65-
if name.startswith(p['prefix']):
66-
if p['replacement'] is None:
67-
result = utils.removeprefix(name, p['prefix'])
68-
else:
69-
result = p['replacement'] + utils.removeprefix(name, p['prefix'])
76+
if source.startswith(p['prefix']):
77+
source = p['replacement'] + utils.removeprefix(source, p['prefix'])
78+
provided = True
7079
break
71-
self.cache[name] = result
72-
return result
80+
81+
# If prefixes were provided, we use that; otherwise if base_path is set, we
82+
# emit a relative path. For files with deterministic prefix, we never use
83+
# a relative path, precisely to preserve determinism, and because it would
84+
# still point to the wrong location, so we leave the filepath untouched to
85+
# let users map it to the proper location using prefix options.
86+
if not (source.startswith(DETERMINISTIC_PREFIX) or provided or self.base_path is None):
87+
try:
88+
source = os.path.relpath(source, self.base_path)
89+
except ValueError:
90+
source = os.path.abspath(source)
91+
source = utils.normalize_path(source)
92+
93+
self.cache[name] = source
94+
return source
7395

7496

7597
# SourceMapPrefixes contains resolver for file names that are:
7698
# - "sources" is for names that output to source maps JSON
7799
# - "load" is for paths that used to load source text
78100
class SourceMapPrefixes:
79-
def __init__(self, sources, load):
80-
self.sources = sources
81-
self.load = load
82-
83-
def provided(self):
84-
return bool(self.sources.prefixes or self.load.prefixes)
101+
def __init__(self, sources, load, base_path):
102+
self.sources = Prefixes(sources, base_path=base_path)
103+
self.load = Prefixes(load, preserve_deterministic_prefix=False)
85104

86105

87106
def encode_vlq(n):
@@ -259,15 +278,20 @@ def read_dwarf_entries(wasm, options):
259278
return sorted(entries, key=lambda entry: entry['address'])
260279

261280

262-
def build_sourcemap(entries, code_section_offset, prefixes, collect_sources, base_path):
281+
def build_sourcemap(entries, code_section_offset, options):
282+
base_path = options.basepath
283+
collect_sources = options.sources
284+
prefixes = SourceMapPrefixes(options.prefix, options.load_prefix, base_path)
285+
263286
sources = []
264-
sources_content = [] if collect_sources else None
287+
sources_content = []
265288
mappings = []
266289
sources_map = {}
267290
last_address = 0
268291
last_source_id = 0
269292
last_line = 1
270293
last_column = 1
294+
271295
for entry in entries:
272296
line = entry['line']
273297
column = entry['column']
@@ -277,20 +301,11 @@ def build_sourcemap(entries, code_section_offset, prefixes, collect_sources, bas
277301
# start at least at column 1
278302
if column == 0:
279303
column = 1
304+
280305
address = entry['address'] + code_section_offset
281-
file_name = entry['file']
282-
file_name = utils.normalize_path(file_name)
283-
# if prefixes were provided, we use that; otherwise, we emit a relative
284-
# path
285-
if prefixes.provided():
286-
source_name = prefixes.sources.resolve(file_name)
287-
else:
288-
try:
289-
file_name = os.path.relpath(file_name, base_path)
290-
except ValueError:
291-
file_name = os.path.abspath(file_name)
292-
file_name = utils.normalize_path(file_name)
293-
source_name = file_name
306+
file_name = utils.normalize_path(entry['file'])
307+
source_name = prefixes.sources.resolve(file_name)
308+
294309
if source_name not in sources_map:
295310
source_id = len(sources)
296311
sources_map[source_name] = source_id
@@ -316,6 +331,7 @@ def build_sourcemap(entries, code_section_offset, prefixes, collect_sources, bas
316331
last_source_id = source_id
317332
last_line = line
318333
last_column = column
334+
319335
return {'version': 3,
320336
'sources': sources,
321337
'sourcesContent': sources_content,
@@ -334,10 +350,8 @@ def main():
334350

335351
code_section_offset = get_code_section_offset(wasm)
336352

337-
prefixes = SourceMapPrefixes(sources=Prefixes(options.prefix), load=Prefixes(options.load_prefix))
338-
339353
logger.debug('Saving to %s' % options.output)
340-
map = build_sourcemap(entries, code_section_offset, prefixes, options.sources, options.basepath)
354+
map = build_sourcemap(entries, code_section_offset, options)
341355
with open(options.output, 'w') as outfile:
342356
json.dump(map, outfile, separators=(',', ':'))
343357

0 commit comments

Comments
 (0)