Skip to content

Commit

Permalink
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge branch 'main' into cw-audio-memory64
Browse files Browse the repository at this point in the history
cwoffenden authored Jan 27, 2025
2 parents 21f7761 + db69527 commit 54b171d
Showing 26 changed files with 3,222 additions and 113 deletions.
8 changes: 7 additions & 1 deletion ChangeLog.md
Original file line number Diff line number Diff line change
@@ -18,11 +18,17 @@ to browse the changes between the tags.

See docs/process.md for more on how version tagging works.

4.0.1 (in development)
4.0.2 (in development)
----------------------
- Added support for compiling AVX2 intrinsics, 256-bit wide intrinsic is emulated
on top of 128-bit Wasm SIMD instruction set. (#23035). Pass `-msimd128 -mavx2`
to enable targeting AVX2.
- The system JS libraries in `src/` were renamed from `library_foo.js` to
`lib/libfoo.js`. They are still included via the same `-lfoo.js` flag so
this should not be a user-visible change. (#23348)

4.0.1 - 01/17/25
----------------
- The minimum version of node required to run emscripten was bumped from v16.20
to v18. Version 4.0 was mistakenly shipped with a change that required v20,
but that was reverted. (#23410)
9 changes: 6 additions & 3 deletions emcc.py
Original file line number Diff line number Diff line change
@@ -76,7 +76,7 @@
'fetchSettings'
]

SIMD_INTEL_FEATURE_TOWER = ['-msse', '-msse2', '-msse3', '-mssse3', '-msse4.1', '-msse4.2', '-msse4', '-mavx']
SIMD_INTEL_FEATURE_TOWER = ['-msse', '-msse2', '-msse3', '-mssse3', '-msse4.1', '-msse4.2', '-msse4', '-mavx', '-mavx2']
SIMD_NEON_FLAGS = ['-mfpu=neon']
LINK_ONLY_FLAGS = {
'--bind', '--closure', '--cpuprofiler', '--embed-file',
@@ -474,6 +474,9 @@ def array_contains_any_of(hay, needles):
if array_contains_any_of(user_args, SIMD_INTEL_FEATURE_TOWER[7:]):
cflags += ['-D__AVX__=1']

if array_contains_any_of(user_args, SIMD_INTEL_FEATURE_TOWER[8:]):
cflags += ['-D__AVX2__=1']

if array_contains_any_of(user_args, SIMD_NEON_FLAGS):
cflags += ['-D__ARM_NEON__=1']

@@ -738,11 +741,11 @@ def phase_parse_arguments(state):


def separate_linker_flags(state, newargs):
"""Process argument list separating out intput files, compiler flags
"""Process argument list separating out input files, compiler flags
and linker flags.
- Linker flags are stored in state.link_flags
- Input files and compiler-only flags are return as two separate lists.
- Input files and compiler-only flags are returned as two separate lists.
Both linker flags and input files are stored as pairs of (i, entry) where
`i` is the orginal index in the command line arguments. This allow the two
2 changes: 1 addition & 1 deletion emscripten-version.txt
Original file line number Diff line number Diff line change
@@ -1 +1 @@
4.0.1-git
4.0.2-git
1 change: 0 additions & 1 deletion eslint.config.mjs
Original file line number Diff line number Diff line change
@@ -54,7 +54,6 @@ export default [{
'src/settings_internal.js',
'src/growableHeap.js',
'src/emrun_prejs.js',
'src/arrayUtils.js',
'src/deterministic.js',
'src/base64Decode.js',
'src/proxyWorker.js',
Original file line number Diff line number Diff line change
@@ -860,7 +860,7 @@ Class properties can be defined several ways as seen below.
class_<Person>("Person")
.constructor<>()
// Bind directly to a class member with automatically generated getters/setters using a
// reference return policy so the object does not need to be deleted JS.
// reference return policy so the object does not need to be deleted from JS.
.property("location", &Person::location, return_value_policy::reference())
// Same as above, but this will return a copy and the object must be deleted or it will
// leak!
87 changes: 86 additions & 1 deletion site/source/docs/porting/simd.rst
Original file line number Diff line number Diff line change
@@ -12,7 +12,7 @@ Emscripten supports the `WebAssembly SIMD <https://github.com/webassembly/simd/>
1. Enable LLVM/Clang SIMD autovectorizer to automatically target WebAssembly SIMD, without requiring changes to C/C++ source code.
2. Write SIMD code using the GCC/Clang SIMD Vector Extensions (``__attribute__((vector_size(16)))``)
3. Write SIMD code using the WebAssembly SIMD intrinsics (``#include <wasm_simd128.h>``)
4. Compile existing SIMD code that uses the x86 SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2 or AVX intrinsics (``#include <*mmintrin.h>``)
4. Compile existing SIMD code that uses the x86 SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX or AVX2 intrinsics (``#include <*mmintrin.h>``)
5. Compile existing SIMD code that uses the ARM NEON intrinsics (``#include <arm_neon.h>``)

These techniques can be freely combined in a single program.
@@ -153,6 +153,7 @@ Emscripten supports compiling existing codebases that use x86 SSE instructions b
* **SSE4.1**: pass ``-msse4.1`` and ``#include <smmintrin.h>``. Use ``#ifdef __SSE4_1__`` to gate code.
* **SSE4.2**: pass ``-msse4.2`` and ``#include <nmmintrin.h>``. Use ``#ifdef __SSE4_2__`` to gate code.
* **AVX**: pass ``-mavx`` and ``#include <immintrin.h>``. Use ``#ifdef __AVX__`` to gate code.
* **AVX2**: pass ``-mavx2`` and ``#include <immintrin.h>``. Use ``#ifdef __AVX2__`` to gate code.

Currently only the SSE1, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, and AVX instruction sets are supported. Each of these instruction sets add on top of the previous ones, so e.g. when targeting SSE3, the instruction sets SSE1 and SSE2 are also available.

@@ -1145,6 +1146,90 @@ The following table highlights the availability and expected performance of diff

Only the 128-bit wide instructions from AVX instruction set are listed. The 256-bit wide AVX instructions are emulated by two 128-bit wide instructions.

The following table highlights the availability and expected performance of different AVX2 intrinsics. Refer to `Intel Intrinsics Guide on AVX2 <https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#avxnewtechs=AVX2>`_.

.. list-table:: x86 AVX2 intrinsics available via #include <immintrin.h> and -mavx2
:widths: 20 30
:header-rows: 1

* - Intrinsic name
- WebAssembly SIMD support
* - _mm_broadcastss_ps
- 💡 emulated with a general shuffle
* - _mm_broadcastsd_pd
- 💡 emulated with a general shuffle
* - _mm_blend_epi32
- 💡 emulated with a general shuffle
* - _mm_broadcastb_epi8
- 💡 emulated with a general shuffle
* - _mm_broadcastw_epi16
- 💡 emulated with a general shuffle
* - _mm_broadcastd_epi32
- 💡 emulated with a general shuffle
* - _mm_broadcastq_epi64
- 💡 emulated with a general shuffle
* - _mm256_permutevar8x32_epi32
- ❌ scalarized
* - _mm256_permute4x64_pd
- 💡 emulated with two general shuffle
* - _mm256_permutevar8x32_ps
- ❌ scalarized
* - _mm256_permute4x64_epi64
- 💡 emulated with two general shuffle
* - _mm_maskload_epi32
- ❌ scalarized
* - _mm_maskload_epi64
- ❌ scalarized
* - _mm_maskstore_epi32
- ❌ scalarized
* - _mm_maskstore_epi64
- ❌ scalarized
* - _mm_sllv_epi32
- ❌ scalarized
* - _mm_sllv_epi64
- ❌ scalarized
* - _mm_srav_epi32
- ❌ scalarized
* - _mm_srlv_epi32
- ❌ scalarized
* - _mm_srlv_epi64
- ❌ scalarized
* - _mm_mask_i32gather_pd
- ❌ scalarized
* - _mm_mask_i64gather_pd
- ❌ scalarized
* - _mm_mask_i32gather_ps
- ❌ scalarized
* - _mm_mask_i64gather_ps
- ❌ scalarized
* - _mm_mask_i32gather_epi32
- ❌ scalarized
* - _mm_mask_i64gather_epi32
- ❌ scalarized
* - _mm_mask_i32gather_epi64
- ❌ scalarized
* - _mm_mask_i64gather_epi64
- ❌ scalarized
* - _mm_i32gather_pd
- ❌ scalarized
* - _mm_i64gather_pd
- ❌ scalarized
* - _mm_i32gather_ps
- ❌ scalarized
* - _mm_i64gather_ps
- ❌ scalarized
* - _mm_i32gather_epi32
- ❌ scalarized
* - _mm_i64gather_epi32
- ❌ scalarized
* - _mm_i32gather_epi64
- ❌ scalarized
* - _mm_i64gather_epi64
- ❌ scalarized

All the 128-bit wide instructions from AVX2 instruction set are listed.
Only a small part of the 256-bit AVX2 instruction set are listed, most of the
256-bit wide AVX2 instructions are emulated by two 128-bit wide instructions.

======================================================
Compiling SIMD code targeting ARM NEON instruction set
29 changes: 0 additions & 29 deletions src/arrayUtils.js

This file was deleted.

26 changes: 22 additions & 4 deletions src/lib/libstrings.js
Original file line number Diff line number Diff line change
@@ -4,8 +4,6 @@
* SPDX-License-Identifier: MIT
*/

#include "arrayUtils.js"

addToLibrary({
// TextDecoder constructor defaults to UTF-8
#if TEXTDECODER == 2
@@ -256,8 +254,28 @@ addToLibrary({
$intArrayFromString__docs: '/** @type {function(string, boolean=, number=)} */',
$intArrayFromString__deps: ['$lengthBytesUTF8', '$stringToUTF8Array'],
$intArrayFromString: intArrayFromString,
$intArrayToString: intArrayToString,
$intArrayFromString: (stringy, dontAddNull, length) => {
var len = length > 0 ? length : lengthBytesUTF8(stringy)+1;
var u8array = new Array(len);
var numBytesWritten = stringToUTF8Array(stringy, u8array, 0, u8array.length);
if (dontAddNull) u8array.length = numBytesWritten;
return u8array;
},
$intArrayToString: (array) => {
var ret = [];
for (var i = 0; i < array.length; i++) {
var chr = array[i];
if (chr > 0xFF) {
#if ASSERTIONS
assert(false, `Character code ${chr} (${String.fromCharCode(chr)}) at offset ${i} not in 0x00-0xFF.`);
#endif
chr &= 0xFF;
}
ret.push(String.fromCharCode(chr));
}
return ret.join('');
},

// Given a pointer 'ptr' to a null-terminated ASCII-encoded string in the
// emscripten HEAP, returns a copy of that string as a Javascript String
9 changes: 1 addition & 8 deletions src/preamble.js
Original file line number Diff line number Diff line change
@@ -225,9 +225,6 @@ function initRuntime() {
function preMain() {
#if STACK_OVERFLOW_CHECK
checkStackCookie();
#endif
#if PTHREADS
if (ENVIRONMENT_IS_PTHREAD) return; // PThreads reuse the runtime from the main thread.
#endif
<<< ATMAINS >>>
callRuntimeCallbacks(__ATMAIN__);
@@ -630,11 +627,7 @@ function getBinarySync(file) {
async function getWasmBinary(binaryFile) {
#if !SINGLE_FILE
// If we don't have the binary yet, load it asynchronously using readAsync.
if (!wasmBinary
#if SUPPORT_BASE64_EMBEDDING
|| isDataURI(binaryFile)
#endif
) {
if (!wasmBinary) {
// Fetch the binary using readAsync
try {
var response = await readAsync(binaryFile);
2 changes: 1 addition & 1 deletion src/shell.js
Original file line number Diff line number Diff line change
@@ -122,7 +122,7 @@ if (ENVIRONMENT_IS_NODE) {
#endif // ENVIRONMENT_MAY_BE_NODE

#if WASM_WORKERS
var ENVIRONMENT_IS_WASM_WORKER = Module['$ww'];
var ENVIRONMENT_IS_WASM_WORKER = !!Module['$ww'];
#endif

// --pre-jses are emitted after the Module integration code, so that they can
2 changes: 1 addition & 1 deletion src/shell_minimal.js
Original file line number Diff line number Diff line change
@@ -70,7 +70,7 @@ var ENVIRONMENT_IS_WEB = !ENVIRONMENT_IS_NODE;
#endif // ASSERTIONS || PTHREADS

#if WASM_WORKERS
var ENVIRONMENT_IS_WASM_WORKER = Module['$ww'];
var ENVIRONMENT_IS_WASM_WORKER = !!Module['$ww'];
#endif

#if ASSERTIONS && ENVIRONMENT_MAY_BE_NODE && ENVIRONMENT_MAY_BE_SHELL
Loading

0 comments on commit 54b171d

Please sign in to comment.