Skip to content

Fix: Include <features.h> in wasi/api.h to define _Noreturn in C++ mode #24420

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

anutosh491
Copy link

Hi,

I maintain xeus-cpp-lite, a Jupyter Kernel that allows users to run C++ directly in the browser through Jupyterlite
Feel free to try it out here https://compiler-research.org/xeus-cpp/lab/index.html (check demo notebook for what all can be achieved)

  1. We use emsdk 3.1.73
  2. We obviously preload the sysroot for standard headers and other 3rd party lib headers (SDL etc)
  3. This is what I observe

image

i) trying to include wasi/api.h fails with

Aborted(Assertion failed: D->getLexicalDeclContext() == this && "Decl inserted into wrong lexical context", at: /home/runner/work/recipes/recipes/output/bld/rattler-build_llvm_1748329150/work/clang/lib/AST/DeclBase.cpp,1758,addHiddenDecl)

ii) Looking into api.h and pasting the code from there into the cell and running it, I observe this is the minimal block replicating the error

#ifndef __wasi_api_h
#define __wasi_api_h

#include <stddef.h>
#include <stdint.h>

#pragma push_macro("_Static_assert")
#undef _Static_assert
#define _Static_assert(X, Y)

_Static_assert(_Alignof(int8_t) == 1, "non-wasi data layout");
_Static_assert(_Alignof(uint8_t) == 1, "non-wasi data layout");
_Static_assert(_Alignof(int16_t) == 2, "non-wasi data layout");
_Static_assert(_Alignof(uint16_t) == 2, "non-wasi data layout");
_Static_assert(_Alignof(int32_t) == 4, "non-wasi data layout");
_Static_assert(_Alignof(uint32_t) == 4, "non-wasi data layout");
_Static_assert(_Alignof(int64_t) == 8, "non-wasi data layout");
_Static_assert(_Alignof(uint64_t) == 8, "non-wasi data layout");
_Static_assert(_Alignof(void*) == 4, "non-wasi data layout");

#ifdef __cplusplus
extern "C" {
#endif

/**
 * Exit code generated by a process when exiting.
 */
typedef uint32_t __wasi_exitcode_t;

_Static_assert(sizeof(__wasi_exitcode_t) == 4, "witx calculated size");
_Static_assert(_Alignof(__wasi_exitcode_t) == 4, "witx calculated align");

_Noreturn void __wasi_proc_exit(
    /**
     * The exit code returned by the process.
     */
    __wasi_exitcode_t rval
) __attribute__((
    __import_module__("wasi_snapshot_preview1"),
    __import_name__("proc_exit")));


/** @} */

#ifdef __cplusplus
}
#endif

#pragma pop_macro("_Static_assert")

#endif

Gives the same error

image

@anutosh491
Copy link
Author

anutosh491 commented May 28, 2025

What works is

 void __wasi_proc_exit(
    /**
     * The exit code returned by the process.
     */
    __wasi_exitcode_t rval
) __attribute__((
    __import_module__("wasi_snapshot_preview1"),
    __import_name__("proc_exit")));
    

telling me _Noreturn stands undefined there and i see it being taken care of in features.h

#if __STDC_VERSION__ >= 201112L
#elif defined(__GNUC__)
#define _Noreturn __attribute__((__noreturn__))
#else
#define _Noreturn
#endif

I see this also helps with _Noreturn in other files like for eg stdlib.h

_Noreturn void abort (void);
int atexit (void (*) (void));
_Noreturn void exit (int);
_Noreturn void _Exit (int);
int at_quick_exit (void (*) (void));
_Noreturn void quick_exit (int);

@anutosh491
Copy link
Author

Pasting the above whole file into xeus-cpp-lite works (obviously with the change in the pr)

As can be seen below, I get the corresponding wasm binary to the first cell where I have just pasted the whole of api.h

image

@anutosh491
Copy link
Author

@anutosh491
Copy link
Author

anutosh491 commented May 28, 2025

ahh there is some failure I suppose.

The feature.h, I was interested in is this https://github.com/emscripten-core/emscripten/blob/9cc922aff5f2fcac6f6f5ff375faf86fb8cc964e/system/lib/libc/musl/include/features.h

but I geuss the one being fetched is not the public one (I guess the internal one)

https://github.com/emscripten-core/emscripten/blob/9cc922aff5f2fcac6f6f5ff375faf86fb8cc964e/system/lib/libc/musl/src/include/features.h

Maybe we can not include features.h and just take care of _NoReturn separately ? Not sure I should touch __main_void.c here !

@sbc100
Copy link
Collaborator

sbc100 commented May 28, 2025

The internal musl headers such as system/lib/libc/musl/src/include/features.h should not be visible to user code.

How are you constructing your sysroot? Are you perhaps copying the files directly from the system/.. tree? That would be wrong. Instead you should use emcc or embuilder to build your sysroot (it lives in the cache directory) and then copy the sysroot.

@anutosh491
Copy link
Author

anutosh491 commented May 28, 2025

The internal musl headers such as system/lib/libc/musl/src/include/features.h should not be visible to user code.

Yeah that's not what I intended to do. I just wanted to include the features.h that sysroot has to provide (seems like its this file https://github.com/emscripten-core/emscripten/blob/9cc922aff5f2fcac6f6f5ff375faf86fb8cc964e/system/lib/libc/musl/include/features.h)

I just want to use features.h that "sysroot/include" provided just like other files are including.

How are you constructing your sysroot? Are you perhaps copying the files directly from the system/.. tree? That would be wrong. Instead you should use emcc or embuilder to build your sysroot (it lives in the cache directory) and then copy the sysroot.

So yeah we use emsdk, activate 3.173 that we're using currently and then we have this

export SYSROOT_PATH=$HOME/emsdk/upstream/emscripten/cache/sysroot

emcmake cmake \
......
more configs
.......
    -DSYSROOT_PATH=$SYSROOT_PATH \
   

And we preload it in our cmake

    target_link_options(xcpp
       ......
        PUBLIC "SHELL: --preload-file ${SYSROOT_PATH}/include@/include"
        ......
        

which produces the xcpp.data and hence xeus-cpp-lite has access to all standard headers in jupyterlite

@anutosh491
Copy link
Author

Technically the problem starts from here

image

  1. stdio.h has this

#ifdef __EMSCRIPTEN__
#include <wasi/api.h>
#endif
#ifdef __cplusplus
extern "C" {
#endif
#include <features.h>

  1. and wasi/api.h has a _NoReturn which stands undefined.
  2. now stdio.h does include features.h but after inlcuding api.h .... and hence the problem

@sbc100
Copy link
Collaborator

sbc100 commented May 28, 2025

So are you still having this issue, even with the correct features.h?

The minimal reproducer above doesn't fail for me when I build locally.

(BTW, is there some reason you are still using emscripten 3.1.73 rather than the latest version?)

@sbc100
Copy link
Collaborator

sbc100 commented May 28, 2025

Are you able to reproduce the issue with emcc itself? Or is this limited to the browser environment you have built?

@anutosh491
Copy link
Author

anutosh491 commented May 28, 2025

So are you still having this issue, even with the correct features.h?

Yess, this works (see that I've added features.h below)

#ifndef __wasi_api_h
#define __wasi_api_h

#include <stddef.h>
#include <stdint.h>
#include <features.h> // added extra here.

#pragma push_macro("_Static_assert")
#undef _Static_assert
#define _Static_assert(X, Y)

_Static_assert(_Alignof(int8_t) == 1, "non-wasi data layout");
_Static_assert(_Alignof(uint8_t) == 1, "non-wasi data layout");
_Static_assert(_Alignof(int16_t) == 2, "non-wasi data layout");
_Static_assert(_Alignof(uint16_t) == 2, "non-wasi data layout");
_Static_assert(_Alignof(int32_t) == 4, "non-wasi data layout");
_Static_assert(_Alignof(uint32_t) == 4, "non-wasi data layout");
_Static_assert(_Alignof(int64_t) == 8, "non-wasi data layout");
_Static_assert(_Alignof(uint64_t) == 8, "non-wasi data layout");
_Static_assert(_Alignof(void*) == 4, "non-wasi data layout");

#ifdef __cplusplus
extern "C" {
#endif

/**
 * Exit code generated by a process when exiting.
 */
typedef uint32_t __wasi_exitcode_t;

_Static_assert(sizeof(__wasi_exitcode_t) == 4, "witx calculated size");
_Static_assert(_Alignof(__wasi_exitcode_t) == 4, "witx calculated align");

_Noreturn void __wasi_proc_exit(
    /**
     * The exit code returned by the process.
     */
    __wasi_exitcode_t rval
) __attribute__((
    __import_module__("wasi_snapshot_preview1"),
    __import_name__("proc_exit")));


/** @} */

#ifdef __cplusplus
}
#endif

#pragma pop_macro("_Static_assert")

but what I pasted above doesn't work.

Are you able to reproduce the issue with emcc itself?

I haven't tried this yet (can give it a shot soon) but think about it. We're simply running clang-repl in the browser here (through building llvm against emscripten .... which gives us libclangInterpreter.a which we use)

So what defined _NOreturn for you. ?

This can be simply seen through the link if you would like to try it out (https://compiler-research.org/xeus-cpp/lab/index.html)

image

_NOreturn is defined here in features.h

#if __STDC_VERSION__ >= 201112L
#elif defined(__GNUC__)
#define _Noreturn __attribute__((__noreturn__))
#else
#define _Noreturn
#endif

And I see it being put to use in stdlib.h, stdio.h and other files using _NOreturn

@anutosh491
Copy link
Author

Are you able to reproduce the issue with emcc itself? Or is this limited to the browser environment you have built?

Technically there shouldn't be no difference. But yeah there's nothing defining _Noreturn for us here correct ?

This would fail for the native clang-repl case too isn't it

clang-repl> int x = 10;
clang-repl> _Noreturn void test() { while (1); }
input_line_3:1:1: error: expected expression
    1 | _Noreturn void test() { while (1); }
      | ^
error: Parsing failed.

@anutosh491
Copy link
Author

(BTW, is there some reason you are still using emscripten 3.1.73 rather than the latest version?)

Because the whole ecosystem for xeus-cpp-lite through Jupyterlite depends on emscripten-forge ( see docs here https://emscripten-forge.org/)

So Emscripten-forge contains conda recipes for the emscripten-wasm32 platform (but unlike pyodide isn't restricted to python packages and extends to c++/c/rust packages too)

Possibly check the llvm recipe here (https://github.com/emscripten-forge/recipes/blob/main/recipes/recipes_emscripten/llvm/recipe.yaml)

So yeah the all packages on main are currently being built against a slightly patched emsdk 3.1.73

@sbc100
Copy link
Collaborator

sbc100 commented May 28, 2025

Are you able to reproduce the issue with emcc itself? Or is this limited to the browser environment you have built?

Technically there shouldn't be no difference. But yeah there's nothing defining _Noreturn for us here correct ?

This would fail for the native clang-repl case too isn't it

clang-repl> int x = 10;
clang-repl> _Noreturn void test() { while (1); }
input_line_3:1:1: error: expected expression
    1 | _Noreturn void test() { while (1); }
      | ^
error: Parsing failed.

_Noreturn is builtin in C11 and above. See https://en.cppreference.com/w/c/language/_Noreturn.html

The default for clang and emcc must be C11 or above since that code works fine:

 $ cat test.c 
_Noreturn void test() { while (1); }
$ clang -c test.c
$ emcc -c test.c

So I guess you implementation of the compiler must have different defaults somehow?

@sbc100
Copy link
Collaborator

sbc100 commented May 28, 2025

Actually clang seems to include _Noreturn even under C89:

$ clang -c -std=c89 test.c

@anutosh491
Copy link
Author

anutosh491 commented May 28, 2025

So I guess you implementation of the compiler must have different defaults somehow?

Wait so my thinking here would be that clang-repl considers the inputs to be c++ based
(could you try the above with a .cpp file ?)

And this is how features.h is helping

i) if you see how stdlib.h that uses _Noreturn does it

#ifdef __cplusplus
extern "C" {
#endif

#include <features.h>
...
....
_Noreturn void abort (void);
int atexit (void (*) (void));
_Noreturn void exit (int);
_Noreturn void _Exit (int);
int at_quick_exit (void (*) (void));
_Noreturn void quick_exit (int);
.....
....
#ifdef __cplusplus
}
#endif

ii) and features.h has


#if __STDC_VERSION__ >= 201112L
#elif defined(__GNUC__)
#define _Noreturn __attribute__((__noreturn__))
#else
#define _Noreturn
#endif

@sbc100
Copy link
Collaborator

sbc100 commented May 28, 2025

So I guess this is some kind of bug in clang-repl where it doesn't recognize _Noreturn void test() { while (1); } as an experession (even though _Noreturn is a valid keyword in clang).

@anutosh491
Copy link
Author

anutosh491 commented May 28, 2025

My thinking here was .... similar to the above
wasi/api.h has this block


#ifdef __cplusplus
extern "C" {
#endif
.....
.....
_Noreturn void __wasi_proc_exit(
    /**
     * The exit code returned by the process.
     */
    __wasi_exitcode_t rval
) __attribute__((
    __import_module__("wasi_snapshot_preview1"),
    __import_name__("proc_exit")));
.....
....
#ifdef __cplusplus
}
#endif

But its including features.h and hence _Noreturn is not defined and hence I can't use it with xeus-cpp-lite

@anutosh491
Copy link
Author

anutosh491 commented May 28, 2025

So I guess this is some kind of bug in clang-repl where it doesn't recognize

Hmm, maybe but think about it. Let's say I paste the whole code from stdlib.h (removing the featurs.h)
I see it erroring out

image

So I think for sure any emscripten sysroot header using _Noreturn needs features.h

The same code passes perfectly when I have the header.

image

So I think wasi/api.h using _Noreturn but not defining it is problematic here.

@sbc100
Copy link
Collaborator

sbc100 commented May 28, 2025

But _Noreturn is always available as keyword in clang. There should be no need to define it ever.

This code always works in clang:

_Noreturn void test() { while (1); }

@anutosh491
Copy link
Author

There should be no need to define it ever.

But emscripten sysroot header are doing it (and as I showed above, they won't work without it) ☹️

Anyways thanks for the back and forth here. I need to get back to the drawing board and think if we should be able to do this directly as you say.

@sbc100
Copy link
Collaborator

sbc100 commented May 28, 2025

If you can show and example that fails in regular clang or emcc then clearly we would need to fix on our end.

If it only fails in clang-repl than that would suggest a bug in clang-repl instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants