Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request]: Discuss the move to static instead of shared build for wasm use cases #493

Open
anutosh491 opened this issue Jan 31, 2025 · 10 comments
Labels
enhancement New feature or request

Comments

@anutosh491
Copy link
Collaborator

anutosh491 commented Jan 31, 2025

Description of new feature

Hey All,

Currently we have a decent shared wasm build for cppinterop. By decent I mean the following

i) somewhat optimized, takes around 15 mins to build
ii) Not too shabby and clear of obvious warnings
iii) Xeus-cpp-lite build time is not too long takes around 1 min 10 seconds using libclangCppInterOp.so
iv) After introducing the following lines in xeus-cpp, Jupyterlite knows we xeus-cpp needs access to libclangCppInterOp.so at runtime and hence there hasn't been any issues running stuff with latest jupyterlite-core

But the general notion with emscripten is that static libs are generally preferred. Pyodide and emscripten-forge avoid shared libraries as much as possible and only use them if the static build doesn't work. Infact, Cmake + emscripten don't even provide a way to generate a shared build. Read emscripten-core/emscripten#15276

So we introduce a work around to get our shared builds. Hence as move to emsdk 3.1.73 (through #456) we should be able to generate a static build (which was not possible with 3.1.45 ... check llvm/llvm-project#114651)

But there is things to discuss before we do that

  1. Build time and how optimized can it be
  2. Use cases (if it helps us do more stuff than the shared build)
  3. Build time for xeus-cpp-lite
  4. Runtime delays in loading the interpreter and running code cells (as Jupyterlite is compatible with fetching archives we won't have to explicitly tell it to look for libclangInterpreter.a like we do currently for libclangCppInterOp.so)
@anutosh491 anutosh491 added the enhancement New feature or request label Jan 31, 2025
@anutosh491
Copy link
Collaborator Author

anutosh491 commented Jan 31, 2025

Now describing how you get a static build

  1. So the only library to be linked against is libclangInterpreter.a
  2. But that being said we also need symbols out of all the transitive libs required to build libclangInterpreter.a
  3. So we first start with an empty libclangCppInterOp.a and fetch all the transitive libs first as step 1
  4. Now we can extract object files from all transitive libs all together and then move them into our empty libclangCppInterOp.a (But this would be buggy)
  • Because llvm has lot of files that carry the same name
  • So if we let's say first extract object file, filename.cpp.o from transitive lib 1, libX.a and then we extract filename.cpp.o from transitive lib2, libY.a. The object file would just be overwritten and we lose the significance of the 1st filename.cpp.o
  • Hence we need to workaround this.
  • We loop over the transitive libs, extract the object files, move them into our libclangCppInterOp.a, delete the object files
  • This way we never lose out of symbols from any object file we are interested in and incrementally populate libclangCppInterOp.a
  1. So by the time we finish the configuration step (emcmake cmake ..) we have a libclangCppInterOp.a that has all the symbols out of libclangInterpreter.a and the transitive libs
  2. So now for the build step (emmake make ..) we just need to compile the source files from cppinterop and create a libCppInterOp.a (I have named it so ... but can be changed)
  3. Once the build step is done i.e we have both libclangCppInterOp.a and libCppInterOp.a , we merge them together to produce a single libclangCppInterOp.a that has everything we want.

I am pasting the full file (not the diff but the full file for CppInterOp/lib/Interpreter/CMakeLists.txt) below and one just needs to copy it in their own file and build cppinterop.

@anutosh491
Copy link
Collaborator Author

  1. DEBUG built with lot of logs
if (EMSCRIPTEN)

  set(link_libs
    clangInterpreter
  )

  add_library(CppInterOp
    STATIC
  
    CppInterOp.cpp
    CXCppInterOp.cpp
    DynamicLibraryManager.cpp
    DynamicLibraryManagerSymbol.cpp
    Paths.cpp
  )

  set(final_lib_name "libclangCppInterOp.a")
  set(final_lib_path "${CMAKE_BINARY_DIR}/lib/${final_lib_name}")

  # Assuming link_libs is already set to the list of libraries you want to process
  set(new_libs ${link_libs})
  set(libs ${new_libs})
  set(static_transitive_libs "")

  # Start the process of collecting static transitive libraries
  while(NOT "${new_libs}" STREQUAL "")
      set(newer_libs "")  # Reset for the next iteration
      foreach(lib ${new_libs})
          if(TARGET ${lib})
              get_target_property(transitive_libs ${lib} INTERFACE_LINK_LIBRARIES)
              if (NOT transitive_libs)
                  continue()
              endif()

              foreach(transitive_lib ${transitive_libs})
                  if(TARGET ${transitive_lib})
                    get_target_property(lib_type ${transitive_lib} TYPE)
                    if("${lib_type}" STREQUAL "STATIC_LIBRARY")
                        list(APPEND static_transitive_libs ${transitive_lib})
                    else()
                        # Filter out unwanted libraries (like libLLVM.so)
                        continue()
                    endif()
                    if(NOT ${transitive_lib} IN_LIST libs)
                        list(APPEND newer_libs ${transitive_lib})
                        list(APPEND libs ${transitive_lib})
                    endif()
                  else()
                    continue()
                  endif()
              endforeach(transitive_lib)

              # Update target properties with the list of only static libraries
              set_target_properties(${lib} PROPERTIES INTERFACE_LINK_LIBRARIES "${static_transitive_libs}")
              set(static_transitive_libs "")  # Reset for the next library
          endif()
      endforeach(lib)

      # Update the new_libs for the next iteration
      set(new_libs ${newer_libs})
  endwhile()

  message(STATUS "Final list of libraries (libs): ${libs}")
  message(STATUS "Static transitive libraries collected: ${static_transitive_libs}")

  # Step 1: Extract all object files from static libraries
  foreach(lib ${libs})
      get_target_property(lib_location ${lib} LOCATION)
      message(STATUS "location: ${lib_location}")

      # Extract all .o files from the current static library into the working directory
      execute_process(
        COMMAND ${CMAKE_AR} x ${lib_location}
        WORKING_DIRECTORY ${CMAKE_BINARY_DIR}
        RESULT_VARIABLE extract_result
        ERROR_VARIABLE extract_error
      )

      # Check extraction success
      if (NOT "${extract_result}" STREQUAL "0")
          message(FATAL_ERROR "Failed to extract ${lib}. Error: ${extract_error}")
      endif()

    # Step 2: Collect all extracted object files
    file(GLOB extracted_objects "${CMAKE_BINARY_DIR}/*.o")

    # Check if we have any extracted object files
    if (extracted_objects)
        message(STATUS "Found the following extracted object files:")
        foreach(object_file ${extracted_objects})
            message(STATUS "  - ${object_file}")
        endforeach()

        # Step 3: Archive the extracted object files into libclangCppInterOp.a
        execute_process(
            COMMAND ${CMAKE_AR} qc ${final_lib_path} ${extracted_objects}
            WORKING_DIRECTORY ${CMAKE_BINARY_DIR}
            RESULT_VARIABLE archive_result
            ERROR_VARIABLE archive_error
        )

        # Check archive success
        if (NOT "${archive_result}" STREQUAL "0")
            message(FATAL_ERROR "Failed to archive object files into libclangCppInterOp.a. Error: ${archive_error}")
        endif()
    else()
        message(WARNING "No object files extracted from ${lib}. Skipping archiving.")
    endif()

    # Step 4: Clear the extracted object files from the CMAKE_BINARY_DIR to avoid accumulation
    foreach(object_file ${extracted_objects})
        execute_process(
            COMMAND ${CMAKE_COMMAND} -E remove ${object_file}
            RESULT_VARIABLE remove_result
            ERROR_VARIABLE remove_error
        )

        # Check remove success
        if (NOT "${remove_result}" STREQUAL "0" AND NOT "${remove_result}" STREQUAL "1")
          message(FATAL_ERROR "Failed to remove object file: ${object_file}. Error: ${remove_error}")
        endif()
    endforeach()
  endforeach()

  if (NOT EXISTS ${final_lib_path})
    message(FATAL_ERROR "Final library ${final_lib_path} was not created successfully!")
  endif()
else()
  set(LLVM_LINK_COMPONENTS
    ${LLVM_TARGETS_TO_BUILD}
    BinaryFormat
    Core
    Object
    OrcJit
    Support
  )
  # FIXME: Investigate why this needs to be conditionally included.
  if ("LLVMFrontendDriver" IN_LIST LLVM_AVAILABLE_LIBS)
    list(APPEND LLVM_LINK_COMPONENTS  FrontendDriver)
  endif()
  if ("LLVMOrcDebugging" IN_LIST LLVM_AVAILABLE_LIBS)
    list(APPEND LLVM_LINK_COMPONENTS OrcDebugging)
  endif()

  set(DLM
    DynamicLibraryManager.cpp
    DynamicLibraryManagerSymbol.cpp
    Paths.cpp
  )
  if (USE_CLING)
    set(LLVM_OPTIONAL_SOURCES ${LLVM_OPTIONAL_SOURCES} ${DLM})
    set(DLM)
  endif(USE_CLING)
  if (USE_REPL)
    #Use DML optional sources
  endif(USE_REPL)

  if (USE_CLING)
    set(cling_clang_interp clingInterpreter)
  endif()
  if (USE_REPL)
    set(cling_clang_interp clangInterpreter)
  endif()

  set(link_libs
    ${cling_clang_interp}
    clangAST
    clangBasic
    clangFrontend
    clangLex
    clangSema
    )

  if(NOT WIN32)
    list(APPEND link_libs dl)
  endif()

  # Get rid of libLLVM-X.so which is appended to the list of static libraries.
  if (LLVM_LINK_LLVM_DYLIB)
    set(new_libs ${link_libs})
    set(libs ${new_libs})
    while(NOT "${new_libs}" STREQUAL "")
      foreach(lib ${new_libs})
        if(TARGET ${lib})
          get_target_property(transitive_libs ${lib} INTERFACE_LINK_LIBRARIES)
          if (NOT transitive_libs)
            continue()
          endif()
          foreach(transitive_lib ${transitive_libs})
            get_target_property(lib_type ${transitive_lib} TYPE)
            if("${lib_type}" STREQUAL "STATIC_LIBRARY")
              list(APPEND static_transitive_libs ${transitive_lib})
            else()
              # Filter our libLLVM.so and friends.
              continue()
            endif()
            if(NOT ${transitive_lib} IN_LIST libs)
              list(APPEND newer_libs ${transitive_lib})
              list(APPEND libs ${transitive_lib})
            endif()
          endforeach(transitive_lib)
          # Update the target properties with the list of only static libraries.
          set_target_properties(${lib} PROPERTIES INTERFACE_LINK_LIBRARIES "${static_transitive_libs}")
          set(static_transitive_libs "")
        endif()
      endforeach(lib)
      set(new_libs ${newer_libs})
      set(newer_libs "")
    endwhile()
    # We just got rid of the libLLVM.so and other components shipped as shared
    # libraries, we need to make up for the missing dependency.
    list(APPEND LLVM_LINK_COMPONENTS
      Coverage
      FrontendHLSL
      LTO
      )
    # We will need to append the missing dependencies to pull in the right
    # LLVM library dependencies.
    list(APPEND link_libs
      clangCodeGen
      clangStaticAnalyzerCore
      )
  endif(LLVM_LINK_LLVM_DYLIB)

  add_llvm_library(clangCppInterOp
    DISABLE_LLVM_LINK_LLVM_DYLIB
    CppInterOp.cpp
    CXCppInterOp.cpp
    ${DLM}
    LINK_LIBS
    ${link_libs}
  )
endif()

string(REPLACE ";" "\;" _VER CPPINTEROP_VERSION)
set_source_files_properties(CppInterOp.cpp PROPERTIES COMPILE_DEFINITIONS
  "LLVM_BINARY_DIR=\"${LLVM_BINARY_DIR}\";CPPINTEROP_VERSION=\"${_VAR}\""
)

if (EMSCRIPTEN)
  # Define path for the merged library to overwrite libclangCppInterOp.a
  set(final_lib_path "${CMAKE_BINARY_DIR}/lib/libclangCppInterOp.a")

  add_custom_command(
      TARGET CppInterOp
      POST_BUILD
      COMMAND ${CMAKE_AR} x $<TARGET_FILE:CppInterOp>
      COMMAND ${CMAKE_COMMAND} -E echo "Extracted object files from CppInterOp.a"
      COMMAND ${CMAKE_COMMAND} -E echo "Collecting extracted object files"
      COMMAND ${CMAKE_AR} qc ${final_lib_path} ${CMAKE_BINARY_DIR}/*.o
      WORKING_DIRECTORY ${CMAKE_BINARY_DIR}
      COMMENT "Extracting object files from clangCppInterOp.a"
  )

  # Optional: Install the final merged library as libclangCppInterOp.a
  install(FILES ${final_lib_path} DESTINATION lib)
endif()
  1. Logs removed
if (EMSCRIPTEN)

  set(link_libs
    clangInterpreter
  )

  add_library(CppInterOp
    STATIC
  
    CppInterOp.cpp
    CXCppInterOp.cpp
    DynamicLibraryManager.cpp
    DynamicLibraryManagerSymbol.cpp
    Paths.cpp
  )

  set(final_lib_name "libclangCppInterOp.a")
  set(final_lib_path "${CMAKE_BINARY_DIR}/lib/${final_lib_name}")

  # Assuming link_libs is already set to the list of libraries you want to process
  set(new_libs ${link_libs})
  set(libs ${new_libs})
  set(static_transitive_libs "")

  # Start the process of collecting static transitive libraries
  while(NOT "${new_libs}" STREQUAL "")
      set(newer_libs "")  # Reset for the next iteration
      foreach(lib ${new_libs})
          if(TARGET ${lib})
              get_target_property(transitive_libs ${lib} INTERFACE_LINK_LIBRARIES)
              if (NOT transitive_libs)
                  continue()
              endif()

              foreach(transitive_lib ${transitive_libs})
                  if(TARGET ${transitive_lib})
                    get_target_property(lib_type ${transitive_lib} TYPE)
                    if("${lib_type}" STREQUAL "STATIC_LIBRARY")
                        list(APPEND static_transitive_libs ${transitive_lib})
                    else()
                        # Filter out unwanted libraries (like libLLVM.so)
                        continue()
                    endif()
                    if(NOT ${transitive_lib} IN_LIST libs)
                        list(APPEND newer_libs ${transitive_lib})
                        list(APPEND libs ${transitive_lib})
                    endif()
                  else()
                    continue()
                  endif()
              endforeach(transitive_lib)

              # Update target properties with the list of only static libraries
              set_target_properties(${lib} PROPERTIES INTERFACE_LINK_LIBRARIES "${static_transitive_libs}")
              set(static_transitive_libs "")  # Reset for the next library
          endif()
      endforeach(lib)

      # Update the new_libs for the next iteration
      set(new_libs ${newer_libs})
  endwhile()

  # Step 1: Extract all object files from static libraries
  foreach(lib ${libs})
      get_target_property(lib_location ${lib} LOCATION)

      # Extract all .o files from the current static library into the working directory
      execute_process(
        COMMAND ${CMAKE_AR} x ${lib_location}
        WORKING_DIRECTORY ${CMAKE_BINARY_DIR}
        RESULT_VARIABLE extract_result
        ERROR_VARIABLE extract_error
      )

      # Check extraction success
      if (NOT "${extract_result}" STREQUAL "0")
          message(FATAL_ERROR "Failed to extract ${lib}. Error: ${extract_error}")
      endif()

    # Step 2: Collect all extracted object files
    file(GLOB extracted_objects "${CMAKE_BINARY_DIR}/*.o")

    # Check if we have any extracted object files
    if (extracted_objects)

        # Step 3: Archive the extracted object files into libclangCppInterOp.a
        execute_process(
            COMMAND ${CMAKE_AR} qc ${final_lib_path} ${extracted_objects}
            WORKING_DIRECTORY ${CMAKE_BINARY_DIR}
            RESULT_VARIABLE archive_result
            ERROR_VARIABLE archive_error
        )

        # Check archive success
        if (NOT "${archive_result}" STREQUAL "0")
            message(FATAL_ERROR "Failed to archive object files into libclangCppInterOp.a. Error: ${archive_error}")
        endif()
    else()
        message(WARNING "No object files extracted from ${lib}. Skipping archiving.")
    endif()

    # Step 4: Clear the extracted object files from the CMAKE_BINARY_DIR to avoid accumulation
    foreach(object_file ${extracted_objects})
        execute_process(
            COMMAND ${CMAKE_COMMAND} -E remove ${object_file}
            RESULT_VARIABLE remove_result
            ERROR_VARIABLE remove_error
        )

        # Check remove success
        if (NOT "${remove_result}" STREQUAL "0")
          message(FATAL_ERROR "Failed to remove object file: ${object_file}. Error: ${remove_error}")
        endif()
    endforeach()
  endforeach()
else()
  set(LLVM_LINK_COMPONENTS
    ${LLVM_TARGETS_TO_BUILD}
    BinaryFormat
    Core
    Object
    OrcJit
    Support
  )
  # FIXME: Investigate why this needs to be conditionally included.
  if ("LLVMFrontendDriver" IN_LIST LLVM_AVAILABLE_LIBS)
    list(APPEND LLVM_LINK_COMPONENTS  FrontendDriver)
  endif()
  if ("LLVMOrcDebugging" IN_LIST LLVM_AVAILABLE_LIBS)
    list(APPEND LLVM_LINK_COMPONENTS OrcDebugging)
  endif()

  set(DLM
    DynamicLibraryManager.cpp
    DynamicLibraryManagerSymbol.cpp
    Paths.cpp
  )
  if (USE_CLING)
    set(LLVM_OPTIONAL_SOURCES ${LLVM_OPTIONAL_SOURCES} ${DLM})
    set(DLM)
  endif(USE_CLING)
  if (USE_REPL)
    #Use DML optional sources
  endif(USE_REPL)

  if (USE_CLING)
    set(cling_clang_interp clingInterpreter)
  endif()
  if (USE_REPL)
    set(cling_clang_interp clangInterpreter)
  endif()

  set(link_libs
    ${cling_clang_interp}
    clangAST
    clangBasic
    clangFrontend
    clangLex
    clangSema
    )

  if(NOT WIN32)
    list(APPEND link_libs dl)
  endif()

  # Get rid of libLLVM-X.so which is appended to the list of static libraries.
  if (LLVM_LINK_LLVM_DYLIB)
    set(new_libs ${link_libs})
    set(libs ${new_libs})
    while(NOT "${new_libs}" STREQUAL "")
      foreach(lib ${new_libs})
        if(TARGET ${lib})
          get_target_property(transitive_libs ${lib} INTERFACE_LINK_LIBRARIES)
          if (NOT transitive_libs)
            continue()
          endif()
          foreach(transitive_lib ${transitive_libs})
            get_target_property(lib_type ${transitive_lib} TYPE)
            if("${lib_type}" STREQUAL "STATIC_LIBRARY")
              list(APPEND static_transitive_libs ${transitive_lib})
            else()
              # Filter our libLLVM.so and friends.
              continue()
            endif()
            if(NOT ${transitive_lib} IN_LIST libs)
              list(APPEND newer_libs ${transitive_lib})
              list(APPEND libs ${transitive_lib})
            endif()
          endforeach(transitive_lib)
          # Update the target properties with the list of only static libraries.
          set_target_properties(${lib} PROPERTIES INTERFACE_LINK_LIBRARIES "${static_transitive_libs}")
          set(static_transitive_libs "")
        endif()
      endforeach(lib)
      set(new_libs ${newer_libs})
      set(newer_libs "")
    endwhile()
    # We just got rid of the libLLVM.so and other components shipped as shared
    # libraries, we need to make up for the missing dependency.
    list(APPEND LLVM_LINK_COMPONENTS
      Coverage
      FrontendHLSL
      LTO
      )
    # We will need to append the missing dependencies to pull in the right
    # LLVM library dependencies.
    list(APPEND link_libs
      clangCodeGen
      clangStaticAnalyzerCore
      )
  endif(LLVM_LINK_LLVM_DYLIB)

  add_llvm_library(clangCppInterOp
    DISABLE_LLVM_LINK_LLVM_DYLIB
    CppInterOp.cpp
    CXCppInterOp.cpp
    ${DLM}
    LINK_LIBS
    ${link_libs}
  )
endif()

string(REPLACE ";" "\;" _VER CPPINTEROP_VERSION)
set_source_files_properties(CppInterOp.cpp PROPERTIES COMPILE_DEFINITIONS
  "LLVM_BINARY_DIR=\"${LLVM_BINARY_DIR}\";CPPINTEROP_VERSION=\"${_VAR}\""
)

if (EMSCRIPTEN)
  # Define path for the merged library to overwrite libclangCppInterOp.a
  set(final_lib_path "${CMAKE_BINARY_DIR}/lib/libclangCppInterOp.a")

  add_custom_command(
      TARGET CppInterOp
      POST_BUILD
      COMMAND ${CMAKE_AR} x $<TARGET_FILE:CppInterOp>
      COMMAND ${CMAKE_AR} qc ${final_lib_path} ${CMAKE_BINARY_DIR}/*.o
      WORKING_DIRECTORY ${CMAKE_BINARY_DIR}
      COMMENT "Extracting object files from libCppInterOp.a"
  )

  # Optional: Install the final merged library as libclangCppInterOp.a
  install(FILES ${final_lib_path} DESTINATION lib)
endif()

@mcbarton
Copy link
Collaborator

@anutosh491 Thanks for raising this issue. I'll look at it in more detail the weekend. My initial comments are we want the cmakelist.txt to be of the structure (to support both the ability to build a shared library and static library)

If emscripten
if build_shared_libs=on

else

endif
else

endif

You've renamed the target clangCppInterOp to CppInterOp for your static build. Please return back to the original target name, as when we get testing working they assume that target name, and allows consistency between wasm and non wasm builds. Next you set link_libs to clangInterpreter. Please set this to ${cling_clang_interp} (this was a mistake with the shared library build which I missed in the review). One final comment when I haven't tried using this static llibrary with xeus-cpp yet, but when I tried to link it to see if it would work with the automated tests work I got a large number of undefined symbols coming related to llvm (being it hit the error limit non appeared to be coming from CppInterOp). When I ran into this issue with the shared library build I fixed it with the '--export-all' linker flag. I will provide a copy of these symbols tomorrow.

@mcbarton
Copy link
Collaborator

As far as ideas I have for optimisations here are some my initial thoughts for improving the shared library build. We used SIDE_MODULE=1 . This assumes you want to export all default symbols, while for SIDE_MODULE=2 you choose what you do and do not think needs exporting assuming I understand the flag correctly (assuming I understand this comment correctly emscripten-core/emscripten#16762 (comment)). By default emcc builds with no optimisations https://emscripten.org/docs/optimizing/Optimizing-Code.html . It would come at the cost of a longer build but maybe we could try the -O2,-O3 or -Os flag for release builds.

@mcbarton
Copy link
Collaborator

mcbarton commented Feb 7, 2025

@vgvassilev pinging for comment on this issue.To me it doesn't make sense to completely remove the ability to build a wasm shared library for CppInterOp, given all the efforts to get something which works. Rather I think it makes sense to have the ability to build both. I don't mind the static build being the default, but doesn't seem to make sense (at least not to me) to completely remove the shared build.

@anutosh491
Copy link
Collaborator Author

I don't mind the static build being the default, but doesn't seem to make sense (at least not to me) to completely remove the shared build.

Well, I don't see anyone write that we need to completely get rid of anything anywhere.

The issue title reads "Discuss" the move to static instead of shared build 😅

I really like the shared build as it works perfectly for xeus-cpp-lite (going against the idea that emscripten doesn't really recommend shared builds)

The general idea with wasm is that mostly people prefer static linkages because shared libraries are fragile. At least they make more problems and are harder to handle at runtime .... but we probably don't make the move untill we have good enough concrete proof to do so.

My take here is

  1. We make the move only if it is convincing enough
  2. Even if we make the move we probably don't need to get rid of the shared build.
  3. We probably give an option to our users .... conveying through the readme or so that one can use whatever build they want (we also have to support that logic in xeus-cpp as this would play a role (https://github.com/compiler-research/xeus-cpp/blob/f05495180c191947e6c4675d5ce1881a2ab8425c/share/jupyter/kernels/xcpp20/wasm_kernel.json.in#L12-L14))
  4. That being said for the deployment we need to use our best build possible (which also would be clearly stated as RECOMMENDED in our README. Its not correct to make the leave the users hanging to make a choice and secondly for sure in theory there is a better build out of the two, we just haven't tested/used both well enough to know the answer ! )

Our focus for now shouldn't be on making the move or not .... getting rid or not .... rather finding concrete evidence/usecase on which is better.

@mcbarton
Copy link
Collaborator

mcbarton commented Feb 7, 2025

I don't mind the static build being the default, but doesn't seem to make sense (at least not to me) to completely remove the shared build.

Well, I don't see anyone write that we need to completely get rid of anything anywhere.

The issue title reads "Discuss" the move to static instead of shared build 😅

I really like the shared build as it works perfectly for xeus-cpp-lite (going against the idea that emscripten doesn't really recommend shared builds)

The general idea with wasm is that mostly people prefer static linkages because shared libraries are fragile. At least they make more problems and are harder to handle at runtime .... but we probably don't make the move untill we have good enough concrete proof to do so.

My take here is

1. We make the move only if it is convincing enough

2. Even if we make the move we probably don't need to get rid of the shared build.

3. We probably give an option to our users .... conveying through the readme or so that one can use whatever build they want (we also have to support that logic in xeus-cpp as this would play a role (https://github.com/compiler-research/xeus-cpp/blob/f05495180c191947e6c4675d5ce1881a2ab8425c/share/jupyter/kernels/xcpp20/wasm_kernel.json.in#L12-L14))

4. That being said for the deployment we need to use our best build possible (which also would be clearly stated as RECOMMENDED in our README. Its not correct to make the leave the users hanging to make a choice and secondly for sure in theory there is a better build out of the two, we just haven't tested/used both well enough to know the answer ! )

Our focus for now shouldn't be on making the move or not .... getting rid or not .... rather finding concrete evidence/usecase on which is better.

Maybe it was me misunderstanding the title. My understanding was from the words 'instead of' and your cmakelists.txt only showing a static build in it. I'm all for recommending to the user that they use a static build if it turns out to be better. It is possible and and relatively straight forward (at least as far as CppInterOps repo is concerned) to have a single deployment, where you can switch between the 2 builds, so we can test them side by side (only publically providing a link to the one deployment that we think is best). The dual version deployment would help those developing.

@anutosh491
Copy link
Collaborator Author

Yeah whatever works. My point here is that theoretically every package out there can support a static build and a shared build but I never see a package maintaining both of them. They double down on the better one.

So nothing wrong with having the two and keeping them. The problem rather is we haven't found the better one. And once we have that we need to focus on improving it !

@mcbarton
Copy link
Collaborator

mcbarton commented Feb 7, 2025

  1. DEBUG built with lot of logs
if (EMSCRIPTEN)

  set(link_libs
    clangInterpreter
  )

  add_library(CppInterOp
    STATIC
  
    CppInterOp.cpp
    CXCppInterOp.cpp
    DynamicLibraryManager.cpp
    DynamicLibraryManagerSymbol.cpp
    Paths.cpp
  )

  set(final_lib_name "libclangCppInterOp.a")
  set(final_lib_path "${CMAKE_BINARY_DIR}/lib/${final_lib_name}")

  # Assuming link_libs is already set to the list of libraries you want to process
  set(new_libs ${link_libs})
  set(libs ${new_libs})
  set(static_transitive_libs "")

  # Start the process of collecting static transitive libraries
  while(NOT "${new_libs}" STREQUAL "")
      set(newer_libs "")  # Reset for the next iteration
      foreach(lib ${new_libs})
          if(TARGET ${lib})
              get_target_property(transitive_libs ${lib} INTERFACE_LINK_LIBRARIES)
              if (NOT transitive_libs)
                  continue()
              endif()

              foreach(transitive_lib ${transitive_libs})
                  if(TARGET ${transitive_lib})
                    get_target_property(lib_type ${transitive_lib} TYPE)
                    if("${lib_type}" STREQUAL "STATIC_LIBRARY")
                        list(APPEND static_transitive_libs ${transitive_lib})
                    else()
                        # Filter out unwanted libraries (like libLLVM.so)
                        continue()
                    endif()
                    if(NOT ${transitive_lib} IN_LIST libs)
                        list(APPEND newer_libs ${transitive_lib})
                        list(APPEND libs ${transitive_lib})
                    endif()
                  else()
                    continue()
                  endif()
              endforeach(transitive_lib)

              # Update target properties with the list of only static libraries
              set_target_properties(${lib} PROPERTIES INTERFACE_LINK_LIBRARIES "${static_transitive_libs}")
              set(static_transitive_libs "")  # Reset for the next library
          endif()
      endforeach(lib)

      # Update the new_libs for the next iteration
      set(new_libs ${newer_libs})
  endwhile()

  message(STATUS "Final list of libraries (libs): ${libs}")
  message(STATUS "Static transitive libraries collected: ${static_transitive_libs}")

  # Step 1: Extract all object files from static libraries
  foreach(lib ${libs})
      get_target_property(lib_location ${lib} LOCATION)
      message(STATUS "location: ${lib_location}")

      # Extract all .o files from the current static library into the working directory
      execute_process(
        COMMAND ${CMAKE_AR} x ${lib_location}
        WORKING_DIRECTORY ${CMAKE_BINARY_DIR}
        RESULT_VARIABLE extract_result
        ERROR_VARIABLE extract_error
      )

      # Check extraction success
      if (NOT "${extract_result}" STREQUAL "0")
          message(FATAL_ERROR "Failed to extract ${lib}. Error: ${extract_error}")
      endif()

    # Step 2: Collect all extracted object files
    file(GLOB extracted_objects "${CMAKE_BINARY_DIR}/*.o")

    # Check if we have any extracted object files
    if (extracted_objects)
        message(STATUS "Found the following extracted object files:")
        foreach(object_file ${extracted_objects})
            message(STATUS "  - ${object_file}")
        endforeach()

        # Step 3: Archive the extracted object files into libclangCppInterOp.a
        execute_process(
            COMMAND ${CMAKE_AR} qc ${final_lib_path} ${extracted_objects}
            WORKING_DIRECTORY ${CMAKE_BINARY_DIR}
            RESULT_VARIABLE archive_result
            ERROR_VARIABLE archive_error
        )

        # Check archive success
        if (NOT "${archive_result}" STREQUAL "0")
            message(FATAL_ERROR "Failed to archive object files into libclangCppInterOp.a. Error: ${archive_error}")
        endif()
    else()
        message(WARNING "No object files extracted from ${lib}. Skipping archiving.")
    endif()

    # Step 4: Clear the extracted object files from the CMAKE_BINARY_DIR to avoid accumulation
    foreach(object_file ${extracted_objects})
        execute_process(
            COMMAND ${CMAKE_COMMAND} -E remove ${object_file}
            RESULT_VARIABLE remove_result
            ERROR_VARIABLE remove_error
        )

        # Check remove success
        if (NOT "${remove_result}" STREQUAL "0" AND NOT "${remove_result}" STREQUAL "1")
          message(FATAL_ERROR "Failed to remove object file: ${object_file}. Error: ${remove_error}")
        endif()
    endforeach()
  endforeach()

  if (NOT EXISTS ${final_lib_path})
    message(FATAL_ERROR "Final library ${final_lib_path} was not created successfully!")
  endif()
else()
  set(LLVM_LINK_COMPONENTS
    ${LLVM_TARGETS_TO_BUILD}
    BinaryFormat
    Core
    Object
    OrcJit
    Support
  )
  # FIXME: Investigate why this needs to be conditionally included.
  if ("LLVMFrontendDriver" IN_LIST LLVM_AVAILABLE_LIBS)
    list(APPEND LLVM_LINK_COMPONENTS  FrontendDriver)
  endif()
  if ("LLVMOrcDebugging" IN_LIST LLVM_AVAILABLE_LIBS)
    list(APPEND LLVM_LINK_COMPONENTS OrcDebugging)
  endif()

  set(DLM
    DynamicLibraryManager.cpp
    DynamicLibraryManagerSymbol.cpp
    Paths.cpp
  )
  if (USE_CLING)
    set(LLVM_OPTIONAL_SOURCES ${LLVM_OPTIONAL_SOURCES} ${DLM})
    set(DLM)
  endif(USE_CLING)
  if (USE_REPL)
    #Use DML optional sources
  endif(USE_REPL)

  if (USE_CLING)
    set(cling_clang_interp clingInterpreter)
  endif()
  if (USE_REPL)
    set(cling_clang_interp clangInterpreter)
  endif()

  set(link_libs
    ${cling_clang_interp}
    clangAST
    clangBasic
    clangFrontend
    clangLex
    clangSema
    )

  if(NOT WIN32)
    list(APPEND link_libs dl)
  endif()

  # Get rid of libLLVM-X.so which is appended to the list of static libraries.
  if (LLVM_LINK_LLVM_DYLIB)
    set(new_libs ${link_libs})
    set(libs ${new_libs})
    while(NOT "${new_libs}" STREQUAL "")
      foreach(lib ${new_libs})
        if(TARGET ${lib})
          get_target_property(transitive_libs ${lib} INTERFACE_LINK_LIBRARIES)
          if (NOT transitive_libs)
            continue()
          endif()
          foreach(transitive_lib ${transitive_libs})
            get_target_property(lib_type ${transitive_lib} TYPE)
            if("${lib_type}" STREQUAL "STATIC_LIBRARY")
              list(APPEND static_transitive_libs ${transitive_lib})
            else()
              # Filter our libLLVM.so and friends.
              continue()
            endif()
            if(NOT ${transitive_lib} IN_LIST libs)
              list(APPEND newer_libs ${transitive_lib})
              list(APPEND libs ${transitive_lib})
            endif()
          endforeach(transitive_lib)
          # Update the target properties with the list of only static libraries.
          set_target_properties(${lib} PROPERTIES INTERFACE_LINK_LIBRARIES "${static_transitive_libs}")
          set(static_transitive_libs "")
        endif()
      endforeach(lib)
      set(new_libs ${newer_libs})
      set(newer_libs "")
    endwhile()
    # We just got rid of the libLLVM.so and other components shipped as shared
    # libraries, we need to make up for the missing dependency.
    list(APPEND LLVM_LINK_COMPONENTS
      Coverage
      FrontendHLSL
      LTO
      )
    # We will need to append the missing dependencies to pull in the right
    # LLVM library dependencies.
    list(APPEND link_libs
      clangCodeGen
      clangStaticAnalyzerCore
      )
  endif(LLVM_LINK_LLVM_DYLIB)

  add_llvm_library(clangCppInterOp
    DISABLE_LLVM_LINK_LLVM_DYLIB
    CppInterOp.cpp
    CXCppInterOp.cpp
    ${DLM}
    LINK_LIBS
    ${link_libs}
  )
endif()

string(REPLACE ";" "\;" _VER CPPINTEROP_VERSION)
set_source_files_properties(CppInterOp.cpp PROPERTIES COMPILE_DEFINITIONS
  "LLVM_BINARY_DIR=\"${LLVM_BINARY_DIR}\";CPPINTEROP_VERSION=\"${_VAR}\""
)

if (EMSCRIPTEN)
  # Define path for the merged library to overwrite libclangCppInterOp.a
  set(final_lib_path "${CMAKE_BINARY_DIR}/lib/libclangCppInterOp.a")

  add_custom_command(
      TARGET CppInterOp
      POST_BUILD
      COMMAND ${CMAKE_AR} x $<TARGET_FILE:CppInterOp>
      COMMAND ${CMAKE_COMMAND} -E echo "Extracted object files from CppInterOp.a"
      COMMAND ${CMAKE_COMMAND} -E echo "Collecting extracted object files"
      COMMAND ${CMAKE_AR} qc ${final_lib_path} ${CMAKE_BINARY_DIR}/*.o
      WORKING_DIRECTORY ${CMAKE_BINARY_DIR}
      COMMENT "Extracting object files from clangCppInterOp.a"
  )

  # Optional: Install the final merged library as libclangCppInterOp.a
  install(FILES ${final_lib_path} DESTINATION lib)
endif()
2. **Logs removed**
if (EMSCRIPTEN)

  set(link_libs
    clangInterpreter
  )

  add_library(CppInterOp
    STATIC
  
    CppInterOp.cpp
    CXCppInterOp.cpp
    DynamicLibraryManager.cpp
    DynamicLibraryManagerSymbol.cpp
    Paths.cpp
  )

  set(final_lib_name "libclangCppInterOp.a")
  set(final_lib_path "${CMAKE_BINARY_DIR}/lib/${final_lib_name}")

  # Assuming link_libs is already set to the list of libraries you want to process
  set(new_libs ${link_libs})
  set(libs ${new_libs})
  set(static_transitive_libs "")

  # Start the process of collecting static transitive libraries
  while(NOT "${new_libs}" STREQUAL "")
      set(newer_libs "")  # Reset for the next iteration
      foreach(lib ${new_libs})
          if(TARGET ${lib})
              get_target_property(transitive_libs ${lib} INTERFACE_LINK_LIBRARIES)
              if (NOT transitive_libs)
                  continue()
              endif()

              foreach(transitive_lib ${transitive_libs})
                  if(TARGET ${transitive_lib})
                    get_target_property(lib_type ${transitive_lib} TYPE)
                    if("${lib_type}" STREQUAL "STATIC_LIBRARY")
                        list(APPEND static_transitive_libs ${transitive_lib})
                    else()
                        # Filter out unwanted libraries (like libLLVM.so)
                        continue()
                    endif()
                    if(NOT ${transitive_lib} IN_LIST libs)
                        list(APPEND newer_libs ${transitive_lib})
                        list(APPEND libs ${transitive_lib})
                    endif()
                  else()
                    continue()
                  endif()
              endforeach(transitive_lib)

              # Update target properties with the list of only static libraries
              set_target_properties(${lib} PROPERTIES INTERFACE_LINK_LIBRARIES "${static_transitive_libs}")
              set(static_transitive_libs "")  # Reset for the next library
          endif()
      endforeach(lib)

      # Update the new_libs for the next iteration
      set(new_libs ${newer_libs})
  endwhile()

  # Step 1: Extract all object files from static libraries
  foreach(lib ${libs})
      get_target_property(lib_location ${lib} LOCATION)

      # Extract all .o files from the current static library into the working directory
      execute_process(
        COMMAND ${CMAKE_AR} x ${lib_location}
        WORKING_DIRECTORY ${CMAKE_BINARY_DIR}
        RESULT_VARIABLE extract_result
        ERROR_VARIABLE extract_error
      )

      # Check extraction success
      if (NOT "${extract_result}" STREQUAL "0")
          message(FATAL_ERROR "Failed to extract ${lib}. Error: ${extract_error}")
      endif()

    # Step 2: Collect all extracted object files
    file(GLOB extracted_objects "${CMAKE_BINARY_DIR}/*.o")

    # Check if we have any extracted object files
    if (extracted_objects)

        # Step 3: Archive the extracted object files into libclangCppInterOp.a
        execute_process(
            COMMAND ${CMAKE_AR} qc ${final_lib_path} ${extracted_objects}
            WORKING_DIRECTORY ${CMAKE_BINARY_DIR}
            RESULT_VARIABLE archive_result
            ERROR_VARIABLE archive_error
        )

        # Check archive success
        if (NOT "${archive_result}" STREQUAL "0")
            message(FATAL_ERROR "Failed to archive object files into libclangCppInterOp.a. Error: ${archive_error}")
        endif()
    else()
        message(WARNING "No object files extracted from ${lib}. Skipping archiving.")
    endif()

    # Step 4: Clear the extracted object files from the CMAKE_BINARY_DIR to avoid accumulation
    foreach(object_file ${extracted_objects})
        execute_process(
            COMMAND ${CMAKE_COMMAND} -E remove ${object_file}
            RESULT_VARIABLE remove_result
            ERROR_VARIABLE remove_error
        )

        # Check remove success
        if (NOT "${remove_result}" STREQUAL "0")
          message(FATAL_ERROR "Failed to remove object file: ${object_file}. Error: ${remove_error}")
        endif()
    endforeach()
  endforeach()
else()
  set(LLVM_LINK_COMPONENTS
    ${LLVM_TARGETS_TO_BUILD}
    BinaryFormat
    Core
    Object
    OrcJit
    Support
  )
  # FIXME: Investigate why this needs to be conditionally included.
  if ("LLVMFrontendDriver" IN_LIST LLVM_AVAILABLE_LIBS)
    list(APPEND LLVM_LINK_COMPONENTS  FrontendDriver)
  endif()
  if ("LLVMOrcDebugging" IN_LIST LLVM_AVAILABLE_LIBS)
    list(APPEND LLVM_LINK_COMPONENTS OrcDebugging)
  endif()

  set(DLM
    DynamicLibraryManager.cpp
    DynamicLibraryManagerSymbol.cpp
    Paths.cpp
  )
  if (USE_CLING)
    set(LLVM_OPTIONAL_SOURCES ${LLVM_OPTIONAL_SOURCES} ${DLM})
    set(DLM)
  endif(USE_CLING)
  if (USE_REPL)
    #Use DML optional sources
  endif(USE_REPL)

  if (USE_CLING)
    set(cling_clang_interp clingInterpreter)
  endif()
  if (USE_REPL)
    set(cling_clang_interp clangInterpreter)
  endif()

  set(link_libs
    ${cling_clang_interp}
    clangAST
    clangBasic
    clangFrontend
    clangLex
    clangSema
    )

  if(NOT WIN32)
    list(APPEND link_libs dl)
  endif()

  # Get rid of libLLVM-X.so which is appended to the list of static libraries.
  if (LLVM_LINK_LLVM_DYLIB)
    set(new_libs ${link_libs})
    set(libs ${new_libs})
    while(NOT "${new_libs}" STREQUAL "")
      foreach(lib ${new_libs})
        if(TARGET ${lib})
          get_target_property(transitive_libs ${lib} INTERFACE_LINK_LIBRARIES)
          if (NOT transitive_libs)
            continue()
          endif()
          foreach(transitive_lib ${transitive_libs})
            get_target_property(lib_type ${transitive_lib} TYPE)
            if("${lib_type}" STREQUAL "STATIC_LIBRARY")
              list(APPEND static_transitive_libs ${transitive_lib})
            else()
              # Filter our libLLVM.so and friends.
              continue()
            endif()
            if(NOT ${transitive_lib} IN_LIST libs)
              list(APPEND newer_libs ${transitive_lib})
              list(APPEND libs ${transitive_lib})
            endif()
          endforeach(transitive_lib)
          # Update the target properties with the list of only static libraries.
          set_target_properties(${lib} PROPERTIES INTERFACE_LINK_LIBRARIES "${static_transitive_libs}")
          set(static_transitive_libs "")
        endif()
      endforeach(lib)
      set(new_libs ${newer_libs})
      set(newer_libs "")
    endwhile()
    # We just got rid of the libLLVM.so and other components shipped as shared
    # libraries, we need to make up for the missing dependency.
    list(APPEND LLVM_LINK_COMPONENTS
      Coverage
      FrontendHLSL
      LTO
      )
    # We will need to append the missing dependencies to pull in the right
    # LLVM library dependencies.
    list(APPEND link_libs
      clangCodeGen
      clangStaticAnalyzerCore
      )
  endif(LLVM_LINK_LLVM_DYLIB)

  add_llvm_library(clangCppInterOp
    DISABLE_LLVM_LINK_LLVM_DYLIB
    CppInterOp.cpp
    CXCppInterOp.cpp
    ${DLM}
    LINK_LIBS
    ${link_libs}
  )
endif()

string(REPLACE ";" "\;" _VER CPPINTEROP_VERSION)
set_source_files_properties(CppInterOp.cpp PROPERTIES COMPILE_DEFINITIONS
  "LLVM_BINARY_DIR=\"${LLVM_BINARY_DIR}\";CPPINTEROP_VERSION=\"${_VAR}\""
)

if (EMSCRIPTEN)
  # Define path for the merged library to overwrite libclangCppInterOp.a
  set(final_lib_path "${CMAKE_BINARY_DIR}/lib/libclangCppInterOp.a")

  add_custom_command(
      TARGET CppInterOp
      POST_BUILD
      COMMAND ${CMAKE_AR} x $<TARGET_FILE:CppInterOp>
      COMMAND ${CMAKE_AR} qc ${final_lib_path} ${CMAKE_BINARY_DIR}/*.o
      WORKING_DIRECTORY ${CMAKE_BINARY_DIR}
      COMMENT "Extracting object files from libCppInterOp.a"
  )

  # Optional: Install the final merged library as libclangCppInterOp.a
  install(FILES ${final_lib_path} DESTINATION lib)
endif()

As a first step i'll look into this in more detail, and get back to you (hopefully in the next week)

@vgvassilev
Copy link
Contributor

I agree we need to support static builds. Our current static builds include transitive dependencies I think. The downside is that the binary sizes of the static builds. Another current problem I think is that we can only resolve symbols from shared libraries from the Jit. That’d mean that we should put all possible packages into that static binary which will make it even harder to download.

Let’s use static libraries but I am not quite sure if we can match the functionality shared libraries have at that point in time…

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants