Skip to content
This repository has been archived by the owner on Aug 24, 2018. It is now read-only.

Switch to nbind, add asm.js support #2

Closed
wants to merge 4 commits into from

Conversation

arcanis
Copy link

@arcanis arcanis commented Dec 6, 2016

Related issue: atom/text-buffer#183

This PR change the build toolchain in order to use nbind instead of nan. By doing this, we can now compile an asm.js-compatible build, that can then be used as fallback when native compilation isn't possible. The main use case is using this package on regular web browsers, which in turn allow to run dependent packages such as text-buffer inside web browsers.

@arcanis
Copy link
Author

arcanis commented Dec 6, 2016

Travis fails to build because the minimum requirement for nbind is Clang 3.6 (it seems that Travis is currently running Clang 3.3). My own local version is Clang 3.9, so 3.6 seems like a reasonable expectation.

@nathansobo
Copy link

Wow extremely cool. If you want to tweak the Travis config go for it.

@arcanis
Copy link
Author

arcanis commented Dec 7, 2016

20 builds later, I've finally found the right settings 😁

I haven't been able to make Travis test the emscripten version - it would require to compile llvm from its sources, which is less than stellar on Travis and its max timeout. It should theoretically work just as well as the native build, anyway.

@arcanis
Copy link
Author

arcanis commented Dec 9, 2016

Ping @nathansobo ? If this PR is good enough to be merged, I can start working on converting marker-index as well

@nathansobo
Copy link

I plan to take a deeper look at this next week. Upon further thought, I need to learn more about nbind before I'll feel totally comfortable with us switching our binding system over to it. My main concern is whether it is as powerful full-featured as the V8 APIs. I want to be sure we're not sacrificing the control we need to get maximal performance out of our bindings by using a more abstracted solution. @maxbrunsfeld do you have thoughts on this?

@arcanis
Copy link
Author

arcanis commented Dec 9, 2016

That's fair, let me know. The lead nbind developer was also extremely helpful, usually answering in a few hours, so maybe they can help us if we have an issue with nbind later on.

@maxbrunsfeld
Copy link

@arcanis Can you do some benchmarking of this relative to master?

@arcanis
Copy link
Author

arcanis commented Dec 12, 2016

Hm, I think I did something wrong: my branch takes 12s to execute the tests with the native version, but only 4s with the emscripten version (vs 3s for the master version). Optimization flags seem to be correctly set, that's weird. I've added a npm run bench option if you want to try on your side.

@nathansobo
Copy link

nathansobo commented Dec 13, 2016

@as-cii has kindly accepted my request to dig into your benchmark and make sense of why nbind might be slower, and he will also be gathering some intel on Emscripten and nbind.

@arcanis I'm still pretty ignorant in this area, but I presume the alternative to using nbind would be to have a build switch in our gyp file that loaded either a native node bindings file or some bindings to JS written against a lower-level API that Emscripten exposes for talking to JS. Can you discuss the trade-offs involved in using a lower level approach as you see them?

@arcanis
Copy link
Author

arcanis commented Dec 13, 2016

Awesome, thanks. For the record, I've also opened an issue on nbind's repository about the slowdown: charto/nbind#45 There's a few leads that I need to investigate.

As far as I see it, porting a C++ codebase has a few disadvantages over a pure C library that a library such as nbind could help alleviate:

  • The JS code usually cannot directly interact with the C++ code itself. In order to properly interface the two of them, one has to write a C bridge, where each function takes an instance of the underlying class and call the right class method. Doing this manually require quite a lot of boilerplate.

  • Because we're interacting via a C bridge, C++ data structures such as std::vector cannot be used. Because of this, you have to convert your JS arrays to pointers, by copying them inside the emscripten memory, and then converting them back to std::vector instances. Shipping this conversion logic seems out-of-scope for a library such as buffer-offset-index, and should probably be implemented in its own package (hence why I went with nbind rather than directly raw emscripten).

    • Note that even if it could be possible to change this library's class methods to directly use raw pointers instead of C++ vectors, a cursory glance over the marker-index library source code gave me the impression that using C structures instead of C++ ones would require quite a bit of work (especially because of its use of std::unordered_map).
  • Actually, the statements above aren't entirely true: there is a library called embind, shipped with emscripten, which allow to do the heavy lifting and should automatically expose the right prototypes to the JS side (essentially doing the same thing than nbind). I haven't explored this path much, because it seemed like a better choice to use the same library for both native and asmjs exports - easier to integrate, easier to test.

  • A last thing that might be of importance: asmjs currently doesn't support memory growth (it did once, but support for it has been removed about a year ago, and should only return with webasm). Because of this, the emscripten-generated code uses either an optimized fixed-sized memory, or an unoptimized dynamic-sized memory (unoptimized meaning that the code won't actually run with the asmjs engine). Even if it's ok for regular application, that's quite unpractical when porting libraries, where memory usage mainly depends on the userland code.

All in all, nbind seems a bit experimental right now, but I find it quite promising. The issues I have seem to be entirely due to the implementation and, as such, fixeable. That being said, I'm not totally sold on using the same codebase on both native and browser versions. It can be done, but it will take a bit of time to get it right.

Let me know if it helps.

@nathansobo
Copy link

nathansobo commented Dec 13, 2016

After chatting with @as-cii @maxbrunsfeld some more this morning, we have some more thoughts to share and a proposal for moving forward.

First of all, running Atom in the browser is something we're definitely interested in over the long term, and we'd like to facilitate your efforts on this front. However, browser support isn't on our immediate roadmap, and so while we want to enable progress on this, we also want to be careful to minimize the potential for disruption of the development or user experience of the existing Electron use case.

We would be more comfortable merging something that is more additive in nature. We think we should keep the existing bindings in place for the Node use case and just use embind for the browser use case, because it seems like a more mature tool and just as good if we're not trying to do bindings for both platforms in a single shot as nbind enables. Maintaining two bindings allows us to be experimental with the browser side while remaining conservative with the code that has to work and be fast for millions of production users on the Electron side. If possible, we'd also like to find a solution that is simpler from a build perspective without the addition of autogyp as a dependency that we'll need to understand.

We're happy to run tests for the browser-based bindings as part of our build and make a best effort to maintain them, but we want to minimize impact on the proven Node-based solution beyond that.

Based on your comments, we also have the following long term concerns about the browser based APIs.

  • Lack of finalizers: Any C++ objects exposed to JS will need to be managed manually do to a lack of finalizer support in JS. For now, setting up some sort of dispose method seems sufficient, though we'll need to be careful not to expose objects to users unless we're clear about the requirement to dispose them. Even then, it's highly likely that package authors will get this wrong and leak memory. That said, it's pretty unclear what Atom in the browser will ultimately look like and how third party packages would even be handled in that scenario. For now, the only class we'd consider exposing to users is Patch and we don't have a compelling need for that yet. If it comes up, we'll need to think carefully about the implications for browser-based use down the road.

  • Memory growth: It seems to us that memory growth support is non-optional for these libraries. It's our understanding that WebAssembly plans to support it for 1.0, and maybe the non asm.js path will be required in the meantime. I think this could be an acceptable compromise but I'm not sure where you're planning on running this code.

@arcanis
Copy link
Author

arcanis commented Dec 13, 2016

we also want to be careful to minimize the potential for disruption of the development or user experience of the existing Electron use case.

Yep, I totally understand. I might try this week-end to make an alternate PR to switch to embind - I don't think it should be too hard.

I think this could be an acceptable compromise but I'm not sure where you're planning on running this code.

Quite frankly, my use case for this feature is quite limited - I only need it to implement a web playground for my terminal UI library. Nothing too fancy, and both memory usage and performances are far less important for me than just being able to actually run the code in the browser (hence why I suggested using the JS reference implementation in browser environments in the first place, until a better solution is found).

@arcanis
Copy link
Author

arcanis commented Dec 23, 2016

Just a small comment to keep you updated: I haven't yet found enough time to open a new PR with embind, and honestly I'm not sure I'll be able to find it in the next few weeks, between the holidays and my relocation. In the meantime, if someone wants to try working on it, feel free to ask if you have any question about emscripten!

@arcanis
Copy link
Author

arcanis commented Dec 23, 2016

By the way, I also found the following issue which might be of interest on the embind project :

emscripten-core/emscripten#4770

@arcanis
Copy link
Author

arcanis commented Jan 12, 2017

I haven't yet had the chance to write a new embind implementation of buffer-offset-index, but I've successfully ported https://github.com/facebook/yoga to javascript using nbind.

@arcanis
Copy link
Author

arcanis commented Jan 15, 2017

Closing this PR in favor of atom/superstring#7

@arcanis arcanis closed this Jan 15, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants