-
Notifications
You must be signed in to change notification settings - Fork 11
add blog posts about js backend debugging, stacks and weak refs #37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,115 @@ | ||||||
--- | ||||||
slug: 2023-02-28-debugging-javascript-backend | ||||||
title: Debugging the JavaScript Backend | ||||||
date: February 28, 2023 | ||||||
authors: [ luite ] | ||||||
tags: [ghc, javascript, debugging ] | ||||||
--- | ||||||
|
||||||
## Introduction | ||||||
|
||||||
I recently gave a short presentation on the topic of debugging the code produced by the GHC JavaScript backend to the GHC team at IOG. This blog post is a summary of the content. | ||||||
|
||||||
## Debugging JavaScript | ||||||
|
||||||
Browsers come with powerful development tools for JavaScript. In particular the chrome development tools are very useful for stepping through JavaScript code and inspecting data during the execution of a program. We can still use these development tools on the code generated by the GHC JavaScript backend, but due to the structure of the code, it can sometimes be difficult to figure out where exactly something goes wrong. | ||||||
|
||||||
This blog post is an experience report that presents a couple of practical techniques for debugging various problems in the JavaScript code. | ||||||
|
||||||
## Tracing Operations | ||||||
|
||||||
Various components of the RTS have tracing options enabled by preprocessor definitions. For example weak reference operations can be traced by compiling the `rts` package with the `-DGHCJS_TRACE_WEAK` cpp option. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. example? Maybe show some of the CPP'd code where the debug option lives |
||||||
|
||||||
Currently, enabling the trace functionality requires rebuilding the `rts` package, while previously with GHCJS it was possible to enable the required tracing by just recompiling the final program. We will likely change this setup to include all tracing functionality in a debug rts liked when using the `-debug` flag, and easily modifyable global settings to enable or disable specific tracing modules. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This paragraph should be the last one in the section because it is no longer talking about the technique, rather it is talking about the usability of the technique and then concludes that making this technique more usable is on our roadmap. So the flow of the technique sections should be:
|
||||||
|
||||||
All the tracing uses the `h$log` function which can be easily modified to redirect the output of the trace, for example tracing only to an array (which can be watched by the JavaScript debugger) and keeping only the last `n` entries. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this is the key part. As a reader looking to debug my JS backend code this is the part I'm most interested in. Thus you should add the examples that you elude to. That is add an example that demonstrates |
||||||
|
||||||
## Dealing With Tail Calls | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This section is an example of use case for Technique 1. So label it as such:
Suggested change
|
||||||
|
||||||
All Haskell code is called from a main loop that looks as follows: | ||||||
|
||||||
```javascript | ||||||
while(!haveToYield(c)) { | ||||||
c = c(); | ||||||
c = c(); | ||||||
c = c(); | ||||||
... | ||||||
} | ||||||
``` | ||||||
|
||||||
The main loop keeps calling the funtion returned by the previous call, until the thread has to stop for some reason. This means that the JavaScript call stack isn't very useful for figuring out where something goes wrong in our code: It only contains function calls up to the main loop. If some `c` fails, we don't know much about what calls lead up to the error condition! | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I would start with this because it is the place the audience is at and a thing the audience will probably assume. So something like this:
|
||||||
|
||||||
However the main loop does give us a good opportunity to add some tracing: If we log each `c = c();` call (the function name and possibly the status of the Haskell stack and some relevant global variables) we can reconstruct more easily which conditions resulted in the error. The RTS provides the useful `h$logCall` and `h$logStack` helper functions for this. | ||||||
|
||||||
Logging main loop calls generates a lot of output, even more so than tracing specific RTS features, so it's probably necessary to redirect and/or truncate the output of `h$log` here. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. great! this means an example of doing that redirection in the first section is even better! |
||||||
|
||||||
It's often useful to make the `haveToYield` condition deterministic, by not taking wall clock time into account. This runs each thread until it blocks or finishes (`c === h$reschedule`). That makes runs reproducible, even if more than one Haskell thread is involved. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Perhaps show how to change |
||||||
|
||||||
## Data Corruption | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. use case |
||||||
|
||||||
Function call traces are useful if an error condition manifests itself relatively close to the initial problem. But what if our program crashes on some malformed data. We need to know which what caused to data to be malformed in the first place! | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This paragraph is missing a sentence to tie into the next section:
Suggested change
|
||||||
|
||||||
### Representation Checks | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
|
||||||
One strategy is trying to catch the error earlier by introducing more checks to verify that the types of our data are what we expect. JavaScript is dynamically typed, so the browser will happily run our code, even if we use a `number` in a place where we'd normally use an `object`. | ||||||
|
||||||
When generating code however, we have a lot more knowledge. When we access data fields of a data constructor or closure, we know which type of data we expect. It's straightforward to modify the code generator to add a test after each field access. The file `verify.js` in the `rts` package has some helper functions for this. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Need to say what a representation check is: |
||||||
|
||||||
### Sequence Numbering | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
|
||||||
Representation verification does not always help us find the origin of the problem. Sometimes we'd like to know where some heap object was allocated. We can do this by combining function call tracing with sequence numbers for allocation. After allocating a Haskell heap object we call a helper function that gives the object a unique sequence number. When we run into the error condition with the incorrect data, we inspect its sequence number and match it up with the function that produced it. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||||||
|
||||||
Then in another run of the program we can step through the function that produced the wrong data using the JavaScript debugger. | ||||||
|
||||||
## Debugging Weak References | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Use Case |
||||||
|
||||||
Sometimes we don't want to debug a problem where the data itself is wrong, but where the data is used at the wrong time. For the JavaScript backend this issue comes up with the storage manager that keeps track of weak references. | ||||||
|
||||||
The weak references garbage collector keeps track of every Haskell value that is still reachable from the Haskell runtime system. This means that if a Haskell value is considered to be unreachable by XXX XXX. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this section doesn't seem finished! |
||||||
|
||||||
## Bisection | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
|
||||||
Sometimes we have a working reference implementation and an optimized implementation in which we want to fix some problem. If our optimization is a more efficient implementation of a specific primop, then we probably know where to look for the problem. But if our optimization is a rewrite pass of all code, things can get a lot more difficult. | ||||||
|
||||||
The JavaScript optimizer is such a rewrite pass. It takes JavaScript code and rewrites it to a more efficient and compact form. | ||||||
|
||||||
The pass looks conceptually like this: | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. great! |
||||||
|
||||||
```javascript | ||||||
// input | ||||||
f() { | ||||||
// original function body | ||||||
} | ||||||
|
||||||
// output | ||||||
f() { | ||||||
// optimized function body | ||||||
} | ||||||
``` | ||||||
|
||||||
After any change to the optimizer we have to recompile all libraries. If we don't know exactly where to make our changes, this can take a lot of time. | ||||||
|
||||||
It turned out to be very useful not to search by selectively enabling the optimizer only on part of the code: Once we know which function is broken by the optimizer, we can easily run the optimizer on it separately to find out where it goes wrong. | ||||||
|
||||||
The trick to making this work effectively was keeping around both the optimized and original code for every function: | ||||||
|
||||||
```javascript | ||||||
// input | ||||||
f() { | ||||||
// original function body | ||||||
} | ||||||
|
||||||
// output | ||||||
f() { | ||||||
if(sequence_no_for_f < threshold) { | ||||||
// original function body | ||||||
} else { | ||||||
// optimized function body | ||||||
} | ||||||
} | ||||||
``` | ||||||
Each function gets its own sequence number, starting from zero. We adjust the threshold value that determines which functions run the optimized function body to quickly close in on where the optimized version gives a different results. | ||||||
|
||||||
## Conclusion | ||||||
|
||||||
We have seen a few debugging strategies for code generated by the JavaScript backend. Most of them are a bit ad hoc and require modification of the compiler or the compiled code. Over time we will probably make more of them available through code generator flags and from a debugging version of the RTS. |
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,102 @@ | ||||||
--- | ||||||
slug: 2023-02-28-weak-references-in-the-javascript-backend | ||||||
title: Weak References in the JavaScript backend | ||||||
date: February 28, 2023 | ||||||
authors: [ luite ] | ||||||
tags: [ghc, javascript, storagemanager ] | ||||||
--- | ||||||
|
||||||
## Introduction | ||||||
|
||||||
I recently gave a short presentation on the topic of weak references in the GHC JavaScript backend to the GHC team at IOG. This blog post is a summary of the content. | ||||||
|
||||||
## Haskell Weak References | ||||||
|
||||||
The "Stretching the Storage Manager" [ssm][1] paper describes weak references as implemented by GHC. These weak references are available through the `System.Mem.Weak` module. Each weak reference connects a key and a value. The value is kept alive by the weak reference as long as the key is alive. Optionally, weak references can have a finalizer of type `IO ()`, which is run after the key becomes unreachable. | ||||||
|
||||||
## JavaScript Weak References | ||||||
|
||||||
JavaScript has weak references on its own, specifically the `WeakMap`. But the functionality is quite different from Haskell's. `WeakMap` is not iterable, its size is not visible and it has no finalizers. Therefore it's impossible to observe when a weak value has become unreachable. | ||||||
|
||||||
There have been proposals to add finalizers to `WeakMap` but so far they haven't been implemented because they introduce nondeterminism and expose reachability information which could impact security. | ||||||
|
||||||
Specific JavaScript environments like node.js do have weak references with the required functionality to implement Haskell `Weak#`. On these platforms we could substitute the general purpose `Weak#` implementation, and we could verify consistency between the general purpose implementation and a node.js specific one. | ||||||
|
||||||
## Checking Reachability | ||||||
|
||||||
Since we don't have a way to determine which `Weak#` keys have become unreachable, we have to do the opposite: Check which values are still reachable. The general idea is as follows: Every Haskell heap object gets a mark property `m`, which is changed by the rechability checker. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What is the set of value that |
||||||
|
||||||
After scanning the whole heap we can determine which `Weak#` keys are still reachable by checking if their mark has been updated. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. when does a scan occur? |
||||||
|
||||||
## Weak Implementation | ||||||
|
||||||
All Haskell heap objects have an identical object stucture, with an entry function and some data properties. The entry function also contains metadata about the object, for example the constructor tag for data constructors and the arity for functions. | ||||||
|
||||||
```javascript | ||||||
// heap object (incomplete) | ||||||
{ f // function, entry point | ||||||
, d1 // any, first data property | ||||||
, d2 // any, second data property (or indirection to more data) | ||||||
} | ||||||
``` | ||||||
|
||||||
To be able to keep track of reachability, we add one property `m` to each object: | ||||||
|
||||||
```javascript | ||||||
// heap object (incomplete) | ||||||
{ f // function, entry point | ||||||
, d1 // any, first data property | ||||||
, d2 // any, second data property (or indirection to more data) | ||||||
, m // number, garbage collection mark | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. aha! its a number! But why is it a number? if its 2 does that mean there are two things pointing to it? |
||||||
} | ||||||
``` | ||||||
|
||||||
The mark gets updated by the code that checks for reachability of everything. This means that we could implement a `Weak#` as follows: | ||||||
|
||||||
```javascript | ||||||
// weak (not actual) | ||||||
h$Weak { | ||||||
key: heap object | ||||||
, value: heap object | ||||||
, finalizer: null or heap object | ||||||
} | ||||||
``` | ||||||
Comment on lines
+54
to
+63
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. great! Much better examples in this one |
||||||
|
||||||
But this means that our `h$Weak` keeps both the key and the value alive. For the operations that `Weak#` needs to support, this isn't necessary, and in fact we'd like to avoid it so that the JavaScript storage manager can reclaim memory as quickly as possib. That's why we make another change to the heap objects, adding an optional indirection to the mark: | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
|
||||||
```javascript | ||||||
// heap object (actual) | ||||||
{ f // function, entry point | ||||||
, d1 // any, first data property | ||||||
, d2 // any, second data property (or indirection to more data) | ||||||
, m // number/h$StableName, garbage collection mark | ||||||
} | ||||||
|
||||||
h$StableName { | ||||||
stableNameNo: number, unique identifier | ||||||
, m : number, garbage collection mark | ||||||
} | ||||||
``` | ||||||
|
||||||
Now we can replace a `number` mark by an `h$StableName` for the key, and then create the weak reference as follows: | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You should provide a link to the documentation of StableName and Weak. Also you should mention that:
Hence StableNames are a perfect proxy to know if a heap object is reachable without keeping the actual object alive. This is exactly what we need for the key of Weak. |
||||||
|
||||||
```javascript | ||||||
// weak (actual) | ||||||
h$Weak { | ||||||
key: h$StableName, the stablename of the key heap object | ||||||
, value: heap object | ||||||
, finalizer: null or heap object | ||||||
} | ||||||
``` | ||||||
This way the `h$Weak` does not reference they key itself. It still knows when the key is unreachable, since the mark of the `h$StableName` of the key would not be updated anymore. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. great explanation. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
|
||||||
## Finalizers | ||||||
|
||||||
Every time the heap is scanned for dead weak references, the associated finalizers are collected. After the pass, if at least one finalizer needs to be run, the storage manager schedules a new thread. This thread runs all the finalizers of the pass. Exceptions are handled between finalizers, but a finalizer that takes a long time will delay execution of the others. | ||||||
|
||||||
## Conclusion | ||||||
|
||||||
We have seen the implementation of weak references in the JavaScript backend. Since we cannot use the JavaScript engine to determine which Haskell heap objects are reachable we use a custom reachability check to implement the required functionality. We have chosen the implmentation in such a way that the JavaScript engine retains as little memory as possible. | ||||||
|
||||||
|
||||||
[1]: Peyton Jones, Simon and Marlow, Simon and Elliott, Conal, Stretching the storage manager: weak pointers and stable names in Haskell, Proceedings of the 11th International Workshop on the Implementation of Functional Languages, 1999, https://www.microsoft.com/en-us/research/publication/stretching-the-storage-manager-weak-pointers-and-stable-names-in-haskell/ |
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,147 @@ | ||||||
--- | ||||||
slug: 2023-02-28-stacks-in-the-js-backend | ||||||
title: Stacks in the JavaScript Backend | ||||||
date: Febuary 28, 2023 | ||||||
authors: [ luite ] | ||||||
tags: [ghc, javascript, threads, rts ] | ||||||
--- | ||||||
|
||||||
## Introduction | ||||||
|
||||||
I recently gave a short presentation on the topic of stacks in the JavaScript backend to the GHC team at IOG. This blog post is a summary of the content. | ||||||
|
||||||
## Haskell Lightweight Stacks | ||||||
|
||||||
In the context of a program produced by the GHC JavaScript backend, two different types of stack exist: The JavaScript call stack and Haskell lightweigt thread stacks. This blog post deals with the latter. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
In general its an anti-pattern to say There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good point. I've certainly been guilty of using former/latter too. |
||||||
|
||||||
Each Haskell thread has a thread state object, `t` of type `h$Thread`. This object stores the state of a lightweight thread, for example whether the thread is finished or whether asynchronous exceptions are ignored. It also contains `t.stack`, an array representing the stack and `t.sp`, a number pointing to the current top of the stack. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. imagine ordering some food from a cafe. Its clearer to the barista if you start at a high level and then refine the information: I'm ordering 3 things: 2 drinks and a pastry. The drinks are both lattes, same size, but one with oat milk the other regular....and so on. The point is to first give a high level view: So how about:
Suggested change
|
||||||
|
||||||
`t.stack` grows dynamically as needed, and is occasionally shrunk to reclaim memory. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. merge this line in with the previous paragraph. Its an orphan and is continuing the topic of the previous paragraph. Thus it belongs in that paragraph. |
||||||
|
||||||
When a thread is created, the stack is initialized with some values: | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
These aren't just any values, they are particular and key values to the correctness of the implementation. |
||||||
|
||||||
```javascript | ||||||
/** @constructor */ | ||||||
function h$Thread() { | ||||||
this.tid = ++h$threadIdN; | ||||||
this.status = THREAD_RUNNING; | ||||||
this.stack = [h$done | ||||||
, 0 | ||||||
, h$baseZCGHCziConcziSynczireportError | ||||||
, h$catch_e | ||||||
]; | ||||||
this.sp = 3; | ||||||
this.mask = 0; // async exceptions masked (0 unmasked, 1: uninterruptible, 2: interruptible) | ||||||
this.interruptible = false; // currently in an interruptible operation | ||||||
... | ||||||
} | ||||||
``` | ||||||
|
||||||
The initial stack contains two stack frames. The top three slots contain a `catch` frame with the `h$catch_e` header, the `h$baseZCGHCziConcziSynczireportError` exception handler and `0`, for the mask state. The last slot of the stack is for `h$done` frame, which only has a header and no payload. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. how could I tell from looking at the example that the stack contains two frames? I don't think I can, so maybe say that. Other than that this is good because you go from high level and then refine like I suggested above. |
||||||
|
||||||
## Scheduling a Thread | ||||||
|
||||||
Typical Haskell code does a lot of manipulation of values on the stack. It would be quite inefficient to do all of this through the thread state object of the current thread, `h$currentThread`. That's why the stack `h$stack` and the "stack pointer" `h$sp` to the top of the stack are global variables that are initialized when a thread is scheduled: | ||||||
|
||||||
```javascript | ||||||
// scheduling a thread t | ||||||
h$currentThread = t; | ||||||
h$stack = t.stack; | ||||||
h$sp = t.sp; | ||||||
Comment on lines
+46
to
+50
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. explain the examples: Perhaps like this:
|
||||||
``` | ||||||
|
||||||
When a thread is suspended, the values are saved back to the thread state object: | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this is funny but I was reading through the history of chez scheme: https://legacy.cs.indiana.edu/~dyb/pubs/hocs.pdf and this is exactly one of the ways they implemented continuations in an early version of chez (see Section 3, paragraph 3 beginning with |
||||||
|
||||||
```javascript | ||||||
// suspending a thread t | ||||||
t.stack = h$stack; | ||||||
t.sp = h$sp; | ||||||
h$currentThread = null; | ||||||
Comment on lines
+55
to
+59
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. explain this example |
||||||
``` | ||||||
|
||||||
## Stack Frames | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. aha! Perhaps link or reference this earlier at the place I made a comment about stack frames |
||||||
|
||||||
Each stack frame starts with a header, which is a JavaScript function. The header is followed by zero or more slots of payload, which can be arbitrary JavaScript values. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Perhaps mention that in JavaScript functions can have properties and that we use this feature to indicate the number of stack slots for the frame payload. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah I see that you mention this later. I would put this her to first explain the structure of the frame and then how we use it. |
||||||
|
||||||
The header serves as the "return point": When some code is done reducing some value to weak-head normal form it returns this value to the next stack frame by storing it in `h$r1` (or more for large values or unboxed tuples), popping its own stack frame and calling the header of the next stack frame at `h$stack[h$sp]` | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. explain that There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would write something like this to introduce the topic first: GHC's calling convention for functions generated from STG is to always perform tail-calls, where the tail-call target is a continuation. In details what happens in this case is:
Here is an annotated code example of this process: ... |
||||||
|
||||||
An example is shown below. | ||||||
|
||||||
```javascript | ||||||
function h$stackFrame_e() { | ||||||
... | ||||||
h$r1 = somethingWHNF; | ||||||
h$sp -= 3; // pop current frame | ||||||
return h$stack[h$sp]; // return to next frame | ||||||
} | ||||||
Comment on lines
+71
to
+76
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. explain the example in a paragraph that immediately follows the example |
||||||
``` | ||||||
|
||||||
The header also contains metadata, stored in properties of the function object. Of particular interest is the `size` property, which contains the size of the stack frame in slots. Certain operations, like throwing exceptions or restarting STM transactions need to know the size of each stack frame to be able to "unwind" the stack. | ||||||
|
||||||
Almost all stack frames have their size stored in the `size` property of the header. An exception is the `h$ap_gen` frame, which contains an arbitrary size function application. This frame type does not have a fixed size, and the size is stored in the payload of the frame itself. Frames `f` with the size stored in they payload of the frame have `f.size < 0`. | ||||||
|
||||||
## Exception Handling | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this section was very good. The examples need to be explained more but other than that I thought it was very nice and clear. Good job! |
||||||
|
||||||
During normal execution of a program, the code that manipulates the stack has knowledge of the specific stack frame it's working with: It knows which values are stored in each stack slot. However there are also operations that require dealing with all kinds of unknown stack frames. Exceptions and STM are the most important ones. | ||||||
|
||||||
Haskell allows exceptions to be thrown within threads and between threads as an alternate way of returning a value. The throw operation transfers control to exception handler in the next `catch` frame on the stack. | ||||||
|
||||||
The `catch` frame has two words of payload: | ||||||
|
||||||
```javascript | ||||||
0 // mask status | ||||||
h$baseZCGHCziConcziSynczireportError // handler | ||||||
h$catch_e // header | ||||||
``` | ||||||
|
||||||
The code for the header is straightforward, it just pops the stack frame and returns to the next frame. This is what happens if no exception has occurred; the code just skips past the exception handler: | ||||||
|
||||||
```javascript | ||||||
function h$catch_e() { | ||||||
h$sp -= 3; | ||||||
return h$stack[h$sp]; | ||||||
}; | ||||||
``` | ||||||
|
||||||
An exception is thrown by the `h$throw` function, which unwinds the stack. Its implementation in simplified form looks like this: | ||||||
|
||||||
```javascript | ||||||
function h$throw(e, async) { | ||||||
... | ||||||
while(h$sp > 0) { | ||||||
f = h$stack[h$sp]; | ||||||
... | ||||||
if(f === h$catch_e) break; | ||||||
if(f === h$atomically_e) { ... } | ||||||
if(f === h$catchStm_e && !async) break; | ||||||
if(f === h$upd_frame) { /* handle black hole */ } | ||||||
h$sp -= h$stackFrameSize(f, sp); | ||||||
} | ||||||
if(h$sp > 0) { | ||||||
var maskStatus = h$stack[h$p - 2]; | ||||||
var handler = h$stack[h$sp - 1]; | ||||||
... | ||||||
} | ||||||
/* jump to handler */ | ||||||
} | ||||||
``` | ||||||
|
||||||
`h$throw` keeps removing stack frames from the stack until some frame of interest is found. Eventually it transfers control to an exception handler or it reports an error if no exception handling frame could be found. `h$throw` uses the `h$stackFrameSize` helper function do determine the size of each frame. | ||||||
|
||||||
```javascript | ||||||
function h$stackFrameSize(f) { | ||||||
if(f === h$ap_gen) { | ||||||
return (h$stack[h$sp - 1] >> 8) + 2; | ||||||
} else { | ||||||
var tag = f.size; | ||||||
if(tag < 0) { | ||||||
return h$stack[h$sp-1]; | ||||||
} else { | ||||||
return (tag & 0xff) + 1; | ||||||
} | ||||||
} | ||||||
``` | ||||||
|
||||||
## Conclusion | ||||||
|
||||||
We have that stacks in the JavaScript backend are represented by JavaScript arrays. The contents on the stack consists of stack frames with a header and a payload. The header of each stack frame contains some metadata so that code for exception can traverse the stack and transfer control to an exception handler. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the introduction you state:
... presents a couple of practical techniques ...
, perhaps each technique should be labelled as such in the section header: