Skip to content

Commit f6f52a2

Browse files
authored
Update the internals documentation (jerryscript-project#2923)
JerryScript-DCO-1.0-Signed-off-by: Robert Fancsik [email protected]
1 parent de71fe4 commit f6f52a2

File tree

3 files changed

+18
-17
lines changed

3 files changed

+18
-17
lines changed

docs/04.INTERNALS.md

+18-17
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ Expression parser is responsible for parsing JavaScript expressions. It is imple
2323

2424
JavaScript statements are parsed by this component. It uses the [Expression parser](#expression-parser) to parse the constituent expressions. The implementation of Statement parser is located in `./jerry-core/parser/js/js-parser-statm.c`.
2525

26-
Function `parser_parse_source` carries out the parsing and compiling of the input EcmaScript source code. When a function appears in the source `parser_parse_source` calls `parser_parse_function` which is responsible for processing the source code of functions recursively including argument parsing and context handling. After the parsing, function `parser_post_processing` dumps the created opcodes and returns an `ecma_compiled_code_t*` that points to the compiled bytecode sequence.
26+
Function `parser_parse_source` carries out the parsing and compiling of the input ECMAScript source code. When a function appears in the source `parser_parse_source` calls `parser_parse_function` which is responsible for processing the source code of functions recursively including argument parsing and context handling. After the parsing, function `parser_post_processing` dumps the created opcodes and returns an `ecma_compiled_code_t*` that points to the compiled bytecode sequence.
2727

2828
The interactions between the major components shown on the following figure.
2929

@@ -33,19 +33,19 @@ The interactions between the major components shown on the following figure.
3333

3434
This section describes the compact byte-code (CBC) representation. The key focus is reducing memory consumption of the byte-code representation without sacrificing considerable performance. Other byte-code representations often focus on performance only so inventing this representation is an original research.
3535

36-
CBC is a CISC like instruction set which assigns shorter instructions for frequent operations. Many instructions represent multiple atomic tasks which reduces the byte code size. This technique is basically a data compression method.
36+
CBC is a CISC like instruction set which assigns shorter instructions for frequent operations. Many instructions represent multiple atomic tasks which reduces the bytecode size. This technique is basically a data compression method.
3737

3838
## Compiled Code Format
3939

40-
The memory layout of the compiled byte code is the following.
40+
The memory layout of the compiled bytecode is the following.
4141

4242
![CBC layout](img/CBC_layout.png)
4343

4444
The header is a `cbc_compiled_code` structure with several fields. These fields contain the key properties of the compiled code.
4545

46-
The literals part is an array of ecma values. These values can contain any EcmaScript value types, e.g. strings, numbers, function and regexp templates. The number of literals is stored in the `literal_end` field of the header.
46+
The literals part is an array of ecma values. These values can contain any ECMAScript value types, e.g. strings, numbers, functions and regexp templates. The number of literals is stored in the `literal_end` field of the header.
4747

48-
CBC instruction list is a sequence of byte code instructions which represents the compiled code.
48+
CBC instruction list is a sequence of bytecode instructions which represents the compiled code.
4949

5050
## Byte-code Format
5151

@@ -55,15 +55,15 @@ The memory layout of a byte-code is the following:
5555

5656
Each byte-code starts with an opcode. The opcode is one byte long for frequent and two byte long for rare instructions. The first byte of the rare instructions is always zero (`CBC_EXT_OPCODE`), and the second byte represents the extended opcode. The name of common and rare instructions start with `CBC_` and `CBC_EXT_` prefix respectively.
5757

58-
The maximum number of opcodes is 511, since 255 common (zero value excluded) and 256 rare instructions can be defined. Currently around 230 frequent and 120 rare instructions are available.
58+
The maximum number of opcodes is 511, since 255 common (zero value excluded) and 256 rare instructions can be defined. Currently around 215 frequent and 70 rare instructions are available.
5959

6060
There are three types of bytecode arguments in CBC:
6161

6262
* __byte argument__: A value between 0 and 255, which often represents the argument count of call like opcodes (function call, new, eval, etc.).
6363

6464
* __literal argument__: An integer index which is greater or equal than zero and less than the `literal_end` field of the header. For further information see next section Literals (next).
6565

66-
* __relative branch__: An 1-3 byte long offset. The branch argument might also represent the end of an instruction range. For example the branch argument of `CBC_EXT_WITH_CREATE_CONTEXT` shows the end of a `with` statement. More precisely the position after the last instruction.
66+
* __relative branch__: An 1-3 byte long offset. The branch argument might also represent the end of an instruction range. For example the branch argument of `CBC_EXT_WITH_CREATE_CONTEXT` shows the end of a `with` statement. More precisely the position after the last instruction in the with clause.
6767

6868
Argument combinations are limited to the following seven forms:
6969

@@ -137,12 +137,12 @@ Byte-codes of this category serve for placing objects onto the stack. As there a
137137

138138
<span class="CSSTableGenerator" markdown="block">
139139

140-
| byte-code | description |
141-
| --------------------- | ---------------------------------------------------- |
142-
| CBC_PUSH_LITERAL | Pushes the value of the given literal argument. |
143-
| CBC_PUSH_TWO_LITERALS | Pushes the value of the given two literal arguments. |
144-
| CBC_PUSH_UNDEFINED | Pushes an undefined value. |
145-
| CBC_PUSH_TRUE | Pushes a logical true. |
140+
| byte-code | description |
141+
| --------------------- | ----------------------------------------------------- |
142+
| CBC_PUSH_LITERAL | Pushes the value of the given literal argument. |
143+
| CBC_PUSH_TWO_LITERALS | Pushes the values of the given two literal arguments. |
144+
| CBC_PUSH_UNDEFINED | Pushes an undefined value. |
145+
| CBC_PUSH_TRUE | Pushes a logical true. |
146146
| CBC_PUSH_PROP_LITERAL | Pushes a property whose base object is popped from the stack, and the property name is passed as a literal argument. |
147147

148148
</span>
@@ -196,7 +196,7 @@ Branch byte-codes are used to perform conditional and unconditional jumps in the
196196
| CBC_JUMP_BACKWARD | Jumps backward by the 1 byte long relative offset argument. |
197197
| CBC_JUMP_BACKWARD_2 | Jumps backward by the 2 byte long relative offset argument. |
198198
| CBC_JUMP_BACKWARD_3 | Jumps backward by the 3 byte long relative offset argument. |
199-
| CBC_BRANCH_IF_TRUE_FORWARD | Jumps if the value on the top of the stack is true by the 1 byte long relative offset argument. |
199+
| CBC_BRANCH_IF_TRUE_FORWARD | Jumps forward if the value on the top of the stack is true by the 1 byte long relative offset argument. |
200200

201201
</span>
202202

@@ -219,12 +219,14 @@ ECMA component of the engine is responsible for the following notions:
219219

220220
## Data Representation
221221

222-
The major structure for data representation is `ECMA_value`. The lower two bits of this structure encode value tag, which determines the type of the value:
222+
The major structure for data representation is `ECMA_value`. The lower three bits of this structure encode value tag, which determines the type of the value:
223223

224224
* simple
225225
* number
226226
* string
227227
* object
228+
* symbol
229+
* error
228230

229231
![ECMA value representation](img/ecma_value.png)
230232

@@ -275,7 +277,6 @@ The objects are represented as following structure:
275277

276278
* Reference counter - number of hard (non-property) references
277279
* Next object pointer for the garbage collector
278-
* GC's visited flag
279280
* type (function object, lexical environment, etc.)
280281

281282
### Properties of Objects
@@ -323,7 +324,7 @@ Collections are array-like data structures, which are optimized to save memory.
323324

324325
### Exception Handling
325326

326-
In order to implement a sense of exception handling, the return values of JerryScript functions are able to indicate their faulty or "exceptional" operation. The return values are actually ECMA values (see section [Data Representation](#data-representation)) in which the error bit is set if an erroneous operation is occurred.
327+
In order to implement a sense of exception handling, the return values of JerryScript functions are able to indicate their faulty or "exceptional" operation. The return values are ECMA values (see section [Data Representation](#data-representation)) and if an erroneous operation occurred the ECMA_VALUE_ERROR simple value is returned.
327328

328329
### Value Management and Ownership
329330

docs/img/ecma_object.png

2.51 KB
Loading

docs/img/ecma_value.png

-849 Bytes
Loading

0 commit comments

Comments
 (0)