Skip to content

Latest commit

 

History

History
195 lines (144 loc) · 6.65 KB

IMPLEMENTATION.md

File metadata and controls

195 lines (144 loc) · 6.65 KB

Theory

In general, a macro-based implementation of this sort of program structure requires generating lables (and branch destinations) with two levels of uniqueness. One level of uniqueness is so that multiple structured statements do not have their lables collide. The second level handles nesting.

The local label capability built into gas allows the collision problem to be ignored entirely. We can use non-unique lables for the else_label and endif_lable, and in fact use the same local label name for each elseif label, since gas has the ability to understand "jump forward to the next instance of local label 0." Similarly for looping, "jump backward to the previous instance of local label 1" can be repeated.

This leaves nesting, which is relative easily handled with a counter. These macros use a single nesting counter, which is incremented by two for each "begin" macro ("if" or "do") This gives us two usable local labels for each structure. For conditional statements, the two labels are used for the end of full statement (endif label), and for the next else label (else labelN.) For loops, we use one label for the start of the loop, and one for (past) the end.)

There are a couple of symbols and support macros used internally by this package. Theoretically, all symbols are prefixed with "ST", and all support macros are prefixed with "st", so as to reduce the likelyhood of collisions with other macro packages or user code.

Here are some examples of how things work:

Generated Code Examples

_if/_endif

   cmp r1, r2                    cmp r1, r2
   _if z			     jnz 0f
     call args_equal	       call args_equal
 call statprint		       call statprint
   _endif			    0: 1:

_if/_else/_endif

   cmp r1, r2		     cmp r1, r2
   _if z			     jnz 0f
     call args_equal	       call args_equal
   _else			     0: rjmp 1f
     call args_NOTequal	       call args_NOTequal
   _endif			     0: 1:
   call statprint         	       call statprint

_if/_elseif/_else/_endif

   cpi r16, 'a'		         cpi r16, 'a'
   _if e			         jne 0f
     call sub_for_a			   call sub_for_a
   _elseif <cpi r16, 'b'>, e           rjmp 1f
			      0: cpi r16, 'b'
				 jne 0f
     call sub_for_b		       	   call sub_for_b
   _elseif <cpi r16, 'c'>, e       	   rjmp 1f
			      0: cpi r16, 'c'
				 jne 0f
     call sub_for_c		       	   call sub_for_c
   _else			           rjmp 1f
     call badcmd		      0: call badcmd
   _endif			      0: 1:

_if/_else with nesting

   cpi r16, 'a'                  cpi r16, 'a'
   _if e			     jne 0f
     cpi r17, NEWLINE	       cpi r17, 13
 _if e			       jne 2f
   call dummy			 call dummy
     _endif			       2: 3:
     call afunc		       call afunc
   _else			       rjmp 1f
     call badcmd		     0: call badcmd
   _endif			     1:

_break cc Branch out of the loop, past the closing _while or _until.

_until cc

Branch back to the beginning of the loop, if the condition is NOT true.

   _do
     call movechar
 cpse r1, r2
   _until skp

_while cc

Branch back to the beginning of the loop, IF the connection is true. "_while always" creates an infinite loop.

   _do
     cpi r16, 16
 _break lt
 call moveit
   _while always

Testing

The implementation includes a test-cases directory containing at least avr assembly code that uses each of the macros. There's no magic, but the code produced can be inspected for correctness.

For looking at the final code, you should produce an executable so that the linker resolves all of the symbol references.

   avr-gcc -mmcu=atmega8 -Istruct_asm_dir myprogram.S -o myprogram.elf
   avr-objdump -S myprogram.elf

Porting

The Gnu Assembler supports a great number of different CPU architectures, and the core functionality of the structured programming macros should be usable without changes on most of them.

What DOES change between different cpus is the exact format of conditional branch instructions and what condition codes can be used. The macros are designed to have these parts of the code be easily modified to support other CPUs.

The core "struct_gas.S" file includes a conditional chain to include the appropriate cpu-specific defintions. This can be extended as needed. The symbols pre-defined by the specific version of gas should be documented, but you can also get hints by using the compiler binary to "dump" the predefined symbols using "xxx-gcc -dM -E - </dev/null"

#ifdef __AVR__
#include "struct_gas_avr.S"
#elif __MSP430__
#include "struct_gas_msp430.S"
#elif __x86_64__
#include "struct_gas_x86.S"
#elif __m68k__
#include "struct_gas_m68k.S"
#else
#error No recognized Architecture for struct_gas.S
#endif

The main content of the cpu-specific files is a set of macros that define "cannonical" forms of all the available conditional jumps, and their logical opposites. For example, if your CPU supports only "zero/not zero" (jz, jnz) and "carry set/carry clear" (jcs, jcc) conditions, you would define macros for "st_jmp_z", "st_jmp_nz", "st_jmp_not_z", "st_jmp_not_nz", "st_jmp_cs", "st_jmp_cc", "st_jmp_not_cs", "st_jmp_not_cc", plus "st_jmp_always." Note that "st_jmp_not_nz" would generate the same code as "st_jmp_z", and that the use of cannonical forms prevents the need for "double jumps" to implement the "not" conditionals ("jz .+2; jmp endif" to implement "if z")

This may sound like a lot of work, but it is very mechanical, and can usually be assisted by using the ".irp" assembler directive (use the existing struct_gas_xxx.S files as examples.)

References (I haven't read all of these, but they turned up while I was searching for an existing implementation.)

"Adding Structured Control-flow to any* Assembler" (*Except gas) (Nov 2010) http://dkeenan.com/AddingStructuredControlFlowToAnyAssembler.htm The immediate motivation for this work.

"Structured Assembler Language Programming Using HLASM" by Edward E. Jaffe. HLASM is apparently an IBM Mainframe thing. This is a pretty nice presentation on the whys and hows, for a much more extensive package than the gas implementation presented here.

"Macro Implementation of a Structured Assembly Language" (may 1982) http://ieeexplore.ieee.org/xpl/articleDetails.jsp?reload=true&arnumber=1702943&contentType=Journals+%26+Magazines I couldn't find a version of this that wasn't paywalled.

"Macsym.mac" The tops20 operating system support macros that include .ifskp and similar, circa 1981. http://pdp-10.trailing-edge.com/BB-4172H-BM/01/4-1-sources/macsym.mac.html

struct.mac for PDP10 from DECUS tapes http://pdp-10.trailing-edge.com/decuslib10-08/01/43,50500/struct.mac.html