Write high-level control flow => get correct optimized bit-twiddled parallel C code
- Embedded DSL
- Host language: Scala
 
-  Tagless Final Interpreter
- Evaluator, Tracer, Twiddle AST emitter
 
 - AST
 - Optimizers (twiddler/parallelizer) => adds annotations to AST
 - CodeGen module
 -  
Verifier module 
 
- Primitives: integers, bit-vector
 - Constructs: coniditionals, for loops, while loops, do loops
 - Arithmetics: +, -, *, /, **
 - Bit-wise operators: |, &, ~, >>, <<, ^
 - Logical operators: &&, ||, ==, <, >, <=, =>, !=
 - Optimized builtins: log2, log10, sqrt, ceil, has_zero, byte_{eq,gt,lt,gte,lte}, signof, abs, min, max, count_bits_set, rev_bits, swap_bits, %, is_power_2, next_power_2
 - Optimizations on constructs: vectorize/unroll loops, high-level op -> bit-operation, branch -> bit-operations without branches
 
Example:
x = 20
y = 10
if( hasZero(x) )
    swapBits(y, log10(x))
should produce
#define SWAP(a, b) (((a) ^= (b)), ((b) ^= (a)), ((a) ^= (b)))
#define haszero(v) (((v) - 0x01010101UL) & ~(v) & 0x80808080UL)
int x = 20;
int y = 10;
if( haszero(x) )
{
  unsigned int v = x;
  int r;
  r = (v >= 1000000000) ? 9 : (v >= 100000000) ? 8 : (v >= 10000000) ? 7 : 
      (v >= 1000000) ? 6 : (v >= 100000) ? 5 : (v >= 10000) ? 4 : 
      (v >= 1000) ? 3 : (v >= 100) ? 2 : (v >= 10) ? 1 : 0;
  SWAP(y, r)
}
Using the OpenMP annotation backend:
x = "Print me in parallel"
for(i = 0; i < 10000000000; i++)
 prints(x)
should produce
char* x = "Print me in parallel"
#pragma omp for
for(int i = 0; i < 10000000000; i++)
 printf("%s", x)
-  Choose host language
- Scala
 
 -  Should generated code be portable? Compiler intrinsics/ifdefs/etc. or not
- Yes
 
 -  Should aggressiveness level be adjustable?
- Yes. Perhaps different evaluator for aggressive optimizations
 
 -  Choose codegen backend. E.g. LMS? lms-verify?
- No more LMS. Documentation is virtually non-existent. C backend is not consistent with the C language. Maybe in the long term with a few pull requests to LMS.
 
 -  Provide option to verify code? E.g. lms-verify
- Not yet. Perhaps when supporting LMS style codegen
 
 -  Parallelize C code where bit-hacks were not possible. E.g. OMP, Vectorization or provide pluggable parallelizers
- Yes
 
 
- Documentation for this use-case is hard to come by
 - The C and C++ backends are not well separated and require manual patchwork to be sound
 - Control over variable generation in the output code is not flexible (without significant complexity overhead)
 
- Implement tagless interpreter for core object language (see language overview)
 - Add code generation facilities to core language
 - Add optimization facilities
 - Build out core library
 -  OpenMP codegen 
for LMS? -  
Add verifier and extend -  
ScalaTest supportTestsuite -  
Extend with LMS 
Run LMS installation instructions- sbt
 - run
 - Choose main entry point (either testsuite or scratch area)
 
or to run all tests immediately (and continuously):
- sbt ~"runMain twiddle.dsl.Main"
 
or run from the REPL:
- sbt ~"runMain twiddle.dsl.REPL"
 
- https://github.com/namin/lms-verify
 - https://www.slideshare.net/krikava/domain-specific-languages-and-scala
 - https://stanford-ppl.github.io/Delite/myfirstdsl.html
 - https://skillsmatter.com/skillscasts/3289-javascript-embedded-dsl-scala
 - https://github.com/TiarkRompf/virtualization-lms-core
 - https://github.com/julienrf/lms-tutorial
 - https://scala-lms.github.io/tutorials/04_atwork.html
 - https://github.com/namin/metaprogramming
 - https://github.com/namin/metaprogramming/blob/master/lectures/4a-dsls/syntax.scala
 - https://www.cl.cam.ac.uk/~na482/meta/slides-4a.pdf
 - https://graphics.stanford.edu/~seander/bithacks.html
 - https://www.youtube.com/watch?v=16A1yemmx-w
 - https://github.com/scala-lms/tutorials/blob/master/src/test/scala/lms/tutorial/dslapi.scala
 - https://github.com/TiarkRompf/virtualization-lms-core/blob/v1.0.0/test-src/epfl/test14-scratch/TestCGen.scala ! https://github.com/TiarkRompf/virtualization-lms-core/blob/v1.0.0/src/internal/CCodegen.scala
 - https://scala-lms.github.io/tutorials/04_atwork.html
 - https://github.com/TiarkRompf/virtualization-lms-core/blob/361a806f674cd12d9d31655bd0f30664a451ad9f/src/internal/CCodegen.scala
 - https://github.com/TiarkRompf/virtualization-lms-core/blob/361a806f674cd12d9d31655bd0f30664a451ad9f/src/common/Packages.scala -> CCodeGenPkg
 - https://github.com/namin/lms-verify/blob/4e43e5669285c1ae98adf97848da497000b0d182/src/main/scala/lms/verify/Core.scala -> CCodeGenDSL
 - https://stanford-ppl.github.io/Delite/faq.html
 - https://github.com/scala-lms/tutorials/blob/master/src/test/scala/lms/tutorial/eval.scala