Converts a regex string into c code, which parses the given regular expression.
For exact syntax specifications, look at the comment in regex2c.c
.
Run make
to build the regex2c
executable. The executable outputs either c code, or debug info.
To change that, change the preprocessor-definition PRINT_DEBUG
in common.h
, a value of 0 prints the c code, larger values print more debug info.
To clean the project, run make clean
.
To test the project, write a regex pattern into the file test/pattern.regex
and run make test
in the root directory. This
build the executable test/pattern_matcher
, which accepts strings from stdin
, which match the given regex pattern.
The regex2c
executable expects a non-empty regex-string from stdin
and prints c code to stdout
. It uses the following steps to convert the expression:
- Parse the regex and convert to an AST
- Convert the AST to a NFA (structures such as repititions and optionals are converted by inserting epsilon-transitions)
- Convert the NFA to a DFA (making it deterministic)
- Minimize the DFA (using Moore's algorithm)
- Convert the DFA into c code, which can be compiled and linked with other code