zig-re

RegEx in Zig.

At the moment, parsing of basic RegEx's (literal characters, concatenation, |, and *) is implemented; \ is used as an escaping character; parentheses can be used as you would expect. The empty string is not a valid regular expression - to match only the empty string, use the regex []. All other current features are merely syntactic sugar (as the previous features are enough to recognize any Type-3 language); here's a list of all the sweet stuff:

Character ranges and groups: [a-d], [abcd], [ab-d], (a|b|c|d) are all supported (and equivalent). Empty character ranges (i.e. []) are the only case where ranges are not syntactic sugar, as there is no other way to regocnize only the empty string.
- Character groups can be inverted: [^a] matches any character but a.
Using . to mean "any character" is supported. To match a literal '.', use [.] or \..
The + operator can be used as a shorthand for a concatentation and a Kleene star: x+ is equivalent to xx*, i.e. + means "at least one character".
The ? operator can be used as a shorthand for a union with the empty string: x? is equivalent to x|[].

Translation of regular expressions to $\varepsilon$-NFA to NFA to DFA is implemented and seems to work in relatively complex scenarios. Canonicalization of the final DFA is not implemented yet, and $\varepsilon$-NFA to NFA conversion is not very efficient yet.

The RegEx ASTs, and the NFAs and DFAs include Graphviz DOT output as a visualization aid.

The project also includes a JIT compiler for x86-64 that can convert DFAs to machine code that is executable immediately. Profile guided optimization is also implemented: the JIT compiler can use information collected during profiling runs of the DFA (pre-compilation, i.e. interpreted runs), to emit more efficient machine code. Although the compile-time performance of using this feature is not great yet.

Example

Example of a regex combining almost all current features: x[yz]|.w*[a-c]*[f-i]

AST:

$\varepsilon$-NFA:

NFA (still has the epsilon transitions, but they can all be ignored):

DFA:

Machine code can be found here

`comptime` ?

Building the DFA at compile-time should be possible quite easily if/when Zig gets working compile-time allocators. It should also be possible now, but not without significant rewrites.

Name		Name	Last commit message	Last commit date
Latest commit History 105 Commits
.github/workflows		.github/workflows
assets		assets
thirdparty		thirdparty
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
build.zig		build.zig
re.zig		re.zig

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

zig-re

Example

`comptime` ?

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

zig-re

Example

comptime ?

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

`comptime` ?

Packages