-
Notifications
You must be signed in to change notification settings - Fork 6
/
Copy pathindex.html
238 lines (234 loc) · 16.9 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
<!DOCTYPE html>
<head>
<meta charset="utf-8">
<script src="https://ajax.googleapis.com/ajax/libs/jquery/3.4.1/jquery.min.js"></script>
<script src="js/viz.js"></script>
<script src="js/front.js"></script>
<script src="js/animation.js"></script>
<script src="js/back.js"></script>
<link rel="stylesheet" type="text/css" href="css/main.css">
<link rel="icon" type="image/png" href="css/icon.png"/>
<link href='https://fonts.googleapis.com/css?family=Lexend Deca' rel='stylesheet'>
<link href='https://fonts.googleapis.com/css?family=Mako' rel='stylesheet'>
<title>Regex Visualizer</title>
<meta name="description" content="Regex Visualizer">
</head>
<body>
<div id="header">
<div id="help">
<button class="helpbutton">Help</button>
<button class="githubbutton" title="GitHub repo" onclick="window.open('https://github.com/27t/regex-visualizer','_blank');"><img src="css/github.png"/></button>
</div>
<div id="previous_step">
<div id="stepl" class="step stepbutton">
<svg width="55px" height="80px" viewBox="0 0 55 80">
<path fill="none" stroke="#000000" stroke-width="4px" d="M 25 20 L 7 42 L 25 64"/>
<path class="buttonPath" fill="none" stroke="#0f6966" stroke-width="4px" d="M 25 20 L 7 42 L 25 64"/>
</svg>
<div class="steptext"></div>
</div>
</div>
<div id="inp">
<svg width="202px" height="42px" viewbox="0 0 202 42" id="validBorder">
<path class="greenPath" fill="none" stroke="#31eb37" stroke-width="2px" d="M 101 41 L 1 41 L 1 1 L 101 1"/>
<path class="greenPath" fill="none" stroke="#31eb37" stroke-width="2px" d="M 101 41 L 201 41 L 201 1 L 101 1"/>
</svg>
<svg width="202px" height="42px" viewbox="0 0 202 42" id="invalidBorder">
<path class="redPath" fill="none" stroke="#e61515" stroke-width="2px" d="M 101 1 L 1 1 L 1 41 L 101 41"/>
<path class="redPath" fill="none" stroke="#e61515" stroke-width="2px" d="M 101 1 L 201 1 L 201 41 L 101 41"/>
</svg>
<input id="regex" maxlength="256"></input>
<div class="beginbuttons"><div class="helpbutton helpbutton2">Help</div><div class="examplebutton">Example expression</div></div>
<button disabled id="convert">Convert</button>
</div>
<div id="next_step">
<div id="stepr" class="step stepbutton">
<div class="steptext"></div>
<svg width="40px" height="80px" viewBox="0 0 40 80">
<path fill="none" stroke="#222233" stroke-width="4px" d="M 15 20 L 33 42 L 15 64"/>
<path class="buttonPath" fill="none" stroke="#0f6966" stroke-width="4px" d="M 15 20 L 33 42 L 15 64"/>
</svg>
</div>
<div id="stepmatch" class="step">
<div>
Check if string matches regular expression:
</div>
<div>
<input id="matchinput"></input>
<button class="stepbutton" id="matchbutton">Check Match</button>
</div>
</div>
</div>
<div id="options">
<div id="options1">
<input checked id="options_check" type="checkbox"> Automatic animations
</div>
<div id="options2">
<button disabled id="options_button">Animation step</button>
</div>
<div id="options3">
<button id="skip_button">Skip animation</button>
</div>
</div>
</div>
<p id="error_message"></p>
<p id="step_message"></p>
<div id="line"></div>
<div id="FA"></div>
<div class="helpmenu bigbox">
<div class="helpheader">
<div class="helpexit">
<svg viewBox="0 0 50 50" width="15px" height="15px">
<line x1="5" y1="5" x2="45" y2="45" stroke="#000000" stroke-width="3px" stroke-linecap="round"></line>
<line x1="5" y1="45" x2="45" y2="5" stroke="#000000" stroke-width="3px" stroke-linecap="round"></line>
</svg>
</div>
<div class="helptitle">
Help menu
</div>
</div>
<div class="helpbody">
<div class="helptabs">
<button class="helpsectionbutton" section=".regexsec">Regular<br>expressions</button>
<button class="helpsectionbutton" section=".fsmsec">Finite state<br>machines</button>
<button class="helpsectionbutton" section=".fsm2sec">Types of<br>machines</button>
<button class="helpsectionbutton" section=".algsec">Algorithms</button>
<button class="helpsectionbutton" section=".featsec">Features</button>
</div>
<div class="helpcontent">
<div class="regexsec">
<h2>Regular expressions</h2>
<p>A regular expression is a sequence of characters that define a search pattern.
<br>
The regular expressions supported by this tool may contain:
<ul>
<li>Letters (<span class="regextext">a-z</span>, <span class="regextext">A-Z</span>)</li>
<li>The quantifiers <span class="regextext">*</span>, meaning 0 or more of the previous token, and <span class="regextext">+</span>, meaning 1 or more of the previous token</li>
<li>Alternation <span class="regextext">|</span>, meaning either the expression to the left or to the right of the <span class="regextext">|</span> has to match.</li>
<li>Parenthesis <span class="regextext">()</span>, to group things together</li>
<li>The number <span class="regextext">0</span>, meaning the empty string</li>
<li>Concatenation</li>
</ul>
</p>
<p>An example regular expression using these symbols is <span class="regextext">(ab|cd)+(e|0)</span>.
The strings that match this regular expression are exactly the strings that start with 1 or more instances of the strings <span class="regextext">ab</span> or <span class="regextext">cd</span>, and after that either the character <span class="regextext">e</span>, or nothing.
Some examples of strings that match this regular expression are <span class="regextext">ab</span>, <span class="regextext">cde</span>, <span class="regextext">abcde</span>, <span class="regextext">cdcdabcd</span>.</p>
<p>For more information about regular expressions, click <a target="_blank" href="https://www.regular-expressions.info/quickstart.html">here</a> (some different notation and way more possibilities, but the same concept)</p>
</div>
<div class="fsmsec">
<h2>Finite state machines</h2>
<p>
A finite state machine (also called a finite automaton) is a finite set of states <span class="regextext">N</span>, together with a transition function.
This transition function takes a state and a symbol as input, and returns a set of states, a subset of <span class="regextext">N</span>.
These returned states are the states that can be reached by starting from the given input state and reading the given input symbol.
The set of symbols an automaton contains is called the alphabet.
</p>
<p>
A finite state machine also has one starting state, the state we are in after reading no input symbols.
Finally, any of the states in <span class="regextext">N</span> can be accepting/final. If we end up in an accepting state after reading a string, we say that string is accepted by the machine.
This tool will create finite state machines, such that the accepted strings are those that exactly match the given regular expression.
</p>
Here states will be represented as circles, accepting states as double circles.
The transition function will be represented as directed edges between these states, meaning you can go from one state to another using the symbol in the edge's label.
The starting state will have an incoming edge from nowhere.
</p>
</div>
<div class="fsm2sec">
<h2>Types of finite state machines</h2>
<p>
We will define 4 different types of finite state machines, each with a more strict definition than the previous.
</p>
<ul>
<li><b>Nondeterministic finite automaton with λ-transitions</b> (<span class="regextext">NFA-λ</span>):<br>This is the most general, least strict type of state machine. Here, the transition function may return multiple states for any input state and symbol.
This means that in some cases we have a choice what state to go to next, given an input state and symbol (nondeterminism). In the visual representation used here, this means there are multiple outgoing edges from the input state with the same symbol. The transition function is also allowed to return no states, in which case reading the input symbol from the input state is not possible. In the visual representation used here, this means there are no outgoing edges from the input state with the input symbol.
The state machine may also contain λ-transitions. These transitions (edges) are labeled λ and allow you to freely move from one state to another without reading any input symbols.</li>
<li><b>Nondeterministic finite automaton</b> (<span class="regextext">NFA</span>):<br>This type is the same as the NFA-λ, except that λ-transitions are not allowed.</li>
<li><b>Deterministic finite automaton</b> (<span class="regextext">DFA</span>):<br>In deterministic finite automata, the transition function has to return exactly one state for each input state and symbol in the alphabet.
This means that for every string of characters in the alphabet, there is one and at most one path you can follow. In the visual representation used here, this means that every state has exactly one outgoing edge for each symbol in the alphabet.</li>
<li><b>Minimal deterministic finite automaton</b> (<span class="regextext">DFAm</span>):<br>We call a DFA minimal if there is no other DFA with less states that accepts the exact same strings as the original DFA.
It can be proven that for each DFA, there exists a unique DFAm, up to the naming of the states.</li>
</ul>
</div>
<div class="algsec">
<h2>Algorithms</h2>
<p>
To convert a regular expression to a <span class="regextext">DFAm</span>, we use 4 different algorithms, each resulting in the next type from the previous section.
Here follows a brief overview of all the algorithms, but for more in depth explanations, I used the book "Introduction to Languages and the Theory of Computation" by John C. Martin.
</p>
<p>
The first algorithm converts a regular expression to a <span class="regextext">NFA-λ</span>. This is done using a bottom-up construction.
We start with finite state machines for a single character, and 3 template state machines for the 3 main operations (alternation, concatenation and the quantifiers).
We then substitute the simple state machines into the template machine we need, until we have a finite state machine for the entire regular expression.
Finally, some basic simplification is done to make the state machine more readable.
</p>
<p>
The second algorithm converts a <span class="regextext">NFA-λ</span> to a <span class="regextext">NFA</span> by replacing the λ-transitions with normal transitions.
We start off by determining the λ-closure of each state: The set of states that can be reached from that state using only λ-transitions. Next, we make any state whose λ-closure contains an accepting state also accepting.
Then, for every state <span class="regextext">p</span> and for every state <span class="regextext">r</span> in the λ-closure of <span class="regextext">p</span>, we duplicate all the normal outgoing edges from <span class="regextext">r</span>, and make their starting point <span class="regextext">p</span>.
Finally, we remove all the λ-transitions, and if during this process any of the states have become unreachable from the starting state, we remove them.
</p>
<p>
The third algorithm converts a <span class="regextext">NFA</span> to a <span class="regextext">DFA</span> using an algorithm called "subset construction".
We start with a new starting state that corresponds to the old starting state. We then determine for each of the symbols in the alphabet, the set of states can be reached from the starting state using that symbol.
If there is not yet a new state corresponding to this old set of states, we create this new state and add an edge from the starting state to this new state with the given symbol.
If there is already a new state corresponding to this old set of states, we simply add an edge from the starting state to this state with the given symbol.
We continue this process for each of the newly created states, until all of the new states have an outgoing edge for every symbol in the alphabet.
A new state in this DFA is accepting, if any of the corresponding old states were accepting in the NFA.
</p>
<p>
The last algorithm converts a <span class="regextext">DFA</span> to a <span class="regextext">DFAm</span> by minimizing it.
This is done by determining which of the states of the DFA are equivalent, and merging them. Two states are called equivalent,
if the strings that cause you to end up in an accepting state starting from that state, are exactly the same for both states.
To determine which states are equivalent, we start off by marking each pair of states as equivalent, and step by step mark pairs as not equivalent if we find proof they are not.
In the first step, we mark every pair of states, where one of the states is accepting and the other one is non-accepting, as non-equivalent, since for example the empty string causes one of them to end up in an accepting state, but not the other one.
In each of the next steps, we take a pair of states that is marked equivalent, and for each symbol in the alphabet, we check if the new pair of states that we get from following the edge with that symbol from the starting pair, is marked as equivalent.
If this new pair is marked as not-equivalent, we have found a symbol that leads the original pair of states to two non-equivalent states. This implies that the original pair of states is also not equivalent, and so we mark it as such.
We do this until we cannot mark any more pairs as non-equivalent. Finally, we merge the states that have been found to be equivalent, and we are done.
</p>
</div>
<div class="featsec">
<h2>Features</h2>
<ul>
<li>Convert a regular expression into a <span class="regextext">DFAm</span>, using custom animations</li>
<li>This gives a very visual way to see what strings match the regular expression and what strings don't</li>
<li>Smooth automatic animations, or animations in steps, with information about the current step</li>
<li>Hover over any of the states, to see an infobox with some example strings that can end up in that state</li>
<li>Pan and zoom the state machine around by dragging the screen and using your mousewheel, to get a better look at large and complicated state machines</li>
<li>A semi-in-depth help menu you are looking at right now!</li>
</ul>
</div>
</div>
</div>
</div>
<div id="startpopup" class="bigbox">
<p class="pop1">Welcome!</p>
<p class="pop2">This tool is used to visualize which strings exactly match<br>a regular expression and which don't, using finite state machines</p>
<p class="pop3">You can start playing around now by pressing "Example expression" or by typing in your own regular expression</p>
<p class="pop4">Press the "help" button at any time for more information about the application</p>
<svg id="popupanim" width="600px" height="230px" viewBox="0 0 600 230">
<path id="pa1" stroke-width="2px" stroke="#000000" d="M10,35L245,35"/>
<polygon id="po1">
<animate id="an1" attributeName="points" attributeType="XML" begin="indefinite" dur="200ms" fill="freeze" from="245,27 245,27 245,43 245,43 245,27" to="245,27 270,35 270,35 245,43 245,27"/>
</polygon>
<path id="cp11" stroke-width="2px" stroke="#000000" fill="none" d="M270,35 A30 30 0 0 0 330 35"/>
<path id="cp12" stroke-width="2px" stroke="#000000" fill="none" d="M270,35 A30 30 0 1 1 330 35"/>
<text id="t1" font-size="30" text-anchor="middle" x="300" y="45">1</text>
<path id="pa2" stroke-width="2px" stroke="#000000" fill="none" d="M330,35 L550,35 A40 40 0 0 1 550 115 L355,115"></path>
<polygon id="po2">
<animate id="an2" attributeName="points" attributeType="XML" begin="indefinite" dur="200ms" fill="freeze" from="355,107 355,107 355,123 355,123 355,107" to="355,107 330,115 330,115 355,123 355,107"/>
</polygon>
<path id="cp21" stroke-width="2px" stroke="#000000" fill="none" d="M330,115 A30 30 0 0 0 270 115"/>
<path id="cp22" stroke-width="2px" stroke="#000000" fill="none" d="M330,115 A30 30 0 1 1 270 115"/>
<text id="t2" font-size="30" text-anchor="middle" x="300" y="125">2</text>
<path id="pa3" stroke-width="2px" stroke="#000000" fill="none" d="M270,115 L50,115 A40 40 0 0 0 50 195 L175,195"></path>
<polygon id="po3">
<animate id="an3" attributeName="points" attributeType="XML" begin="indefinite" dur="200ms" fill="freeze" from="175,203 175,203 175,187 175,187 175,203" to="175,203 200,195 200,195 175,187 175,203"/>
</polygon>
<path id="cp31" stroke-width="2px" stroke="#000000" fill="none" d="M200,195 A30 30 0 0 1 230 165 L370,165 A30 30 0 0 1 400 195"/>
<path id="cp32" stroke-width="2px" stroke="#000000" fill="none" d="M200,195 A30 30 0 0 0 230 225 L370,225 A30 30 0 0 0 400 195"/>
<text id="t3" font-size="30" text-anchor="middle" x="300" y="205">Start!</text>
<rect id="startrect" stroke-width="2px" fill="rgba(0,0,0,0)" x="200" y="165" width="200px" height="60px" rx="30"/>
</svg>
</div>
</body>
<script> </script>