Skip to content

Commit 7d86ade

Browse files
Remove complicated examples from patterns
1 parent 6af175e commit 7d86ade

File tree

3 files changed

+212
-145
lines changed

3 files changed

+212
-145
lines changed

core-concepts/modules/ROOT/pages/typeql/index.adoc

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -66,4 +66,10 @@ TypeQL functions bring the familiar abstraction from programming languages to da
6666
****
6767
Short definitions of terms used throughout the guide, with links to the section which introduces them.
6868
****
69-
--
69+
70+
// .xref:{page-version}@new_core_concepts::typeql/invalid-patterns.adoc[]
71+
// [.clickable]
72+
// ****
73+
// A deeper look at invalid patterns involving disjoint variable reuse
74+
// ****
75+
--
Lines changed: 198 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,198 @@
1+
= Advanced: Invalid Patterns
2+
3+
This section dives into the semantics of TypeQL,
4+
using the <<_disjunctive_normal_form>> to explain why <<_disjoint_variable_reuse>> makes a pattern invalid.
5+
6+
7+
== Disjoint variable reuse
8+
This section demonstrates some of the cases which can arise due to
9+
xref:new_core_concepts::typeql/query-variables-patterns.adoc#_disjoint_variable_usage_errors[disjoint variable reuse]
10+
11+
[unbound negations]
12+
=== Unbound negation inputs
13+
Notice we can write a pattern where the input variable to a negation is not bound in the parent.
14+
For example:
15+
[,typeql]
16+
----
17+
match
18+
$p1 isa person; $p2 isa person;
19+
{
20+
$emp1 isa employment, links (employer: $company, employee: $p1);
21+
} or {
22+
$edu1 isa education, links (institute: $institute, attendee: $p2);
23+
};
24+
25+
not { $emp2 isa employment, links (employer: $company, employee: $p2); };
26+
not { $edu2 isa employment, links (institute: $institute, attendee: $p2); };
27+
----
28+
At first glance, this looks like a reasonable query:
29+
we query for persons `$p1` and `$p2` who neither worked for the same company,
30+
nor attended the same institute.
31+
However, you can see that the input variables for the negations (`$company` and `$institute`)
32+
are not local to the negation, but also not bound in the parent conjunction.
33+
34+
==== Disjunctive Normal Form
35+
The best way to think about these requirements is to convert the query to Disjunctive Normal Form
36+
by rewriting the pattern using "distributivity" and examining each branch:
37+
38+
`A; { B; } or { C; };` becomes `{A; B} or {A; C};`
39+
40+
In this case, we get the pattern:
41+
[,typeql]
42+
----
43+
match
44+
{
45+
$p1 isa person; $p2 isa person;
46+
$emp1 isa employment, links (employer: $company, employee: $p1);
47+
not { $emp2 isa employment, links (employer: $company, employee: $p2); };
48+
not { $edu2 isa employment, links (institute: $institute, attendee: $p2); };
49+
} or {
50+
$p1 isa person; $p2 isa person;
51+
$edu1 isa education, links (institute: $institute, attendee: $p2);
52+
not { $emp2 isa employment, links (employer: $company, employee: $p2); };
53+
not { $edu2 isa education, links (institute: $institute, attendee: $p2); };
54+
};
55+
----
56+
Although this could now be a valid logic query,
57+
the first branch requires that `$p2` did not attend *any* institute,
58+
and the second branch requires that `$p2` was not employed by *any* employer.
59+
This is clearly not what we intended to write. Hence, we flag these as invalid TypeQL queries.
60+
61+
[NOTE]
62+
====
63+
There is, of course, a way to express the intended query:
64+
[,typeql]
65+
.The correct query
66+
----
67+
$p1 isa person; $p2 isa person;
68+
not {
69+
$emp1 isa employment, links (employer: $company, employee: $p1);
70+
$emp2 isa employment, links (employer: $company, employee: $p2);
71+
};
72+
not {
73+
$edu1 isa education, links (institute: $institute, attendee: $p2);
74+
$edu2 isa education, links (institute: $institute, attendee: $p2);
75+
};
76+
----
77+
====
78+
79+
=== Reusing branch local variables in disjunctions
80+
81+
Consider another case of questionable query composition:
82+
[,typeql]
83+
----
84+
match
85+
$p1 isa person; $p2 isa person;
86+
{
87+
$emp1 isa employment, links (employer: $company, employee: $p1);
88+
} or {
89+
$edu1 isa education, links (institute: $institute, attendee: $p2);
90+
};
91+
{
92+
$emp2 isa employment, links (employer: $company, employee: $p2);
93+
} or {
94+
$edu2 isa education, links (institute: $institute, attendee: $p2);
95+
};
96+
----
97+
Ideally, this would be a query to find two persons `$p1` and `$p2` who
98+
were either employed by the same company, or attended the same institute.
99+
100+
The DNF quickly reveals the mistake:
101+
[,typeql]
102+
----
103+
match
104+
{
105+
$p1 isa person; $p2 isa person;
106+
$emp1 isa employment, links (employer: $company, employee: $p1);
107+
$emp2 isa employment, links (employer: $company, employee: $p2);
108+
} or {
109+
$p1 isa person; $p2 isa person;
110+
$edu1 isa education, links (institute: $institute, attendee: $p2);
111+
$emp2 isa employment, links (employer: $company, employee: $p2);
112+
} or
113+
{
114+
$p1 isa person; $p2 isa person;
115+
$emp1 isa employment, links (employer: $company, employee: $p1);
116+
$edu2 isa education, links (institute: $institute, attendee: $p2);
117+
} or {
118+
$p1 isa person; $p2 isa person;
119+
$edu1 isa education, links (institute: $institute, attendee: $p2);
120+
$edu2 isa education, links (institute: $institute, attendee: $p2);
121+
};
122+
----
123+
124+
You can see the query we meant to write in two of those branches:
125+
[,typeql]
126+
----
127+
match
128+
$p1 isa person; $p2 isa person;
129+
{
130+
$emp1 isa employment, links (employer: $company, employee: $p1);
131+
$emp2 isa employment, links (employer: $company, employee: $p2);
132+
} or {
133+
$edu1 isa education, links (institute: $institute, attendee: $p2);
134+
$edu2 isa education, links (institute: $institute, attendee: $p2);
135+
};
136+
----
137+
138+
The problem lies in the other two branches.
139+
[,typeql]
140+
----
141+
match
142+
{
143+
$p1 isa person; $p2 isa person;
144+
$emp1 isa employment, links (employer: $company, employee: $p1);
145+
$edu2 isa education, links (institute: $institute, attendee: $p2);
146+
} or {
147+
$p1 isa person; $p2 isa person;
148+
$edu1 isa education, links (institute: $institute, attendee: $p2);
149+
$emp2 isa employment, links (employer: $company, employee: $p2);
150+
};
151+
----
152+
This will return any persons `$p1` & `$p2` when
153+
either (1) `$p1` is employed by *any* and `$p2` attended *any* institute;
154+
or (2) `$p2` is employed by *any* company and `$p1` attended *any* institute.
155+
156+
Notice `$company` is "internal" to both the first and second disjunctions (The same is the case for `$institute`).
157+
TypeQL throws a "disjoint variable re-use" error for such cases.
158+
159+
=== Single origin for optionals
160+
Optional variables can only be used in only one `try` block in a given match clause. Consider:
161+
[,typeql]
162+
----
163+
match
164+
try { $x isa person, has name "John"; };
165+
try { $x isa person, has name "James"; };
166+
----
167+
The reason this is banned is similar to the <<_unbound_negation_inputs>> case above:
168+
It is unclear if `$x` is an input or either of the `try` blocks, or an optional variable.
169+
170+
The same variable can be used in `try` blocks in the next stage.
171+
This is because the variable is bound - either to a concept, or to `None` - and is an 'input' to the block.
172+
The block simply fails if it tries to use a variable bound to `None` in a constraint.
173+
174+
[NOTE]
175+
====
176+
The semantics of try blocks dictate the equivalence of
177+
178+
`try { P };` and `{ P } or { not { P' }; };`
179+
180+
where, `P'` is the pattern obtained by renaming all the 'optional' variables in `P` with fresh ones.
181+
Rewriting the above query with this equivalence (assuming `$x` to be 'optional' and renaming it, although this is ambiguous)
182+
[,tyepql]
183+
----
184+
{ $x "John"; } or not { $x_1 "John"; };
185+
{ $x "James"; } or not { $x_2 "James"; };
186+
----
187+
188+
Taking the DNF gives us 4 cases, and some very unintuitive behaviour:
189+
[,typeql]
190+
----
191+
{ $x "John"; $x "James"; } or # persons named both James and John.
192+
{ $x "John"; not { $x_2 "James"; }; } or # If nobody is named James, then persons named John
193+
{ not { $x_1 "John"; }; $x "James"; } or # If nobody is named John, then persons named James
194+
{ not { $x_1 "John"; }; not { $x_2 "James"; }; }; # If nobody is named either, then a single answer `None`
195+
# If there is one person named John, and a different person named James, then no answers
196+
----
197+
198+
====

core-concepts/modules/ROOT/pages/typeql/query-variables-patterns.adoc

Lines changed: 7 additions & 144 deletions
Original file line numberDiff line numberDiff line change
@@ -329,156 +329,19 @@ Notice that a variable that is "bound" in a conjunction is guaranteed to be boun
329329
in ALL execution paths - i.e., regardless of which branch is taken in each disjunction.
330330
An answer to a pattern contains only those variables bound in the root conjunction.
331331

332-
// TODO: This could be its own page
333-
=== Invalid patterns: unbound negation inputs
334-
Notice we can write a pattern where the input variable to a negation is not bound in the parent.
335-
For example:
332+
=== Disjoint variable usage errors
333+
Since variable names are scoped to a pipeline, it's possible to write a query in which
334+
a certain variable is internal to several different patterns. For example:
336335
[,typeql]
337336
----
338337
match
339-
$p1 isa person; $p2 isa person;
340-
{
341-
$emp1 isa employment, links (employer: $company, employee: $p1);
342-
} or {
343-
$edu1 isa education, links (institute: $institute, attendee: $p2);
344-
};
345-
346-
not { $emp2 isa employment, links (employer: $company, employee: $p2); };
347-
not { $edu2 isa employment, links (institute: $institute, attendee: $p2); };
348-
----
349-
At first glance, this looks like a reasonable query:
350-
we query for persons `$p1` and `$p2` who neither worked for the same company,
351-
nor attended the same institute.
352-
However, you can see that the input variables for the negations (`$company` and `$institute`)
353-
are not bound in the parent conjunction. Hence, this is an invalid query.
354-
355-
==== DNF
356-
The best way to think about these requirements is to convert the query to Disjunctive Normal Form
357-
by rewriting the pattern using "distributivity" and examining each branch:
358-
359-
`A; { B; } or { C; };` becomes `{A; B} or {A; C};`
360-
361-
In this case, we get the pattern:
362-
[,typeql]
363-
----
364-
match
365-
{
366-
$p1 isa person; $p2 isa person;
367-
$emp1 isa employment, links (employer: $company, employee: $p1);
368-
not { $emp2 isa employment, links (employer: $company, employee: $p2); };
369-
not { $edu2 isa employment, links (institute: $institute, attendee: $p2); };
370-
} or {
371-
$p1 isa person; $p2 isa person;
372-
$edu1 isa education, links (institute: $institute, attendee: $p2);
373-
not { $emp2 isa employment, links (employer: $company, employee: $p2); };
374-
not { $edu2 isa education, links (institute: $institute, attendee: $p2); };
375-
};
376-
----
377-
Although this could now be a valid logic query,
378-
the first branch requires that `$p2` did not attend *any* institute,
379-
and the second branch requires that `$p2` was not employed by *any* employer.
380-
This is clearly not what we meant. Hence, we flag these as invalid TypeQL queries.
381-
382-
[NOTE]
383-
====
384-
There is, of course, a way to express the intended query:
385-
[,typeql]
386-
.The correct query
387-
----
388-
$p1 isa person; $p2 isa person;
389-
not {
390-
$emp1 isa employment, links (employer: $company, employee: $p1);
391-
$emp2 isa employment, links (employer: $company, employee: $p2);
392-
};
393-
not {
394-
$edu1 isa education, links (institute: $institute, attendee: $p2);
395-
$edu2 isa education, links (institute: $institute, attendee: $p2);
396-
};
397-
----
398-
====
399-
400-
=== Invalid patterns: Disjoint variable re-use
401-
402-
Consider another case of questionable query composition:
403-
[,typeql]
338+
{ $x isa person, has name "John"; } or { $y isa person, has name "James"; };
339+
{ $x isa person, has age 25; } or { $y isa person, has age 50; };
404340
----
405-
match
406-
$p1 isa person; $p2 isa person;
407-
{
408-
$emp1 isa employment, links (employer: $company, employee: $p1);
409-
} or {
410-
$edu1 isa education, links (institute: $institute, attendee: $p2);
411-
};
412-
{
413-
$emp2 isa employment, links (employer: $company, employee: $p2);
414-
} or {
415-
$edu2 isa education, links (institute: $institute, attendee: $p2);
416-
};
417-
----
418-
Ideally, this would be a query to find two persons `$p1` and `$p2` who
419-
were either employed by the same company, or attended the same institute.
420-
421-
The DNF quickly reveals the mistake:
422-
[,typeql]
423-
----
424-
match
425-
{
426-
$p1 isa person; $p2 isa person;
427-
$emp1 isa employment, links (employer: $company, employee: $p1);
428-
$emp2 isa employment, links (employer: $company, employee: $p2);
429-
} or {
430-
$p1 isa person; $p2 isa person;
431-
$edu1 isa education, links (institute: $institute, attendee: $p2);
432-
$emp2 isa employment, links (employer: $company, employee: $p2);
433-
} or
434-
{
435-
$p1 isa person; $p2 isa person;
436-
$emp1 isa employment, links (employer: $company, employee: $p1);
437-
$edu2 isa education, links (institute: $institute, attendee: $p2);
438-
} or {
439-
$p1 isa person; $p2 isa person;
440-
$edu1 isa education, links (institute: $institute, attendee: $p2);
441-
$edu2 isa education, links (institute: $institute, attendee: $p2);
442-
};
443-
----
444-
445-
You can see the query we meant to write in two of those branches:
446-
[,typeql]
447-
----
448-
match
449-
$p1 isa person; $p2 isa person;
450-
{
451-
$emp1 isa employment, links (employer: $company, employee: $p1);
452-
$emp2 isa employment, links (employer: $company, employee: $p2);
453-
} or {
454-
$edu1 isa education, links (institute: $institute, attendee: $p2);
455-
$edu2 isa education, links (institute: $institute, attendee: $p2);
456-
};
457-
----
458-
459-
The problem lies in the other two branches.
460-
[,typeql]
461-
----
462-
match
463-
{
464-
$p1 isa person; $p2 isa person;
465-
$emp1 isa employment, links (employer: $company, employee: $p1);
466-
$edu2 isa education, links (institute: $institute, attendee: $p2);
467-
} or {
468-
$p1 isa person; $p2 isa person;
469-
$edu1 isa education, links (institute: $institute, attendee: $p2);
470-
$emp2 isa employment, links (employer: $company, employee: $p2);
471-
};
472-
----
473-
This will return any persons `$p1` & `$p2` when
474-
either (1) `$p1` is employed by *any* and `$p2` attended *any* institute;
475-
or (2) `$p2` is employed by *any* company and `$p1` attended *any* institute.
476-
477-
Notice `$company` is "internal" to both the first and second disjunctions (The same is the case for `$institute`).
478-
TypeQL throws a "disjoint variable re-use" error for such cases.
341+
Such patterns are considered invalid and will throw disjoint variable usage errors.
479342

480343
==== Valid variable scopes:
481-
From these, it can be seen that a variable is either:
344+
Rephrasing the xref:reference::typeql/patterns/index.adoc#_scopes[unique scope principle], a variable is either:
482345

483346
1. Internal to one branch of one disjunction.
484347
2. Internal to one negation.

0 commit comments

Comments
 (0)