Description
Consider a query used in a scalar context, such as SELECT
clause, EXISTS
expression, IN
expression, or an argument of a scalar function.
For convenience, assume a query that returns one row, e.g., @funsql from(person).limit(1)
. It does return multiple columns: person_id
, gender_concept_id
, etc. How should this query be interpreted when used in a scalar context, such as:
@funsql select(from(person).limit(1))
This is a challenge because in a scalar context, a query must return exactly one column. Currently, the returned column is NULL
unless the column is explicitly specified with a select()
combinator. Thus, select(from(person).limit(1))
returns NULL
, but select(from(person).limit(1).select(person_id))
returns the value of person_id
.
This interpretation allows us to accept any query in a scalar context, which is particularly useful for EXISTS
. However, it may cause confusion when the query is used as an argument of IN
or a scalar function.
There is a better interpretation: A query used in a scalar context should return its first column.
This interpretation does not change the semantics of a query with an explicit select()
. For queries without select()
, it would pick the first column of a table, which is typically its primary key. This allows us to write, for example
@funsql begin
cohort() = begin
from(person)
filter(gender_concept_id == 8532)
end
relevant_visit() = begin
from(visit_occurrence)
filter(person_id in cohort()) # rather than `in cohort().select(person_id)`
end
end