Bug: WITH ORDINALITY AS t(col, ord) alias gains a duplicate ord column when an Unnest node is rendered more than once in a single Generator.sql() pass
Before you file an issue
- ✅ Read dialect specified:
parse_one(sql, read="postgres")
- ✅ Write dialect specified:
ast.sql(dialect="postgres")
- ✅ Reproduces on
main (tested on sqlglot 25.0.0, 26.0.0, 27.21.0, 28.0.0, 28.5.0, 28.10.0, 29.0.1, 30.6.0)
Summary
When an Unnest node has both an offset and a TableAlias with explicit columns, Generator.unnest_sql mutates alias.columns by appending the offset identifier on every render of that node. If a generator codepath renders the same Unnest expression instance more than once during a single Generator.sql() pass — for example Postgres' single-arg TRIM(...) codepath, which calls self.sql(expression, "this") twice — the alias accumulates duplicate offset entries. The output SQL declares an alias column list one entry longer than the underlying SELECT, which Postgres then rejects with InvalidColumnReference: table "<alias>" has 2 columns available but 3 columns specified.
Fully reproducible code snippet
from sqlglot import parse_one
sql = """
SELECT TRIM(ARRAY_TO_STRING(ARRAY(
SELECT tok
FROM (SELECT tok, rn
FROM UNNEST(string_to_array('a b', ' ')) WITH ORDINALITY AS t(tok, rn)) d
WHERE NOT EXISTS (
SELECT 1
FROM UNNEST(string_to_array('a b', ' ')) WITH ORDINALITY AS s(tok2, rn2)
WHERE s.tok2 = d.tok AND s.rn2 < d.rn
)
ORDER BY rn
), ' ')) AS comp
"""
print(parse_one(sql, read="postgres").sql(dialect="postgres"))
Expected output (semantics-preserving)
SELECT TRIM(ARRAY_TO_STRING(ARRAY(SELECT tok FROM (SELECT tok, rn FROM UNNEST(STRING_TO_ARRAY('a b', ' ')) WITH ORDINALITY AS t(tok, rn)) AS d WHERE NOT EXISTS(SELECT 1 FROM UNNEST(STRING_TO_ARRAY('a b', ' ')) WITH ORDINALITY AS s(tok2, rn2) WHERE s.tok2 = d.tok AND s.rn2 < d.rn) ORDER BY rn), ' ')) AS comp
Actual output
SELECT TRIM(ARRAY_TO_STRING(ARRAY(SELECT tok FROM (SELECT tok, rn FROM UNNEST(STRING_TO_ARRAY('a b', ' ')) WITH ORDINALITY AS t(tok, rn, rn)) AS d WHERE NOT EXISTS(SELECT 1 FROM UNNEST(STRING_TO_ARRAY('a b', ' ')) WITH ORDINALITY AS s(tok2, rn2, rn2) WHERE s.tok2 = d.tok AND s.rn2 < d.rn) ORDER BY rn), ' ')) AS comp
The alias column list (tok, rn) is emitted as (tok, rn, rn) and (tok2, rn2) as (tok2, rn2, rn2) — a duplicate offset column in each.
Root cause (mutation evidence)
Generator.unnest_sql appends the offset identifier into alias.columns on every call rather than rendering it transiently. Calling the generator on the same Unnest node N times produces N appended copies:
from sqlglot import exp, parse_one
from sqlglot.dialects.postgres import Postgres
tree = parse_one(
"SELECT * FROM UNNEST(string_to_array('a b', ' ')) WITH ORDINALITY AS t(tok, rn)",
read="postgres",
)
unnest = next(tree.find_all(exp.Unnest))
gen = Postgres().generator()
print([c.name for c in unnest.args["alias"].args.get("columns") or []])
# ['tok']
print(gen.sql(unnest))
# UNNEST(STRING_TO_ARRAY('a b', ' ')) WITH ORDINALITY AS t(tok, rn)
print([c.name for c in unnest.args["alias"].args.get("columns") or []])
# ['tok', 'rn'] ← mutated
print(gen.sql(unnest))
# UNNEST(STRING_TO_ARRAY('a b', ' ')) WITH ORDINALITY AS t(tok, rn, rn)
print([c.name for c in unnest.args["alias"].args.get("columns") or []])
# ['tok', 'rn', 'rn'] ← mutated again
In the bug scenario the whole tree is rendered exactly once via tree.sql(...), but the inner Unnest is visited twice due to Postgres' TRIM single-arg codepath (generator.py calls self.sql(expression, "this") twice for Trim), producing the (tok, rn, rn) output above.
Why it surfaces in nested contexts only
A top-level UNNEST(...) WITH ORDINALITY AS t(tok, rn) is rendered exactly once, so the mutation doesn't accumulate. The bug only surfaces when the Unnest is reached through a generator codepath that renders the same Expression instance more than once in a pass (e.g. inside a TRIM(...) argument).
Workaround
Pre-fold the offset column into alias.columns before rendering, then replace the offset argument on the Unnest with a non-Expression truthy sentinel so unnest_sql still emits WITH ORDINALITY but skips the mutation branch:
from sqlglot import exp, parse_one
def _fold_unnest_ordinality_offset(expression: exp.Expression) -> None:
for unnest in expression.find_all(exp.Unnest):
offset = unnest.args.get("offset")
alias = unnest.args.get("alias")
if not isinstance(offset, exp.Expression):
continue
if not isinstance(alias, exp.TableAlias):
continue
columns = alias.args.get("columns") or []
if not any(
isinstance(c, exp.Identifier) and c.name == offset.name for c in columns
):
alias.append("columns", offset.copy())
unnest.set("offset", True)
tree = parse_one(sql, read="postgres")
_fold_unnest_ordinality_offset(tree)
print(tree.sql(dialect="postgres"))
# correct output: AS t(tok, rn) ... AS s(tok2, rn2) ...
Suggested fix
Either (a) make unnest_sql non-mutating — e.g. emit WITH ORDINALITY AS {alias.name}({alias.columns}, {offset}) directly without modifying alias — or (b) make the codepaths that render the same Expression instance twice (Trim, etc.) operate on a copy.
Official Postgres documentation
Versions tested
sqlglot==25.0.0, 26.0.0, 27.21.0, 28.0.0, 28.5.0, 28.10.0, 29.0.1, 30.6.0 — bug present in all.
Environment
- Python 3.13.5
- macOS 25.4.0 (darwin arm64)
- target dialect:
postgres (server: PostgreSQL 17.7)
Bug:
WITH ORDINALITY AS t(col, ord)alias gains a duplicateordcolumn when anUnnestnode is rendered more than once in a singleGenerator.sql()passBefore you file an issue
parse_one(sql, read="postgres")ast.sql(dialect="postgres")main(tested on sqlglot 25.0.0, 26.0.0, 27.21.0, 28.0.0, 28.5.0, 28.10.0, 29.0.1, 30.6.0)Summary
When an
Unnestnode has both anoffsetand aTableAliaswith explicitcolumns,Generator.unnest_sqlmutatesalias.columnsby appending the offset identifier on every render of that node. If a generator codepath renders the same Unnest expression instance more than once during a singleGenerator.sql()pass — for example Postgres' single-argTRIM(...)codepath, which callsself.sql(expression, "this")twice — the alias accumulates duplicate offset entries. The output SQL declares an alias column list one entry longer than the underlying SELECT, which Postgres then rejects withInvalidColumnReference: table "<alias>" has 2 columns available but 3 columns specified.Fully reproducible code snippet
Expected output (semantics-preserving)
Actual output
The alias column list
(tok, rn)is emitted as(tok, rn, rn)and(tok2, rn2)as(tok2, rn2, rn2)— a duplicate offset column in each.Root cause (mutation evidence)
Generator.unnest_sqlappends the offset identifier intoalias.columnson every call rather than rendering it transiently. Calling the generator on the sameUnnestnode N times produces N appended copies:In the bug scenario the whole tree is rendered exactly once via
tree.sql(...), but the innerUnnestis visited twice due to Postgres'TRIMsingle-arg codepath (generator.pycallsself.sql(expression, "this")twice forTrim), producing the(tok, rn, rn)output above.Why it surfaces in nested contexts only
A top-level
UNNEST(...) WITH ORDINALITY AS t(tok, rn)is rendered exactly once, so the mutation doesn't accumulate. The bug only surfaces when theUnnestis reached through a generator codepath that renders the same Expression instance more than once in a pass (e.g. inside aTRIM(...)argument).Workaround
Pre-fold the offset column into
alias.columnsbefore rendering, then replace theoffsetargument on theUnnestwith a non-Expressiontruthy sentinel sounnest_sqlstill emitsWITH ORDINALITYbut skips the mutation branch:Suggested fix
Either (a) make
unnest_sqlnon-mutating — e.g. emitWITH ORDINALITY AS {alias.name}({alias.columns}, {offset})directly without modifyingalias— or (b) make the codepaths that render the same Expression instance twice (Trim, etc.) operate on a copy.Official Postgres documentation
SELECT/FROM— Table Functions andWITH ORDINALITY— definesfunction_name (...) [WITH ORDINALITY] [[AS] table_alias [(column_alias [, ...])]]. WithWITH ORDINALITY, exactly onebigintcolumn is appended; the column-alias list therefore contains the function's columns followed by exactly one ordinality column — never two.SRFs andWITH ORDINALITY— same contract.Versions tested
sqlglot==25.0.0,26.0.0,27.21.0,28.0.0,28.5.0,28.10.0,29.0.1,30.6.0— bug present in all.Environment
postgres(server: PostgreSQL 17.7)