Problem
When building structured outputs that contain recursive data structures (e.g., expression trees, nested comments, org charts), there is no way to reference a schema node from within itself. Currently, the only workaround is to manually inline the schema N levels deep, which:
- Bloats the schema exponentially — each level duplicates the entire subtree, resulting in massive JSON payloads sent to the provider
- Imposes an artificial depth limit — you must choose a max depth at build time, and anything deeper is silently degraded to leaf-only nodes
- Hurts model accuracy — providers like OpenAI handle
$ref natively in structured outputs and perform better with compact, canonical schemas than with deeply inlined duplicates
- Makes code harder to maintain — recursive structures require helper methods with depth counters instead of a straightforward declarative definition
Real-world use case
A rule/formula builder where expressions form a tree:
{
"operation": "&&",
"children": [
{
"operation": ">",
"children": [
{ "operation": null, "formula_value": { "type": "value", "form_element_id": 12 } },
{ "operation": null, "formula_value": { "type": "number", "constant": 100 } }
]
},
{
"operation": null,
"formula_value": { "type": "option_selected", "form_option_id": 45 }
}
]
}
The children array contains items of the same type as the parent — a textbook recursive schema.
Expected behavior
Something like:
$expression = $schema->object([
'operation' => $schema->string()->nullable()->required(),
'children' => $schema->array()
->nullable()
->items($schema->ref('formula_expression'))
->required(),
'formula_value' => $formulaValue->nullable()->required(),
])->name('formula_expression'); // registers in $defs
Which would produce the standard JSON Schema output:
{
"$defs": {
"formula_expression": {
"type": "object",
"properties": {
"operation": { "type": ["string", "null"] },
"children": {
"type": ["array", "null"],
"items": { "$ref": "#/$defs/formula_expression" }
},
"formula_value": { "$ref": "#/$defs/formula_value" }
},
"required": ["operation", "children", "formula_value"]
}
}
}
API proposal
Suggested minimal API surface (open to discussion):
| Method |
Purpose |
$schema->ref(string $name) |
Emit a { "$ref": "#/$defs/{name}" } pointer |
->name(string $name) |
Register the current node under $defs with the given name |
Alternatively, a single combined method could work:
$schema->define('formula_expression', function (JsonSchema $schema) {
return $schema->object([
'children' => $schema->array()->items($schema->ref('formula_expression')),
]);
});
Provider compatibility
| Provider |
$ref / $defs support |
| OpenAI (structured outputs) |
✅ Fully supported and documented |
| Anthropic (tool use) |
✅ Supported in tool input_schema |
| Google Gemini |
⚠️ Limited — may need inlining as fallback |
For providers that don't support $ref, the framework could automatically inline the schema up to a configurable depth as a fallback — which is exactly what users have to do manually today.
Current workaround
Recursive depth-limited inlining via helper methods:
private function buildFormulaExpression(
JsonSchema $schema,
mixed $formulaValue,
array $enumValues,
int $depth
): mixed {
if ($depth <= 0) {
// leaf-only node, no children
}
$child = $this->buildFormulaExpression(
$schema, $formulaValue, $enumValues, $depth - 1
);
return $schema->object([
'children' => $schema->array()->items($child)->nullable(),
// ...
]);
}
This works but produces schemas that are orders of magnitude larger than the $ref equivalent and limits nesting depth artificially.
Problem
When building structured outputs that contain recursive data structures (e.g., expression trees, nested comments, org charts), there is no way to reference a schema node from within itself. Currently, the only workaround is to manually inline the schema N levels deep, which:
$refnatively in structured outputs and perform better with compact, canonical schemas than with deeply inlined duplicatesReal-world use case
A rule/formula builder where expressions form a tree:
{ "operation": "&&", "children": [ { "operation": ">", "children": [ { "operation": null, "formula_value": { "type": "value", "form_element_id": 12 } }, { "operation": null, "formula_value": { "type": "number", "constant": 100 } } ] }, { "operation": null, "formula_value": { "type": "option_selected", "form_option_id": 45 } } ] }The
childrenarray contains items of the same type as the parent — a textbook recursive schema.Expected behavior
Something like:
Which would produce the standard JSON Schema output:
{ "$defs": { "formula_expression": { "type": "object", "properties": { "operation": { "type": ["string", "null"] }, "children": { "type": ["array", "null"], "items": { "$ref": "#/$defs/formula_expression" } }, "formula_value": { "$ref": "#/$defs/formula_value" } }, "required": ["operation", "children", "formula_value"] } } }API proposal
Suggested minimal API surface (open to discussion):
$schema->ref(string $name){ "$ref": "#/$defs/{name}" }pointer->name(string $name)$defswith the given nameAlternatively, a single combined method could work:
Provider compatibility
$ref/$defssupportinput_schemaFor providers that don't support
$ref, the framework could automatically inline the schema up to a configurable depth as a fallback — which is exactly what users have to do manually today.Current workaround
Recursive depth-limited inlining via helper methods:
This works but produces schemas that are orders of magnitude larger than the
$refequivalent and limits nesting depth artificially.