Skip to content

Implement first draft of milestone 1#1

Open
linborland wants to merge 1 commit into
mongodb-labs:mainfrom
linborland:main
Open

Implement first draft of milestone 1#1
linborland wants to merge 1 commit into
mongodb-labs:mainfrom
linborland:main

Conversation

@linborland

Copy link
Copy Markdown

First draft of the agg experimental module for Milestone 1. The overall strategy is to copy/adapt the PHP aggregation YAML specs into Go-friendly shapes and use a small typed builder API to make aggregation stages more discoverable and safer.

This PR implements the first four Milestone 1 stages: Match, Sort, Set, and Project. It also introduces a first-pass typed expression model, a minimal query package for Match, and an initial set of operators to make those stages usable: arithmetic (Add, Subtract, Multiply, Divide), comparison (Eq, Ne, Gt, Gte, Lt, Lte), logical (And, Or, Not), and a few basic expression helpers such as Concat, IfNull, Literal, In, FilterArray, and ArrayToObject.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Introduces an experimental agg Go module that provides a typed aggregation pipeline builder (stages + expressions) and a minimal typed query builder intended for $match, alongside YAML operator/stage specs adapted for Go-oriented generation/testing.

Changes:

  • Adds core aggregation types (Stage, Pipeline) and first Milestone 1 stage builders: $match, $sort, $set, $project.
  • Introduces a first-pass typed expression model (Expr + typed sub-interfaces) and a starter set of expression operator constructors.
  • Adds initial typed query builder support (query.Filter, field conditions like $eq) plus YAML specs for stages, expressions, and query operators.

Reviewed changes

Copilot reviewed 29 out of 30 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
agg/stage.go Stage/pipeline types and builders for $match, $sort, $set, $project.
agg/expr.go Core typed expression interfaces, concrete value wrappers, field refs, and $literal.
agg/operator.go Expression operator constructors (arithmetic/comparison/logical/string/array/misc).
agg/query/query.go Minimal typed query filter builder intended for $match.
agg/go.mod Declares the standalone agg Go module and driver dependency.
agg/go.sum Dependency checksums for the agg module.
agg/spec/stage/match.yaml YAML spec for $match stage including example pipelines.
agg/spec/stage/sort.yaml YAML spec for $sort stage including textScore sort example.
agg/spec/stage/set.yaml YAML spec for $set stage including common usage examples.
agg/spec/stage/project.yaml YAML spec for $project stage including inclusion/exclusion/computed fields examples.
agg/spec/query/eq.yaml YAML spec for query $eq operator used by $match.
agg/spec/expression/add.yaml YAML spec for aggregation $add.
agg/spec/expression/subtract.yaml YAML spec for aggregation $subtract.
agg/spec/expression/multiply.yaml YAML spec for aggregation $multiply.
agg/spec/expression/divide.yaml YAML spec for aggregation $divide.
agg/spec/expression/eq.yaml YAML spec for aggregation $eq.
agg/spec/expression/ne.yaml YAML spec for aggregation $ne.
agg/spec/expression/gt.yaml YAML spec for aggregation $gt.
agg/spec/expression/gte.yaml YAML spec for aggregation $gte.
agg/spec/expression/lt.yaml YAML spec for aggregation $lt.
agg/spec/expression/lte.yaml YAML spec for aggregation $lte.
agg/spec/expression/and.yaml YAML spec for aggregation $and.
agg/spec/expression/or.yaml YAML spec for aggregation $or.
agg/spec/expression/not.yaml YAML spec for aggregation $not.
agg/spec/expression/concat.yaml YAML spec for aggregation $concat.
agg/spec/expression/in.yaml YAML spec for aggregation $in.
agg/spec/expression/ifNull.yaml YAML spec for aggregation $ifNull.
agg/spec/expression/filter.yaml YAML spec for aggregation $filter.
agg/spec/expression/arrayToObject.yaml YAML spec for aggregation $arrayToObject.
agg/spec/expression/literal.yaml YAML spec for aggregation $literal.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread agg/stage.go
Comment on lines +3 to +22
import (
"go.mongodb.org/mongo-driver/v2/bson"

"github.com/mongodb-labs/mongo-go-driver-exp/agg/query"
)

// Stage is a single aggregation pipeline stage, e.g. { $match: ... }.
type Stage bson.D

// Pipeline is an ordered sequence of stages.
type Pipeline []Stage

// MarshalBSON implements bson.Marshaler for Pipeline.
func (p Pipeline) MarshalBSON() ([]byte, error) {
stages := make([]bson.D, len(p))
for i, s := range p {
stages[i] = bson.D(s)
}
return bson.Marshal(stages)
}
Comment thread agg/expr.go
Comment on lines +56 to +66
func (e numericExprVal) MarshalBSONValue() (byte, []byte, error) {
return marshalExprValue(e.v)
}

type boolExprVal struct{ v any }

func (e boolExprVal) expr() {}
func (e boolExprVal) boolExpr() {}
func (e boolExprVal) MarshalBSONValue() (byte, []byte, error) {
return marshalExprValue(e.v)
}
Comment thread agg/expr.go
Comment on lines +3 to +5
import (
"go.mongodb.org/mongo-driver/v2/bson"
)
Comment thread agg/expr.go
Comment on lines +72 to +82
func (e stringExprVal) MarshalBSONValue() (byte, []byte, error) {
return marshalExprValue(e.v)
}

type arrayExprVal struct{ v any }

func (e arrayExprVal) expr() {}
func (e arrayExprVal) arrayExpr() {}
func (e arrayExprVal) MarshalBSONValue() (byte, []byte, error) {
return marshalExprValue(e.v)
}
Comment thread agg/expr.go
Comment on lines +88 to +106
func (e dateExprVal) MarshalBSONValue() (byte, []byte, error) {
return marshalExprValue(e.v)
}

type objectExprVal struct{ v any }

func (e objectExprVal) expr() {}
func (e objectExprVal) objectExpr() {}
func (e objectExprVal) MarshalBSONValue() (byte, []byte, error) {
return marshalExprValue(e.v)
}

// genericExprVal backs Expr (base interface) return types.
type genericExprVal struct{ v any }

func (e genericExprVal) expr() {}
func (e genericExprVal) MarshalBSONValue() (byte, []byte, error) {
return marshalExprValue(e.v)
}
Comment thread agg/expr.go
Comment on lines +149 to +153
// marshalExprValue is the shared marshal implementation for all expression types.
func marshalExprValue(v any) (byte, []byte, error) {
typ, b, err := bson.MarshalValue(v)
return byte(typ), b, err
}
Comment thread agg/operator.go
Comment on lines +112 to +114
func ArrayToObject(array ArrayExpr) ObjectExpr {
return objectExprVal{v: bson.D{{Key: "$arrayToObject", Value: array}}}
}
Comment on lines +14 to +16
description: |
Any valid expression expression.
- name: array
Comment on lines +8 to +10
description: |
Return a value without parsing. Use for values that the aggregation pipeline may interpret as an expression. For example, use a $literal expression to a string that starts with a dollar sign ($) to avoid parsing as a field path.
arguments:

@matthewdale matthewdale left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should refactor the expr/operator functions to use generics instead of interfaces, which prevents the need to define wrapper functions like Field, NumericField, etc.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure committing the specs here is actually worth the additional review and maintenance overhead. Now that the specs are in a dedicated repo, we should make that a submodule and describe the necessary type mapping in an AGENTS.md or CLAUDE.md file.

Comment thread agg/stage.go
// result of expr.
func Compute(field string, expr Expr) ProjectionField {
return ProjectionField{name: field, val: expr}
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this use of functions to build the most common patterns for projections.

Optional: Consider an alternate name for Compute, like Project.

Comment thread agg/stage.go
name string
// val is int32(1), int32(0), or an Expr. Constrained by constructors.
val any
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a good candidate for an interface instead of a struct.

E.g.

type ProjectionField interface { projectionField() projectionField }

type projectionField struct {
    name string
    val   any
}

func (pf projectionField) projectionField() projectionField {
    return pf
}

func Include(field string) ProjectionField {
    return projectionField{name: field, val: int32(1)}
}

Comment thread agg/stage.go
for i, s := range p {
stages[i] = bson.D(s)
}
return bson.Marshal(stages)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The BSON representation of an agg pipeline is {"pipeline": <array of stages>}. Here's the updated bson.Marshal call.

Suggested change
return bson.Marshal(stages)
return bson.Marshal(bson.D{{Key: "pipeline", Value: stages}})

@matthewdale matthewdale left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant to "Request changes" with my previous review, sorry for the confusion!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants