Skip to content

Conversation

@asoorm
Copy link
Contributor

@asoorm asoorm commented Nov 28, 2025

Overview

This PR introduces a ConnectRPC handler that enables gRPC/Connect/gRPC-Web clients to interact with GraphQL APIs through protocol translation. The implementation provides a bridge between protobuf-based RPC protocols and GraphQL operations.

Key Features

  • Multi-Protocol Support: Handles gRPC, Connect, and gRPC-Web protocols via Vanguard transcoder
  • Service Discovery: Convention-based service discovery from directory structure
  • Operation Registry: Service-scoped GraphQL operation management
  • Error Handling: Relay-inspired error classification (CRITICAL vs NON-CRITICAL)
  • Header Forwarding: Transparent header propagation to GraphQL endpoint

Architecture

File Responsibilities

  • handler.go: Core RPC request handler, GraphQL execution, error classification
  • operation_registry.go: Thread-safe registry for service-scoped GraphQL operations
  • proto_loader.go: Protobuf file parsing and service definition extraction
  • server.go: HTTP/2 server with lifecycle management (start/stop/reload)
  • service_discovery.go: Convention-based service discovery from filesystem
  • vanguard_service.go: Vanguard integration for protocol transcoding

Supporting Files

  • handler_error_test.go: Error handling and classification tests
  • handler_test.go: Core handler functionality tests
  • operation_registry_test.go: Registry thread-safety and service-scoping tests
  • proto_loader_test.go: Proto parsing tests
  • server_test.go: Server configuration tests
  • server_lifecycle_test.go: Server lifecycle and reload tests
  • service_discovery_test.go: Service discovery convention tests
  • vanguard_service_test.go: Protocol transcoding tests

Manual Testing

Prerequisites

Start the router with appropriate config

listen_addr: "localhost:3002"
graphql_path: "/graphql"

# ConnectRPC configuration
connect_rpc:
  enabled: true
  server:
    listen_addr: "localhost:5026"
  services_provider_id: "fs-services"
  graphql_endpoint: "http://localhost:3002/graphql"

# Storage providers for services directory
storage_providers:
  file_system:
    - id: "fs-services"
      path: "./pkg/connectrpc/samples/services"

The services directory should contain proto and named GraphQL operations. See ./pkg/connectrpc/samples/services as an example:

tree ./pkg/connectrpc/samples/services 
./pkg/connectrpc/samples/services
└── employee.v1
    ├── MutationUpdateEmployeeMood.graphql
    ├── QueryGetEmployeeById.graphql
    ├── QueryGetEmployeeByPets.graphql
    ├── QueryGetEmployeeWithMood.graphql
    ├── QueryGetEmployees.graphql
    ├── QueryGetEmployeesByPetsInlineFragment.graphql
    ├── QueryGetEmployeesByPetsNamedFragment.graphql
    ├── service.proto
    └── service.proto.lock.json

Testing with curl (Connect Protocol)

Client Validation Error

curl -s -X POST http://localhost:5026/employee.v1.EmployeeService/GetEmployeeById \
  -H "Content-Type: application/json" \
  -H "Connect-Protocol-Version: 1" \
  -d '{"employee_id": "a"}' | jq
{
  "code": "invalid_argument",
  "message": "invalid request: field 'employee_id': Int32 cannot represent non-integer value: a"
}

OK

curl -s -X POST http://localhost:5026/employee.v1.EmployeeService/GetEmployeeById \
  -H "Content-Type: application/json" \
  -H "Connect-Protocol-Version: 1" \
  -d '{"employee_id": 1}' | jq  
{
  "employee": {
    "id": 1,
    "tag": "",
    "details": {
      "pets": null,
      "location": {
        "key": {
          "name": "Germany"
        }
      },
      "forename": "Jens",
      "surname": "Neuse"
    }
  }
}

Testing with grpcurl (gRPC Protocol)

Note: Requires proto files as reflection protocol is not supported.

grpcurl -plaintext \
  -proto ./pkg/connectrpc/samples/services/employee.v1/service.proto \
  -d '{"employee_id": 1}' \
  localhost:5026 \
  employee.v1.EmployeeService/GetEmployeeById | jq
{
  "employee": {
    "id": 1,
    "details": {
      "forename": "Jens",
      "surname": "Neuse",
      "location": {
        "key": {
          "name": "Germany"
        }
      }
    }
  }
}

Testing with buf CLI (Connect Protocol)

buf curl --http2-prior-knowledge --protocol connect \
  --schema ./pkg/connectrpc/samples/services/employee.v1/service.proto \
  --data '{"employee_id": 1}' \
  http://localhost:5026/employee.v1.EmployeeService/GetEmployeeById | jq
{
  "employee": {
    "id": 1,
    "details": {
      "forename": "Jens",
      "surname": "Neuse",
      "location": {
        "key": {
          "name": "Germany"
        }
      }
    }
  }
}

grpc-web

buf curl --protocol grpcweb \
  --schema ./pkg/connectrpc/samples/services/employee.v1/service.proto \
  --data '{"employee_id": 1}' \
  http://localhost:5026/employee.v1.EmployeeService/GetEmployeeById

Service Discovery Convention

Services are discovered using this directory structure:

services/
├── employee.v1/
│   ├── employee.proto          # Service definition
│   └── operations/
│       ├── GetEmployee.graphql
│       └── UpdateEmployee.graphql
└── product.v1/
    ├── product.proto
    └── operations/
        └── GetProduct.graphql

Related Issues

Closes #ENG-8277

Summary by CodeRabbit

  • New Features

    • Add ConnectRPC: GraphQL-backed RPC server with service discovery, lifecycle (start/reload/stop), and multi-protocol client support (Connect/gRPC/gRPC‑Web)
    • Auto-generated sample Employee service and client bindings
  • Changes

    • New ConnectRPC configuration block and sample/default configs
    • Header forwarding improved and ConnectRPC integrated into graceful shutdown
    • Operation names now require PascalCase; CLI flag --prefix-operation-type removed
  • Tests

    • Extensive unit, integration, and helper tests covering ConnectRPC, validation, discovery, and end-to-end flows

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Nov 28, 2025

Warning

Rate limit exceeded

@asoorm has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 8 minutes and 6 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 91096a7 and 46c3f55.

⛔ Files ignored due to path filters (1)
  • router-tests/testdata/connectrpc/client/employee.v1/service.pb.go is excluded by !**/*.pb.go
📒 Files selected for processing (27)
  • protographic/src/operation-to-proto.ts (2 hunks)
  • router-tests/connectrpc/connectrpc_server_lifecycle_test.go (1 hunks)
  • router-tests/testdata/connectrpc/README.md (1 hunks)
  • router-tests/testdata/connectrpc/buf.gen.yaml (1 hunks)
  • router-tests/testdata/connectrpc/client/employee.v1/employeev1connect/service.connect.go (1 hunks)
  • router/core/router.go (5 hunks)
  • router/core/router_config.go (4 hunks)
  • router/pkg/config/config.go (2 hunks)
  • router/pkg/config/config.schema.json (1 hunks)
  • router/pkg/config/connectrpc_test.go (1 hunks)
  • router/pkg/connectrpc/constructor_validation_test.go (1 hunks)
  • router/pkg/connectrpc/error_handling_test.go (1 hunks)
  • router/pkg/connectrpc/handler.go (1 hunks)
  • router/pkg/connectrpc/handler_test.go (1 hunks)
  • router/pkg/connectrpc/operation_registry.go (1 hunks)
  • router/pkg/connectrpc/operation_registry_test.go (1 hunks)
  • router/pkg/connectrpc/proto_loader.go (1 hunks)
  • router/pkg/connectrpc/server.go (1 hunks)
  • router/pkg/connectrpc/server_test.go (1 hunks)
  • router/pkg/connectrpc/service_discovery.go (1 hunks)
  • router/pkg/connectrpc/service_discovery_test.go (1 hunks)
  • router/pkg/connectrpc/test_helpers.go (1 hunks)
  • router/pkg/connectrpc/validator.go (1 hunks)
  • router/pkg/connectrpc/vanguard_service.go (1 hunks)
  • router/pkg/connectrpc/vanguard_service_test.go (1 hunks)
  • router/pkg/httputil/headers.go (1 hunks)
  • router/pkg/mcpserver/server.go (2 hunks)

Walkthrough

Adds a ConnectRPC subsystem and integrates it into the router: configuration/schema, server, proto loader, operation registry, GraphQL→Connect handler, Vanguard service layer, service discovery, validation, header utilities, CLI/operation-name validation changes, many unit and integration tests, sample protos/GraphQL, and dependency updates.

Changes

Cohort / File(s) Summary
Config & Defaults
router/connect.config.yaml, router/pkg/config/config.go, router/pkg/config/config.schema.json, router/pkg/config/testdata/config_defaults.json, router/pkg/config/testdata/config_full.json, router/pkg/config/connectrpc_test.go
Add ConnectRPC configuration types/fields, JSON schema, YAML example, default/full testdata and config loading tests.
Router Core Integration
router/core/router.go, router/core/router_config.go, router/core/supervisor_instance.go
Integrate ConnectRPC bootstrap/start/stop into router lifecycle; add Config fields and public option WithConnectRPC(cfg config.ConnectRPCConfiguration).
ConnectRPC Server & Lifecycle
router/pkg/connectrpc/server.go, router/pkg/connectrpc/server_test.go, router-tests/connectrpc/connectrpc_server_lifecycle_test.go, router-tests/connectrpc/connectrpc_test_helpers.go
Implement Server (NewServer, Start, Stop, Reload), initialization, h2c HTTP serving, introspection APIs, lifecycle tests and test helpers.
Service Discovery
router/pkg/connectrpc/service_discovery.go, router/pkg/connectrpc/service_discovery_test.go
Discover services from filesystem (one proto per service), extract package/service, enforce uniqueness, collect GraphQL operation files; add tests.
Proto Loader
router/pkg/connectrpc/proto_loader.go, router/pkg/connectrpc/proto_loader_test.go, router/pkg/connectrpc/test_helpers.go
New ProtoLoader with local descriptor registry, ServiceDefinition/MethodDefinition types, directory loading, accessors and tests plus shared test helpers.
Operation Registry
router/pkg/connectrpc/operation_registry.go, router/pkg/connectrpc/operation_registry_test.go
Thread-safe in-memory registry for GraphQL operations per service with load/get/count/clear APIs and tests.
GraphQL→Connect Handler
router/pkg/connectrpc/handler.go, router/pkg/connectrpc/handler_test.go, router/pkg/connectrpc/error_handling_test.go, router/pkg/connectrpc/constructor_validation_test.go, router/pkg/connectrpc/test_helpers.go
RPCHandler that validates requests, converts proto-JSON → GraphQL variables (snake_case→camelCase), forwards headers, executes GraphQL calls, classifies errors and attaches metadata; extensive tests.
Vanguard Integration
router/pkg/connectrpc/vanguard_service.go, router/pkg/connectrpc/vanguard_service_test.go
Vanguard-based service wrapper registering services/methods, per-service HTTP handlers, Connect↔HTTP error mapping, and metadata inspection APIs with tests.
Validation
router/pkg/connectrpc/validator.go
MessageValidator to validate JSON payloads against protobuf input descriptors; ValidationError type and ValidateMessage method.
Header Utilities & MCP Update
router/pkg/httputil/headers.go, router/pkg/mcpserver/server.go
Centralize SkippedHeaders set and update MCP server header-forwarding to use it.
Samples & Testdata (proto + graphql)
router/pkg/connectrpc/samples/services/employee.v1/service.proto, router/pkg/connectrpc/samples/services/employee.v1/*.graphql, router/pkg/connectrpc/samples/services/employee.v1/service.proto.lock.json, router-tests/testdata/connectrpc/services/employee.v1/service.proto, router-tests/testdata/connectrpc/services/employee.v1/*.graphql, router-tests/testdata/connectrpc/services/employee.v1/service.proto.lock.json, router-tests/testdata/connectrpc/buf.yaml, router-tests/testdata/connectrpc/buf.gen.yaml, router-tests/testdata/connectrpc/README.md
Add employee.v1 proto, lock files and numerous GraphQL operations for samples and integration tests; add Buf config and README for test data.
Generated Client Bindings (tests)
router-tests/testdata/connectrpc/client/employee.v1/employeev1connect/service.connect.go
Add generated Connect client/server bindings used by integration tests.
Integration Tests & Helpers
router-tests/connectrpc/*, router-tests/testdata/connectrpc/*
Add client protocol tests (Connect/gRPC/gRPC-Web), server lifecycle, error mapping, concurrency tests, integration test helpers and testdata.
ConnectRPC Package Tests & Helpers
router/pkg/connectrpc/*_test.go, router/pkg/connectrpc/test_helpers.go
Numerous unit tests and helpers across handler, registry, proto loader, server, validator, Vanguard service, and error handling.
CLI & Operation-to-Proto Changes
cli/src/commands/grpc-service/commands/generate.ts, protographic/src/operation-to-proto.ts, protographic/tests/operations/operation-validation.test.ts
Remove --prefix-operation-type flag and propagation; enforce PascalCase operation names and update validation/tests; stop auto-transforming operationName into methodName.
Dependencies
router/go.mod, router-tests/go.mod
Add/adjust dependencies: connectrpc.com/vanguard v0.3.0, github.com/jhump/protoreflect v1.17.0, promote golang.org/x/net to direct where applicable.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Areas requiring extra attention:

  • router/pkg/connectrpc/handler.go — proto→GraphQL conversion, recursive snake_case→camelCase logic, header forwarding, error metadata and classification.
  • router/pkg/connectrpc/server.go — initialize/reload concurrency, listener/handler swap, Start/Stop lifecycle and graceful shutdown semantics.
  • router/pkg/connectrpc/vanguard_service.go — HTTP handler routing, Connect↔HTTP error mapping and JSON error wiring.
  • router/pkg/connectrpc/service_discovery.go — filesystem traversal assumptions, single-proto-per-directory enforcement and duplicate full-name detection.
  • router/core/router.go & router/core/router_config.go — bootstrap wiring, option handling and shutdown sequencing to ensure no resource leaks.

Possibly related PRs

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 51.85% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title clearly and concisely summarizes the main feature: adding a ConnectRPC handler that bridges GraphQL to gRPC protocols. It is specific, focused on the primary change, and directly related to the substantial codebase additions.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@asoorm asoorm changed the title feat(router): Add ConnectRPC handler for GraphQL-to-gRPC bridge feat(router): add connectrpc handler for graphql-to-grpc bridge Nov 28, 2025
@asoorm asoorm marked this pull request as draft November 28, 2025 14:05
@github-actions
Copy link

github-actions bot commented Nov 28, 2025

Router image scan passed

✅ No security vulnerabilities found in image:

ghcr.io/wundergraph/cosmo/router:sha-6c2394f3c2f9e6b31d9431fed93d458b7e4be43a

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

🧹 Nitpick comments (23)
router/pkg/connectrpc/proto_loader.go (1)

322-326: Document that GetServices returns the internal map.

The comment at line 323 mentions treating the returned map as read-only, but this is not enforced. Consider returning a copy of the map or documenting this more prominently in the method's godoc.

Apply this diff to make the read-only contract clearer:

-// GetServices returns all loaded service definitions.
-// The returned map should be treated as read-only to prevent accidental mutation.
+// GetServices returns all loaded service definitions.
+// IMPORTANT: The returned map is the internal map and must NOT be modified.
+// Modifying the returned map will cause data races and undefined behavior.
 func (pl *ProtoLoader) GetServices() map[string]*ServiceDefinition {
 	return pl.services
 }
router/pkg/connectrpc/proto_loader_test.go (1)

20-107: Comprehensive test coverage for proto loading.

Tests properly verify service metadata, method properties, and expected method names. Consider adding t.Parallel() to subtests for faster execution, though this is optional.

router/pkg/connectrpc/operation_registry_test.go (1)

438-447: Consider handling or logging errors from LoadOperationsForService in concurrent test.

The concurrent clearer goroutines discard the error from LoadOperationsForService. While this is acceptable for stress-testing thread safety, consider at minimum logging unexpected errors to aid debugging if the test fails.

 			go func() {
 				defer wg.Done()
 				for j := 0; j < 50; j++ {
 					registry.Clear()
-					_ = registry.LoadOperationsForService(serviceName, []string{opFile})
+					if err := registry.LoadOperationsForService(serviceName, []string{opFile}); err != nil {
+						t.Logf("LoadOperationsForService error (expected during stress): %v", err)
+					}
 				}
 			}()
router/core/router.go (3)

966-972: Consider using consistent URL construction pattern.

Using path.Join(r.listenAddr, r.graphqlPath) produces a path like localhost:3002/graphql without a scheme. While the downstream NewServer normalizes this by adding http:// if missing, using url.JoinPath (as done at line 1362) would be more semantically correct and consistent.

 		// Determine the router GraphQL endpoint
 		var routerGraphQLEndpoint string
 		if r.connectRPC.GraphQLEndpoint != "" {
 			routerGraphQLEndpoint = r.connectRPC.GraphQLEndpoint
 		} else {
-			routerGraphQLEndpoint = path.Join(r.listenAddr, r.graphqlPath)
+			routerGraphQLEndpoint = fmt.Sprintf("http://%s%s", r.listenAddr, r.graphqlPath)
 		}

Note: The same pattern appears in MCP initialization at line 907.


954-964: Service discovery result is underutilized.

The discoveredServices slice is computed here but only used for logging the count. The actual service loading is performed again inside connectrpc.NewServer. Consider either passing the discovered services to NewServer to avoid redundant discovery, or removing this discovery call if it's only for early validation/logging.


1527-1538: Perfect! I have completed the verification. Here's my final analysis:

Verification Results:

  1. connectRPCServer is declared as interface{} (line 89 in router_config.go)
  2. ✅ It's assigned crpcServer which has concrete type *connectrpc.Server (line 1005)
  3. connectrpc.Server DOES have a Stop(context.Context) error method (line 201 in router/pkg/connectrpc/server.go)
  4. ✅ The type assertion at line 1532 works correctly but defeats compile-time type safety
  5. mcpServer demonstrates the better pattern: concrete type *mcpserver.GraphQLSchemaServer with direct method calls (line 1521) - no type assertion needed
  6. connectRPCServer is only assigned once with a known concrete type

Type assertion for Stop method is fragile and inconsistent with mcpServer pattern.

The connectRPCServer field is declared as interface{} but always assigned the concrete type *connectrpc.Server, which has a Stop(context.Context) error method. The type assertion works but sacrifices compile-time safety. For consistency with mcpServer (which uses concrete type *mcpserver.GraphQLSchemaServer and calls Stop() directly at line 1521), declare connectRPCServer as *connectrpc.Server instead. This eliminates the type assertion and provides type safety at compile time.

router/pkg/connectrpc/vanguard_service_test.go (5)

253-260: Direct map access bypasses OperationRegistry encapsulation.

Directly manipulating opRegistry.operations couples tests to internal implementation details. If the registry's internal structure changes, these tests will break. Consider adding a test helper method to the OperationRegistry or using the public API if available.

-		if opRegistry.operations[serviceName] == nil {
-			opRegistry.operations[serviceName] = make(map[string]*schemaloader.Operation)
-		}
-		opRegistry.operations[serviceName]["QueryGetEmployeeById"] = &schemaloader.Operation{
+		// Use a public method or test helper to register operations
+		opRegistry.RegisterOperation(serviceName, &schemaloader.Operation{
 			Name:            "QueryGetEmployeeById",
 			OperationType:   "query",
 			OperationString: "query QueryGetEmployeeById($id: Int!) { employee(id: $id) { id name } }",
-		}
+		})

273-277: Inefficient pattern to retrieve a single service definition.

This pattern iterates through a map just to get one element. Consider exposing a method like GetFirstService() or GetService(name) on ProtoLoader for cleaner test code.


316-356: Repeated service extraction pattern across multiple sub-tests.

The same 8-line pattern for extracting a service definition is repeated in lines 317-324, 338-345, 359-366, and 380-387. Extract this into a helper function to reduce duplication and improve maintainability.

func getFirstServiceDef(t *testing.T, protoLoader *ProtoLoader) *ServiceDefinition {
	t.Helper()
	services := protoLoader.GetServices()
	require.NotEmpty(t, services)
	for _, svc := range services {
		return svc
	}
	return nil
}

240-244: Ignoring error from w.Write.

While this is test code and unlikely to fail, ignoring the error return value is not idiomatic. Consider using a helper or explicitly discarding with _.

-			w.Write([]byte(`{"data":{"employee":{"id":1,"name":"Test Employee"}}}`))
+			_, _ = w.Write([]byte(`{"data":{"employee":{"id":1,"name":"Test Employee"}}}`))

474-504: Helper function also directly accesses internal map.

Same encapsulation concern as noted earlier. This helper sets up the operation registry by directly manipulating internal state.

router-tests/connectrpc_test.go (2)

23-27: Ignoring error from w.Write in mock server.

-			w.Write([]byte(`{"data":{"employees":[{"id":1}]}}`))
+			_, _ = w.Write([]byte(`{"data":{"employees":[{"id":1}]}}`))

70-73: Same issue: ignoring w.Write error in mock server.

-			w.Write([]byte(`{"data":{}}`))
+			_, _ = w.Write([]byte(`{"data":{}}`))
router/pkg/connectrpc/server_lifecycle_test.go (3)

46-50: Ignored error return from w.Write.

The error returned by w.Write is not checked. While this is test code, it's good practice to handle or explicitly ignore errors.

-		w.Write([]byte(`{"data":{}}`))
+		_, _ = w.Write([]byte(`{"data":{}}`))

108-111: Ignored error return from server.Stop.

For test hygiene, consider checking the error from server.Stop or explicitly ignoring it.

 		ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
 		defer cancel()
-		server.Stop(ctx)
+		_ = server.Stop(ctx)
 	})

154-157: Ignored error return from server.Stop.

Same pattern as above - consider explicitly ignoring the error for consistency.

 		ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
 		defer cancel()
-		server.Stop(ctx)
+		_ = server.Stop(ctx)
 	})
router/pkg/connectrpc/operation_registry.go (1)

218-230: Consider pre-allocating slice for GetAllOperations.

The slice could be pre-allocated for a minor performance improvement, similar to GetAllOperationsForService.

 func (r *OperationRegistry) GetAllOperations() []schemaloader.Operation {
 	r.mu.RLock()
 	defer r.mu.RUnlock()

-	var operations []schemaloader.Operation
+	// Count total operations for pre-allocation
+	total := 0
+	for _, serviceOps := range r.operations {
+		total += len(serviceOps)
+	}
+	operations := make([]schemaloader.Operation, 0, total)
 	for _, serviceOps := range r.operations {
 		for _, op := range serviceOps {
 			operations = append(operations, *op)
 		}
 	}

 	return operations
 }
router/pkg/connectrpc/server.go (3)

71-74: Protocol prefix check may have edge cases.

Using strings.Contains(config.GraphQLEndpoint, "://") could match URLs where :// appears in the path or query string (though unlikely). Consider using strings.HasPrefix with known protocols.

 	// Add protocol if missing
-	if !strings.Contains(config.GraphQLEndpoint, "://") {
+	if !strings.HasPrefix(config.GraphQLEndpoint, "http://") && !strings.HasPrefix(config.GraphQLEndpoint, "https://") {
 		config.GraphQLEndpoint = "http://" + config.GraphQLEndpoint
 	}

176-182: HTTP server timeouts are hardcoded.

The server timeouts are hardcoded to 30s/30s/60s while config.RequestTimeout exists but is only used for the HTTP client. Consider using the configured timeout or adding separate server timeout configuration.

 	s.httpServer = &http.Server{
 		Addr:         s.config.ListenAddr,
 		Handler:      h2cHandler,
-		ReadTimeout:  30 * time.Second,
-		WriteTimeout: 30 * time.Second,
-		IdleTimeout:  60 * time.Second,
+		ReadTimeout:  s.config.RequestTimeout,
+		WriteTimeout: s.config.RequestTimeout,
+		IdleTimeout:  s.config.RequestTimeout * 2,
 	}

344-349: Captured statusCode is unused.

The statusCode field is captured but never read. If this is for future observability/logging, consider adding a comment. Otherwise, it could be removed.

Either add usage (e.g., logging) or remove the unused field:

 type responseWriter struct {
 	http.ResponseWriter
-	statusCode int
+	statusCode int // Captured for future observability/metrics
 }
router/pkg/connectrpc/samples/services/employee.v1/employee.proto (1)

50-58: Unusual field number gaps may indicate removed fields.

Field numbers jump from 3 to 53-54-57-58, which is valid but unusual. This pattern often indicates fields were removed. If this is sample/test data, consider using sequential numbering for clarity.

router/pkg/connectrpc/testdata/employee_only/employee.proto (1)

50-58: Field number gaps present in test data.

Field numbers jump from 3 to 53-54-57-58. For test data, sequential numbering improves readability and maintenance.

router/pkg/connectrpc/handler.go (1)

436-444: hasData check could be more robust.

The check for partial data uses string comparisons that may miss edge cases:

  • Empty array [] might represent valid partial data
  • Whitespace variations like "{ }" would be considered as having data
  • No trimming of whitespace before comparison

Consider a more robust check:

-		hasData := len(graphqlResponse.Data) > 0 && string(graphqlResponse.Data) != "null" && string(graphqlResponse.Data) != "{}"
+		// Check if data is meaningful (not null, empty object, or empty array)
+		dataStr := strings.TrimSpace(string(graphqlResponse.Data))
+		hasData := len(dataStr) > 0 && 
+			dataStr != "null" && 
+			dataStr != "{}" && 
+			dataStr != "[]"
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5451293 and c36e3f2.

⛔ Files ignored due to path filters (1)
  • router/go.sum is excluded by !**/*.sum
📒 Files selected for processing (31)
  • router-tests/connectrpc_test.go (1 hunks)
  • router/connect.config.yaml (1 hunks)
  • router/core/router.go (4 hunks)
  • router/core/router_config.go (3 hunks)
  • router/core/supervisor_instance.go (1 hunks)
  • router/go.mod (3 hunks)
  • router/pkg/config/config.go (2 hunks)
  • router/pkg/config/config.schema.json (1 hunks)
  • router/pkg/config/connectrpc_test.go (1 hunks)
  • router/pkg/connectrpc/handler.go (1 hunks)
  • router/pkg/connectrpc/handler_error_test.go (1 hunks)
  • router/pkg/connectrpc/handler_test.go (1 hunks)
  • router/pkg/connectrpc/operation_registry.go (1 hunks)
  • router/pkg/connectrpc/operation_registry_test.go (1 hunks)
  • router/pkg/connectrpc/proto_loader.go (1 hunks)
  • router/pkg/connectrpc/proto_loader_test.go (1 hunks)
  • router/pkg/connectrpc/samples/services/employee.v1/MutationUpdateEmployeeTag.graphql (1 hunks)
  • router/pkg/connectrpc/samples/services/employee.v1/QueryGetEmployeeById.graphql (1 hunks)
  • router/pkg/connectrpc/samples/services/employee.v1/QueryGetEmployees.graphql (1 hunks)
  • router/pkg/connectrpc/samples/services/employee.v1/QueryGetProducts.graphql (1 hunks)
  • router/pkg/connectrpc/samples/services/employee.v1/employee.proto (1 hunks)
  • router/pkg/connectrpc/server.go (1 hunks)
  • router/pkg/connectrpc/server_lifecycle_test.go (1 hunks)
  • router/pkg/connectrpc/server_test.go (1 hunks)
  • router/pkg/connectrpc/service_discovery.go (1 hunks)
  • router/pkg/connectrpc/service_discovery_test.go (1 hunks)
  • router/pkg/connectrpc/testdata/employee_only/employee.proto (1 hunks)
  • router/pkg/connectrpc/testdata/examples/product.v1/product_service.proto (1 hunks)
  • router/pkg/connectrpc/testdata/examples/user.v1/user_service.proto (1 hunks)
  • router/pkg/connectrpc/vanguard_service.go (1 hunks)
  • router/pkg/connectrpc/vanguard_service_test.go (1 hunks)
🧰 Additional context used
🧠 Learnings (11)
📓 Common learnings
Learnt from: dkorittki
Repo: wundergraph/cosmo PR: 2273
File: router/core/graphql_handler.go:0-0
Timestamp: 2025-11-19T15:13:57.821Z
Learning: In the Cosmo router (wundergraph/cosmo), error handling follows a two-phase pattern: (1) Prehandler phase handles request parsing, validation, and setup errors using `httpGraphqlError` and `writeOperationError` (in files like graphql_prehandler.go, operation_processor.go, parse_multipart.go, batch.go); (2) Execution phase handles resolver execution errors using `WriteError` in GraphQLHandler.ServeHTTP. Because all `httpGraphqlError` instances are caught in the prehandler before ServeHTTP is invoked, any error type checks for `httpGraphqlError` in the execution-phase WriteError method are unreachable code.
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2172
File: router/core/graph_server.go:0-0
Timestamp: 2025-09-17T20:55:39.456Z
Learning: The Initialize method in router/internal/retrytransport/manager.go has been updated to properly handle feature-flag-only subgraphs by collecting subgraphs from both routerConfig.GetSubgraphs() and routerConfig.FeatureFlagConfigs.ConfigByFeatureFlagName, ensuring all subgraphs receive retry configuration.
📚 Learning: 2025-11-19T15:13:57.821Z
Learnt from: dkorittki
Repo: wundergraph/cosmo PR: 2273
File: router/core/graphql_handler.go:0-0
Timestamp: 2025-11-19T15:13:57.821Z
Learning: In the Cosmo router (wundergraph/cosmo), error handling follows a two-phase pattern: (1) Prehandler phase handles request parsing, validation, and setup errors using `httpGraphqlError` and `writeOperationError` (in files like graphql_prehandler.go, operation_processor.go, parse_multipart.go, batch.go); (2) Execution phase handles resolver execution errors using `WriteError` in GraphQLHandler.ServeHTTP. Because all `httpGraphqlError` instances are caught in the prehandler before ServeHTTP is invoked, any error type checks for `httpGraphqlError` in the execution-phase WriteError method are unreachable code.

Applied to files:

  • router/pkg/connectrpc/vanguard_service_test.go
  • router/pkg/connectrpc/handler_test.go
  • router/pkg/connectrpc/handler_error_test.go
  • router/pkg/connectrpc/handler.go
📚 Learning: 2025-10-01T20:39:16.113Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2252
File: router-tests/telemetry/telemetry_test.go:9684-9693
Timestamp: 2025-10-01T20:39:16.113Z
Learning: Repo preference: In router-tests/telemetry/telemetry_test.go, keep strict > 0 assertions for request.operation.*Time (parsingTime, normalizationTime, validationTime, planningTime) in telemetry-related tests; do not relax to >= 0 unless CI flakiness is observed.

Applied to files:

  • router/pkg/connectrpc/handler_test.go
  • router-tests/connectrpc_test.go
  • router/pkg/connectrpc/handler_error_test.go
  • router/pkg/connectrpc/operation_registry_test.go
  • router/pkg/config/connectrpc_test.go
  • router/pkg/connectrpc/service_discovery_test.go
  • router/pkg/connectrpc/server_lifecycle_test.go
📚 Learning: 2025-09-02T12:52:27.677Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2181
File: router/core/operation_processor.go:0-0
Timestamp: 2025-09-02T12:52:27.677Z
Learning: Hash validation for persisted queries with query bodies is performed in router/core/graphql_prehandler.go in the handleOperation function at lines 571-578, not in the APQ processing logic in operation_processor.go. This validates that extensions.persistedQuery.sha256Hash matches the computed query hash before any APQ operations occur. There's also a test case that verifies this behavior.

Applied to files:

  • router/pkg/connectrpc/handler_test.go
📚 Learning: 2025-09-02T12:52:27.677Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2181
File: router/core/operation_processor.go:0-0
Timestamp: 2025-09-02T12:52:27.677Z
Learning: Hash validation for persisted queries with query bodies is performed in router/core/graphql_prehandler.go in the handleOperation function, not in the APQ processing logic in operation_processor.go. This validates that extensions.persistedQuery.sha256Hash matches the query body before any APQ operations occur.

Applied to files:

  • router/pkg/connectrpc/handler_test.go
  • router/pkg/connectrpc/handler.go
📚 Learning: 2025-09-17T20:55:39.456Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2172
File: router/core/graph_server.go:0-0
Timestamp: 2025-09-17T20:55:39.456Z
Learning: The Initialize method in router/internal/retrytransport/manager.go has been updated to properly handle feature-flag-only subgraphs by collecting subgraphs from both routerConfig.GetSubgraphs() and routerConfig.FeatureFlagConfigs.ConfigByFeatureFlagName, ensuring all subgraphs receive retry configuration.

Applied to files:

  • router/pkg/config/config.go
  • router/core/router.go
  • router/core/supervisor_instance.go
📚 Learning: 2025-08-20T22:13:25.222Z
Learnt from: StarpTech
Repo: wundergraph/cosmo PR: 2157
File: router-tests/go.mod:16-16
Timestamp: 2025-08-20T22:13:25.222Z
Learning: github.com/mark3labs/mcp-go v0.38.0 has regressions and should not be used in the wundergraph/cosmo project. v0.36.0 is the stable version that should be used across router-tests and other modules.

Applied to files:

  • router/go.mod
  • router/core/router.go
📚 Learning: 2025-09-24T12:54:00.765Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2222
File: router-tests/websocket_test.go:2238-2302
Timestamp: 2025-09-24T12:54:00.765Z
Learning: The wundergraph/cosmo project uses Go 1.25 (Go 1.23+ minimum), so fmt.Appendf and other newer Go standard library functions are available and can be used without compatibility concerns.

Applied to files:

  • router/go.mod
📚 Learning: 2025-09-24T12:54:00.765Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2222
File: router-tests/websocket_test.go:2238-2302
Timestamp: 2025-09-24T12:54:00.765Z
Learning: The wundergraph/cosmo project uses Go 1.25 (Go 1.25 minimum), so fmt.Appendf and other newer Go standard library functions are available and can be used without compatibility concerns.

Applied to files:

  • router/go.mod
📚 Learning: 2025-07-21T15:06:36.664Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2067
File: router/pkg/config/config.schema.json:1637-1644
Timestamp: 2025-07-21T15:06:36.664Z
Learning: In the Cosmo router project, when extending JSON schema validation for security-sensitive fields like JWKS secrets, backwards compatibility is maintained by implementing warnings in the Go code rather than hard validation constraints in the schema. This allows existing configurations to continue working while alerting users to potential security issues.

Applied to files:

  • router/pkg/config/config.schema.json
📚 Learning: 2025-07-21T14:46:34.879Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2067
File: router/pkg/authentication/jwks_token_decoder.go:80-106
Timestamp: 2025-07-21T14:46:34.879Z
Learning: In the Cosmo router project, required field validation for JWKS configuration (Secret, Algorithm, KeyId) is handled at the JSON schema level in config.schema.json rather than through runtime validation in the Go code at router/pkg/authentication/jwks_token_decoder.go.

Applied to files:

  • router/pkg/config/config.schema.json
🧬 Code graph analysis (14)
router/pkg/connectrpc/server_test.go (1)
router/pkg/connectrpc/server.go (2)
  • NewServer (49-89)
  • ServerConfig (21-33)
router/pkg/connectrpc/vanguard_service_test.go (4)
router/pkg/connectrpc/vanguard_service.go (2)
  • NewVanguardService (31-56)
  • VanguardServiceConfig (16-20)
router/pkg/connectrpc/proto_loader.go (3)
  • ProtoLoader (53-59)
  • NewProtoLoader (62-72)
  • ServiceDefinition (17-30)
router/pkg/connectrpc/handler.go (3)
  • RPCHandler (138-143)
  • NewRPCHandler (154-182)
  • HandlerConfig (146-151)
router/pkg/connectrpc/operation_registry.go (2)
  • NewOperationRegistry (32-41)
  • OperationRegistry (24-29)
router/pkg/connectrpc/handler_test.go (2)
router/pkg/connectrpc/operation_registry.go (2)
  • NewOperationRegistry (32-41)
  • OperationRegistry (24-29)
router/pkg/connectrpc/handler.go (2)
  • NewRPCHandler (154-182)
  • HandlerConfig (146-151)
router-tests/connectrpc_test.go (3)
router-tests/testenv/testenv.go (2)
  • Run (105-122)
  • Environment (1733-1769)
router/pkg/connectrpc/server.go (2)
  • NewServer (49-89)
  • ServerConfig (21-33)
router/pkg/connectrpc/handler.go (1)
  • GraphQLRequest (112-115)
router/pkg/connectrpc/operation_registry_test.go (1)
router/pkg/connectrpc/operation_registry.go (1)
  • NewOperationRegistry (32-41)
router/pkg/config/connectrpc_test.go (1)
router/pkg/config/config.go (3)
  • ConnectRPCConfiguration (995-1000)
  • LoadConfig (1129-1241)
  • Config (1022-1098)
router/pkg/connectrpc/service_discovery_test.go (1)
router/pkg/connectrpc/service_discovery.go (2)
  • DiscoverServices (46-158)
  • ServiceDiscoveryConfig (29-34)
router/pkg/connectrpc/operation_registry.go (1)
router/core/context.go (3)
  • OperationTypeQuery (506-506)
  • OperationTypeMutation (507-507)
  • OperationTypeSubscription (508-508)
router/pkg/connectrpc/server_lifecycle_test.go (2)
router/pkg/connectrpc/proto_loader.go (2)
  • ProtoLoader (53-59)
  • NewProtoLoader (62-72)
router/pkg/connectrpc/server.go (3)
  • Server (36-46)
  • NewServer (49-89)
  • ServerConfig (21-33)
router/core/router.go (3)
router/pkg/connectrpc/server.go (3)
  • Server (36-46)
  • ServerConfig (21-33)
  • NewServer (49-89)
router/pkg/connectrpc/service_discovery.go (2)
  • DiscoverServices (46-158)
  • ServiceDiscoveryConfig (29-34)
router/pkg/config/config.go (1)
  • ConnectRPCConfiguration (995-1000)
router/core/supervisor_instance.go (1)
router/core/router.go (1)
  • WithConnectRPC (2212-2216)
router/pkg/connectrpc/handler.go (1)
router/pkg/connectrpc/operation_registry.go (1)
  • OperationRegistry (24-29)
router/core/router_config.go (1)
router/pkg/config/config.go (1)
  • ConnectRPCConfiguration (995-1000)
router/pkg/connectrpc/vanguard_service.go (2)
router/pkg/connectrpc/handler.go (1)
  • RPCHandler (138-143)
router/pkg/connectrpc/proto_loader.go (3)
  • ProtoLoader (53-59)
  • ServiceDefinition (17-30)
  • MethodDefinition (33-50)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: build-router
  • GitHub Check: build_test
  • GitHub Check: integration_test (./events)
  • GitHub Check: build_push_image
  • GitHub Check: build_push_image (nonroot)
  • GitHub Check: integration_test (./. ./fuzzquery ./lifecycle ./modules)
  • GitHub Check: image_scan
  • GitHub Check: image_scan (nonroot)
  • GitHub Check: integration_test (./telemetry)
  • GitHub Check: build_test
  • GitHub Check: Analyze (go)
  • GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (40)
router/pkg/connectrpc/samples/services/employee.v1/QueryGetProducts.graphql (1)

1-16: ✓ Sample query demonstrates proper inline fragment usage.

The GraphQL query is syntactically correct and effectively demonstrates querying a union/interface type (productTypes) with type-specific fields. The pattern of using inline fragments for different product types is appropriate and showcases how the operation registry handles multi-type queries.

router/pkg/connectrpc/samples/services/employee.v1/QueryGetEmployeeById.graphql (1)

1-20: LGTM!

The query is syntactically valid and properly follows the operation naming convention (Query<OperationName>) expected by the service registry. The nested field selection is correctly structured, and the parameterized query with $id: Int! is properly defined for the operation registry integration.

router/pkg/connectrpc/handler_test.go (1)

1-344: LGTM! Comprehensive test coverage.

The test file provides excellent coverage of the RPC handler functionality, including:

  • Constructor validation with various configurations
  • RPC request handling (success and error cases)
  • Header forwarding from context
  • HTTP transport error handling
  • Reload, operation counting, and validation

The use of mock implementations and test helpers follows Go testing best practices.

router/pkg/connectrpc/samples/services/employee.v1/QueryGetEmployees.graphql (1)

1-19: LGTM! Valid sample GraphQL query.

The GraphQL query is well-formed and appropriate for testing the ConnectRPC-to-GraphQL bridge functionality.

router/core/supervisor_instance.go (1)

271-271: LGTM! Consistent ConnectRPC option integration.

The addition of WithConnectRPC(config.ConnectRPC) follows the established pattern for router options and properly integrates the ConnectRPC configuration.

router/core/router_config.go (1)

89-89: LGTM! ConnectRPC configuration properly integrated.

The additions properly integrate ConnectRPC configuration into the router:

  • Field for the ConnectRPC server instance
  • Configuration field matching the config package structure
  • Usage tracking following the established pattern

Also applies to: 121-121, 310-310

router/connect.config.yaml (1)

1-20: LGTM! Well-structured sample configuration.

The configuration file provides a clear example of ConnectRPC setup with:

  • Proper schema hint for editor support
  • Standard router settings
  • ConnectRPC server configuration with local addresses
  • File system storage provider for service discovery

This aligns well with the ConnectRPC integration introduced in the PR.

router/pkg/connectrpc/server_test.go (1)

1-113: LGTM! Thorough server lifecycle testing.

The test suite comprehensively validates:

  • Server construction with various configurations
  • Default value application (listen address, timeout)
  • Protocol auto-completion for GraphQL endpoint
  • Error handling for missing required fields
  • Service information consistency across start/stop lifecycle

The tests follow Go best practices and provide good coverage of the server's public API.

router/go.mod (1)

60-60: All dependency versions are secure; golang.org/x/net v0.46.0 is current with all known vulnerabilities patched.

Verification results:

  • connectrpc.com/vanguard v0.3.0: No security advisories
  • github.com/jhump/protoreflect v1.17.0: No security advisories
  • golang.org/x/net v0.46.0: Multiple historical vulnerabilities exist, but all are in versions below v0.46.0. The current version includes all patches and is safe.

The three dependencies are compatible and secure as specified in the PR.

router/pkg/config/config.go (1)

995-1005: LGTM! Configuration follows established patterns.

The new ConnectRPCConfiguration and ConnectRPCServer structs mirror the existing MCP configuration pattern, with appropriate env variable bindings and sensible defaults (e.g., localhost:5026 avoids port conflicts with MCP on 5025).

router/pkg/connectrpc/proto_loader_test.go (1)

11-18: Well-designed test helper.

Good use of t.Helper() for proper stack trace attribution. The shared helper reduces duplication across test files.

router/pkg/connectrpc/samples/services/employee.v1/MutationUpdateEmployeeTag.graphql (1)

1-11: LGTM! Valid GraphQL mutation following naming conventions.

The mutation properly declares required variables (Int!, String!), and the operation name matches the filename, aligning with the ConnectRPC service discovery conventions.

router/pkg/connectrpc/operation_registry_test.go (3)

14-36: Good coverage of constructor behavior.

Tests properly verify both logger and nil logger paths, ensuring the defensive nil check in NewOperationRegistry works correctly.


385-452: Solid thread-safety testing approach.

The concurrent read and read-with-clear patterns properly exercise the RWMutex implementation. Good use of sync.WaitGroup for coordination.


454-499: Excellent isolation test for service-scoped operations.

This test correctly verifies that operations with the same name in different service namespaces remain isolated, which is critical for multi-service scenarios.

router/core/router.go (2)

926-929: Good validation of required configuration.

The early validation of ServicesProviderID provides a clear error message when the required configuration is missing.


2212-2216: LGTM! Option function follows established pattern.

The WithConnectRPC option follows the same pattern as other configuration options in this file.

router/pkg/connectrpc/handler_error_test.go (3)

17-48: Comprehensive HTTP status to Connect code mapping tests.

Good coverage of both 4xx and 5xx status codes with appropriate Connect error code mappings. The table-driven approach makes it easy to add new cases.


134-255: Well-structured critical error tests with inline snapshots.

The tests properly validate error classification, Connect error codes, and GraphQL error metadata. Using JSONEq for semantic comparison is the right approach.


257-397: Good coverage of partial data (non-critical) error scenarios.

Tests correctly verify that partial data is captured in metadata alongside error information, which is important for the Relay-inspired error classification mentioned in the PR objectives.

router/pkg/config/connectrpc_test.go (3)

12-20: Good test for zero-value safety invariant.

Testing that the zero value of ConnectRPCConfiguration represents a disabled state is an important safety check that protects against accidental enablement.


22-118: Comprehensive YAML loading tests with table-driven approach.

Good coverage of different configuration scenarios including minimal config, full overrides, environment variable expansion, and disabled state. The use of t.Setenv ensures proper cleanup.


171-203: Multi-file config merge test validates important behavior.

This test ensures that ConnectRPC configuration can be split across multiple files with proper override semantics, which is useful for environment-specific configurations.

router-tests/connectrpc_test.go (3)

40-44: Good shutdown pattern with context timeout.

The deferred cleanup with a timeout context ensures the server is properly stopped even if tests fail, preventing resource leaks.


98-115: Integration test validates router environment setup.

This test ensures the router testenv works correctly with GraphQL requests, providing a basic smoke test for the integration.


30-35: Relative path concern is unfounded—this is standard practice in the repository.

The path ../router/pkg/connectrpc/testdata is appropriate for this monorepo structure where router/ and router-tests/ are sibling directories. Go tests execute from their package directory by default, making the relative path predictable and reliable. This pattern is consistently used throughout other test files in router-tests/ (e.g., router_plugin_test.go with ../router/plugins, testexec.go with const routerDir = "../router"). No refactoring is needed.

Likely an incorrect or invalid review comment.

router/pkg/connectrpc/service_discovery_test.go (6)

13-56: Well-structured test for single service discovery.

Good test that validates the complete service discovery flow including package extraction, service name extraction, and file associations.


134-176: Important ADR compliance test for nested proto handling.

This test ensures that the discovery logic stops at the first proto file and doesn't descend into subdirectories, which is critical for the expected directory structure. The test name and comment clearly document this behavior.


218-251: Enforces single proto per directory constraint.

This constraint prevents ambiguity in service definitions. The test properly verifies the error message to ensure users get clear feedback.


253-283: Good test for duplicate service detection.

Validates that the discovery logic prevents the same package.service combination from appearing in multiple directories, which would cause routing conflicts.


423-460: Thorough edge case coverage for package extraction.

Tests cover normal cases, whitespace variations, nested packages, and missing declarations. This ensures robust parsing of proto files.


462-502: Good coverage for service name extraction.

Tests handle variations like presence/absence of braces, multiple services (first-wins behavior), and missing declarations.

router/pkg/connectrpc/testdata/examples/user.v1/user_service.proto (1)

1-130: Well-structured proto definition for testing.

The proto file follows the documented naming conventions (Query/Mutation prefixes) and provides comprehensive message definitions for testing the ConnectRPC integration. The service definition and message types are appropriately designed for the test scenarios.

router/pkg/connectrpc/server_lifecycle_test.go (1)

63-307: Comprehensive test coverage for server lifecycle.

The test suite covers key lifecycle scenarios including start/stop/reload sequences, error handling, component initialization, and graceful shutdown. The use of sync.Once for shared proto loading is a good pattern to avoid registration conflicts.

router/pkg/config/config.schema.json (1)

2110-2167: Well-structured ConnectRPC configuration schema.

The new connect_rpc configuration block follows the established patterns in the schema (similar to mcp). The structure includes appropriate properties for server configuration, service discovery, and GraphQL endpoint integration.

One minor observation: The services array items could benefit from required fields to enforce that at least one of proto_provider_id or operations_provider_id is specified when a service is defined.

Consider whether the services array items should have required properties. If a service entry is provided without any provider IDs, it may lead to runtime confusion.

router/pkg/connectrpc/operation_registry.go (2)

46-133: Robust operation loading with appropriate error handling.

The LoadOperationsForService method handles individual file errors gracefully by logging and continuing, which is appropriate for resilience. The duplicate detection behavior (last one wins) is documented in the warning log.

One consideration: The method returns nil error even when all files fail to load. This could mask scenarios where no operations were successfully loaded.

Consider whether callers need to distinguish between "0 files provided" vs "all files failed to parse" scenarios. Currently both result in success with 0 operations loaded.


24-29: Thread-safe design with appropriate locking strategy.

The use of sync.RWMutex with RLock for read operations and Lock for write operations is correct. The documentation accurately describes the thread-safety guarantees.

router/pkg/connectrpc/server.go (1)

91-198: Server startup flow is well-structured.

The Start() method follows a clear initialization pattern: discover services → initialize components → load protos/operations → create vanguard service → create transcoder → start HTTP server. Error handling is appropriate with wrapped errors for context.

router/pkg/connectrpc/testdata/examples/product.v1/product_service.proto (1)

1-213: Well-structured test proto definition.

This proto file demonstrates good practices:

  • Clear documentation comments for service and methods
  • Sequential field numbering for maintainability
  • Dedicated request/response messages for each RPC
  • Appropriate go_package option for testdata
  • Comprehensive coverage of query and mutation patterns
router/pkg/connectrpc/service_discovery.go (1)

225-258: Let me create a final test to definitively prove the review comment's example is incorrect:
<function_calls>

#!/bin/bash

Final verification test with the exact scenario from review comment

python3 << 'EOF'
def extractPackageFromProto(content):
"""Exact Go implementation simulation"""
lines = content.split("\n")
for line in lines:
line = line.strip()
if line.startswith("package "): # This is the key check
pkg = line[len("package "):]
pkg = pkg.rstrip(";")
pkg = pkg.strip()
return pkg
return ""

EXACT scenario from review comment

example_from_review = """// Old package: package employee.v1;
package company.v2;"""

result = extractPackageFromProto(example_from_review)

print("=" * 60)
print("TESTING EXACT REVIEW COMMENT SCENARIO")
print("=" * 60)
print(f"Proto content:\n{repr(example_from_review)}\n")
print(f"Result extracted: '{result}'")
print(f"Review claims: would extract 'employee.v1' (WRONG)")
print(f"Actual result: '{result}' (CORRECT)")
print(f"\nReview comment assertion: INCORRECT")
print(f"Reason: Line '// Old package: package employee.v1;' starts with '//'")
print(f" so it does NOT match the check: line.startswith('package ')")
EOF


</function_calls>

@asoorm asoorm force-pushed the ahmet/eng-8277-connect-rpc-handler-final branch 4 times, most recently from 016c128 to 3059f5e Compare November 29, 2025 09:17
@codecov
Copy link

codecov bot commented Nov 29, 2025

Codecov Report

❌ Patch coverage is 0.50725% with 1373 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (main@9be1a0e). Learn more about missing BASE report.

Files with missing lines Patch % Lines
router/pkg/connectrpc/handler.go 0.00% 275 Missing ⚠️
router/pkg/connectrpc/server.go 0.00% 193 Missing ⚠️
router/pkg/connectrpc/vanguard_service.go 0.00% 193 Missing ⚠️
router/pkg/connectrpc/proto_loader.go 0.00% 160 Missing ⚠️
router/pkg/connectrpc/validator.go 0.00% 160 Missing ⚠️
router/pkg/connectrpc/operation_registry.go 0.00% 146 Missing ⚠️
router/pkg/connectrpc/service_discovery.go 0.00% 131 Missing ⚠️
router/core/router.go 8.82% 60 Missing and 2 partials ⚠️
router/pkg/connectrpc/test_helpers.go 0.00% 51 Missing ⚠️
router/core/supervisor_instance.go 0.00% 1 Missing ⚠️
... and 1 more

❌ Your patch check has failed because the patch coverage (0.50%) is below the target coverage (90.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #2379   +/-   ##
=======================================
  Coverage        ?   33.09%           
=======================================
  Files           ?      348           
  Lines           ?    35176           
  Branches        ?      239           
=======================================
  Hits            ?    11641           
  Misses          ?    22504           
  Partials        ?     1031           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@github-actions github-actions bot added the cli label Nov 29, 2025
@asoorm asoorm force-pushed the ahmet/eng-8277-connect-rpc-handler-final branch from 4d210e1 to 50c0cf9 Compare November 29, 2025 10:18
@asoorm asoorm marked this pull request as ready for review November 29, 2025 10:36
@asoorm asoorm requested a review from JivusAyrus as a code owner November 29, 2025 10:36
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

♻️ Duplicate comments (2)
router/pkg/connectrpc/vanguard_service.go (1)

242-286: Improvement from past review, but message sanitization still needed.

The error response is now properly formatted as Connect JSON, which is an improvement. However, the connectErr.Message() at line 258 may still contain internal details (service names, internal URLs, implementation specifics) from the HandleRPC error chain.

The past review comment's concern about error sanitization remains partially valid. Consider implementing a sanitization layer for error messages returned to clients:

func sanitizeErrorMessage(err *connect.Error) string {
    // For internal errors, return a generic message
    if err.Code() == connect.CodeInternal {
        return "internal server error"
    }
    // For other errors, the message is typically safe
    return err.Message()
}

Then use: "message": sanitizeErrorMessage(connectErr)

This ensures internal infrastructure details don't leak while preserving useful client-facing error messages for validation errors, not-found, etc.

router/pkg/connectrpc/server.go (1)

283-284: Reload does not wrap handler in h2c, breaking HTTP/2 support.

This issue was raised in a previous review. In Start() (line 177), the handler is wrapped with h2c.NewHandler(handler, &http2.Server{}), but Reload() sets the handler directly without the h2c wrapper, which will break gRPC/HTTP/2 compatibility after a reload.

 	// Update HTTP server handler
-	s.httpServer.Handler = s.createHandler()
+	handler := s.createHandler()
+	s.httpServer.Handler = h2c.NewHandler(handler, &http2.Server{})
🧹 Nitpick comments (22)
protographic/tests/operations/operation-validation.test.ts (1)

201-361: PascalCase validation tests are solid; consider a couple of explicit edge cases

This suite exercises the new rule well across success and failure paths (queries, mutations, subscriptions, and error messaging). To make the contract even clearer, you might optionally add tests for names that:

  • Start uppercase but contain underscores (e.g. Get_user)
  • Start with an underscore (e.g. _GetUser), if those are intentionally disallowed

These are already rejected by the regex, but dedicated tests would document that behavior.

protographic/src/operation-to-proto.ts (1)

259-300: PascalCase validation is correct; tweak wording to match behavior (esp. with prefixOperationType)

The regex-based PascalCase check is straightforward and matches the new tests (rejects camelCase, snake_case, ALL‑CAPS, etc.), which is good.

Two minor polish points:

  1. Error message vs. prefixOperationType behavior

The error text says:

“This ensures the RPC method name exactly matches the GraphQL operation name.”

But when prefixOperationType is true, methodName becomes e.g. QueryGetUser, so the RPC method no longer “exactly matches” the GraphQL operation name. You can avoid that mismatch with a small wording change while keeping the tests intact. For example:

-          `This ensures the RPC method name exactly matches the GraphQL operation name.`,
+          `This helps ensure RPC method names derived from GraphQL operations remain predictable.`,

This still satisfies the existing regex expectations in the tests.

  1. Comment above method name construction

Similarly, this comment is slightly stronger than the actual behavior once the prefix is applied:

// 4. Create method name from operation name
// Use operation name as-is to ensure exact matching (no transformation)
let methodName = operationName;

Consider softening it so it’s accurate both with and without prefixOperationType:

-    // 4. Create method name from operation name
-    // Use operation name as-is to ensure exact matching (no transformation)
+    // 4. Create method name from operation name
+    // Use the operation name as the base (no additional casing transformation)

This keeps the intent clear (no more automatic camelCase / capitalization magic) while reflecting that an optional type prefix may still be applied.

router/pkg/config/config.go (1)

1012-1022: ConnectRPCConfiguration wiring looks correct; consider validation for required fields

The ConnectRPCConfiguration / ConnectRPCServer structs use sensible YAML keys and env var names and line up with the defaults JSON. To avoid hard-to-debug runtime failures, it would be good to ensure (via config.schema.json and/or Go-level post-processing) that connect_rpc.enabled: true is not allowed with an empty connect_rpc.graphql_endpoint (and any other fields you consider mandatory when enabled). If the schema already enforces this, no further change needed here.

router/pkg/connectrpc/samples/services/employee.v1/service.proto (1)

1-191: Consider consolidating duplicate proto files.

This proto file is duplicated between router/pkg/connectrpc/samples/services/employee.v1/ and router-tests/testdata/connectrpc/services/employee.v1/. While the duplication might be intentional (samples for documentation, testdata for tests), it creates maintenance burden and risk of divergence.

Consider:

  1. Using symlinks if both locations need the same file
  2. Generating one from the other during build
  3. Documenting why duplication is necessary if it serves different purposes
router-tests/connectrpc/README.md (1)

47-56: Add a language to the directory tree fenced block to satisfy markdownlint (MD040).

The directory structure block lacks a language identifier. To keep markdownlint happy and be consistent with the earlier Go snippet, add a language like text:

-```
+```text
 router-tests/
 ├── connectrpc/
 │   ├── server.go
 │   └── README.md
 ├── connectrpc_test.go
 └── testdata/
     └── connectrpc/
         └── services/
router/pkg/connectrpc/proto_loader_test.go (1)

20-109: ProtoLoader tests cover key metadata paths; only a minor naming nit.

The subtests thoroughly exercise service discovery, method enumeration, and method metadata (names, full names, input/output types, streaming flags) against the employee.v1 sample; this is solid coverage.

One tiny optional polish: the length assertion message mentions employee_only directory while the path under test is "samples/services/employee.v1". Renaming the message for consistency would avoid future confusion:

- assert.Len(t, services, 1, "Should load exactly one service from employee_only directory")
+ assert.Len(t, services, 1, "Should load exactly one service from samples/services/employee.v1")

Otherwise, the test logic looks good.

router-tests/testdata/connectrpc/README.md (1)

7-27: Add language specification to fenced code block.

The directory structure code block should include a language specification to satisfy markdown linting rules and improve rendering.

-```
+```text
 router-tests/testdata/connectrpc/
router-tests/connectrpc/connectrpc_test.go (1)

44-53: Test name doesn't match behavior.

The subtest is named "reloads operations on schema change" but it only verifies that Reload() doesn't return an error. Consider either renaming it to something like "reload completes without error" or expanding the test to verify that operations are actually reloaded by comparing operation counts/contents before and after reload.

router/pkg/connectrpc/vanguard_service_test.go (1)

249-260: Direct access to internal map is fragile.

The test directly manipulates the internal operations map of the registry (opRegistry.operations[serviceName]). This couples the test to the implementation details and will break if the internal structure changes. Consider adding a test helper method or using the public API to register operations.

If OperationRegistry doesn't expose a public method for adding operations in tests, consider adding one:

// In operation_registry.go or test_helpers.go
func (r *OperationRegistry) RegisterOperation(serviceName, methodName string, op *schemaloader.Operation) {
    r.mu.Lock()
    defer r.mu.Unlock()
    if r.operations[serviceName] == nil {
        r.operations[serviceName] = make(map[string]*schemaloader.Operation)
    }
    r.operations[serviceName][methodName] = op
}
router-tests/connectrpc/connectrpc_server_lifecycle_test.go (1)

53-78: Clarify expected behavior for concurrent starts.

The test verifies "at least one start should succeed" but doesn't specify what should happen to the other concurrent start attempts. Should they return an error (e.g., "server already started") or also succeed (idempotent)?

Consider adding assertions to verify the expected behavior:

 // At least one should succeed
 successCount := 0
+alreadyStartedCount := 0
 for _, err := range errors {
   if err == nil {
     successCount++
+  } else if strings.Contains(err.Error(), "already") {
+    alreadyStartedCount++
   }
 }
-assert.GreaterOrEqual(t, successCount, 1, "at least one start should succeed")
+assert.Equal(t, 1, successCount, "exactly one start should succeed")
+assert.Equal(t, 2, alreadyStartedCount, "other starts should fail with 'already started'")

If the server is designed to be idempotent (multiple starts succeed), document this expectation and adjust the assertion accordingly.

router/core/router.go (1)

1560-1571: Unnecessary type assertion for shutdown.

The connectRPCServer field is assigned from connectrpc.NewServer() which returns *connectrpc.Server. The type assertion to check for Stop(context.Context) error interface is redundant since the concrete type is known.

Consider simplifying by using the concrete type directly:

 	if r.connectRPCServer != nil {
 		wg.Add(1)
 		go func() {
 			defer wg.Done()
-			// Type assert to access Stop method
-			if server, ok := r.connectRPCServer.(interface{ Stop(context.Context) error }); ok {
-				if subErr := server.Stop(ctx); subErr != nil {
-					err.Append(fmt.Errorf("failed to shutdown connect_rpc server: %w", subErr))
-				}
+			if subErr := r.connectRPCServer.Stop(ctx); subErr != nil {
+				err.Append(fmt.Errorf("failed to shutdown connect_rpc server: %w", subErr))
 			}
 		}()
 	}

This requires changing the field type in router_config.go from interface{} or any to *connectrpc.Server if it isn't already.

router-tests/connectrpc/connectrpc_client_test.go (1)

239-295: Concurrency test could benefit from timeout context.

The concurrency test is good but uses unbounded context.Background() which could hang indefinitely if there's an issue.

Consider adding a timeout to prevent test hangs:

+	ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
+	defer cancel()
+
 	for i := 0; i < numRequests; i++ {
 		go func() {
 			req := connect.NewRequest(&employeev1.GetEmployeeByIdRequest{
 				EmployeeId: 1,
 			})
-			_, err := client.GetEmployeeById(context.Background(), req)
+			_, err := client.GetEmployeeById(ctx, req)
 			results <- err
 		}()
 	}

You'd also need to add "time" to the imports.

router/pkg/connectrpc/constructor_validation_test.go (1)

134-141: Consider using t.Parallel() for subtests.

The table-driven test loop doesn't run subtests in parallel. Since these are validation tests without shared mutable state, they could run in parallel for faster execution.

 	for _, tt := range tests {
+		tt := tt // capture range variable
 		t.Run(tt.name, func(t *testing.T) {
+			t.Parallel()
 			result, err := tt.constructor()
 			assert.Error(t, err)
 			assert.Nil(t, result)
 			assert.Contains(t, err.Error(), tt.wantErr)
 		})
 	}
router/pkg/connectrpc/test_helpers.go (2)

78-84: Mock response missing required HTTP fields.

The mockRoundTripper.RoundTrip returns a response without setting Proto, ProtoMajor, ProtoMinor, and Request fields. While this works for basic tests, it may cause issues if code inspects these fields.

 func (m *mockRoundTripper) RoundTrip(req *http.Request) (*http.Response, error) {
 	return &http.Response{
 		StatusCode: m.statusCode,
 		Body:       io.NopCloser(strings.NewReader(m.responseBody)),
 		Header:     make(http.Header),
+		Proto:      "HTTP/1.1",
+		ProtoMajor: 1,
+		ProtoMinor: 1,
+		Request:    req,
 	}, nil
 }

94-102: Direct access to unexported field operations breaks encapsulation.

The test helper directly accesses opRegistry.operations, which is an unexported field. This works because the test helper is in the same package, but it tightly couples tests to implementation details.

Consider adding a test-only method to OperationRegistry or using the public Register method if one exists:

-	// Manually add test operations to the registry using service-scoped approach
-	serviceName := "employee.v1.EmployeeService"
-	if opRegistry.operations[serviceName] == nil {
-		opRegistry.operations[serviceName] = make(map[string]*schemaloader.Operation)
-	}
-	opRegistry.operations[serviceName]["GetEmployeeById"] = &schemaloader.Operation{
+	// Use public API to register test operations
+	opRegistry.Register("employee.v1.EmployeeService", &schemaloader.Operation{
 		Name:            "GetEmployeeById",
 		OperationType:   "query",
 		OperationString: "query GetEmployeeById($id: Int!) { employee(id: $id) { id name } }",
-	}
+	})

If no public registration method exists, consider adding one like RegisterForTest with a clear name indicating it's for testing purposes.

router/pkg/connectrpc/handler_test.go (1)

84-94: Missing test case for ProtoLoader validation.

According to the NewRPCHandler validation logic in handler.go, ProtoLoader is a required dependency. Consider adding a test case to verify the error when ProtoLoader is nil:

t.Run("returns error when proto loader is missing", func(t *testing.T) {
    operationRegistry := NewOperationRegistry(logger)
    handler, err := NewRPCHandler(HandlerConfig{
        GraphQLEndpoint:   "http://localhost:4000/graphql",
        HTTPClient:        httpClient,
        Logger:            logger,
        OperationRegistry: operationRegistry,
    })

    assert.Error(t, err)
    assert.Nil(t, handler)
    assert.Contains(t, err.Error(), "proto loader is required")
})
router/pkg/connectrpc/validator.go (3)

63-78: Consider removing or repurposing these helper functions.

getKeys and getFieldNames are currently only used by the debug print statements. If the debug statements are removed, these functions become dead code. Either remove them or repurpose them for structured logging.


80-115: Consider validating for unknown fields in JSON.

The current implementation only validates that required proto fields exist in JSON but doesn't reject unknown JSON keys that aren't defined in the proto schema. Depending on your requirements, you may want to add validation for unexpected fields:

// After validating all proto fields, check for unknown keys in JSON
for key := range data {
    found := false
    for _, field := range fields {
        if field.GetName() == key || field.GetJSONName() == key {
            found = true
            break
        }
    }
    if !found {
        return &ValidationError{
            Field:   fieldPath + key,
            Message: "unknown field",
        }
    }
}

This would help catch typos and invalid field names early. However, if you prefer lenient validation (ignoring unknown fields), the current behavior is acceptable.


294-305: Consider validating enum values against allowed members.

The current enum validation only checks that the value is a string or number, but doesn't verify the value is a valid enum member. This could allow invalid enum values through:

case descriptorpb.FieldDescriptorProto_TYPE_ENUM:
    enumDesc := field.GetEnumType()
    switch v := value.(type) {
    case string:
        // Validate enum name exists
        if enumDesc.FindValueByName(v) == nil {
            return &ValidationError{
                Field:   fieldPath,
                Message: fmt.Sprintf("invalid enum value: %s", v),
            }
        }
    case float64:
        // Validate enum number exists
        if enumDesc.FindValueByNumber(protoreflect.EnumNumber(int32(v))) == nil {
            return &ValidationError{
                Field:   fieldPath,
                Message: fmt.Sprintf("invalid enum value: %v", v),
            }
        }
    // ...
    }

This is optional as lenient validation may be acceptable for your use case.

router-tests/connectrpc/connectrpc_test_helpers.go (1)

230-237: Potential JSON injection in ErrorGraphQLHandler.

The message parameter is directly interpolated into JSON without escaping, which could produce invalid JSON if the message contains quotes or other special characters:

w.Write([]byte(fmt.Sprintf(`{"errors": [{"message": "%s"}]}`, message)))

For test helpers this is low risk, but consider using proper JSON encoding:

-func ErrorGraphQLHandler(message string) http.HandlerFunc {
-	return func(w http.ResponseWriter, r *http.Request) {
-		w.Header().Set("Content-Type", "application/json")
-		w.WriteHeader(http.StatusOK)
-		w.Write([]byte(fmt.Sprintf(`{"errors": [{"message": "%s"}]}`, message)))
-	}
-}
+func ErrorGraphQLHandler(message string) http.HandlerFunc {
+	return func(w http.ResponseWriter, r *http.Request) {
+		w.Header().Set("Content-Type", "application/json")
+		w.WriteHeader(http.StatusOK)
+		response := map[string]interface{}{
+			"errors": []map[string]string{{"message": message}},
+		}
+		json.NewEncoder(w).Encode(response)
+	}
+}
router/pkg/connectrpc/vanguard_service.go (1)

109-153: Consider reducing log verbosity for production.

Multiple Info-level log statements during service registration could be noisy in production environments with many services:

  • Lines 110-113: Service registration info
  • Lines 117-121: Each method logged individually
  • Lines 127-133: Service discovery info
  • Lines 148-152: Registration success

Consider using Debug level for the per-method logging (lines 117-121) and consolidating the Info-level logs:

-		vs.logger.Info("service method",
+		vs.logger.Debug("service method",
 			zap.String("service", serviceName),
 			zap.String("method", method.Name),
router/pkg/connectrpc/proto_loader.go (1)

323-327: Consider returning a copy to prevent accidental mutation.

The method returns the internal map directly. While the comment advises treating it as read-only, callers could accidentally mutate the map. For defensive coding, consider returning a shallow copy.

 func (pl *ProtoLoader) GetServices() map[string]*ServiceDefinition {
-	return pl.services
+	result := make(map[string]*ServiceDefinition, len(pl.services))
+	for k, v := range pl.services {
+		result[k] = v
+	}
+	return result
 }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c36e3f2 and 0e075e3.

⛔ Files ignored due to path filters (3)
  • router-tests/go.sum is excluded by !**/*.sum
  • router-tests/testdata/connectrpc/client/employee.v1/service.pb.go is excluded by !**/*.pb.go
  • router/go.sum is excluded by !**/*.sum
📒 Files selected for processing (53)
  • cli/src/commands/grpc-service/commands/generate.ts (0 hunks)
  • protographic/src/operation-to-proto.ts (2 hunks)
  • protographic/tests/operations/operation-validation.test.ts (1 hunks)
  • router-tests/connectrpc/README.md (1 hunks)
  • router-tests/connectrpc/connectrpc_client_test.go (1 hunks)
  • router-tests/connectrpc/connectrpc_server_lifecycle_test.go (1 hunks)
  • router-tests/connectrpc/connectrpc_test.go (1 hunks)
  • router-tests/connectrpc/connectrpc_test_helpers.go (1 hunks)
  • router-tests/go.mod (3 hunks)
  • router-tests/testdata/connectrpc/README.md (1 hunks)
  • router-tests/testdata/connectrpc/buf.gen.yaml (1 hunks)
  • router-tests/testdata/connectrpc/buf.yaml (1 hunks)
  • router-tests/testdata/connectrpc/client/employee.v1/employeev1connect/service.connect.go (1 hunks)
  • router-tests/testdata/connectrpc/services/employee.v1/MutationUpdateEmployeeMood.graphql (1 hunks)
  • router-tests/testdata/connectrpc/services/employee.v1/QueryGetEmployeeById.graphql (1 hunks)
  • router-tests/testdata/connectrpc/services/employee.v1/QueryGetEmployeeByPets.graphql (1 hunks)
  • router-tests/testdata/connectrpc/services/employee.v1/QueryGetEmployeeWithMood.graphql (1 hunks)
  • router-tests/testdata/connectrpc/services/employee.v1/QueryGetEmployees.graphql (1 hunks)
  • router-tests/testdata/connectrpc/services/employee.v1/QueryGetEmployeesByPetsInlineFragment.graphql (1 hunks)
  • router-tests/testdata/connectrpc/services/employee.v1/QueryGetEmployeesByPetsNamedFragment.graphql (1 hunks)
  • router-tests/testdata/connectrpc/services/employee.v1/service.proto (1 hunks)
  • router-tests/testdata/connectrpc/services/employee.v1/service.proto.lock.json (1 hunks)
  • router/core/router.go (4 hunks)
  • router/core/router_config.go (3 hunks)
  • router/core/supervisor_instance.go (1 hunks)
  • router/go.mod (3 hunks)
  • router/pkg/config/config.go (2 hunks)
  • router/pkg/config/config.schema.json (1 hunks)
  • router/pkg/config/testdata/config_defaults.json (1 hunks)
  • router/pkg/config/testdata/config_full.json (1 hunks)
  • router/pkg/connectrpc/constructor_validation_test.go (1 hunks)
  • router/pkg/connectrpc/error_handling_test.go (1 hunks)
  • router/pkg/connectrpc/handler.go (1 hunks)
  • router/pkg/connectrpc/handler_test.go (1 hunks)
  • router/pkg/connectrpc/proto_loader.go (1 hunks)
  • router/pkg/connectrpc/proto_loader_test.go (1 hunks)
  • router/pkg/connectrpc/samples/service.graphqls (1 hunks)
  • router/pkg/connectrpc/samples/services/employee.v1/MutationUpdateEmployeeMood.graphql (1 hunks)
  • router/pkg/connectrpc/samples/services/employee.v1/QueryGetEmployeeById.graphql (1 hunks)
  • router/pkg/connectrpc/samples/services/employee.v1/QueryGetEmployeeByPets.graphql (1 hunks)
  • router/pkg/connectrpc/samples/services/employee.v1/QueryGetEmployeeWithMood.graphql (1 hunks)
  • router/pkg/connectrpc/samples/services/employee.v1/QueryGetEmployees.graphql (1 hunks)
  • router/pkg/connectrpc/samples/services/employee.v1/QueryGetEmployeesByPetsInlineFragment.graphql (1 hunks)
  • router/pkg/connectrpc/samples/services/employee.v1/QueryGetEmployeesByPetsNamedFragment.graphql (1 hunks)
  • router/pkg/connectrpc/samples/services/employee.v1/service.proto (1 hunks)
  • router/pkg/connectrpc/samples/services/employee.v1/service.proto.lock.json (1 hunks)
  • router/pkg/connectrpc/server.go (1 hunks)
  • router/pkg/connectrpc/server_test.go (1 hunks)
  • router/pkg/connectrpc/service_discovery.go (1 hunks)
  • router/pkg/connectrpc/test_helpers.go (1 hunks)
  • router/pkg/connectrpc/validator.go (1 hunks)
  • router/pkg/connectrpc/vanguard_service.go (1 hunks)
  • router/pkg/connectrpc/vanguard_service_test.go (1 hunks)
💤 Files with no reviewable changes (1)
  • cli/src/commands/grpc-service/commands/generate.ts
✅ Files skipped from review due to trivial changes (2)
  • router/pkg/connectrpc/samples/services/employee.v1/QueryGetEmployeeWithMood.graphql
  • router-tests/testdata/connectrpc/services/employee.v1/QueryGetEmployeeById.graphql
🚧 Files skipped from review as they are similar to previous changes (4)
  • router/pkg/connectrpc/server_test.go
  • router/pkg/connectrpc/samples/services/employee.v1/QueryGetEmployeeById.graphql
  • router/core/router_config.go
  • router/pkg/config/config.schema.json
🧰 Additional context used
🧠 Learnings (13)
📓 Common learnings
Learnt from: dkorittki
Repo: wundergraph/cosmo PR: 2273
File: router/core/graphql_handler.go:0-0
Timestamp: 2025-11-19T15:13:57.821Z
Learning: In the Cosmo router (wundergraph/cosmo), error handling follows a two-phase pattern: (1) Prehandler phase handles request parsing, validation, and setup errors using `httpGraphqlError` and `writeOperationError` (in files like graphql_prehandler.go, operation_processor.go, parse_multipart.go, batch.go); (2) Execution phase handles resolver execution errors using `WriteError` in GraphQLHandler.ServeHTTP. Because all `httpGraphqlError` instances are caught in the prehandler before ServeHTTP is invoked, any error type checks for `httpGraphqlError` in the execution-phase WriteError method are unreachable code.
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2172
File: router/core/graph_server.go:0-0
Timestamp: 2025-09-17T20:55:39.456Z
Learning: The Initialize method in router/internal/retrytransport/manager.go has been updated to properly handle feature-flag-only subgraphs by collecting subgraphs from both routerConfig.GetSubgraphs() and routerConfig.FeatureFlagConfigs.ConfigByFeatureFlagName, ensuring all subgraphs receive retry configuration.
📚 Learning: 2025-10-01T20:39:16.113Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2252
File: router-tests/telemetry/telemetry_test.go:9684-9693
Timestamp: 2025-10-01T20:39:16.113Z
Learning: Repo preference: In router-tests/telemetry/telemetry_test.go, keep strict > 0 assertions for request.operation.*Time (parsingTime, normalizationTime, validationTime, planningTime) in telemetry-related tests; do not relax to >= 0 unless CI flakiness is observed.

Applied to files:

  • router-tests/connectrpc/connectrpc_test.go
  • router-tests/connectrpc/connectrpc_server_lifecycle_test.go
  • router-tests/connectrpc/connectrpc_client_test.go
  • router-tests/go.mod
  • router/pkg/connectrpc/handler_test.go
  • router/pkg/connectrpc/constructor_validation_test.go
📚 Learning: 2025-11-19T15:13:57.821Z
Learnt from: dkorittki
Repo: wundergraph/cosmo PR: 2273
File: router/core/graphql_handler.go:0-0
Timestamp: 2025-11-19T15:13:57.821Z
Learning: In the Cosmo router (wundergraph/cosmo), error handling follows a two-phase pattern: (1) Prehandler phase handles request parsing, validation, and setup errors using `httpGraphqlError` and `writeOperationError` (in files like graphql_prehandler.go, operation_processor.go, parse_multipart.go, batch.go); (2) Execution phase handles resolver execution errors using `WriteError` in GraphQLHandler.ServeHTTP. Because all `httpGraphqlError` instances are caught in the prehandler before ServeHTTP is invoked, any error type checks for `httpGraphqlError` in the execution-phase WriteError method are unreachable code.

Applied to files:

  • router-tests/connectrpc/connectrpc_test.go
  • router/pkg/connectrpc/test_helpers.go
  • router/pkg/connectrpc/handler_test.go
  • router/pkg/connectrpc/error_handling_test.go
  • router/pkg/connectrpc/vanguard_service_test.go
  • router/pkg/connectrpc/vanguard_service.go
  • router-tests/connectrpc/connectrpc_test_helpers.go
  • router/pkg/connectrpc/handler.go
📚 Learning: 2025-08-20T22:13:25.222Z
Learnt from: StarpTech
Repo: wundergraph/cosmo PR: 2157
File: router-tests/go.mod:16-16
Timestamp: 2025-08-20T22:13:25.222Z
Learning: github.com/mark3labs/mcp-go v0.38.0 has regressions and should not be used in the wundergraph/cosmo project. v0.36.0 is the stable version that should be used across router-tests and other modules.

Applied to files:

  • router-tests/connectrpc/connectrpc_test.go
  • router-tests/testdata/connectrpc/buf.gen.yaml
  • router-tests/go.mod
  • router/core/router.go
  • router/go.mod
📚 Learning: 2025-07-21T14:46:34.879Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2067
File: router/pkg/authentication/jwks_token_decoder.go:80-106
Timestamp: 2025-07-21T14:46:34.879Z
Learning: In the Cosmo router project, required field validation for JWKS configuration (Secret, Algorithm, KeyId) is handled at the JSON schema level in config.schema.json rather than through runtime validation in the Go code at router/pkg/authentication/jwks_token_decoder.go.

Applied to files:

  • router/pkg/connectrpc/validator.go
📚 Learning: 2025-07-21T15:06:36.664Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2067
File: router/pkg/config/config.schema.json:1637-1644
Timestamp: 2025-07-21T15:06:36.664Z
Learning: In the Cosmo router project, when extending JSON schema validation for security-sensitive fields like JWKS secrets, backwards compatibility is maintained by implementing warnings in the Go code rather than hard validation constraints in the schema. This allows existing configurations to continue working while alerting users to potential security issues.

Applied to files:

  • router/pkg/connectrpc/validator.go
📚 Learning: 2025-09-17T20:55:39.456Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2172
File: router/core/graph_server.go:0-0
Timestamp: 2025-09-17T20:55:39.456Z
Learning: The Initialize method in router/internal/retrytransport/manager.go has been updated to properly handle feature-flag-only subgraphs by collecting subgraphs from both routerConfig.GetSubgraphs() and routerConfig.FeatureFlagConfigs.ConfigByFeatureFlagName, ensuring all subgraphs receive retry configuration.

Applied to files:

  • router/core/router.go
  • router/core/supervisor_instance.go
📚 Learning: 2025-09-02T12:52:27.677Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2181
File: router/core/operation_processor.go:0-0
Timestamp: 2025-09-02T12:52:27.677Z
Learning: Hash validation for persisted queries with query bodies is performed in router/core/graphql_prehandler.go in the handleOperation function at lines 571-578, not in the APQ processing logic in operation_processor.go. This validates that extensions.persistedQuery.sha256Hash matches the computed query hash before any APQ operations occur. There's also a test case that verifies this behavior.

Applied to files:

  • router/pkg/connectrpc/handler_test.go
📚 Learning: 2025-07-30T09:29:04.257Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2090
File: router/core/operation_processor.go:0-0
Timestamp: 2025-07-30T09:29:04.257Z
Learning: GraphQL operation names don't allow characters with more than 1 code point, so string length operations and slicing work correctly for both byte and character counting in GraphQL operation name processing.

Applied to files:

  • protographic/src/operation-to-proto.ts
  • protographic/tests/operations/operation-validation.test.ts
📚 Learning: 2025-09-24T12:54:00.765Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2222
File: router-tests/websocket_test.go:2238-2302
Timestamp: 2025-09-24T12:54:00.765Z
Learning: The wundergraph/cosmo project uses Go 1.25 (Go 1.23+ minimum), so fmt.Appendf and other newer Go standard library functions are available and can be used without compatibility concerns.

Applied to files:

  • router/go.mod
📚 Learning: 2025-09-24T12:54:00.765Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2222
File: router-tests/websocket_test.go:2238-2302
Timestamp: 2025-09-24T12:54:00.765Z
Learning: The wundergraph/cosmo project uses Go 1.25 (Go 1.25 minimum), so fmt.Appendf and other newer Go standard library functions are available and can be used without compatibility concerns.

Applied to files:

  • router/go.mod
📚 Learning: 2025-09-19T15:08:03.085Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2223
File: router/cmd/main.go:32-33
Timestamp: 2025-09-19T15:08:03.085Z
Learning: pyroscope-go's Config.ServerAddress field accepts both full URLs (e.g., "http://localhost:4040") and simple host:port format (e.g., "localhost:4040"). The library handles URL normalization internally, so both formats work correctly.

Applied to files:

  • router/pkg/connectrpc/server.go
📚 Learning: 2025-09-02T12:52:27.677Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2181
File: router/core/operation_processor.go:0-0
Timestamp: 2025-09-02T12:52:27.677Z
Learning: Hash validation for persisted queries with query bodies is performed in router/core/graphql_prehandler.go in the handleOperation function, not in the APQ processing logic in operation_processor.go. This validates that extensions.persistedQuery.sha256Hash matches the query body before any APQ operations occur.

Applied to files:

  • router/pkg/connectrpc/handler.go
🧬 Code graph analysis (15)
router-tests/connectrpc/connectrpc_test.go (2)
router-tests/testenv/testenv.go (2)
  • Run (105-122)
  • Environment (1763-1799)
router-tests/connectrpc/connectrpc_test_helpers.go (2)
  • NewTestConnectRPCServer (94-144)
  • ConnectRPCServerOptions (77-83)
router-tests/connectrpc/connectrpc_server_lifecycle_test.go (2)
router-tests/connectrpc/connectrpc_test_helpers.go (2)
  • NewTestConnectRPCServer (94-144)
  • ConnectRPCServerOptions (77-83)
router/pkg/connectrpc/server.go (1)
  • ServerConfig (21-33)
router-tests/connectrpc/connectrpc_client_test.go (2)
router-tests/connectrpc/connectrpc_test_helpers.go (5)
  • NewTestConnectRPCServer (94-144)
  • ConnectRPCServerOptions (77-83)
  • EmployeeGraphQLHandler (222-228)
  • ErrorGraphQLHandler (231-237)
  • HTTPErrorHandler (240-245)
router-tests/testdata/connectrpc/client/employee.v1/employeev1connect/service.connect.go (1)
  • NewEmployeeServiceClient (77-130)
router/pkg/connectrpc/validator.go (1)
router/pkg/connectrpc/proto_loader.go (1)
  • ProtoLoader (53-59)
router/pkg/config/config.go (3)
router/pkg/connectrpc/server.go (1)
  • Server (36-47)
router/core/graph_server.go (1)
  • Server (69-72)
router-tests/jwks/jwks.go (1)
  • Server (24-29)
router/core/router.go (3)
router/pkg/connectrpc/server.go (3)
  • Server (36-47)
  • ServerConfig (21-33)
  • NewServer (50-90)
router/pkg/connectrpc/service_discovery.go (2)
  • DiscoverServices (46-153)
  • ServiceDiscoveryConfig (29-34)
router/pkg/config/config.go (1)
  • ConnectRPCConfiguration (1012-1017)
router/pkg/connectrpc/test_helpers.go (3)
router/pkg/connectrpc/proto_loader.go (2)
  • ProtoLoader (53-59)
  • NewProtoLoader (62-72)
router/pkg/connectrpc/handler.go (2)
  • RPCHandler (139-145)
  • HandlerConfig (148-154)
router/pkg/connectrpc/operation_registry.go (2)
  • NewOperationRegistry (32-41)
  • OperationRegistry (24-29)
router/pkg/connectrpc/handler_test.go (4)
router/pkg/connectrpc/operation_registry.go (2)
  • NewOperationRegistry (32-41)
  • OperationRegistry (24-29)
router/pkg/connectrpc/proto_loader.go (2)
  • NewProtoLoader (62-72)
  • ProtoLoader (53-59)
router/pkg/connectrpc/handler.go (2)
  • NewRPCHandler (157-190)
  • HandlerConfig (148-154)
router/pkg/connectrpc/test_helpers.go (1)
  • MockHTTPClient (64-71)
router/pkg/connectrpc/constructor_validation_test.go (3)
router/pkg/connectrpc/handler.go (3)
  • NewRPCHandler (157-190)
  • HandlerConfig (148-154)
  • RPCHandler (139-145)
router/pkg/connectrpc/server.go (1)
  • ServerConfig (21-33)
router/pkg/connectrpc/vanguard_service.go (2)
  • NewVanguardService (68-93)
  • VanguardServiceConfig (53-57)
router/pkg/connectrpc/server.go (5)
router/pkg/connectrpc/proto_loader.go (2)
  • ProtoLoader (53-59)
  • NewProtoLoader (62-72)
router/pkg/connectrpc/operation_registry.go (2)
  • OperationRegistry (24-29)
  • NewOperationRegistry (32-41)
router/pkg/connectrpc/handler.go (3)
  • RPCHandler (139-145)
  • NewRPCHandler (157-190)
  • HandlerConfig (148-154)
router/pkg/connectrpc/vanguard_service.go (3)
  • VanguardService (60-65)
  • NewVanguardService (68-93)
  • VanguardServiceConfig (53-57)
router/pkg/connectrpc/service_discovery.go (2)
  • DiscoverServices (46-153)
  • ServiceDiscoveryConfig (29-34)
router/pkg/connectrpc/vanguard_service.go (2)
router/pkg/connectrpc/handler.go (2)
  • MetaKeyGraphQLErrors (101-101)
  • GraphQLError (125-130)
router/pkg/connectrpc/proto_loader.go (2)
  • ServiceDefinition (17-30)
  • MethodDefinition (33-50)
router/pkg/connectrpc/proto_loader_test.go (3)
router/pkg/connectrpc/proto_loader.go (2)
  • ProtoLoader (53-59)
  • NewProtoLoader (62-72)
router-tests/testenv/testenv.go (1)
  • Run (105-122)
demo/pkg/subgraphs/test1/subgraph/model/models_gen.go (1)
  • InputType (346-348)
router-tests/connectrpc/connectrpc_test_helpers.go (1)
router/pkg/connectrpc/server.go (1)
  • ServerConfig (21-33)
router/pkg/connectrpc/handler.go (3)
router/pkg/connectrpc/operation_registry.go (1)
  • OperationRegistry (24-29)
router/pkg/connectrpc/validator.go (3)
  • MessageValidator (13-15)
  • NewMessageValidator (18-22)
  • ValidationError (25-28)
router/pkg/connectrpc/proto_loader.go (1)
  • ProtoLoader (53-59)
router-tests/testdata/connectrpc/client/employee.v1/employeev1connect/service.connect.go (1)
router-tests/testdata/connectrpc/client/employee.v1/service.pb.go (22)
  • GetEmployeeByIdRequest (222-227)
  • GetEmployeeByIdRequest (240-240)
  • GetEmployeeByIdRequest (255-257)
  • GetEmployeeByIdResponse (266-271)
  • GetEmployeeByIdResponse (284-284)
  • GetEmployeeByIdResponse (299-301)
  • GetEmployeesRequest (479-483)
  • GetEmployeesRequest (496-496)
  • GetEmployeesRequest (511-513)
  • GetEmployeesResponse (515-520)
  • GetEmployeesResponse (533-533)
  • GetEmployeesResponse (548-550)
  • GetEmployeesWithMoodResponse (435-440)
  • GetEmployeesWithMoodResponse (453-453)
  • GetEmployeesWithMoodResponse (468-470)
  • UpdateEmployeeMoodRequest (125-131)
  • UpdateEmployeeMoodRequest (144-144)
  • UpdateEmployeeMoodRequest (159-161)
  • UpdateEmployeeMoodResponse (177-183)
  • UpdateEmployeeMoodResponse (196-196)
  • UpdateEmployeeMoodResponse (211-213)
  • File_employee_v1_service_proto (1853-1853)
🪛 markdownlint-cli2 (0.18.1)
router-tests/testdata/connectrpc/README.md

7-7: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

router-tests/connectrpc/README.md

47-47: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

Comment on lines 96 to 159
{
name: "VanguardService: nil handler",
constructor: func() (interface{}, error) {
protoLoader := NewProtoLoader(logger)
_ = protoLoader.LoadFromDirectory("samples/services/employee.v1")
return NewVanguardService(VanguardServiceConfig{
Handler: nil,
ProtoLoader: protoLoader,
Logger: logger,
})
},
wantErr: "handler cannot be nil",
},
{
name: "VanguardService: nil proto loader",
constructor: func() (interface{}, error) {
return NewVanguardService(VanguardServiceConfig{
Handler: &RPCHandler{},
ProtoLoader: nil,
Logger: logger,
})
},
wantErr: "proto loader cannot be nil",
},
{
name: "VanguardService: no proto services",
constructor: func() (interface{}, error) {
protoLoader := NewProtoLoader(logger)
return NewVanguardService(VanguardServiceConfig{
Handler: &RPCHandler{},
ProtoLoader: protoLoader,
Logger: logger,
})
},
wantErr: "no proto services found",
},
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check if samples directory exists in the repository
find . -type d -name "samples" | head -20

# Search for references to "samples/services" in the codebase
rg -n "samples/services" --type go

# Check the actual test file to see the full context
wc -l router/pkg/connectrpc/constructor_validation_test.go

# Look for the LoadFromDirectory function to understand how paths are handled
rg -n "LoadFromDirectory" --type go -B2 -A5

Repository: wundergraph/cosmo

Length of output: 6815


🏁 Script executed:

#!/bin/bash
# Check the samples directory structure
ls -la router/pkg/connectrpc/samples/services/ 2>/dev/null || echo "samples/services directory structure:"
find router/pkg/connectrpc/samples -type f | head -20

# Look at the full constructor_validation_test.go file including test setup
head -50 router/pkg/connectrpc/constructor_validation_test.go

# Check for any test helpers or setup
grep -n "func Test\|func setup\|os.Chdir\|t.Chdir" router/pkg/connectrpc/constructor_validation_test.go

# Check if there's a module or workspace configuration
ls -la router/pkg/connectrpc/ | grep -E "go.mod|go.sum"

Repository: wundergraph/cosmo

Length of output: 2424


🏁 Script executed:

#!/bin/bash
# Check for Makefile or test scripts
find . -maxdepth 3 -name "Makefile" -o -name "*.sh" | grep -E "test|make" | head -10

# Look for CI configuration that shows how tests are run
ls -la .github/workflows/ 2>/dev/null | head -20

# Check if there's a root-level Makefile
cat Makefile 2>/dev/null | grep -A5 "test:" || echo "No test target in Makefile"

# Check the go.mod at repo root to understand module structure
head -20 go.mod

# Verify the relative path from repo root perspective
ls -la samples/services/employee.v1 2>/dev/null || echo "Path doesn't exist from repo root"
ls -la router/pkg/connectrpc/samples/services/employee.v1 2>/dev/null || echo "Path from full package location"

Repository: wundergraph/cosmo

Length of output: 2480


🏁 Script executed:

#!/bin/bash
# Check for go.mod in router package
head -30 router/pkg/connectrpc/go.mod 2>/dev/null || echo "No go.mod in router/pkg/connectrpc"

# Check go.mod in router directory
head -20 router/go.mod 2>/dev/null | head -10

# Check the connect-go CI workflow to see how tests are run
cat .github/workflows/connect-go-ci.yaml

# Check if there's any working directory specification in tests
grep -r "working.directory\|working-directory\|chdir\|Chdir" router/pkg/connectrpc/*.go 2>/dev/null || echo "No working directory directives found"

Repository: wundergraph/cosmo

Length of output: 1030


🏁 Script executed:

#!/bin/bash
# Find CI workflows related to router tests
grep -l "router" .github/workflows/*.yaml | head -5

# Check if there's a specific workflow for router tests
cat .github/workflows/*router*.yaml 2>/dev/null | grep -A20 "connectrpc\|pkg/connectrpc" || echo "No direct reference to connectrpc tests"

# Look for where connectrpc tests are run
grep -r "connectrpc" .github/workflows/*.yaml

# Check the actual test invocation in makefiles
cat router-tests/Makefile 2>/dev/null || cat router/Makefile 2>/dev/null

# Check if tests specify working directory in go.mod or via build constraints
grep -n "working\|chdir" router/Makefile 2>/dev/null || echo "No working directory in Makefile"

Repository: wundergraph/cosmo

Length of output: 1595


🏁 Script executed:

#!/bin/bash
# Check router-tests directory structure
ls -la router-tests/ | head -20

# Check if samples are in router-tests
find router-tests -name "samples" -o -name "employee.v1" 2>/dev/null | head -10

# Check router-tests/Makefile more carefully - understand what test_target actually is
grep -n "test_target\|connectrpc" router-tests/Makefile

# Look at the actual CI that runs router tests
grep -A30 "build_test\|runs-on" .github/workflows/aws-lambda-router-ci.yaml | head -50

Repository: wundergraph/cosmo

Length of output: 2710


🏁 Script executed:

#!/bin/bash
# Compare the two connectrpc test locations
echo "=== Files in router/pkg/connectrpc/: ==="
ls -1 router/pkg/connectrpc/*.go | wc -l

echo "=== Files in router-tests/connectrpc/: ==="
ls -1 router-tests/connectrpc/*.go 2>/dev/null | wc -l || echo "No files (may be empty dir)"

# Check what's in router-tests/connectrpc
ls -la router-tests/connectrpc/

# Check the testdata structure
echo "=== router-tests testdata structure: ==="
find router-tests/testdata/connectrpc -type f | head -20

# Key question: where does constructor_validation_test.go exist?
find . -name "constructor_validation_test.go"

# Check if router tests import from router/pkg/connectrpc
grep -r "pkg/connectrpc\|connectrpc\." router-tests/connectrpc/ 2>/dev/null | head -5

Repository: wundergraph/cosmo

Length of output: 2290


🏁 Script executed:

#!/bin/bash
# Check if there's a Makefile in router directory
ls -la router/Makefile 2>/dev/null || echo "No Makefile in router/"

# Search for CI workflows that mention router module tests
grep -l "router\|connectrpc" .github/workflows/*.yaml | xargs grep -l "go test\|gotestsum" 2>/dev/null | head -5

# Check if there's any CI that explicitly runs router/pkg/connectrpc tests
grep -r "pkg/connectrpc\|router.*test" .github/workflows/ | grep -v ".swp" | head -10

# Look for how the router module itself is tested
grep -r "working.directory.*router[^-]" .github/workflows/ || echo "No explicit working directory for router tests"

# Check if there's a go.mod at router level and how it's tested
ls -la router/go.mod && echo "router/go.mod exists"

# Check the full CI workflow for router tests
cat .github/workflows/*router*.yaml 2>/dev/null | grep -A50 "jobs:" | head -60

Repository: wundergraph/cosmo

Length of output: 2544


🏁 Script executed:

#!/bin/bash
# Get the full router-ci.yaml workflow
cat .github/workflows/router-ci.yaml

# Also check router/Makefile to understand test targets
cat router/Makefile

Repository: wundergraph/cosmo

Length of output: 16302


🏁 Script executed:

#!/bin/bash
# Look at the exact test code around line 100 to see what happens with LoadFromDirectory
sed -n '95,110p' router/pkg/connectrpc/constructor_validation_test.go

# Check if the error from LoadFromDirectory is being checked
sed -n '98,106p' router/pkg/connectrpc/constructor_validation_test.go

# Check what the test is actually testing - is it testing the LoadFromDirectory call or the VanguardService constructor?
sed -n '96,131p' router/pkg/connectrpc/constructor_validation_test.go

Repository: wundergraph/cosmo

Length of output: 1858


🏁 Script executed:

#!/bin/bash
# Key question: does the path actually work from router directory?
# Simulate running from router directory
cd router && ls -la samples/services/employee.v1 2>&1 || echo "Path does not exist from router directory"

# Go back to root
cd ..

# Now check what the correct path would be from router directory
ls -la router/pkg/connectrpc/samples/services/employee.v1

# Check if there's any symlink or the samples are duplicated
ls -la router/samples 2>/dev/null || echo "No router/samples directory"

# The key insight: check where the test helper actually looks
grep -A10 "GetSharedProtoLoader\|setupTestProtoLoaderFromDir" router/pkg/connectrpc/test_helpers.go

Repository: wundergraph/cosmo

Length of output: 1499


🏁 Script executed:

#!/bin/bash
# Check what happens when LoadFromDirectory is called with wrong path
# Look at the LoadFromDirectory implementation to see error handling
grep -A20 "func (pl \*ProtoLoader) LoadFromDirectory" router/pkg/connectrpc/proto_loader.go

# Check if there are any test results or CI logs showing this test passing
git log --oneline -n 20 router/pkg/connectrpc/constructor_validation_test.go 2>/dev/null | head -5

# Finally, verify the exact state - check if maybe the path is different in the actual code vs the snippet
cat router/pkg/connectrpc/constructor_validation_test.go | grep -n "samples/services"

Repository: wundergraph/cosmo

Length of output: 969


Based on my comprehensive verification, I can now provide the rewritten review comment:


Tests reference inaccessible filesystem paths when run from the router directory.

The VanguardService validation tests at lines 96–131 use relative paths like "samples/services/employee.v1" (line 100) that do not exist relative to the router module root. CI runs tests from the router directory (working-directory: ./router, per .github/workflows/router-ci.yaml), so the path should be "pkg/connectrpc/samples/services/employee.v1". The error from LoadFromDirectory is discarded (line 100: _ = protoLoader.LoadFromDirectory(...)), silently masking the failure and leaving the protoLoader without loaded services.

Similarly, lines 89, 162, and 172 reference "samples/services" which will also fail from the router working directory.

🤖 Prompt for AI Agents
router/pkg/connectrpc/constructor_validation_test.go lines 96-131: the test uses
a relative path "samples/services/employee.v1" that is invalid when CI runs from
the router module root and it ignores the error returned by
protoLoader.LoadFromDirectory; change the path to
"pkg/connectrpc/samples/services/employee.v1" (and similarly update other
occurrences at lines ~89, 162, 172), and propagate/check the error from
LoadFromDirectory instead of discarding it so the test fails fast when the proto
files can't be loaded.

Comment on lines +212 to +237
case descriptorpb.FieldDescriptorProto_TYPE_INT64,
descriptorpb.FieldDescriptorProto_TYPE_SINT64,
descriptorpb.FieldDescriptorProto_TYPE_SFIXED64:
// JSON numbers are float64, check if it's a valid integer
// Note: int64 can be represented as string in JSON to avoid precision loss
switch v := value.(type) {
case float64:
if v != float64(int64(v)) {
return &ValidationError{
Field: fieldPath,
Message: fmt.Sprintf("Int64 cannot represent non-integer value: %v", value),
}
}
case string:
// String representation is valid for int64
default:
return &ValidationError{
Field: fieldPath,
Message: fmt.Sprintf("Int64 cannot represent non-numeric value: %v", value),
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Int64/Uint64 string validation accepts any string without parsing.

When int64 or uint64 values are provided as strings (per protobuf JSON mapping spec), the current validation only checks the type but doesn't verify the string contains a valid integer:

case string:
    // String representation is valid for int64

This would accept invalid values like "abc" or "12.34". Consider adding actual parsing validation:

 case string:
-    // String representation is valid for int64
+    // Validate string can be parsed as int64
+    if _, err := strconv.ParseInt(v, 10, 64); err != nil {
+        return &ValidationError{
+            Field:   fieldPath,
+            Message: fmt.Sprintf("Int64 cannot represent value: %v", value),
+        }
+    }

Apply similar validation for uint64 at lines 250-267.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
router/pkg/connectrpc/server.go (1)

288-289: Critical: Reload does not wrap handler in h2c, breaking HTTP/2 support.

In Start() at lines 176-177, the handler is wrapped with h2c.NewHandler(handler, &http2.Server{}) to enable HTTP/2 over cleartext. However, Reload() sets the handler directly without the h2c wrapper, which will break gRPC compatibility after a reload.

Apply this diff to fix:

 	// Update HTTP server handler
-	s.httpServer.Handler = s.createHandler()
+	handler := s.createHandler()
+	s.httpServer.Handler = h2c.NewHandler(handler, &http2.Server{})
🧹 Nitpick comments (1)
router/pkg/connectrpc/server.go (1)

330-336: Optional: Consider removing unused statusCode field.

The responseWriter captures the HTTP status code in WriteHeader() (line 369), but the captured statusCode field is never read or used elsewhere in the codebase. If this is intentional for future logging or metrics, consider adding a comment. Otherwise, you can simplify by removing the unused field.

If you decide to remove it, apply this diff:

 type responseWriter struct {
 	http.ResponseWriter
-	statusCode int
 }

 func (rw *responseWriter) WriteHeader(code int) {
-	rw.statusCode = code
 	rw.ResponseWriter.WriteHeader(code)
 }

And remove the initialization in createHandler:

-	rw := &responseWriter{ResponseWriter: w, statusCode: 200}
+	rw := &responseWriter{ResponseWriter: w}

Also applies to: 365-371

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0e075e3 and 666d384.

📒 Files selected for processing (3)
  • router/pkg/connectrpc/constructor_validation_test.go (1 hunks)
  • router/pkg/connectrpc/server.go (1 hunks)
  • router/pkg/connectrpc/validator.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • router/pkg/connectrpc/constructor_validation_test.go
  • router/pkg/connectrpc/validator.go
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-09-19T15:08:03.085Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2223
File: router/cmd/main.go:32-33
Timestamp: 2025-09-19T15:08:03.085Z
Learning: pyroscope-go's Config.ServerAddress field accepts both full URLs (e.g., "http://localhost:4040") and simple host:port format (e.g., "localhost:4040"). The library handles URL normalization internally, so both formats work correctly.

Applied to files:

  • router/pkg/connectrpc/server.go
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: build-router
  • GitHub Check: build_push_image
  • GitHub Check: build_push_image (nonroot)
  • GitHub Check: integration_test (./events)
  • GitHub Check: image_scan (nonroot)
  • GitHub Check: integration_test (./. ./fuzzquery ./lifecycle ./modules)
  • GitHub Check: integration_test (./telemetry)
  • GitHub Check: image_scan
  • GitHub Check: build_test
  • GitHub Check: Analyze (go)
  • GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (7)
router/pkg/connectrpc/server.go (7)

20-47: LGTM!

The configuration structure and server type definitions are well-organized and include all necessary components for lifecycle management, service discovery, proto loading, operation registry, and HTTP/2 support.


49-90: LGTM!

The constructor properly validates required configuration, sets sensible defaults, and creates a resilient HTTP client with retry capabilities. The protocol normalization for GraphQL endpoint is appropriate.


92-208: LGTM!

The Start method correctly orchestrates service discovery, component initialization, and server startup. The ProtoLoader is created before initializeComponents() (ensuring the RPCHandler gets the fresh loader), and the handler is properly wrapped with h2c for HTTP/2 support.


210-227: LGTM!

The Stop method implements proper graceful shutdown with a reasonable timeout and appropriate error handling.


295-315: LGTM!

The component initialization method correctly creates the operation registry and RPC handler, with clear documentation about the ProtoLoader dependency.


317-323: LGTM!

The deprecated LoadOperations method is appropriately maintained for backward compatibility with a clear warning message.


344-410: LGTM!

The accessor methods properly handle nil checks, and the responseWriter's streaming interface implementations (Flush, Push, Hijack) correctly delegate to the underlying ResponseWriter when supported, which is essential for gRPC streaming compatibility.

Copy link
Member

@endigma endigma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

partial review

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
router-tests/testdata/connectrpc/README.md (1)

24-33: Scope concern: Verify whether "Regenerating Client Code" section aligns with feedback.

The past review feedback requested cutting the README down to "just explaining how to write new tests." The current "Regenerating Client Code" section (lines 24+) addresses client code management rather than test writing. If the intent is to focus strictly on test authoring, consider moving or removing this section; otherwise, clarify the README's scope in the intro.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 996c6c0 and 09d48ef.

📒 Files selected for processing (1)
  • router-tests/testdata/connectrpc/README.md (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: dkorittki
Repo: wundergraph/cosmo PR: 2273
File: router/core/graphql_handler.go:0-0
Timestamp: 2025-11-19T15:13:57.821Z
Learning: In the Cosmo router (wundergraph/cosmo), error handling follows a two-phase pattern: (1) Prehandler phase handles request parsing, validation, and setup errors using `httpGraphqlError` and `writeOperationError` (in files like graphql_prehandler.go, operation_processor.go, parse_multipart.go, batch.go); (2) Execution phase handles resolver execution errors using `WriteError` in GraphQLHandler.ServeHTTP. Because all `httpGraphqlError` instances are caught in the prehandler before ServeHTTP is invoked, any error type checks for `httpGraphqlError` in the execution-phase WriteError method are unreachable code.
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2141
File: router-tests/http_subscriptions_test.go:17-55
Timestamp: 2025-08-28T09:17:49.477Z
Learning: The Cosmo router uses a custom, intentionally rigid multipart implementation for GraphQL subscriptions. The multipart parsing in test files should remain strict and not be made more tolerant, as this rigidity is by design.
🪛 markdownlint-cli2 (0.18.1)
router-tests/testdata/connectrpc/README.md

16-16: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: build-router
  • GitHub Check: integration_test (./events)
  • GitHub Check: image_scan
  • GitHub Check: integration_test (./. ./fuzzquery ./lifecycle ./modules)
  • GitHub Check: integration_test (./telemetry)
  • GitHub Check: image_scan (nonroot)
  • GitHub Check: build_test
  • GitHub Check: build_push_image
  • GitHub Check: build_push_image (nonroot)
  • GitHub Check: Analyze (javascript-typescript)
  • GitHub Check: Analyze (go)

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
router/pkg/connectrpc/handler_test.go (1)

84-94: Add missing test case for nil ProtoLoader.

The constructor validates that ProtoLoader is not nil (handler.go line 177-179), but there's no test case covering this validation path.

Add a test case:

+	t.Run("should return error when proto loader is missing", func(t *testing.T) {
+		operationRegistry := NewOperationRegistry(logger)
+		handler, err := NewRPCHandler(HandlerConfig{
+			GraphQLEndpoint:   "http://localhost:4000/graphql",
+			HTTPClient:        httpClient,
+			Logger:            logger,
+			OperationRegistry: operationRegistry,
+		})
+
+		assert.Error(t, err)
+		assert.Nil(t, handler)
+		assert.ErrorContains(t, err, "proto loader is required")
+	})
router/pkg/connectrpc/server.go (1)

109-134: Consider using LoadFromDirectories for consistency.

The code calls LoadFromDirectory separately for each service (line 118), which bypasses the package uniqueness validation in LoadFromDirectories. While service discovery already validates package.service uniqueness, using LoadFromDirectories would provide additional validation and be more efficient.

Consider collecting all service directories and loading them in one call:

+	// Collect all service directories
+	serviceDirs := make([]string, len(discoveredServices))
+	for i, service := range discoveredServices {
+		serviceDirs[i] = service.ServiceDir
+	}
+
+	// Load all proto files at once (validates package uniqueness)
+	if err := server.protoLoader.LoadFromDirectories(serviceDirs); err != nil {
+		return nil, fmt.Errorf("failed to load proto files: %w", err)
+	}
+
 	// Load proto files and operations for each discovered service
 	for _, service := range discoveredServices {
 		server.logger.Info("loading service",
 			zap.String("service", service.FullName),
 			zap.String("dir", service.ServiceDir),
 			zap.Int("proto_files", len(service.ProtoFiles)),
 			zap.Int("operation_files", len(service.OperationFiles)))
 
-		// Load proto files for this service
-		if err := server.protoLoader.LoadFromDirectory(service.ServiceDir); err != nil {
-			return nil, fmt.Errorf("failed to load proto files for service %s: %w", service.FullName, err)
-		}
-
 		// Load operations for this service
 		if len(service.OperationFiles) > 0 {
router/pkg/config/config.schema.json (1)

2198-2202: Use http-url format for consistency with other endpoint configurations.

Line 2201 specifies graphql_endpoint with format: "url", but other endpoint fields in the schema (e.g., cdn.url, mcp.router_url) use the more restrictive http-url format. For consistency and validation clarity, consider updating this to http-url.

         "graphql_endpoint": {
           "type": "string",
-          "description": "Custom URL to use for the router GraphQL endpoint in ConnectRPC. Use this when your router is behind a proxy. This URL is used to forward RPC requests as GraphQL operations.",
+          "description": "Custom URL to use for the router GraphQL endpoint in ConnectRPC. Use this when your router is behind a proxy. This URL is used to forward RPC requests as GraphQL operations.",
-          "format": "url"
+          "format": "http-url"
         }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 09d48ef and 6836b37.

⛔ Files ignored due to path filters (1)
  • router-tests/testdata/connectrpc/client/employee.v1/service.pb.go is excluded by !**/*.pb.go
📒 Files selected for processing (12)
  • router-tests/testdata/connectrpc/buf.gen.yaml (1 hunks)
  • router-tests/testdata/connectrpc/client/employee.v1/employeev1connect/service.connect.go (1 hunks)
  • router/core/router.go (5 hunks)
  • router/core/router_config.go (4 hunks)
  • router/pkg/config/config.go (2 hunks)
  • router/pkg/config/config.schema.json (1 hunks)
  • router/pkg/connectrpc/error_handling_test.go (1 hunks)
  • router/pkg/connectrpc/handler.go (1 hunks)
  • router/pkg/connectrpc/handler_test.go (1 hunks)
  • router/pkg/connectrpc/proto_loader.go (1 hunks)
  • router/pkg/connectrpc/server.go (1 hunks)
  • router/pkg/connectrpc/server_test.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (7)
  • router/core/router_config.go
  • router/pkg/config/config.go
  • router/pkg/connectrpc/server_test.go
  • router-tests/testdata/connectrpc/buf.gen.yaml
  • router/core/router.go
  • router/pkg/connectrpc/error_handling_test.go
  • router/pkg/connectrpc/handler.go
🧰 Additional context used
🧠 Learnings (8)
📓 Common learnings
Learnt from: dkorittki
Repo: wundergraph/cosmo PR: 2273
File: router/core/graphql_handler.go:0-0
Timestamp: 2025-11-19T15:13:57.821Z
Learning: In the Cosmo router (wundergraph/cosmo), error handling follows a two-phase pattern: (1) Prehandler phase handles request parsing, validation, and setup errors using `httpGraphqlError` and `writeOperationError` (in files like graphql_prehandler.go, operation_processor.go, parse_multipart.go, batch.go); (2) Execution phase handles resolver execution errors using `WriteError` in GraphQLHandler.ServeHTTP. Because all `httpGraphqlError` instances are caught in the prehandler before ServeHTTP is invoked, any error type checks for `httpGraphqlError` in the execution-phase WriteError method are unreachable code.
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2141
File: router-tests/http_subscriptions_test.go:17-55
Timestamp: 2025-08-28T09:17:49.477Z
Learning: The Cosmo router uses a custom, intentionally rigid multipart implementation for GraphQL subscriptions. The multipart parsing in test files should remain strict and not be made more tolerant, as this rigidity is by design.
📚 Learning: 2025-10-01T20:39:16.113Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2252
File: router-tests/telemetry/telemetry_test.go:9684-9693
Timestamp: 2025-10-01T20:39:16.113Z
Learning: Repo preference: In router-tests/telemetry/telemetry_test.go, keep strict > 0 assertions for request.operation.*Time (parsingTime, normalizationTime, validationTime, planningTime) in telemetry-related tests; do not relax to >= 0 unless CI flakiness is observed.

Applied to files:

  • router/pkg/connectrpc/handler_test.go
📚 Learning: 2025-09-02T12:52:27.677Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2181
File: router/core/operation_processor.go:0-0
Timestamp: 2025-09-02T12:52:27.677Z
Learning: Hash validation for persisted queries with query bodies is performed in router/core/graphql_prehandler.go in the handleOperation function at lines 571-578, not in the APQ processing logic in operation_processor.go. This validates that extensions.persistedQuery.sha256Hash matches the computed query hash before any APQ operations occur. There's also a test case that verifies this behavior.

Applied to files:

  • router/pkg/connectrpc/handler_test.go
📚 Learning: 2025-11-19T15:13:57.821Z
Learnt from: dkorittki
Repo: wundergraph/cosmo PR: 2273
File: router/core/graphql_handler.go:0-0
Timestamp: 2025-11-19T15:13:57.821Z
Learning: In the Cosmo router (wundergraph/cosmo), error handling follows a two-phase pattern: (1) Prehandler phase handles request parsing, validation, and setup errors using `httpGraphqlError` and `writeOperationError` (in files like graphql_prehandler.go, operation_processor.go, parse_multipart.go, batch.go); (2) Execution phase handles resolver execution errors using `WriteError` in GraphQLHandler.ServeHTTP. Because all `httpGraphqlError` instances are caught in the prehandler before ServeHTTP is invoked, any error type checks for `httpGraphqlError` in the execution-phase WriteError method are unreachable code.

Applied to files:

  • router/pkg/connectrpc/handler_test.go
📚 Learning: 2025-07-21T15:06:36.664Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2067
File: router/pkg/config/config.schema.json:1637-1644
Timestamp: 2025-07-21T15:06:36.664Z
Learning: In the Cosmo router project, when extending JSON schema validation for security-sensitive fields like JWKS secrets, backwards compatibility is maintained by implementing warnings in the Go code rather than hard validation constraints in the schema. This allows existing configurations to continue working while alerting users to potential security issues.

Applied to files:

  • router/pkg/config/config.schema.json
📚 Learning: 2025-07-21T14:46:34.879Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2067
File: router/pkg/authentication/jwks_token_decoder.go:80-106
Timestamp: 2025-07-21T14:46:34.879Z
Learning: In the Cosmo router project, required field validation for JWKS configuration (Secret, Algorithm, KeyId) is handled at the JSON schema level in config.schema.json rather than through runtime validation in the Go code at router/pkg/authentication/jwks_token_decoder.go.

Applied to files:

  • router/pkg/config/config.schema.json
📚 Learning: 2025-06-30T20:39:02.387Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 1929
File: router/internal/circuit/manager.go:16-25
Timestamp: 2025-06-30T20:39:02.387Z
Learning: In the Cosmo router project, parameter validation for circuit breaker configuration is handled at the JSON schema level rather than through runtime validation methods on structs. The config.schema.json file contains comprehensive validation constraints for circuit breaker parameters.

Applied to files:

  • router/pkg/config/config.schema.json
📚 Learning: 2025-09-19T15:08:03.085Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2223
File: router/cmd/main.go:32-33
Timestamp: 2025-09-19T15:08:03.085Z
Learning: pyroscope-go's Config.ServerAddress field accepts both full URLs (e.g., "http://localhost:4040") and simple host:port format (e.g., "localhost:4040"). The library handles URL normalization internally, so both formats work correctly.

Applied to files:

  • router/pkg/connectrpc/server.go
🧬 Code graph analysis (2)
router/pkg/connectrpc/handler_test.go (4)
router/pkg/connectrpc/operation_registry.go (2)
  • NewOperationRegistry (32-41)
  • OperationRegistry (24-29)
router/pkg/connectrpc/proto_loader.go (2)
  • NewProtoLoader (62-72)
  • ProtoLoader (53-59)
router/pkg/connectrpc/handler.go (2)
  • NewRPCHandler (161-194)
  • HandlerConfig (152-158)
router/pkg/connectrpc/test_helpers.go (1)
  • MockHTTPClient (64-71)
router/pkg/connectrpc/server.go (4)
router/pkg/connectrpc/proto_loader.go (2)
  • ProtoLoader (53-59)
  • NewProtoLoader (62-72)
router/pkg/connectrpc/handler.go (3)
  • RPCHandler (143-149)
  • NewRPCHandler (161-194)
  • HandlerConfig (152-158)
router/pkg/connectrpc/vanguard_service.go (3)
  • VanguardService (60-65)
  • NewVanguardService (68-93)
  • VanguardServiceConfig (53-57)
router/pkg/connectrpc/service_discovery.go (2)
  • DiscoverServices (46-153)
  • ServiceDiscoveryConfig (29-34)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: build-router
  • GitHub Check: integration_test (./telemetry)
  • GitHub Check: build_push_image
  • GitHub Check: integration_test (./. ./fuzzquery ./lifecycle ./modules)
  • GitHub Check: build_push_image (nonroot)
  • GitHub Check: integration_test (./events)
  • GitHub Check: image_scan (nonroot)
  • GitHub Check: image_scan
  • GitHub Check: build_test
  • GitHub Check: Analyze (javascript-typescript)
  • GitHub Check: Analyze (go)
🔇 Additional comments (27)
router/pkg/connectrpc/handler_test.go (7)

112-178: LGTM!

The tests properly cover both success and error paths for RPC handling, including service-scoped operation registry usage.


184-220: LGTM - Test structure improved.

The nested test structure follows the previous review feedback and clearly describes expected behavior ("forwarding headers from context should forward listed headers").


222-249: LGTM!

The test properly verifies HTTP transport error handling using a custom error round tripper.


251-280: LGTM!

The test correctly verifies operation counting across service-scoped registries.


282-311: LGTM!

The test properly validates both positive and negative cases for operation existence.


313-496: LGTM - Comprehensive coverage.

The test suite thoroughly covers all edge cases for protobuf JSON to GraphQL variable conversion, including nested structures, arrays, nulls, and empty values.


498-577: LGTM!

Both test functions provide comprehensive coverage with clean table-driven tests for snakeToCamel and thorough case coverage for convertKeysRecursive.

router/pkg/connectrpc/proto_loader.go (9)

16-72: LGTM!

The struct definitions are well-documented, and the constructor properly initializes the local files registry, avoiding global registry conflicts.


74-104: LGTM!

The directory loading logic is clear and handles errors appropriately.


138-175: LGTM - Package uniqueness validation fixed.

The validation logic has been corrected per previous review feedback. By tracking existing services before loading each file (lines 139-142) and only validating newly added services (lines 155-159), duplicate proto packages from different directories will now be properly detected.


185-206: LGTM!

Standard recursive file discovery implementation.


208-236: LGTM!

The proto file parsing logic is standard and handles errors appropriately.


238-278: LGTM - Race condition resolved.

The global registry race condition has been fixed by using a local pl.files registry instance (line 255). Each ProtoLoader instance now has its own registry, eliminating cross-instance conflicts mentioned in previous reviews.


280-322: LGTM!

The service definition extraction logic properly converts descriptors and extracts all method metadata.


324-334: LGTM!

The getter methods are straightforward. The read-only comment on GetServices is helpful guidance for callers.


336-356: LGTM!

The method lookup and files registry accessor are correctly implemented.

router/pkg/connectrpc/server.go (10)

20-47: LGTM!

The configuration and server structs are well-defined with appropriate fields for lifecycle management and component wiring.


168-213: LGTM!

The server start logic properly wraps the handler with h2c for HTTP/2 support and starts the server with appropriate timeouts.


215-232: LGTM!

Standard graceful shutdown implementation with appropriate timeout.


234-299: LGTM - Previous issues resolved.

Both critical issues from previous reviews have been fixed:

  1. The h2c wrapper is now applied during reload (line 295)
  2. ProtoLoader is created before initializeComponents (line 249), ensuring the handler gets fresh proto definitions

301-321: LGTM!

Component initialization is clear, and the comment about ProtoLoader requirement is helpful.


323-340: LGTM!

The handler creation logic properly wraps the transcoder to capture response status.


342-356: LGTM!

The getter methods properly handle nil checks and delegate to the vanguard service.


358-391: LGTM!

The response writer wrapper properly implements all required streaming interfaces and checks for underlying support before delegating.


393-399: LGTM!

The operation count getter properly handles nil checks.


401-407: LGTM!

The address accessor properly handles the nil listener case.

router-tests/testdata/connectrpc/client/employee.v1/employeev1connect/service.connect.go (1)

1-307: Generated code - LGTM.

This file is auto-generated by protoc-gen-connect-go (line 1). The generated client and handler scaffolding follows standard Connect RPC patterns and should not be manually modified.

asoorm added 13 commits December 3, 2025 08:21
Remove test helper abstractions and generated proto code that caused
protobuf namespace conflicts. Tests now directly instantiate the
ConnectRPC server and focus on core functionality:

- Remove router-tests/connectrpc/server.go helper
- Simplify connectrpc_test.go to use direct server instantiation
- Add README.md documenting test data structure and purpose

This resolves protobuf global registry conflicts that occurred when
generated code auto-registered types that the server also tried to
register at runtime.
@asoorm asoorm force-pushed the ahmet/eng-8277-connect-rpc-handler-final branch from 6836b37 to c1b1b30 Compare December 3, 2025 08:22
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (7)
router-tests/testdata/connectrpc/README.md (1)

16-22: Add a language identifier to the fenced code block.

Markdown lint (MD040) expects a language on fenced blocks; use a generic language like text for the directory tree example.

-```
+```text
 services/
 └── myservice.v1/
     ├── service.proto                    # Proto service definition
     ├── QueryGetItem.graphql            # GraphQL query operation
     └── MutationCreateItem.graphql      # GraphQL mutation operation
-```
+```
router/pkg/config/config.schema.json (1)

2137-2204: Clarify schema semantics for global vs per-service providers in connect_rpc.

The connect_rpc block enforces services_provider_id when enabled: true, but the services items’ proto_provider_id / operations_provider_id remain unconstrained (optional, no minItems, no relationship with the global provider). This makes it easy to express configs the runtime may not actually support (e.g., mixed or partially-specified overrides).

Consider tightening the schema and/or comments here to describe the intended model more explicitly (global-only vs per-service overrides vs required both), and only expose per-service provider fields once they’re fully wired through the Go configuration if that’s not already the case.

router-tests/testdata/connectrpc/services/employee.v1/service.proto (1)

39-52: Fix minor grammar in mood mutation comment.

The comment reads This mutation update the mood of an employee. — consider changing to This mutation updates the mood of an employee. (or similar) so generated docs/readers don’t stumble on it.

router/pkg/connectrpc/handler_test.go (1)

112-178: Consider nested subtests for better organization.

The test cases in TestHandleRPC could benefit from nested subtests similar to TestExecuteGraphQL (lines 184-220), which uses clear hierarchical naming like "forwarding headers from context" → "should forward listed headers". This improves readability and makes test failures easier to diagnose.

router/pkg/connectrpc/validator.go (1)

217-237: Add parsing validation for int64 string representation.

When int64 values are provided as strings (per protobuf JSON mapping spec), the validation only checks the type but doesn't verify the string contains a valid integer. This would accept invalid values like "abc" or "12.34".

Consider adding parsing validation:

 case string:
-    // String representation is valid for int64
+    // Validate string can be parsed as int64
+    if _, err := strconv.ParseInt(v, 10, 64); err != nil {
+        return &ValidationError{
+            Field:   fieldPath,
+            Message: fmt.Sprintf("Int64 cannot represent value: %v", value),
+        }
+    }

Apply similar validation for uint64 at lines 255-272.

router/pkg/connectrpc/vanguard_service.go (1)

218-231: Ensure Connect error messages exposed in JSON don’t leak internal details.

This new Connect-style JSON error handling is a big improvement over writing err.Error() directly, but writeConnectError still surfaces connectErr.Message() verbatim to clients for all error codes:

  • Any connect.NewError created inside RPCHandler.HandleRPC (or deeper) that embeds low-level details (service/method names, internal URLs, backend messages, etc.) will have those details echoed back in the "message" field.
  • For internal/server-side failures (e.g. CodeInternal, CodeUnknown, CodeDataLoss, possibly some CodeUnavailable/CodeDeadlineExceeded cases), this can still expose implementation details that should ideally live only in logs.

Consider tightening this in one of these ways:

  • Differentiate “public” vs “internal” codes.
    For clearly client-facing errors (e.g. InvalidArgument, Unauthenticated, PermissionDenied, NotFound, FailedPrecondition, OutOfRange, Aborted, ResourceExhausted), keep using connectErr.Message() as-is (assuming upstream uses user-oriented messages).
    For internal-ish codes (e.g. Internal, Unknown, DataLoss, and maybe some Unavailable/DeadlineExceeded), return a fixed generic message like "internal server error" while:

    • Logging the full connectErr (including cause and metadata) with vs.logger.Error.
    • Optionally preserving structured fields like graphql_errors in the JSON if they’re already curated/safe.
  • Align mapping & tests.
    Since this path uses connectCodeToHTTPStatus, it might be worth a small table-driven test that asserts your connectCodeToHTTPStatus and httpStatusToConnectCode stay in sync for all codes you intend to support and that “internal” codes are the ones that get generic messages.

This keeps the improved Connect error shape and GraphQL metadata while reducing the chance of accidental information leakage from lower layers.

Also applies to: 244-285

router/pkg/connectrpc/samples/services/employee.v1/service.proto (1)

50-51: Fix grammatical error in comment on update_mood.

The comment still reads This mutation update the mood of an employee.; the verb should agree with the subject.

-  // This mutation update the mood of an employee.
+  // This mutation updates the mood of an employee.
🧹 Nitpick comments (6)
protographic/tests/operations/operation-validation.test.ts (2)

232-245: Slight duplication in camelCase scenarios

You have two tests exercising camelCase (getUser / getUserById) with very similar setup; one checks rejection and the other focuses on the helpful message. Consider consolidating into a single test that asserts both that an error is thrown and that the message contains the examples, or change the second test to reuse a shared helper to avoid repetition.

Also applies to: 292-305


306-360: Add positive-path tests for PascalCase mutation/subscription names

You validate that non-PascalCase names for mutation and subscription are rejected, but there are no corresponding tests proving that valid PascalCase names (e.g., CreateUser, OnMessageAdded) are accepted. Adding those would close the loop on behavior across all operation types and guard against over‑restrictive changes to the validator.

protographic/src/operation-to-proto.ts (1)

304-304: Nit: Duplicate step number in comments.

Line 294 and line 304 both use "4." as the step number. Consider renumbering for consistency.

-    // 4. Create request message from variables
+    // 5. Create request message from variables
router/pkg/connectrpc/proto_loader_test.go (1)

11-107: Good coverage of ProtoLoader behavior; only tiny polish possible.

The tests nicely cover service discovery, method metadata, and non-streaming guarantees for query methods. If you want to tighten things further later, you could:

  • Factor "samples/services/employee.v1" into a small helper/const to avoid repetition.
  • Add a negative-case test for unknown service or method to lock down the error contract.

Both are optional; current tests are already solid.

router/pkg/config/connectrpc_test.go (1)

12-203: Comprehensive ConnectRPC config tests; behavior is well-anchored.

Zero-value, YAML + env, integration, and multi-file merge paths are all exercised cleanly, which should make future refactors to ConnectRPCConfiguration or LoadConfig safer. An optional future enhancement could be a test that bootstraps ConnectRPC purely from CONNECT_RPC_* env vars (no YAML), but the current suite already covers the critical semantics.

router/pkg/connectrpc/vanguard_service.go (1)

124-127: Clarify expected request path format between handler comment and extractMethodName.

The comments and implementation disagree on the path shape:

  • Lines 124–126: createServiceHandler says the handler will receive requests at paths like /Method (without the service prefix).
  • Lines 288–307: extractMethodName’s comment and logic expect [/]package.Service/Method and verify that parts[0] == serviceName.

If Vanguard actually passes paths including the fully-qualified service name (e.g. /employee.v1.EmployeeService/GetEmployeeById), then the comment on lines 124–126 is stale and should be updated. If instead the handler really sees /Method, extractMethodName will always return "", and every call will be treated as CodeNotFound.

Please confirm the actual r.URL.Path format as seen by this handler under Vanguard and either:

  • Update the comment to reflect the package.Service/Method format (likely case), or
  • Adjust extractMethodName to support the /Method format if that’s what Vanguard supplies.

Also applies to: 288-307

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6836b37 and c1b1b30.

⛔ Files ignored due to path filters (3)
  • router-tests/go.sum is excluded by !**/*.sum
  • router-tests/testdata/connectrpc/client/employee.v1/service.pb.go is excluded by !**/*.pb.go
  • router/go.sum is excluded by !**/*.sum
📒 Files selected for processing (57)
  • cli/src/commands/grpc-service/commands/generate.ts (0 hunks)
  • protographic/src/operation-to-proto.ts (2 hunks)
  • protographic/tests/operations/operation-validation.test.ts (1 hunks)
  • router-tests/connectrpc/connectrpc_client_test.go (1 hunks)
  • router-tests/connectrpc/connectrpc_server_lifecycle_test.go (1 hunks)
  • router-tests/connectrpc/connectrpc_test.go (1 hunks)
  • router-tests/connectrpc/connectrpc_test_helpers.go (1 hunks)
  • router-tests/go.mod (3 hunks)
  • router-tests/testdata/connectrpc/README.md (1 hunks)
  • router-tests/testdata/connectrpc/buf.gen.yaml (1 hunks)
  • router-tests/testdata/connectrpc/buf.yaml (1 hunks)
  • router-tests/testdata/connectrpc/client/employee.v1/employeev1connect/service.connect.go (1 hunks)
  • router-tests/testdata/connectrpc/services/employee.v1/MutationUpdateEmployeeMood.graphql (1 hunks)
  • router-tests/testdata/connectrpc/services/employee.v1/QueryGetEmployeeById.graphql (1 hunks)
  • router-tests/testdata/connectrpc/services/employee.v1/QueryGetEmployeeByPets.graphql (1 hunks)
  • router-tests/testdata/connectrpc/services/employee.v1/QueryGetEmployeeWithMood.graphql (1 hunks)
  • router-tests/testdata/connectrpc/services/employee.v1/QueryGetEmployees.graphql (1 hunks)
  • router-tests/testdata/connectrpc/services/employee.v1/QueryGetEmployeesByPetsInlineFragment.graphql (1 hunks)
  • router-tests/testdata/connectrpc/services/employee.v1/QueryGetEmployeesByPetsNamedFragment.graphql (1 hunks)
  • router-tests/testdata/connectrpc/services/employee.v1/service.proto (1 hunks)
  • router-tests/testdata/connectrpc/services/employee.v1/service.proto.lock.json (1 hunks)
  • router/connect.config.yaml (1 hunks)
  • router/core/router.go (5 hunks)
  • router/core/router_config.go (4 hunks)
  • router/core/supervisor_instance.go (1 hunks)
  • router/go.mod (3 hunks)
  • router/pkg/config/config.go (2 hunks)
  • router/pkg/config/config.schema.json (1 hunks)
  • router/pkg/config/connectrpc_test.go (1 hunks)
  • router/pkg/config/testdata/config_defaults.json (1 hunks)
  • router/pkg/config/testdata/config_full.json (1 hunks)
  • router/pkg/connectrpc/constructor_validation_test.go (1 hunks)
  • router/pkg/connectrpc/error_handling_test.go (1 hunks)
  • router/pkg/connectrpc/handler.go (1 hunks)
  • router/pkg/connectrpc/handler_test.go (1 hunks)
  • router/pkg/connectrpc/operation_registry.go (1 hunks)
  • router/pkg/connectrpc/operation_registry_test.go (1 hunks)
  • router/pkg/connectrpc/proto_loader.go (1 hunks)
  • router/pkg/connectrpc/proto_loader_test.go (1 hunks)
  • router/pkg/connectrpc/samples/service.graphqls (1 hunks)
  • router/pkg/connectrpc/samples/services/employee.v1/MutationUpdateEmployeeMood.graphql (1 hunks)
  • router/pkg/connectrpc/samples/services/employee.v1/QueryGetEmployeeById.graphql (1 hunks)
  • router/pkg/connectrpc/samples/services/employee.v1/QueryGetEmployeeByPets.graphql (1 hunks)
  • router/pkg/connectrpc/samples/services/employee.v1/QueryGetEmployeeWithMood.graphql (1 hunks)
  • router/pkg/connectrpc/samples/services/employee.v1/QueryGetEmployees.graphql (1 hunks)
  • router/pkg/connectrpc/samples/services/employee.v1/QueryGetEmployeesByPetsInlineFragment.graphql (1 hunks)
  • router/pkg/connectrpc/samples/services/employee.v1/QueryGetEmployeesByPetsNamedFragment.graphql (1 hunks)
  • router/pkg/connectrpc/samples/services/employee.v1/service.proto (1 hunks)
  • router/pkg/connectrpc/samples/services/employee.v1/service.proto.lock.json (1 hunks)
  • router/pkg/connectrpc/server.go (1 hunks)
  • router/pkg/connectrpc/server_test.go (1 hunks)
  • router/pkg/connectrpc/service_discovery.go (1 hunks)
  • router/pkg/connectrpc/service_discovery_test.go (1 hunks)
  • router/pkg/connectrpc/test_helpers.go (1 hunks)
  • router/pkg/connectrpc/validator.go (1 hunks)
  • router/pkg/connectrpc/vanguard_service.go (1 hunks)
  • router/pkg/connectrpc/vanguard_service_test.go (1 hunks)
💤 Files with no reviewable changes (1)
  • cli/src/commands/grpc-service/commands/generate.ts
🚧 Files skipped from review as they are similar to previous changes (26)
  • router-tests/testdata/connectrpc/services/employee.v1/QueryGetEmployeeById.graphql
  • router/pkg/connectrpc/samples/services/employee.v1/MutationUpdateEmployeeMood.graphql
  • router/pkg/config/config.go
  • router/core/router_config.go
  • router/pkg/connectrpc/samples/services/employee.v1/QueryGetEmployeeWithMood.graphql
  • router/pkg/config/testdata/config_defaults.json
  • router-tests/testdata/connectrpc/services/employee.v1/MutationUpdateEmployeeMood.graphql
  • router/pkg/config/testdata/config_full.json
  • router/pkg/connectrpc/operation_registry_test.go
  • router-tests/connectrpc/connectrpc_test.go
  • router/connect.config.yaml
  • router/core/supervisor_instance.go
  • router-tests/testdata/connectrpc/services/employee.v1/QueryGetEmployeesByPetsInlineFragment.graphql
  • router/pkg/connectrpc/constructor_validation_test.go
  • router-tests/testdata/connectrpc/services/employee.v1/QueryGetEmployees.graphql
  • router/pkg/connectrpc/samples/services/employee.v1/QueryGetEmployeeByPets.graphql
  • router/pkg/connectrpc/samples/services/employee.v1/QueryGetEmployeeById.graphql
  • router/pkg/connectrpc/samples/services/employee.v1/service.proto.lock.json
  • router-tests/testdata/connectrpc/services/employee.v1/QueryGetEmployeesByPetsNamedFragment.graphql
  • router/pkg/connectrpc/test_helpers.go
  • router-tests/testdata/connectrpc/services/employee.v1/QueryGetEmployeeByPets.graphql
  • router/pkg/connectrpc/error_handling_test.go
  • router-tests/testdata/connectrpc/buf.gen.yaml
  • router/pkg/connectrpc/vanguard_service_test.go
  • router/go.mod
  • router/pkg/connectrpc/operation_registry.go
🧰 Additional context used
🧠 Learnings (18)
📓 Common learnings
Learnt from: dkorittki
Repo: wundergraph/cosmo PR: 2273
File: router/core/graphql_handler.go:0-0
Timestamp: 2025-11-19T15:13:57.821Z
Learning: In the Cosmo router (wundergraph/cosmo), error handling follows a two-phase pattern: (1) Prehandler phase handles request parsing, validation, and setup errors using `httpGraphqlError` and `writeOperationError` (in files like graphql_prehandler.go, operation_processor.go, parse_multipart.go, batch.go); (2) Execution phase handles resolver execution errors using `WriteError` in GraphQLHandler.ServeHTTP. Because all `httpGraphqlError` instances are caught in the prehandler before ServeHTTP is invoked, any error type checks for `httpGraphqlError` in the execution-phase WriteError method are unreachable code.
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2141
File: router-tests/http_subscriptions_test.go:17-55
Timestamp: 2025-08-28T09:17:49.477Z
Learning: The Cosmo router uses a custom, intentionally rigid multipart implementation for GraphQL subscriptions. The multipart parsing in test files should remain strict and not be made more tolerant, as this rigidity is by design.
📚 Learning: 2025-10-01T20:39:16.113Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2252
File: router-tests/telemetry/telemetry_test.go:9684-9693
Timestamp: 2025-10-01T20:39:16.113Z
Learning: Repo preference: In router-tests/telemetry/telemetry_test.go, keep strict > 0 assertions for request.operation.*Time (parsingTime, normalizationTime, validationTime, planningTime) in telemetry-related tests; do not relax to >= 0 unless CI flakiness is observed.

Applied to files:

  • router/pkg/connectrpc/server_test.go
  • router/pkg/config/connectrpc_test.go
  • router-tests/connectrpc/connectrpc_client_test.go
  • router-tests/go.mod
  • router/pkg/connectrpc/handler_test.go
  • router-tests/connectrpc/connectrpc_server_lifecycle_test.go
📚 Learning: 2025-08-20T22:13:25.222Z
Learnt from: StarpTech
Repo: wundergraph/cosmo PR: 2157
File: router-tests/go.mod:16-16
Timestamp: 2025-08-20T22:13:25.222Z
Learning: github.com/mark3labs/mcp-go v0.38.0 has regressions and should not be used in the wundergraph/cosmo project. v0.36.0 is the stable version that should be used across router-tests and other modules.

Applied to files:

  • router/pkg/connectrpc/server_test.go
  • router-tests/go.mod
  • router/core/router.go
📚 Learning: 2025-11-19T15:13:57.821Z
Learnt from: dkorittki
Repo: wundergraph/cosmo PR: 2273
File: router/core/graphql_handler.go:0-0
Timestamp: 2025-11-19T15:13:57.821Z
Learning: In the Cosmo router (wundergraph/cosmo), error handling follows a two-phase pattern: (1) Prehandler phase handles request parsing, validation, and setup errors using `httpGraphqlError` and `writeOperationError` (in files like graphql_prehandler.go, operation_processor.go, parse_multipart.go, batch.go); (2) Execution phase handles resolver execution errors using `WriteError` in GraphQLHandler.ServeHTTP. Because all `httpGraphqlError` instances are caught in the prehandler before ServeHTTP is invoked, any error type checks for `httpGraphqlError` in the execution-phase WriteError method are unreachable code.

Applied to files:

  • router/pkg/connectrpc/handler_test.go
  • router-tests/connectrpc/connectrpc_test_helpers.go
  • router/pkg/connectrpc/vanguard_service.go
  • router/pkg/connectrpc/handler.go
📚 Learning: 2025-09-02T12:52:27.677Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2181
File: router/core/operation_processor.go:0-0
Timestamp: 2025-09-02T12:52:27.677Z
Learning: Hash validation for persisted queries with query bodies is performed in router/core/graphql_prehandler.go in the handleOperation function at lines 571-578, not in the APQ processing logic in operation_processor.go. This validates that extensions.persistedQuery.sha256Hash matches the computed query hash before any APQ operations occur. There's also a test case that verifies this behavior.

Applied to files:

  • router/pkg/connectrpc/handler_test.go
📚 Learning: 2025-07-30T09:29:04.257Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2090
File: router/core/operation_processor.go:0-0
Timestamp: 2025-07-30T09:29:04.257Z
Learning: GraphQL operation names don't allow characters with more than 1 code point, so string length operations and slicing work correctly for both byte and character counting in GraphQL operation name processing.

Applied to files:

  • protographic/tests/operations/operation-validation.test.ts
  • protographic/src/operation-to-proto.ts
📚 Learning: 2025-09-19T15:08:03.085Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2223
File: router/cmd/main.go:32-33
Timestamp: 2025-09-19T15:08:03.085Z
Learning: pyroscope-go's Config.ServerAddress field accepts both full URLs (e.g., "http://localhost:4040") and simple host:port format (e.g., "localhost:4040"). The library handles URL normalization internally, so both formats work correctly.

Applied to files:

  • router/pkg/connectrpc/server.go
📚 Learning: 2025-07-21T14:46:34.879Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2067
File: router/pkg/authentication/jwks_token_decoder.go:80-106
Timestamp: 2025-07-21T14:46:34.879Z
Learning: In the Cosmo router project, required field validation for JWKS configuration (Secret, Algorithm, KeyId) is handled at the JSON schema level in config.schema.json rather than through runtime validation in the Go code at router/pkg/authentication/jwks_token_decoder.go.

Applied to files:

  • router/pkg/connectrpc/validator.go
  • router/pkg/config/config.schema.json
📚 Learning: 2025-07-21T15:06:36.664Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2067
File: router/pkg/config/config.schema.json:1637-1644
Timestamp: 2025-07-21T15:06:36.664Z
Learning: In the Cosmo router project, when extending JSON schema validation for security-sensitive fields like JWKS secrets, backwards compatibility is maintained by implementing warnings in the Go code rather than hard validation constraints in the schema. This allows existing configurations to continue working while alerting users to potential security issues.

Applied to files:

  • router/pkg/connectrpc/validator.go
  • router/pkg/config/config.schema.json
📚 Learning: 2025-09-17T20:55:39.456Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2172
File: router/core/graph_server.go:0-0
Timestamp: 2025-09-17T20:55:39.456Z
Learning: The Initialize method in router/internal/retrytransport/manager.go has been updated to properly handle feature-flag-only subgraphs by collecting subgraphs from both routerConfig.GetSubgraphs() and routerConfig.FeatureFlagConfigs.ConfigByFeatureFlagName, ensuring all subgraphs receive retry configuration.

Applied to files:

  • router/core/router.go
📚 Learning: 2025-09-24T12:54:00.765Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2222
File: router-tests/websocket_test.go:2238-2302
Timestamp: 2025-09-24T12:54:00.765Z
Learning: The wundergraph/cosmo project uses Go 1.25 (Go 1.25 minimum), so fmt.Appendf and other newer Go standard library functions are available and can be used without compatibility concerns.

Applied to files:

  • router/core/router.go
📚 Learning: 2025-09-24T12:54:00.765Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2222
File: router-tests/websocket_test.go:2238-2302
Timestamp: 2025-09-24T12:54:00.765Z
Learning: The wundergraph/cosmo project uses Go 1.25 (Go 1.23+ minimum), so fmt.Appendf and other newer Go standard library functions are available and can be used without compatibility concerns.

Applied to files:

  • router/core/router.go
📚 Learning: 2025-07-30T09:29:46.660Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2090
File: router/pkg/config/config.schema.json:0-0
Timestamp: 2025-07-30T09:29:46.660Z
Learning: The "operation_name_trim_limit" configuration property in router/pkg/config/config.schema.json should be placed at the security level as a sibling to complexity_limits, not inside the complexity_limits object.

Applied to files:

  • router/pkg/config/config.schema.json
📚 Learning: 2025-06-30T20:39:02.387Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 1929
File: router/internal/circuit/manager.go:16-25
Timestamp: 2025-06-30T20:39:02.387Z
Learning: In the Cosmo router project, parameter validation for circuit breaker configuration is handled at the JSON schema level rather than through runtime validation methods on structs. The config.schema.json file contains comprehensive validation constraints for circuit breaker parameters.

Applied to files:

  • router/pkg/config/config.schema.json
📚 Learning: 2025-08-20T10:08:17.857Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2155
File: router/core/router.go:1857-1866
Timestamp: 2025-08-20T10:08:17.857Z
Learning: In the Cosmo router codebase, JSON schema validation prevents null values in TrafficShapingRules subgraph configurations, making nil checks unnecessary when dereferencing subgraph rule pointers in NewSubgraphTransportOptions.

Applied to files:

  • router/pkg/config/config.schema.json
📚 Learning: 2025-08-20T10:08:17.857Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2155
File: router/core/router.go:1857-1866
Timestamp: 2025-08-20T10:08:17.857Z
Learning: router/pkg/config/config.schema.json forbids null values for traffic_shaping.subgraphs: additionalProperties references $defs.traffic_shaping_subgraph_request_rule with type "object". Therefore, in core.NewSubgraphTransportOptions, dereferencing each subgraph rule pointer is safe under schema-validated configs, and a nil-check is unnecessary.

Applied to files:

  • router/pkg/config/config.schema.json
📚 Learning: 2025-08-07T12:05:06.775Z
Learnt from: StarpTech
Repo: wundergraph/cosmo PR: 2079
File: proto/wg/cosmo/platform/v1/platform.proto:39-45
Timestamp: 2025-08-07T12:05:06.775Z
Learning: In the Cosmo project, the proto fields for schema, mappings, and lock in ProtoInput are intentionally kept as string types rather than bytes because the team works with text data and wants it to be UTF-8 encoded for readability and text processing purposes.

Applied to files:

  • router/pkg/connectrpc/handler.go
📚 Learning: 2025-09-02T12:52:27.677Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2181
File: router/core/operation_processor.go:0-0
Timestamp: 2025-09-02T12:52:27.677Z
Learning: Hash validation for persisted queries with query bodies is performed in router/core/graphql_prehandler.go in the handleOperation function, not in the APQ processing logic in operation_processor.go. This validates that extensions.persistedQuery.sha256Hash matches the query body before any APQ operations occur.

Applied to files:

  • router/pkg/connectrpc/handler.go
🧬 Code graph analysis (9)
router/pkg/connectrpc/server_test.go (1)
router/pkg/connectrpc/server.go (2)
  • NewServer (50-166)
  • ServerConfig (21-33)
router/pkg/config/connectrpc_test.go (1)
router/pkg/config/config.go (3)
  • ConnectRPCConfiguration (1012-1017)
  • LoadConfig (1146-1258)
  • Config (1039-1115)
router-tests/connectrpc/connectrpc_client_test.go (2)
router-tests/connectrpc/connectrpc_test_helpers.go (5)
  • NewTestConnectRPCServer (94-144)
  • ConnectRPCServerOptions (77-83)
  • EmployeeGraphQLHandler (222-228)
  • ErrorGraphQLHandler (231-237)
  • HTTPErrorHandler (240-245)
router-tests/testdata/connectrpc/client/employee.v1/employeev1connect/service.connect.go (1)
  • NewEmployeeServiceClient (89-141)
router/pkg/connectrpc/handler_test.go (4)
router/pkg/connectrpc/operation_registry.go (2)
  • NewOperationRegistry (32-41)
  • OperationRegistry (24-29)
router/pkg/connectrpc/proto_loader.go (2)
  • NewProtoLoader (62-72)
  • ProtoLoader (53-59)
router/pkg/connectrpc/handler.go (2)
  • NewRPCHandler (161-194)
  • HandlerConfig (152-158)
router/pkg/connectrpc/test_helpers.go (1)
  • MockHTTPClient (64-71)
router-tests/connectrpc/connectrpc_server_lifecycle_test.go (2)
router-tests/connectrpc/connectrpc_test_helpers.go (2)
  • NewTestConnectRPCServer (94-144)
  • ConnectRPCServerOptions (77-83)
router/pkg/connectrpc/server.go (1)
  • ServerConfig (21-33)
router/pkg/connectrpc/proto_loader_test.go (1)
router/pkg/connectrpc/proto_loader.go (2)
  • ProtoLoader (53-59)
  • NewProtoLoader (62-72)
router-tests/connectrpc/connectrpc_test_helpers.go (1)
router/pkg/connectrpc/server.go (1)
  • ServerConfig (21-33)
router/pkg/connectrpc/vanguard_service.go (2)
router/pkg/connectrpc/handler.go (2)
  • MetaKeyGraphQLErrors (105-105)
  • GraphQLError (129-134)
router/pkg/connectrpc/proto_loader.go (3)
  • ProtoLoader (53-59)
  • ServiceDefinition (17-30)
  • MethodDefinition (33-50)
router-tests/testdata/connectrpc/client/employee.v1/employeev1connect/service.connect.go (1)
router-tests/testdata/connectrpc/client/employee.v1/service.pb.go (24)
  • FindEmployeesByPetsRequest (321-327)
  • FindEmployeesByPetsRequest (342-342)
  • FindEmployeesByPetsRequest (357-359)
  • FindEmployeesByPetsResponse (368-375)
  • FindEmployeesByPetsResponse (390-390)
  • FindEmployeesByPetsResponse (405-407)
  • GetEmployeeByIdRequest (227-233)
  • GetEmployeeByIdRequest (248-248)
  • GetEmployeeByIdRequest (263-265)
  • GetEmployeeByIdResponse (274-280)
  • GetEmployeeByIdResponse (295-295)
  • GetEmployeeByIdResponse (310-312)
  • GetEmployeesRequest (501-505)
  • GetEmployeesRequest (520-520)
  • GetEmployeesRequest (535-537)
  • GetEmployeesResponse (539-545)
  • GetEmployeesResponse (560-560)
  • GetEmployeesResponse (575-577)
  • UpdateEmployeeMoodRequest (124-131)
  • UpdateEmployeeMoodRequest (146-146)
  • UpdateEmployeeMoodRequest (161-163)
  • UpdateEmployeeMoodResponse (179-186)
  • UpdateEmployeeMoodResponse (201-201)
  • UpdateEmployeeMoodResponse (216-218)
🪛 markdownlint-cli2 (0.18.1)
router-tests/testdata/connectrpc/README.md

16-16: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: build-router
  • GitHub Check: image_scan (nonroot)
  • GitHub Check: image_scan
  • GitHub Check: integration_test (./. ./fuzzquery ./lifecycle ./modules)
  • GitHub Check: Analyze (javascript-typescript)
  • GitHub Check: integration_test (./telemetry)
  • GitHub Check: build_push_image (nonroot)
  • GitHub Check: integration_test (./events)
  • GitHub Check: Analyze (go)
  • GitHub Check: build_push_image
  • GitHub Check: build_test
🔇 Additional comments (31)
router-tests/testdata/connectrpc/buf.yaml (1)

1-9: Configuration is well-formed and appropriate for testdata.

The Buf v2 configuration is syntactically valid and semantically correct. The module path, lint rules (STANDARD), and breaking change detection (FILE) are suitable for proto file validation in the ConnectRPC testdata context.

protographic/tests/operations/operation-validation.test.ts (1)

201-361: PascalCase validation tests look comprehensive and aligned with behavior

The new suite covers the key accepted/rejected patterns and the specific error messaging; structure and expectations all look consistent with a strict PascalCase rule for operation names.

router/pkg/connectrpc/samples/service.graphqls (1)

1-346: Schema looks structurally sound and well‑suited as a ConnectRPC sample surface

The SDL is internally consistent: all referenced types are defined, interface implementations satisfy their contracts (e.g., Pet/Animal, Hobby, Experience), and the Products union has valid object members. The mix of queries, mutations (including uploads), subscriptions, unions, interfaces, enums, and nested inputs should give good coverage for exercising the new ConnectRPC bridge and service discovery.

As long as this file is wired into the new samples/tests as intended elsewhere in the PR, I don’t see any blocking issues here.

protographic/src/operation-to-proto.ts (2)

267-279: LGTM! PascalCase validation with helpful error messaging.

The regex correctly enforces PascalCase, and the suggested name using upperFirst(camelCase(operationName)) addresses previous feedback. The error message is clear with practical examples.

Note: All-uppercase names like "ID" or "GETUSER" will be rejected—this is correct behavior for enforcing PascalCase conventions.


294-296: LGTM! Direct use of operationName aligns with PascalCase validation.

Since the operation name is already validated as PascalCase, using it directly for methodName ensures exact matching between GraphQL operation names and RPC method names.

router/pkg/connectrpc/samples/services/employee.v1/QueryGetEmployeesByPetsNamedFragment.graphql (1)

1-20: Well-structured fragment-based query.

Fragment definitions and reuse in FindEmployeesByPetsNamedFragment look consistent and idiomatic; no changes needed.

router/pkg/connectrpc/samples/services/employee.v1/QueryGetEmployees.graphql (1)

1-10: LGTM for GetEmployees sample.

The operation shape is clear and minimal, suitable as a canonical example; nothing to adjust.

router-tests/testdata/connectrpc/services/employee.v1/QueryGetEmployeeWithMood.graphql (1)

1-12: Query shape for mood + nested pets looks appropriate.

The selection set exercises both scalar (currentMood) and nested list (details.pets) fields, which is useful for the ConnectRPC E2E tests.

router/pkg/connectrpc/samples/services/employee.v1/QueryGetEmployeesByPetsInlineFragment.graphql (1)

1-14: Inline-fragment variant looks correct.

The inline ... on Details fragment is well-formed and mirrors the named-fragment variant; good complementary sample.

router-tests/testdata/connectrpc/client/employee.v1/employeev1connect/service.connect.go (1)

1-307: Generated EmployeeService client/handler wiring looks consistent.

The connect-go stub (procedure constants, descriptors, client methods, handler routing, unimplemented handler) is internally consistent and matches the proto surface; no manual changes needed here.

router/pkg/connectrpc/server_test.go (1)

12-85: NewServer config/defaulting tests match server behavior.

The tests exercise the key NewServer invariants (required fields, protocol prefixing, listen addr and timeout defaults, nil logger handling) and align with the current implementation; they should give good protection against regressions in the constructor.

router/pkg/connectrpc/service_discovery_test.go (1)

13-352: Service discovery tests comprehensively exercise the intended conventions.

These tests nicely pin down the discovery rules (one proto per dir, no nested protos, unique package.service, recursive operation discovery, and the parsing helpers). They align with the documented ADR-style layout and should make future refactors to DiscoverServices much safer.

router-tests/connectrpc/connectrpc_client_test.go (1)

19-295: Client protocol/error/concurrency tests look solid and race-free.

The tests validate Connect/gRPC/gRPC‑Web behavior, GraphQL/HTTP error mapping to connect.Error codes, and basic concurrency with an atomic request counter. The patterns (h2c client, buffered channel fan‑out/fan‑in) are sound and shouldn’t introduce data races.

router-tests/go.mod (1)

6-6: Verify alignment of connectrpc.com/connect versions between connect-go and router-tests modules.

The router-tests module pins connectrpc.com/connect v1.16.2, but connect-go/go.mod uses v1.17.0. The vanguard and protoreflect versions are consistent across modules, and mcp-go is correctly at v0.36.0. Confirm whether the connect version difference (v1.16.2 vs v1.17.0) is intentional or requires alignment for multi-module compatibility.

router/pkg/connectrpc/handler_test.go (2)

25-110: LGTM: Constructor validation is comprehensive.

The test coverage for NewRPCHandler validation paths is thorough, including error cases, defaults, and protocol normalization.


313-577: LGTM: Variable conversion tests are comprehensive.

The test coverage for proto JSON to GraphQL variable conversion is excellent, covering nested structures, arrays, null handling, empty inputs, and edge cases for the snakeToCamel helper.

router/core/router.go (3)

960-1034: LGTM: ConnectRPC bootstrap follows established patterns.

The ConnectRPC bootstrap flow is well-structured and consistent with the MCP bootstrap pattern. Service discovery, provider resolution, endpoint determination, and error handling are all appropriately implemented.


1551-1557: LGTM: Proper shutdown integration.

The ConnectRPC server shutdown is correctly integrated into the graceful shutdown flow using wg.Go() (available in Go 1.25+), with appropriate error handling and wrapping.


2208-2212: LGTM: Standard option implementation.

The WithConnectRPC option follows the established pattern for router configuration options.

router/pkg/connectrpc/validator.go (1)

54-62: LGTM: Debug logging properly structured.

The validation logging now uses structured logging with appropriate debug level and field names, addressing previous feedback about debug print statements.

router/pkg/connectrpc/proto_loader.go (2)

238-278: LGTM: Local registry addresses previous race condition concerns.

The proto loader now uses a local registry (pl.files) instead of the global registry, eliminating the concurrent registration issues flagged in previous reviews. Each ProtoLoader instance maintains its own registry, making the check-then-register pattern safe.


138-175: LGTM: Package uniqueness validation correctly implemented.

The package uniqueness logic now properly tracks newly added services using an existingServices map, addressing previous feedback about broken validation. The implementation correctly enforces unique proto package names across all service directories.

router/pkg/connectrpc/service_discovery.go (1)

46-153: LGTM: Service discovery is well-designed.

The convention-based service discovery correctly handles both flat and nested directory structures, enforces the one-proto-per-directory rule, validates package.service uniqueness, and prevents duplicate discovery with filepath.SkipDir. Error handling and logging are comprehensive.

router-tests/connectrpc/connectrpc_test_helpers.go (1)

18-245: LGTM: Test helpers are well-designed.

The test infrastructure provides comprehensive helpers with proper resource management. The use of t.Cleanup() for automatic teardown, idempotent Close() via the cleanupDone flag, and dynamic port allocation via localhost:0 are all best practices for test reliability.

router/pkg/connectrpc/server.go (3)

49-166: LGTM: Server initialization is well-structured.

The constructor properly validates configuration, applies sensible defaults, and initializes components in the correct order. The ProtoLoader is created before initializeComponents(), addressing previous feedback about initialization ordering.


234-299: LGTM: Reload implementation addresses previous feedback.

The Reload() method now correctly:

  1. Creates a fresh ProtoLoader before calling initializeComponents() (lines 249-257), ensuring the handler receives fresh proto definitions
  2. Wraps the handler with h2c.NewHandler() (line 295), preserving HTTP/2 support after reload

Both issues flagged in previous reviews have been resolved.


358-391: LGTM: Response writer properly implements required interfaces.

The responseWriter wrapper correctly implements http.Flusher, http.Pusher, and http.Hijacker interfaces required for gRPC streaming, while safely capturing the response status code with appropriate fallbacks when the underlying writer doesn't support these interfaces.

router/pkg/connectrpc/handler.go (4)

17-54: LGTM: HTTP status to Connect code mapping is appropriate.

The status code mapping follows Connect RPC conventions and covers all common HTTP status codes with a safe default to CodeUnknown.


290-371: LGTM: Variable conversion properly implements recursive transformation.

The conversion logic now correctly handles nested structures using convertKeysRecursive(), and snakeToCamel() uses strings.Builder with pre-allocation for efficiency. Previous feedback about recursive conversion and performance has been addressed.


373-434: LGTM: Error classification follows GraphQL best practices.

The CRITICAL vs PARTIAL error classification correctly distinguishes between complete failures and partial successes, following Relay-inspired patterns. Using CodeUnknown for GraphQL errors (rather than CodeInternal) is appropriate, and the metadata embedding enables clients to access both error details and partial data.


436-542: LGTM: GraphQL execution is comprehensive and well-designed.

The execution flow properly handles header propagation (filtering sensitive headers), HTTP error mapping, GraphQL error classification, and edge cases (empty responses). The logic correctly distinguishes between CRITICAL (no data) and PARTIAL (data + errors) scenarios.

@asoorm asoorm force-pushed the ahmet/eng-8277-connect-rpc-handler-final branch from c1b1b30 to 92ba712 Compare December 3, 2025 08:47
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (3)
router-tests/testdata/connectrpc/README.md (1)

16-22: Add language identifier to fenced code block.

The fenced code block is missing a language specifier. Add text or plaintext to comply with Markdown linting rules.

-```
+```text
 services/
 └── myservice.v1/
     ├── service.proto                    # Proto service definition
     ├── QueryGetItem.graphql            # GraphQL query operation
     └── MutationCreateItem.graphql      # GraphQL mutation operation

</blockquote></details>
<details>
<summary>router/pkg/connectrpc/constructor_validation_test.go (1)</summary><blockquote>

`89-89`: **Tests use relative paths that may not resolve correctly from the test working directory.**

The paths `"samples/services"` and `"samples/services/employee.v1"` are relative to the test file location. When CI runs tests from the `router/` directory (per `router-ci.yaml`), these paths should be `"pkg/connectrpc/samples/services"` instead.

However, if tests are run from `router/pkg/connectrpc/`, they would work. Verify which working directory is used during CI test execution.




```shell
#!/bin/bash
# Check how tests are run in CI and verify the samples directory exists
echo "=== Checking test working directory in CI ==="
grep -A10 "go test\|gotestsum" .github/workflows/router-ci.yaml | head -20

echo -e "\n=== Verifying samples directory structure ==="
ls -la router/pkg/connectrpc/samples/services/ 2>/dev/null || echo "Path not found from repo root"

echo -e "\n=== Check if there's a go.mod in the connectrpc package ==="
cat router/pkg/connectrpc/go.mod 2>/dev/null || echo "No separate go.mod"

Also applies to: 100-100, 165-165, 175-175

router/pkg/config/config.schema.json (1)

2137-2204: Remove dead code schema fields for per-service provider overrides.

The services array (lines 2179–2197) and its nested proto_provider_id and operations_provider_id fields are defined in the schema but have no corresponding Go struct fields in ConnectRPCConfiguration. The implementation uses convention-based auto-discovery via DiscoverServices(), which scans the filesystem—it never reads or parses per-service provider configuration. Users who configure these fields will have them silently discarded during unmarshalling, creating a confusing schema that misrepresents what the router actually supports.

Either remove these unused fields from the schema, or add a comment documenting that they are reserved for future per-service provider support and clarify in the services array description that services are currently auto-discovered by convention.

🧹 Nitpick comments (13)
protographic/src/operation-to-proto.ts (1)

294-304: Fix duplicate step number in comments.

Lines 294 and 304 both have step "4." in their comments. The second occurrence should be "5." for consistency.

-    // 4. Create method name from operation name
+    // 5. Create method name from operation name
     // Use operation name as-is to ensure exact matching (no transformation)
     let methodName = operationName;

     // Add operation type prefix if requested
     if (this.prefixOperationType) {
       const operationTypePrefix = upperFirst(node.operation.toLowerCase());
       methodName = `${operationTypePrefix}${methodName}` as any;
     }

-    // 4. Create request message from variables
+    // 6. Create request message from variables
router-tests/connectrpc/connectrpc_server_lifecycle_test.go (1)

121-137: Consider adding a test with an already-expired context.

The graceful shutdown test verifies completion within a reasonable timeout, but doesn't test behavior when the context deadline is already exceeded. Consider adding a case:

t.Run("stop with already cancelled context", func(t *testing.T) {
    ts := NewTestConnectRPCServer(t, ConnectRPCServerOptions{})
    require.NoError(t, ts.Start())

    ctx, cancel := context.WithCancel(context.Background())
    cancel() // Cancel immediately

    err := ts.Server.Stop(ctx)
    assert.Error(t, err) // or NoError if shutdown proceeds anyway
})
router/pkg/connectrpc/error_handling_test.go (1)

305-376: Consider adding edge case tests for successful responses.

The success path tests cover basic scenarios. Consider adding cases for:

  • Empty data object: {"data": {}}
  • Null values in successful response: {"data": {"user": null}}
  • Response with extensions: {"data": {...}, "extensions": {...}}
{
    name: "Successful query with null field",
    graphqlResponse: `{"data": {"user": null}}`,
    expectedData: `{"user": null}`,
},
router/pkg/connectrpc/constructor_validation_test.go (2)

12-13: Comment could be more informative.

The comment states these tests are "consolidated" from other files, but those source files aren't mentioned. If they were removed, consider updating to clarify this is now the canonical location.


137-144: Subtests don't run in parallel.

The outer test uses t.Parallel(), but individual table-driven subtests don't. For consistency and faster execution, consider adding t.Parallel() to subtests.

 	for _, tt := range tests {
 		t.Run(tt.name, func(t *testing.T) {
+			t.Parallel()
 			result, err := tt.constructor()
 			assert.Error(t, err)
 			assert.Nil(t, result)
 			assert.Contains(t, err.Error(), tt.wantErr)
 		})
 	}
router/pkg/connectrpc/handler_test.go (1)

125-129: Direct manipulation of internal map bypasses encapsulation.

The test directly assigns to operationRegistry.operations, accessing an unexported field. While this works for same-package tests, consider adding a test helper method like AddTestOperation(service, name string, op *Operation) on OperationRegistry to make tests more maintainable if the internal structure changes.

router/core/router.go (3)

982-992: Discovered services are not passed to the server.

DiscoverServices is called and logged, but the result (discoveredServices) is only used to log the count. The ServerConfig doesn't receive the discovered services—it only gets ServicesDir and rediscovers internally.

This creates redundant work. Consider either:

  1. Passing discoveredServices to NewServer to avoid rediscovery
  2. Removing the DiscoverServices call here if the server handles discovery internally

994-1000: Duplicate endpoint resolution logic with MCP.

The GraphQL endpoint resolution pattern is duplicated between MCP (lines 935-942) and ConnectRPC (lines 994-1000). Consider extracting a helper function.

+func (r *Router) resolveGraphQLEndpoint(customURL string) string {
+	if customURL != "" {
+		return customURL
+	}
+	return path.Join(r.listenAddr, r.graphqlPath)
+}
+
 // In MCP bootstrap:
-var routerGraphQLEndpoint string
-if r.mcp.RouterURL != "" {
-    routerGraphQLEndpoint = r.mcp.RouterURL
-} else {
-    routerGraphQLEndpoint = path.Join(r.listenAddr, r.graphqlPath)
-}
+routerGraphQLEndpoint := r.resolveGraphQLEndpoint(r.mcp.RouterURL)

 // In ConnectRPC bootstrap:
-var routerGraphQLEndpoint string
-if r.connectRPC.GraphQLEndpoint != "" {
-    routerGraphQLEndpoint = r.connectRPC.GraphQLEndpoint
-} else {
-    routerGraphQLEndpoint = path.Join(r.listenAddr, r.graphqlPath)
-}
+routerGraphQLEndpoint := r.resolveGraphQLEndpoint(r.connectRPC.GraphQLEndpoint)

Also applies to: 935-942


1010-1013: Consider reducing log verbosity.

Multiple Info-level logs for the same operation (creating, created, starting, started) may clutter production logs. Consider using Debug level for intermediate steps and keeping only the final success message at Info level.

-		r.logger.Info("Creating ConnectRPC server",
+		r.logger.Debug("Creating ConnectRPC server",
 			zap.String("services_dir", servicesDir),
 			zap.String("graphql_endpoint", routerGraphQLEndpoint),
 			zap.String("listen_addr", r.connectRPC.Server.ListenAddr))
router/pkg/connectrpc/validator.go (1)

299-309: Enum validation could verify the value exists in the enum definition.

Currently, any string or number is accepted for enum fields. For stricter validation, consider checking if the value matches a defined enum value or number using field.GetEnumType().FindValueByName() or FindValueByNumber().

router/pkg/connectrpc/proto_loader.go (1)

324-328: GetServices exposes internal map without defensive copy.

The comment says the map "should be treated as read-only," but callers could still mutate it. Consider returning a defensive copy if mutation would cause issues, or document more explicitly that this is intentional for performance.

 // GetServices returns all loaded service definitions.
-// The returned map should be treated as read-only to prevent accidental mutation.
+// WARNING: Returns the internal map directly for performance. Callers MUST NOT
+// modify the returned map or its values. Use GetService() for safe single lookups.
 func (pl *ProtoLoader) GetServices() map[string]*ServiceDefinition {
 	return pl.services
 }

Alternatively, return a copy:

func (pl *ProtoLoader) GetServices() map[string]*ServiceDefinition {
	result := make(map[string]*ServiceDefinition, len(pl.services))
	for k, v := range pl.services {
		result[k] = v
	}
	return result
}
router/pkg/connectrpc/server.go (2)

184-190: Consider making server timeouts configurable.

The ReadTimeout, WriteTimeout, and IdleTimeout are hardcoded. While the defaults are reasonable, consider exposing these in ServerConfig for consistency with RequestTimeout, especially if users need to tune for long-running gRPC streams.


328-334: statusCode is captured but never used.

The responseWriter captures the status code at line 330, but it's never read after being set. Either utilize it for logging/metrics or simplify by removing the unused field.

If intended for future use, consider adding a comment. Otherwise, simplify:

-	wrappedTranscoder := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
-		// Create a response writer that captures the status code and implements required interfaces
-		rw := &responseWriter{ResponseWriter: w, statusCode: 200}
-
-		// The transcoder handles protocol translation and routing
-		s.transcoder.ServeHTTP(rw, r)
-	})
+	wrappedTranscoder := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
+		// Create a response writer that implements required interfaces for gRPC streaming
+		rw := &responseWriter{ResponseWriter: w}
+		s.transcoder.ServeHTTP(rw, r)
+	})
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c1b1b30 and 92ba712.

⛔ Files ignored due to path filters (1)
  • router-tests/testdata/connectrpc/client/employee.v1/service.pb.go is excluded by !**/*.pb.go
📒 Files selected for processing (17)
  • protographic/src/operation-to-proto.ts (2 hunks)
  • router-tests/connectrpc/connectrpc_server_lifecycle_test.go (1 hunks)
  • router-tests/testdata/connectrpc/README.md (1 hunks)
  • router-tests/testdata/connectrpc/buf.gen.yaml (1 hunks)
  • router-tests/testdata/connectrpc/client/employee.v1/employeev1connect/service.connect.go (1 hunks)
  • router/core/router.go (5 hunks)
  • router/core/router_config.go (4 hunks)
  • router/pkg/config/config.go (2 hunks)
  • router/pkg/config/config.schema.json (1 hunks)
  • router/pkg/connectrpc/constructor_validation_test.go (1 hunks)
  • router/pkg/connectrpc/error_handling_test.go (1 hunks)
  • router/pkg/connectrpc/handler.go (1 hunks)
  • router/pkg/connectrpc/handler_test.go (1 hunks)
  • router/pkg/connectrpc/proto_loader.go (1 hunks)
  • router/pkg/connectrpc/server.go (1 hunks)
  • router/pkg/connectrpc/server_test.go (1 hunks)
  • router/pkg/connectrpc/validator.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
  • router/core/router_config.go
  • router-tests/testdata/connectrpc/buf.gen.yaml
  • router/pkg/connectrpc/server_test.go
  • router/pkg/config/config.go
  • router/pkg/connectrpc/handler.go
🧰 Additional context used
🧠 Learnings (16)
📓 Common learnings
Learnt from: dkorittki
Repo: wundergraph/cosmo PR: 2273
File: router/core/graphql_handler.go:0-0
Timestamp: 2025-11-19T15:13:57.821Z
Learning: In the Cosmo router (wundergraph/cosmo), error handling follows a two-phase pattern: (1) Prehandler phase handles request parsing, validation, and setup errors using `httpGraphqlError` and `writeOperationError` (in files like graphql_prehandler.go, operation_processor.go, parse_multipart.go, batch.go); (2) Execution phase handles resolver execution errors using `WriteError` in GraphQLHandler.ServeHTTP. Because all `httpGraphqlError` instances are caught in the prehandler before ServeHTTP is invoked, any error type checks for `httpGraphqlError` in the execution-phase WriteError method are unreachable code.
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2141
File: router-tests/http_subscriptions_test.go:17-55
Timestamp: 2025-08-28T09:17:49.477Z
Learning: The Cosmo router uses a custom, intentionally rigid multipart implementation for GraphQL subscriptions. The multipart parsing in test files should remain strict and not be made more tolerant, as this rigidity is by design.
📚 Learning: 2025-11-19T15:13:57.821Z
Learnt from: dkorittki
Repo: wundergraph/cosmo PR: 2273
File: router/core/graphql_handler.go:0-0
Timestamp: 2025-11-19T15:13:57.821Z
Learning: In the Cosmo router (wundergraph/cosmo), error handling follows a two-phase pattern: (1) Prehandler phase handles request parsing, validation, and setup errors using `httpGraphqlError` and `writeOperationError` (in files like graphql_prehandler.go, operation_processor.go, parse_multipart.go, batch.go); (2) Execution phase handles resolver execution errors using `WriteError` in GraphQLHandler.ServeHTTP. Because all `httpGraphqlError` instances are caught in the prehandler before ServeHTTP is invoked, any error type checks for `httpGraphqlError` in the execution-phase WriteError method are unreachable code.

Applied to files:

  • router/pkg/connectrpc/handler_test.go
  • router/pkg/connectrpc/error_handling_test.go
📚 Learning: 2025-09-02T12:52:27.677Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2181
File: router/core/operation_processor.go:0-0
Timestamp: 2025-09-02T12:52:27.677Z
Learning: Hash validation for persisted queries with query bodies is performed in router/core/graphql_prehandler.go in the handleOperation function at lines 571-578, not in the APQ processing logic in operation_processor.go. This validates that extensions.persistedQuery.sha256Hash matches the computed query hash before any APQ operations occur. There's also a test case that verifies this behavior.

Applied to files:

  • router/pkg/connectrpc/handler_test.go
📚 Learning: 2025-10-01T20:39:16.113Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2252
File: router-tests/telemetry/telemetry_test.go:9684-9693
Timestamp: 2025-10-01T20:39:16.113Z
Learning: Repo preference: In router-tests/telemetry/telemetry_test.go, keep strict > 0 assertions for request.operation.*Time (parsingTime, normalizationTime, validationTime, planningTime) in telemetry-related tests; do not relax to >= 0 unless CI flakiness is observed.

Applied to files:

  • router/pkg/connectrpc/handler_test.go
  • router-tests/connectrpc/connectrpc_server_lifecycle_test.go
  • router/pkg/connectrpc/constructor_validation_test.go
  • router/pkg/connectrpc/error_handling_test.go
📚 Learning: 2025-09-17T20:55:39.456Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2172
File: router/core/graph_server.go:0-0
Timestamp: 2025-09-17T20:55:39.456Z
Learning: The Initialize method in router/internal/retrytransport/manager.go has been updated to properly handle feature-flag-only subgraphs by collecting subgraphs from both routerConfig.GetSubgraphs() and routerConfig.FeatureFlagConfigs.ConfigByFeatureFlagName, ensuring all subgraphs receive retry configuration.

Applied to files:

  • router/core/router.go
  • router/pkg/config/config.schema.json
📚 Learning: 2025-08-20T22:13:25.222Z
Learnt from: StarpTech
Repo: wundergraph/cosmo PR: 2157
File: router-tests/go.mod:16-16
Timestamp: 2025-08-20T22:13:25.222Z
Learning: github.com/mark3labs/mcp-go v0.38.0 has regressions and should not be used in the wundergraph/cosmo project. v0.36.0 is the stable version that should be used across router-tests and other modules.

Applied to files:

  • router/core/router.go
📚 Learning: 2025-09-24T12:54:00.765Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2222
File: router-tests/websocket_test.go:2238-2302
Timestamp: 2025-09-24T12:54:00.765Z
Learning: The wundergraph/cosmo project uses Go 1.25 (Go 1.25 minimum), so fmt.Appendf and other newer Go standard library functions are available and can be used without compatibility concerns.

Applied to files:

  • router/core/router.go
📚 Learning: 2025-09-24T12:54:00.765Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2222
File: router-tests/websocket_test.go:2238-2302
Timestamp: 2025-09-24T12:54:00.765Z
Learning: The wundergraph/cosmo project uses Go 1.25 (Go 1.23+ minimum), so fmt.Appendf and other newer Go standard library functions are available and can be used without compatibility concerns.

Applied to files:

  • router/core/router.go
📚 Learning: 2025-07-21T15:06:36.664Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2067
File: router/pkg/config/config.schema.json:1637-1644
Timestamp: 2025-07-21T15:06:36.664Z
Learning: In the Cosmo router project, when extending JSON schema validation for security-sensitive fields like JWKS secrets, backwards compatibility is maintained by implementing warnings in the Go code rather than hard validation constraints in the schema. This allows existing configurations to continue working while alerting users to potential security issues.

Applied to files:

  • router/pkg/config/config.schema.json
  • router/pkg/connectrpc/validator.go
📚 Learning: 2025-08-20T10:08:17.857Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2155
File: router/core/router.go:1857-1866
Timestamp: 2025-08-20T10:08:17.857Z
Learning: router/pkg/config/config.schema.json forbids null values for traffic_shaping.subgraphs: additionalProperties references $defs.traffic_shaping_subgraph_request_rule with type "object". Therefore, in core.NewSubgraphTransportOptions, dereferencing each subgraph rule pointer is safe under schema-validated configs, and a nil-check is unnecessary.

Applied to files:

  • router/pkg/config/config.schema.json
📚 Learning: 2025-07-21T14:46:34.879Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2067
File: router/pkg/authentication/jwks_token_decoder.go:80-106
Timestamp: 2025-07-21T14:46:34.879Z
Learning: In the Cosmo router project, required field validation for JWKS configuration (Secret, Algorithm, KeyId) is handled at the JSON schema level in config.schema.json rather than through runtime validation in the Go code at router/pkg/authentication/jwks_token_decoder.go.

Applied to files:

  • router/pkg/config/config.schema.json
  • router/pkg/connectrpc/validator.go
📚 Learning: 2025-07-30T09:29:46.660Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2090
File: router/pkg/config/config.schema.json:0-0
Timestamp: 2025-07-30T09:29:46.660Z
Learning: The "operation_name_trim_limit" configuration property in router/pkg/config/config.schema.json should be placed at the security level as a sibling to complexity_limits, not inside the complexity_limits object.

Applied to files:

  • router/pkg/config/config.schema.json
📚 Learning: 2025-06-30T20:39:02.387Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 1929
File: router/internal/circuit/manager.go:16-25
Timestamp: 2025-06-30T20:39:02.387Z
Learning: In the Cosmo router project, parameter validation for circuit breaker configuration is handled at the JSON schema level rather than through runtime validation methods on structs. The config.schema.json file contains comprehensive validation constraints for circuit breaker parameters.

Applied to files:

  • router/pkg/config/config.schema.json
📚 Learning: 2025-08-20T10:08:17.857Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2155
File: router/core/router.go:1857-1866
Timestamp: 2025-08-20T10:08:17.857Z
Learning: In the Cosmo router codebase, JSON schema validation prevents null values in TrafficShapingRules subgraph configurations, making nil checks unnecessary when dereferencing subgraph rule pointers in NewSubgraphTransportOptions.

Applied to files:

  • router/pkg/config/config.schema.json
📚 Learning: 2025-07-30T09:29:04.257Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2090
File: router/core/operation_processor.go:0-0
Timestamp: 2025-07-30T09:29:04.257Z
Learning: GraphQL operation names don't allow characters with more than 1 code point, so string length operations and slicing work correctly for both byte and character counting in GraphQL operation name processing.

Applied to files:

  • protographic/src/operation-to-proto.ts
📚 Learning: 2025-09-19T15:08:03.085Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2223
File: router/cmd/main.go:32-33
Timestamp: 2025-09-19T15:08:03.085Z
Learning: pyroscope-go's Config.ServerAddress field accepts both full URLs (e.g., "http://localhost:4040") and simple host:port format (e.g., "localhost:4040"). The library handles URL normalization internally, so both formats work correctly.

Applied to files:

  • router/pkg/connectrpc/server.go
🧬 Code graph analysis (4)
router-tests/connectrpc/connectrpc_server_lifecycle_test.go (2)
router-tests/connectrpc/connectrpc_test_helpers.go (2)
  • NewTestConnectRPCServer (94-144)
  • ConnectRPCServerOptions (77-83)
router/pkg/connectrpc/server.go (3)
  • NewServer (50-166)
  • ServerConfig (21-33)
  • Server (36-47)
router/core/router.go (3)
router/pkg/connectrpc/server.go (2)
  • Server (36-47)
  • ServerConfig (21-33)
router/pkg/connectrpc/service_discovery.go (2)
  • DiscoverServices (46-153)
  • ServiceDiscoveryConfig (29-34)
router/pkg/config/config.go (1)
  • ConnectRPCConfiguration (1012-1017)
router/pkg/connectrpc/validator.go (1)
router/pkg/connectrpc/proto_loader.go (1)
  • ProtoLoader (53-59)
router-tests/testdata/connectrpc/client/employee.v1/employeev1connect/service.connect.go (1)
router-tests/testdata/connectrpc/client/employee.v1/service.pb.go (33)
  • FindEmployeesByPetsRequest (321-327)
  • FindEmployeesByPetsRequest (342-342)
  • FindEmployeesByPetsRequest (357-359)
  • FindEmployeesByPetsResponse (368-375)
  • FindEmployeesByPetsResponse (390-390)
  • FindEmployeesByPetsResponse (405-407)
  • FindEmployeesByPetsInlineFragmentRequest (586-592)
  • FindEmployeesByPetsInlineFragmentRequest (607-607)
  • FindEmployeesByPetsInlineFragmentRequest (622-624)
  • FindEmployeesByPetsNamedFragmentRequest (681-687)
  • FindEmployeesByPetsNamedFragmentRequest (702-702)
  • FindEmployeesByPetsNamedFragmentRequest (717-719)
  • GetEmployeeByIdRequest (227-233)
  • GetEmployeeByIdRequest (248-248)
  • GetEmployeeByIdRequest (263-265)
  • GetEmployeeByIdResponse (274-280)
  • GetEmployeeByIdResponse (295-295)
  • GetEmployeeByIdResponse (310-312)
  • GetEmployeesRequest (501-505)
  • GetEmployeesRequest (520-520)
  • GetEmployeesRequest (535-537)
  • GetEmployeesResponse (539-545)
  • GetEmployeesResponse (560-560)
  • GetEmployeesResponse (575-577)
  • GetEmployeesWithMoodRequest (416-420)
  • GetEmployeesWithMoodRequest (435-435)
  • GetEmployeesWithMoodRequest (450-452)
  • UpdateEmployeeMoodRequest (124-131)
  • UpdateEmployeeMoodRequest (146-146)
  • UpdateEmployeeMoodRequest (161-163)
  • UpdateEmployeeMoodResponse (179-186)
  • UpdateEmployeeMoodResponse (201-201)
  • UpdateEmployeeMoodResponse (216-218)
🪛 markdownlint-cli2 (0.18.1)
router-tests/testdata/connectrpc/README.md

16-16: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: build-router
  • GitHub Check: build_push_image (nonroot)
  • GitHub Check: Analyze (javascript-typescript)
  • GitHub Check: image_scan (nonroot)
  • GitHub Check: image_scan
  • GitHub Check: build_test
  • GitHub Check: Analyze (go)
  • GitHub Check: build_push_image
  • GitHub Check: integration_test (./telemetry)
  • GitHub Check: integration_test (./. ./fuzzquery ./lifecycle ./modules)
  • GitHub Check: integration_test (./events)
🔇 Additional comments (21)
protographic/src/operation-to-proto.ts (1)

267-279: PascalCase validation looks correct.

The regex correctly enforces PascalCase naming (starts with uppercase, contains at least one lowercase, alphanumeric only). The suggested name using upperFirst(camelCase(operationName)) provides helpful guidance when validation fails.

router-tests/testdata/connectrpc/client/employee.v1/employeev1connect/service.connect.go (1)

1-307: Generated code follows standard Connect patterns.

This is generated code by protoc-gen-connect-go with correct:

  • Version compatibility assertion (line 21)
  • Idempotency annotations (queries marked IdempotencyNoSideEffects, mutation has none)
  • Client/handler interface symmetry
  • Unimplemented handler stubs
router-tests/connectrpc/connectrpc_server_lifecycle_test.go (1)

85-93: Error scenario test correctly fixed.

The test now correctly asserts the error at NewServer construction time rather than at Start(), which aligns with the implementation where DiscoverServices is called during construction.

router/pkg/connectrpc/error_handling_test.go (2)

28-62: Test coverage for HTTP-to-Connect code mapping is valuable.

While this tests a simple switch case, it serves as documentation and regression protection for the mapping behavior. A previous reviewer suggested removal, but keeping this provides clarity on expected behavior.


64-303: Comprehensive error handling test suite.

The data-driven approach effectively covers:

  • HTTP transport errors (401, 500, 503)
  • GraphQL CRITICAL errors (null data with errors)
  • GraphQL NON-CRITICAL errors (partial data with errors)
  • Metadata assertions (HTTP status, error classification, GraphQL errors, partial data)

The test structure is clean and maintainable.

router/pkg/connectrpc/handler_test.go (4)

16-23: LGTM!

Clean implementation of a test double for simulating transport errors.


184-219: Well-structured header forwarding tests.

Good use of nested subtests with descriptive names following the suggested pattern. The test properly verifies that Content-Length is handled by the HTTP client and Content-Type is set correctly.


468-474: Empty JSON input returns empty object without error.

The test expects convertProtoJSONToGraphQLVariables to return {} for empty input (""). Verify this is the intended behavior—an empty string is technically invalid JSON. If strict validation is desired, this should return an error.


553-577: Thorough edge case coverage for snakeToCamel.

Good coverage of edge cases including multiple underscores, leading/trailing underscores, and empty strings. The test table is well-organized.

router/core/router.go (2)

1536-1540: LGTM! Proper use of wg.Go for concurrent shutdown.

The shutdown flow correctly uses wg.Go (Go 1.25+) for concurrent cleanup of multiple servers and providers, which improves shutdown time while maintaining proper error aggregation.

Also applies to: 1544-1548, 1551-1557, 1559-1564


2208-2212: LGTM!

Clean implementation of the WithConnectRPC option following the existing pattern.

router/pkg/connectrpc/validator.go (2)

1-23: LGTM!

Clean type definitions and constructor for the message validator.


54-62: Debug logging now uses structured logging.

Good improvement—the previous fmt.Printf statements have been replaced with proper structured logging via zap.

router/pkg/connectrpc/proto_loader.go (4)

52-72: LGTM!

Clean struct definition and constructor. The use of a local protoregistry.Files instead of the global registry addresses the previously identified race condition concern.


138-174: Package uniqueness validation is now correctly implemented.

The logic properly tracks existing services before loading each file (line 139-142) and only validates newly added services (line 157-158). This addresses the previously identified issue where duplicate packages weren't detected after the first directory.


244-260: Local registry usage eliminates race condition.

The code now correctly uses the local pl.files registry instead of protoregistry.GlobalFiles, and fails fast on registration errors instead of silently ignoring them. This addresses the previously identified TOCTOU vulnerability.


336-350: LGTM!

GetMethod correctly returns a pointer to the method in the slice, avoiding unnecessary copies while providing clear error messages for missing services/methods.

router/pkg/connectrpc/server.go (4)

49-166: LGTM - well-structured initialization with good error handling.

The constructor validates configuration, sets sensible defaults, and properly sequences component initialization. Error handling is thorough with wrapped errors providing context.


301-321: LGTM!

Clean component initialization with proper dependency injection.


358-391: LGTM - proper implementation of optional HTTP interfaces.

The responseWriter correctly implements http.Flusher, http.Pusher, and http.Hijacker via type assertions, which is essential for gRPC streaming compatibility.


393-407: LGTM!

Proper nil checks and clean delegation for introspection methods.

Comment on lines +53 to +78
t.Run("concurrent start attempts succeed", func(t *testing.T) {
ts := NewTestConnectRPCServer(t, ConnectRPCServerOptions{})

var wg sync.WaitGroup
errors := make([]error, 3)

// Try to start server concurrently
for i := 0; i < 3; i++ {
wg.Add(1)
go func(idx int) {
defer wg.Done()
errors[idx] = ts.Server.Start()
}(i)
}

wg.Wait()

// At least one should succeed
successCount := 0
for _, err := range errors {
if err == nil {
successCount++
}
}
assert.GreaterOrEqual(t, successCount, 1, "at least one start should succeed")
})
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# Find files that might contain Server.Start() implementation
fd "\.go$" | grep -E "(server|connectrpc)" | head -20

Repository: wundergraph/cosmo

Length of output: 841


🏁 Script executed:

# Search for Server.Start() implementation
ast-grep --pattern $'func ($_ *Server) Start() $_ {
  $$$
}'

Repository: wundergraph/cosmo

Length of output: 4273


🏁 Script executed:

cat -n router-tests/connectrpc/connectrpc_server_lifecycle_test.go | head -80

Repository: wundergraph/cosmo

Length of output: 2524


🏁 Script executed:

# Verify the test file location and structure
wc -l router-tests/connectrpc/connectrpc_server_lifecycle_test.go

Repository: wundergraph/cosmo

Length of output: 124


🏁 Script executed:

# Check if there's any handling for port conflicts or state tracking
rg -A 5 "failed to create listener" router/pkg/connectrpc/server.go

Repository: wundergraph/cosmo

Length of output: 188


🏁 Script executed:

# Check if there's any state variable tracking started status
rg "started|Started" router/pkg/connectrpc/server.go | head -20

Repository: wundergraph/cosmo

Length of output: 105


Verify concurrent start behavior with explicit error checks.

The test assertion is too permissive. The Server.Start() implementation creates a new listener on each call without concurrency protection, so exactly one should succeed while others fail with port-in-use errors. Update the assertion to verify this behavior:

successCount := 0
failureCount := 0
for _, err := range errors {
    if err == nil {
        successCount++
    } else if strings.Contains(err.Error(), "address already in use") {
        failureCount++
    }
}
assert.Equal(t, 1, successCount, "exactly one start should succeed")
assert.Equal(t, 2, failureCount, "concurrent starts should fail with port conflict")

Alternatively, simply assert that at least one succeeds and at least one fails, which is less brittle to error message variations:

assert.Equal(t, 1, successCount, "exactly one start should succeed")
assert.Equal(t, 2, failureCount, "exactly two starts should fail")
🤖 Prompt for AI Agents
router-tests/connectrpc/connectrpc_server_lifecycle_test.go around lines 53-78:
the test is too permissive about concurrent Start() results and should assert
the expected concurrency behavior; change the assertions to verify exactly one
Start() succeeds and the other two fail due to port conflicts (e.g., count nil
errors as success and count errors whose Error() contains "address already in
use" as failures), or alternatively assert at least one success and at least one
failure to be less brittle; ensure you import strings if you use string matching
for the error message.

Comment on lines +250 to +252
if s.operationRegistry != nil {
s.operationRegistry.Clear()
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Remove redundant Clear() that corrupts state for in-flight requests.

The Clear() call modifies the live operationRegistry that's still referenced by the old handler chain (old transcoder → old vanguardService → old rpcHandler → old operationRegistry). Since initializeComponents() at line 304 creates a completely new registry, this Clear() is both redundant and harmful—it can cause in-flight requests to fail with empty operation lookups.

Apply this diff:

 	// Create a fresh proto loader and clear operation registry before initializing components
 	// This ensures initializeComponents() (and the RPCHandler it constructs) receives the fresh ProtoLoader
 	s.protoLoader = NewProtoLoader(s.logger)
-	if s.operationRegistry != nil {
-		s.operationRegistry.Clear()
-	}

 	// Reinitialize components with the fresh proto loader
 	if err := s.initializeComponents(); err != nil {
🤖 Prompt for AI Agents
In router/pkg/connectrpc/server.go around lines 250 to 252, the code calls
s.operationRegistry.Clear() which mutates the live registry referenced by the
old handler chain and can corrupt state for in-flight requests; remove the
Clear() call entirely so the existing registry remains intact and let
initializeComponents() later create a new registry without mutating the old one.

Comment on lines +230 to +231
case string:
// String representation is valid for int64
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Int64/Uint64 string values are accepted without parsing validation.

Per protobuf JSON mapping, int64/uint64 can be represented as strings to avoid precision loss. However, the current validation accepts any string without verifying it contains a valid integer. Values like "abc" or "12.34" would pass validation.

 	case descriptorpb.FieldDescriptorProto_TYPE_INT64,
 		descriptorpb.FieldDescriptorProto_TYPE_SINT64,
 		descriptorpb.FieldDescriptorProto_TYPE_SFIXED64:
 		switch v := value.(type) {
 		case float64:
 			if v != float64(int64(v)) {
 				return &ValidationError{
 					Field:   fieldPath,
 					Message: fmt.Sprintf("Int64 cannot represent non-integer value: %v", value),
 				}
 			}
 		case string:
-			// String representation is valid for int64
+			// Validate string can be parsed as int64
+			if _, err := strconv.ParseInt(v, 10, 64); err != nil {
+				return &ValidationError{
+					Field:   fieldPath,
+					Message: fmt.Sprintf("Int64 cannot represent value: %v", value),
+				}
+			}
 		default:

Apply the same fix for uint64 at lines 265-266 using strconv.ParseUint.

Note: You'll need to add "strconv" to the imports.

Also applies to: 265-266

🤖 Prompt for AI Agents
In router/pkg/connectrpc/validator.go around lines 230-231 (and similarly
265-266), the switch handling string values for int64/uint64 currently accepts
any string; update the string case to call strconv.ParseInt(value, 10, 64) for
int64 and strconv.ParseUint(value, 10, 64) for uint64, and if parse returns an
error return a validation error (same form as other type errors) instead of
accepting the raw string; also add "strconv" to the imports.

@asoorm asoorm force-pushed the ahmet/eng-8277-connect-rpc-handler-final branch from 92ba712 to 7895b2a Compare December 3, 2025 08:54
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (2)
router-tests/testdata/connectrpc/README.md (1)

16-22: Add language identifier to the fenced code block.

Update the opening fence from totext to satisfy MD040 and keep markdownlint happy.

router/pkg/connectrpc/server.go (1)

234-274: Clearing operationRegistry during reload corrupts in‑flight requests

In Reload, you do:

s.protoLoader = NewProtoLoader(s.logger)
if s.operationRegistry != nil {
    s.operationRegistry.Clear()
}
if err := s.initializeComponents(); err != nil { ... }

Clear() runs on the existing operationRegistry that’s still referenced by the old handler chain currently serving requests. Any in‑flight RPCs that haven’t yet looked up their operation will suddenly see an empty registry and fail with “operation not found”, even though the old handler and proto definitions are otherwise still valid.

This matches the earlier review concern about Clear() mutating live state and should be removed; initializeComponents() already creates a fresh OperationRegistry for the new handler, so there’s no need to touch the old one at all.

Suggested change:

-    s.protoLoader = NewProtoLoader(s.logger)
-    if s.operationRegistry != nil {
-        s.operationRegistry.Clear()
-    }
+    s.protoLoader = NewProtoLoader(s.logger)

Let the old registry live out its lifetime to keep in‑flight requests consistent; the new handler chain will use the new registry created in initializeComponents().

🧹 Nitpick comments (9)
protographic/src/operation-to-proto.ts (1)

294-320: Method name mapping looks good; consider tightening comments and wording

Using operationName directly for methodName (with only the optional prefixOperationType preprended) is a nice improvement in predictability and aligns with the new PascalCase requirement.

Two minor nits you might want to clean up:

  1. The numbered comments are now slightly inconsistent:

    • // 4. Create method name from operation name
    • // 4. Create request message from variables
    • // 3.5. Process any input object types...
      Renumbering these (e.g., 4 / 5 / 6) would make the flow easier to follow.
  2. The PascalCase error message says this “ensures the RPC method name exactly matches the GraphQL operation name,” but with prefixOperationType enabled the actual method name becomes Query${operationName} / Mutation${operationName} / Subscription${operationName}. You might tweak the wording to something like “...before any optional operation‑type prefixing” to avoid confusion.

Example diff for the comments only:

-    // 4. Create method name from operation name
+    // 4. Create method name from operation name

-    // 4. Create request message from variables
+    // 5. Create request message from variables

-    // 3.5. Process any input object types referenced in variables
+    // 6. Process any input object types referenced in variables
router/pkg/config/config.go (1)

1012-1017: ConnectRPC config wiring looks correct; optional envPrefix polish only.

The new ConnectRPCConfiguration / ConnectRPCServer types and Config.ConnectRPC field line up with the schema surface (enabled, server, services_provider_id, graphql_endpoint) and existing patterns. If you want stricter consistency with other nested sections (e.g. plugins, watch_config), you could also hang an envPrefix:"CONNECT_RPC_" off Config.ConnectRPC and shorten the field‑level env tags, but the current explicit CONNECT_RPC_* names are perfectly fine as is.

Also applies to: 1019-1022, 1049-1053

router/core/router.go (1)

960-1034: Avoid redundant service discovery or clarify intent for early validation

bootstrap calls connectrpc.DiscoverServices and discards the resulting slice except for logging, while connectrpc.NewServer performs its own discovery again. That gives you early failure and a service count, but it doubles the filesystem walk/parsing cost and risks the two discovery paths diverging over time.

If you only need early validation, consider either:

  • Moving discovery into connectrpc.NewServer exclusively and logging the count there, or
  • Passing the already discovered services into NewServer (and skipping re‑discovery there).

Not blocking, but worth simplifying before this API is widely used.

router/pkg/connectrpc/proto_loader.go (1)

324-356: Expose a copy of services (or document single‑threaded usage) to avoid accidental mutation

GetServices returns the internal map[string]*ServiceDefinition directly. Callers can mutate it, which:

  • Breaks the invariant maintained by ProtoLoader, and
  • Can introduce data races if ProtoLoader is ever accessed concurrently (e.g. during reload).

You already warn “should be treated as read‑only”, but you can enforce that cheaply by returning a shallow copy:

func (pl *ProtoLoader) GetServices() map[string]*ServiceDefinition {
-    return pl.services
+    out := make(map[string]*ServiceDefinition, len(pl.services))
+    for k, v := range pl.services {
+        out[k] = v
+    }
+    return out
}

Not strictly required today, but it makes the API much safer if usage evolves.

router/pkg/connectrpc/handler.go (4)

196-245: Consider trimming or gating full request body logging

HandleRPC logs the entire requestJSON as a string at debug level:

h.logger.Debug("handling RPC request",
    ...,
    zap.String("request_json", string(requestJSON)))

For large payloads or sensitive fields, this can easily flood logs or leak data into debug logs in non‑dev environments.

Consider either:

  • Logging only derived metadata (sizes, selected fields) by default, or
  • Wrapping this behind a dedicated, clearly documented debug flag so operators know they’re enabling full‑body logging.

290-337: GraphQL variable conversion assumes an object root; that’s fine but worth documenting

convertProtoJSONToGraphQLVariables always unmarshals into map[string]interface{} and will fail if the proto JSON is not an object (e.g. a bare string or array). Given GraphQL variables must be an object, that’s reasonable, but it’s implicit.

You might want to make the precondition explicit in the comment or error message (e.g. “expected JSON object at top level for GraphQL variables”), so future callers don’t attempt to pass scalar/array roots.


507-515: Comment about “backward compatibility” no longer matches behavior on parse errors

In executeGraphQL:

if err := json.Unmarshal(responseBody, &graphqlResponse); err != nil {
    // If we can't parse it, return the raw response (backward compatibility)
    h.logger.Error("failed to parse GraphQL response", ...)
    return nil, connect.NewError(connect.CodeInternal, ErrInternalServer)
}

The comment still says “return the raw response (backward compatibility)”, but the code now returns a CodeInternal error with a generic message and only logs the raw body.

Either update the comment to reflect the current behavior or restore the old behavior, but they shouldn’t contradict each other.


488-505: HTTP→Connect error mapping is reasonable but may miss some common statuses

httpStatusToConnectCode covers the major 4xx/5xx codes, but others like 422 (often used for validation errors) and 502 (bad gateway) currently fall back to CodeUnknown.

This is not wrong, but if your GraphQL endpoint ever uses those statuses, mapping them to InvalidArgument and Unavailable respectively would make errors more actionable for clients.

router/pkg/connectrpc/server.go (1)

77-87: Configure CheckRetry in retryablehttp to avoid retrying non-idempotent GraphQL mutations

Using retryablehttp.NewClient() retries all failed POST requests by default. Since mutations are non-idempotent and can cause duplicate side effects, configure retryClient.CheckRetry to only retry queries, or disable retries and rely on upstream retry behavior. The operation.OperationType field is already available and logged at execution time—pass it to executeGraphQL() to make retry decisions.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 92ba712 and 7895b2a.

⛔ Files ignored due to path filters (1)
  • router-tests/testdata/connectrpc/client/employee.v1/service.pb.go is excluded by !**/*.pb.go
📒 Files selected for processing (17)
  • protographic/src/operation-to-proto.ts (2 hunks)
  • router-tests/connectrpc/connectrpc_server_lifecycle_test.go (1 hunks)
  • router-tests/testdata/connectrpc/README.md (1 hunks)
  • router-tests/testdata/connectrpc/buf.gen.yaml (1 hunks)
  • router-tests/testdata/connectrpc/client/employee.v1/employeev1connect/service.connect.go (1 hunks)
  • router/core/router.go (5 hunks)
  • router/core/router_config.go (4 hunks)
  • router/pkg/config/config.go (2 hunks)
  • router/pkg/config/config.schema.json (1 hunks)
  • router/pkg/connectrpc/constructor_validation_test.go (1 hunks)
  • router/pkg/connectrpc/error_handling_test.go (1 hunks)
  • router/pkg/connectrpc/handler.go (1 hunks)
  • router/pkg/connectrpc/handler_test.go (1 hunks)
  • router/pkg/connectrpc/proto_loader.go (1 hunks)
  • router/pkg/connectrpc/server.go (1 hunks)
  • router/pkg/connectrpc/server_test.go (1 hunks)
  • router/pkg/connectrpc/validator.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (7)
  • router/pkg/connectrpc/server_test.go
  • router-tests/testdata/connectrpc/buf.gen.yaml
  • router/pkg/connectrpc/error_handling_test.go
  • router/core/router_config.go
  • router/pkg/connectrpc/handler_test.go
  • router/pkg/connectrpc/validator.go
  • router-tests/connectrpc/connectrpc_server_lifecycle_test.go
🧰 Additional context used
🧠 Learnings (16)
📓 Common learnings
Learnt from: dkorittki
Repo: wundergraph/cosmo PR: 2273
File: router/core/graphql_handler.go:0-0
Timestamp: 2025-11-19T15:13:57.821Z
Learning: In the Cosmo router (wundergraph/cosmo), error handling follows a two-phase pattern: (1) Prehandler phase handles request parsing, validation, and setup errors using `httpGraphqlError` and `writeOperationError` (in files like graphql_prehandler.go, operation_processor.go, parse_multipart.go, batch.go); (2) Execution phase handles resolver execution errors using `WriteError` in GraphQLHandler.ServeHTTP. Because all `httpGraphqlError` instances are caught in the prehandler before ServeHTTP is invoked, any error type checks for `httpGraphqlError` in the execution-phase WriteError method are unreachable code.
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2141
File: router-tests/http_subscriptions_test.go:17-55
Timestamp: 2025-08-28T09:17:49.477Z
Learning: The Cosmo router uses a custom, intentionally rigid multipart implementation for GraphQL subscriptions. The multipart parsing in test files should remain strict and not be made more tolerant, as this rigidity is by design.
📚 Learning: 2025-09-17T20:55:39.456Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2172
File: router/core/graph_server.go:0-0
Timestamp: 2025-09-17T20:55:39.456Z
Learning: The Initialize method in router/internal/retrytransport/manager.go has been updated to properly handle feature-flag-only subgraphs by collecting subgraphs from both routerConfig.GetSubgraphs() and routerConfig.FeatureFlagConfigs.ConfigByFeatureFlagName, ensuring all subgraphs receive retry configuration.

Applied to files:

  • router/core/router.go
📚 Learning: 2025-08-20T22:13:25.222Z
Learnt from: StarpTech
Repo: wundergraph/cosmo PR: 2157
File: router-tests/go.mod:16-16
Timestamp: 2025-08-20T22:13:25.222Z
Learning: github.com/mark3labs/mcp-go v0.38.0 has regressions and should not be used in the wundergraph/cosmo project. v0.36.0 is the stable version that should be used across router-tests and other modules.

Applied to files:

  • router/core/router.go
📚 Learning: 2025-09-24T12:54:00.765Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2222
File: router-tests/websocket_test.go:2238-2302
Timestamp: 2025-09-24T12:54:00.765Z
Learning: The wundergraph/cosmo project uses Go 1.25 (Go 1.25 minimum), so fmt.Appendf and other newer Go standard library functions are available and can be used without compatibility concerns.

Applied to files:

  • router/core/router.go
📚 Learning: 2025-09-24T12:54:00.765Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2222
File: router-tests/websocket_test.go:2238-2302
Timestamp: 2025-09-24T12:54:00.765Z
Learning: The wundergraph/cosmo project uses Go 1.25 (Go 1.23+ minimum), so fmt.Appendf and other newer Go standard library functions are available and can be used without compatibility concerns.

Applied to files:

  • router/core/router.go
📚 Learning: 2025-09-19T15:08:03.085Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2223
File: router/cmd/main.go:32-33
Timestamp: 2025-09-19T15:08:03.085Z
Learning: pyroscope-go's Config.ServerAddress field accepts both full URLs (e.g., "http://localhost:4040") and simple host:port format (e.g., "localhost:4040"). The library handles URL normalization internally, so both formats work correctly.

Applied to files:

  • router/pkg/connectrpc/server.go
📚 Learning: 2025-10-01T20:39:16.113Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2252
File: router-tests/telemetry/telemetry_test.go:9684-9693
Timestamp: 2025-10-01T20:39:16.113Z
Learning: Repo preference: In router-tests/telemetry/telemetry_test.go, keep strict > 0 assertions for request.operation.*Time (parsingTime, normalizationTime, validationTime, planningTime) in telemetry-related tests; do not relax to >= 0 unless CI flakiness is observed.

Applied to files:

  • router/pkg/connectrpc/constructor_validation_test.go
📚 Learning: 2025-11-19T15:13:57.821Z
Learnt from: dkorittki
Repo: wundergraph/cosmo PR: 2273
File: router/core/graphql_handler.go:0-0
Timestamp: 2025-11-19T15:13:57.821Z
Learning: In the Cosmo router (wundergraph/cosmo), error handling follows a two-phase pattern: (1) Prehandler phase handles request parsing, validation, and setup errors using `httpGraphqlError` and `writeOperationError` (in files like graphql_prehandler.go, operation_processor.go, parse_multipart.go, batch.go); (2) Execution phase handles resolver execution errors using `WriteError` in GraphQLHandler.ServeHTTP. Because all `httpGraphqlError` instances are caught in the prehandler before ServeHTTP is invoked, any error type checks for `httpGraphqlError` in the execution-phase WriteError method are unreachable code.

Applied to files:

  • router/pkg/connectrpc/handler.go
📚 Learning: 2025-08-07T12:05:06.775Z
Learnt from: StarpTech
Repo: wundergraph/cosmo PR: 2079
File: proto/wg/cosmo/platform/v1/platform.proto:39-45
Timestamp: 2025-08-07T12:05:06.775Z
Learning: In the Cosmo project, the proto fields for schema, mappings, and lock in ProtoInput are intentionally kept as string types rather than bytes because the team works with text data and wants it to be UTF-8 encoded for readability and text processing purposes.

Applied to files:

  • router/pkg/connectrpc/handler.go
📚 Learning: 2025-09-02T12:52:27.677Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2181
File: router/core/operation_processor.go:0-0
Timestamp: 2025-09-02T12:52:27.677Z
Learning: Hash validation for persisted queries with query bodies is performed in router/core/graphql_prehandler.go in the handleOperation function, not in the APQ processing logic in operation_processor.go. This validates that extensions.persistedQuery.sha256Hash matches the query body before any APQ operations occur.

Applied to files:

  • router/pkg/connectrpc/handler.go
📚 Learning: 2025-07-21T15:06:36.664Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2067
File: router/pkg/config/config.schema.json:1637-1644
Timestamp: 2025-07-21T15:06:36.664Z
Learning: In the Cosmo router project, when extending JSON schema validation for security-sensitive fields like JWKS secrets, backwards compatibility is maintained by implementing warnings in the Go code rather than hard validation constraints in the schema. This allows existing configurations to continue working while alerting users to potential security issues.

Applied to files:

  • router/pkg/config/config.schema.json
📚 Learning: 2025-07-21T14:46:34.879Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2067
File: router/pkg/authentication/jwks_token_decoder.go:80-106
Timestamp: 2025-07-21T14:46:34.879Z
Learning: In the Cosmo router project, required field validation for JWKS configuration (Secret, Algorithm, KeyId) is handled at the JSON schema level in config.schema.json rather than through runtime validation in the Go code at router/pkg/authentication/jwks_token_decoder.go.

Applied to files:

  • router/pkg/config/config.schema.json
📚 Learning: 2025-06-30T20:39:02.387Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 1929
File: router/internal/circuit/manager.go:16-25
Timestamp: 2025-06-30T20:39:02.387Z
Learning: In the Cosmo router project, parameter validation for circuit breaker configuration is handled at the JSON schema level rather than through runtime validation methods on structs. The config.schema.json file contains comprehensive validation constraints for circuit breaker parameters.

Applied to files:

  • router/pkg/config/config.schema.json
📚 Learning: 2025-08-20T10:08:17.857Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2155
File: router/core/router.go:1857-1866
Timestamp: 2025-08-20T10:08:17.857Z
Learning: In the Cosmo router codebase, JSON schema validation prevents null values in TrafficShapingRules subgraph configurations, making nil checks unnecessary when dereferencing subgraph rule pointers in NewSubgraphTransportOptions.

Applied to files:

  • router/pkg/config/config.schema.json
📚 Learning: 2025-08-20T10:08:17.857Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2155
File: router/core/router.go:1857-1866
Timestamp: 2025-08-20T10:08:17.857Z
Learning: router/pkg/config/config.schema.json forbids null values for traffic_shaping.subgraphs: additionalProperties references $defs.traffic_shaping_subgraph_request_rule with type "object". Therefore, in core.NewSubgraphTransportOptions, dereferencing each subgraph rule pointer is safe under schema-validated configs, and a nil-check is unnecessary.

Applied to files:

  • router/pkg/config/config.schema.json
📚 Learning: 2025-07-30T09:29:04.257Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2090
File: router/core/operation_processor.go:0-0
Timestamp: 2025-07-30T09:29:04.257Z
Learning: GraphQL operation names don't allow characters with more than 1 code point, so string length operations and slicing work correctly for both byte and character counting in GraphQL operation name processing.

Applied to files:

  • protographic/src/operation-to-proto.ts
🧬 Code graph analysis (3)
router/pkg/connectrpc/constructor_validation_test.go (3)
router/pkg/connectrpc/handler.go (3)
  • NewRPCHandler (161-194)
  • HandlerConfig (152-158)
  • RPCHandler (143-149)
router/pkg/connectrpc/server.go (2)
  • NewServer (50-166)
  • ServerConfig (21-33)
router/pkg/connectrpc/vanguard_service.go (2)
  • NewVanguardService (68-93)
  • VanguardServiceConfig (53-57)
router/pkg/connectrpc/handler.go (4)
router/core/errors.go (1)
  • Extensions (48-54)
router/pkg/connectrpc/operation_registry.go (1)
  • OperationRegistry (24-29)
router/pkg/connectrpc/validator.go (2)
  • MessageValidator (14-16)
  • NewMessageValidator (19-23)
router/pkg/connectrpc/proto_loader.go (1)
  • ProtoLoader (53-59)
router-tests/testdata/connectrpc/client/employee.v1/employeev1connect/service.connect.go (1)
router-tests/testdata/connectrpc/client/employee.v1/service.pb.go (43)
  • File_employee_v1_service_proto (1955-1955)
  • FindEmployeesByPetsRequest (321-327)
  • FindEmployeesByPetsRequest (342-342)
  • FindEmployeesByPetsRequest (357-359)
  • FindEmployeesByPetsResponse (368-375)
  • FindEmployeesByPetsResponse (390-390)
  • FindEmployeesByPetsResponse (405-407)
  • FindEmployeesByPetsInlineFragmentRequest (586-592)
  • FindEmployeesByPetsInlineFragmentRequest (607-607)
  • FindEmployeesByPetsInlineFragmentRequest (622-624)
  • FindEmployeesByPetsInlineFragmentResponse (633-640)
  • FindEmployeesByPetsInlineFragmentResponse (655-655)
  • FindEmployeesByPetsInlineFragmentResponse (670-672)
  • FindEmployeesByPetsNamedFragmentRequest (681-687)
  • FindEmployeesByPetsNamedFragmentRequest (702-702)
  • FindEmployeesByPetsNamedFragmentRequest (717-719)
  • FindEmployeesByPetsNamedFragmentResponse (728-735)
  • FindEmployeesByPetsNamedFragmentResponse (750-750)
  • FindEmployeesByPetsNamedFragmentResponse (765-767)
  • GetEmployeeByIdRequest (227-233)
  • GetEmployeeByIdRequest (248-248)
  • GetEmployeeByIdRequest (263-265)
  • GetEmployeeByIdResponse (274-280)
  • GetEmployeeByIdResponse (295-295)
  • GetEmployeeByIdResponse (310-312)
  • GetEmployeesRequest (501-505)
  • GetEmployeesRequest (520-520)
  • GetEmployeesRequest (535-537)
  • GetEmployeesResponse (539-545)
  • GetEmployeesResponse (560-560)
  • GetEmployeesResponse (575-577)
  • GetEmployeesWithMoodRequest (416-420)
  • GetEmployeesWithMoodRequest (435-435)
  • GetEmployeesWithMoodRequest (450-452)
  • GetEmployeesWithMoodResponse (454-460)
  • GetEmployeesWithMoodResponse (475-475)
  • GetEmployeesWithMoodResponse (490-492)
  • UpdateEmployeeMoodRequest (124-131)
  • UpdateEmployeeMoodRequest (146-146)
  • UpdateEmployeeMoodRequest (161-163)
  • UpdateEmployeeMoodResponse (179-186)
  • UpdateEmployeeMoodResponse (201-201)
  • UpdateEmployeeMoodResponse (216-218)
🪛 markdownlint-cli2 (0.18.1)
router-tests/testdata/connectrpc/README.md

16-16: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: build-router
  • GitHub Check: integration_test (./events)
  • GitHub Check: image_scan
  • GitHub Check: integration_test (./telemetry)
  • GitHub Check: image_scan (nonroot)
  • GitHub Check: build_push_image (nonroot)
  • GitHub Check: Analyze (javascript-typescript)
  • GitHub Check: integration_test (./. ./fuzzquery ./lifecycle ./modules)
  • GitHub Check: Analyze (go)
  • GitHub Check: build_push_image
  • GitHub Check: build_test
🔇 Additional comments (12)
protographic/src/operation-to-proto.ts (1)

267-281: PascalCase validation behavior and naming contract

The PascalCase check is clear and the upperFirst(camelCase(operationName)) suggestion is a good, deterministic hint. One thing to be aware of: this regex intentionally rejects several GraphQL‑legal names (e.g., ones with underscores like get_user, leading underscores like _Query, or all‑caps like GETUSER / ID). If the intent is to enforce a strictly PascalCase contract for this bridge, this is fine; otherwise you may want to either (a) relax the pattern or (b) explicitly call out these rejected patterns in the error text/docs so users aren’t surprised.

From a correctness perspective, the ASCII‑only character classes are safe here because GraphQL operation names are restricted such that multi‑code‑point characters aren’t allowed, so there’s no hidden Unicode edge case in this validation. Based on learnings, this aligns with the existing operation‑name constraints.

router-tests/testdata/connectrpc/client/employee.v1/employeev1connect/service.connect.go (1)

1-307: Generated Connect client/server scaffold looks good.

This is standard protoc-gen-connect-go output: imports are used, per‑RPC descriptors, client methods, unary handlers, and unimplemented stubs are wired correctly. No manual adjustments needed.

router/pkg/config/config.schema.json (1)

2137-2185: connect_rpc schema aligns with Go config and enforces the key invariant.

The connect_rpc section correctly mirrors ConnectRPCConfiguration/ConnectRPCServer, and the if/then rule requiring services_provider_id when enabled: true captures the essential relationship without over‑constraining other fields. Looks solid.

router/pkg/connectrpc/constructor_validation_test.go (1)

12-145: Constructor validation/default tests are well‑consolidated and robust.

The table‑driven TestConstructorValidation cleanly exercises all failure paths for NewRPCHandler, NewServer, and NewVanguardService, and TestConstructorDefaults covers the key defaulting behavior (endpoint normalization, listen address). Using samples/services[...] relative to the package dir is appropriate, and the explicit error check on LoadFromDirectory("samples/services/employee.v1") ensures path issues won’t be silently masked.

Also applies to: 147-182

router/core/router.go (3)

887-958: MCP storage provider resolution logic now looks correct

Using operationsDir == "" as the failure condition is simpler and less error‑prone than tracking an explicit “found” flag, and the error message is clear for misconfigured provider_id. No further changes needed here.


1513-1619: Shutdown integration for ConnectRPC server and telemetry looks solid

Using sync.WaitGroup.Go with the mutex‑protected concSafeErrorJoiner gives clean concurrent teardown of the Prometheus, MCP, ConnectRPC, tracing, metrics, Redis, and module cleanups. The new r.connectRPCServer != nil branch mirrors the existing MCP shutdown path and should behave correctly under the router grace period.


2208-2212: WithConnectRPC option wiring is consistent with other config options

The new WithConnectRPC nicely mirrors WithMCP and other With* helpers and ensures the config flows into bootstrap without additional special‑casing. No issues here.

router/pkg/connectrpc/server.go (2)

301-321: Initialization order between protoLoader, operationRegistry, and RPCHandler looks correct

initializeComponents creates a fresh OperationRegistry and then constructs an RPCHandler that closes over both operationRegistry and protoLoader. Given that NewServer and Reload both set protoLoader before calling this, the handler will always see the correct loader instance, and subsequent calls to LoadFromDirectory/LoadOperationsForService will populate the shared instances as intended.


323-340: Handler wrapping for h2c in Start/Reload preserves HTTP/2 compatibility

Using createHandler + h2c.NewHandler in both Start and Reload keeps the gRPC/HTTP2 surface consistent across reloads and aligns with previous review feedback. The small responseWriter wrapper is correctly integrated via wrappedTranscoder.

router/pkg/connectrpc/handler.go (3)

340-371: snakeToCamel implementation is efficient and handles edge cases well

The character‑wise loop with a strings.Builder is clear and avoids allocations, and it handles leading/trailing/duplicate underscores correctly for typical proto field names. Good fit for the recursive key conversion.


517-542: GraphQL error classification logic aligns with the CRITICAL/PARTIAL model

The executeGraphQL logic to distinguish:

  • CRITICAL: errors with no data, and
  • PARTIAL: errors with non‑null, non‑empty data

and then emitting makeCriticalGraphQLError/makePartialGraphQLError with rich metadata (http-status, graphql-errors, graphql-partial-data) matches the stated Relay‑inspired CRITICAL vs NON‑CRITICAL semantics and should be very helpful for downstream clients.


544-582: Introspection helpers on RPCHandler are simple and useful

GetOperationCount, GetOperations, VerifyOperationExists, and GetOperationInfo are thin wrappers over OperationRegistry that will be handy for diagnostics and tests. Returning interface{} from the getters is a bit loose, but acceptable given the internal usage.

Comment on lines 75 to 227
func (pl *ProtoLoader) LoadFromDirectory(dir string) error {
pl.logger.Info("loading proto files from directory", zap.String("dir", dir))

// Find all .proto files
protoFiles, err := pl.findProtoFiles(dir)
if err != nil {
return fmt.Errorf("failed to find proto files: %w", err)
}

if len(protoFiles) == 0 {
return fmt.Errorf("no proto files found in directory: %s", dir)
}

pl.logger.Info("found proto files", zap.Int("count", len(protoFiles)))

// Load each proto file
for _, protoFile := range protoFiles {
if err := pl.loadProtoFile(protoFile); err != nil {
pl.logger.Error("failed to load proto file",
zap.String("file", protoFile),
zap.Error(err))
return fmt.Errorf("failed to load proto file %s: %w", protoFile, err)
}
}

pl.logger.Info("successfully loaded proto files",
zap.Int("services", len(pl.services)))

return nil
}

// LoadFromDirectories loads all .proto files from multiple directories
// and validates that proto package names are unique across all directories.
// The proto package name acts as a namespace, so duplicate packages are not allowed.
func (pl *ProtoLoader) LoadFromDirectories(dirs []string) error {
if len(dirs) == 0 {
return fmt.Errorf("no directories provided")
}

pl.logger.Info("loading proto files from multiple directories",
zap.Int("directory_count", len(dirs)))

// Track packages we've seen to enforce uniqueness
seenPackages := make(map[string]string) // package name -> directory

for _, dir := range dirs {
pl.logger.Debug("loading proto files from directory", zap.String("dir", dir))

// Find all .proto files in this directory
protoFiles, err := pl.findProtoFiles(dir)
if err != nil {
return fmt.Errorf("failed to find proto files in %s: %w", dir, err)
}

if len(protoFiles) == 0 {
pl.logger.Warn("no proto files found in directory", zap.String("dir", dir))
continue
}

pl.logger.Debug("found proto files",
zap.String("dir", dir),
zap.Int("count", len(protoFiles)))

// Track service names before loading to identify new ones
existingServices := make(map[string]bool)
for serviceName := range pl.services {
existingServices[serviceName] = true
}

// Load each proto file and track packages
for _, protoFile := range protoFiles {
if err := pl.loadProtoFile(protoFile); err != nil {
pl.logger.Error("failed to load proto file",
zap.String("file", protoFile),
zap.String("dir", dir),
zap.Error(err))
return fmt.Errorf("failed to load proto file %s from %s: %w", protoFile, dir, err)
}

// Validate package uniqueness for newly added services
for serviceName, service := range pl.services {
// Only check services that were just added in this file
if existingServices[serviceName] {
continue
}

packageName := service.Package
if existingDir, exists := seenPackages[packageName]; exists && existingDir != dir {
return fmt.Errorf(
"duplicate proto package '%s' found in multiple directories: '%s' and '%s'. "+
"Proto package names must be unique across all services",
packageName, existingDir, dir)
}
seenPackages[packageName] = dir

pl.logger.Debug("registered proto package",
zap.String("package", packageName),
zap.String("dir", dir),
zap.String("service", service.FullName))
}
}
}

pl.logger.Info("successfully loaded proto files from all directories",
zap.Int("total_services", len(pl.services)),
zap.Int("unique_packages", len(seenPackages)))

return nil
}

// findProtoFiles recursively finds all .proto files in a directory
func (pl *ProtoLoader) findProtoFiles(dir string) ([]string, error) {
var protoFiles []string

err := filepath.Walk(dir, func(path string, info os.FileInfo, err error) error {
if err != nil {
return err
}

if !info.IsDir() && strings.HasSuffix(path, ".proto") {
protoFiles = append(protoFiles, path)
}

return nil
})

if err != nil {
return nil, err
}

return protoFiles, nil
}

// loadProtoFile loads and parses a single proto file using protoparse
func (pl *ProtoLoader) loadProtoFile(path string) error {
pl.logger.Debug("loading proto file", zap.String("path", path))

// Get the directory containing the proto file for import resolution
dir := filepath.Dir(path)
filename := filepath.Base(path)

// Create a parser with the directory as import path
parser := protoparse.Parser{
ImportPaths: []string{dir},
IncludeSourceCodeInfo: true,
}

// Parse the proto file
fds, err := parser.ParseFiles(filename)
if err != nil {
return fmt.Errorf("failed to parse proto file: %w", err)
}

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Per‑file parsing with ImportPaths limited to the file directory can break common proto import layouts

loadProtoFile configures protoparse.Parser{ImportPaths: []string{dir}} and calls ParseFiles(filename) for each discovered file individually. Combined with findProtoFiles returning absolute paths, this means:

  • Imports that are relative to a service root (e.g. import "common/types.proto") won’t resolve if types.proto lives in a sibling directory rather than the immediate filepath.Dir(path).
  • Nested tree layouts (foo/bar.proto importing foo/common.proto or ../common.proto) are fragile, because the import root is the file’s directory instead of the overall ServicesDir.

That makes the loader surprisingly sensitive to directory structure and file naming, and will fail for many idiomatic .proto layouts.

Consider instead:

  • Using the service root (the dir passed to LoadFromDirectory) as an import path, and
  • Passing relative paths to ParseFiles (e.g. via rel, _ := filepath.Rel(rootDir, path)).

So LoadFromDirectory would build a slice of relative proto filenames and then invoke a single Parser{ImportPaths: []string{rootDir}}.ParseFiles(relFiles...), letting protoparse resolve imports correctly across the tree.

🤖 Prompt for AI Agents
In router/pkg/connectrpc/proto_loader.go around lines 75-227, the loader
currently calls loadProtoFile per-file with Parser.ImportPaths set to the file's
directory which breaks common proto import layouts; instead, for
LoadFromDirectory gather all proto files, compute their paths relative to the
provided root directory (using filepath.Rel), create a single protoparse.Parser
with ImportPaths set to that root directory, and call ParseFiles once with the
slice of relative filenames so imports resolve across the tree; either change
loadProtoFile to accept (rootDir, relPath) and use the root as ImportPaths or
remove per-file parsing and perform batch parsing in LoadFromDirectory (apply
the same pattern in LoadFromDirectories: use each service root as an ImportPath
and pass relative filenames to ParseFiles).

Comment on lines +293 to +299
// Update HTTP server handler with h2c wrapper for gRPC compatibility
handler := s.createHandler()
s.httpServer.Handler = h2c.NewHandler(handler, &http2.Server{})

s.logger.Info("ConnectRPC server reloaded successfully")
return nil
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Guard Reload against being called before Start

Reload unconditionally does:

handler := s.createHandler()
s.httpServer.Handler = h2c.NewHandler(handler, &http2.Server{})

If someone calls Reload() on a freshly constructed Server (before Start()), s.httpServer will be nil and this will panic. In practice you probably only call Reload after Start, but the method doesn’t state or enforce that precondition.

Consider:

if s.httpServer == nil {
    return fmt.Errorf("server is not started")
}
handler := s.createHandler()
s.httpServer.Handler = h2c.NewHandler(handler, &http2.Server{})

so misuse fails fast instead of panicking.

🤖 Prompt for AI Agents
In router/pkg/connectrpc/server.go around lines 293 to 299, Reload currently
assumes s.httpServer is initialized and will panic if called before Start; add a
guard at the top of Reload that checks whether s.httpServer (or a boolean
started flag if present) is nil/false and return a clear error like "server not
started; call Start before Reload" instead of proceeding; keep existing behavior
when the server is started by creating the handler and setting
s.httpServer.Handler to the h2c wrapper and then logging success.

@asoorm asoorm force-pushed the ahmet/eng-8277-connect-rpc-handler-final branch from 7895b2a to ff07530 Compare December 3, 2025 09:54
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (9)
router-tests/testdata/connectrpc/README.md (1)

16-22: Add language identifier to fenced code block to satisfy markdownlint.

The fenced code block currently uses plain ``` without a language, which still triggers MD040. Set it to something like text to keep tooling happy.

-```
+```text
 services/
 └── myservice.v1/
     ├── service.proto                    # Proto service definition
     ├── QueryGetItem.graphql            # GraphQL query operation
     └── MutationCreateItem.graphql      # GraphQL mutation operation

</blockquote></details>
<details>
<summary>router/pkg/connectrpc/constructor_validation_test.go (4)</summary><blockquote>

`87-92`: **Fix incorrect relative path for CI execution.**

When CI runs tests from the `router` directory (per `.github/workflows/router-ci.yaml`), the path `"samples/services"` at line 89 does not exist. Change to `"pkg/connectrpc/samples/services"` so tests pass in CI.




Apply this diff:

```diff
 		name: "Server: empty graphql endpoint",
 		constructor: func() (interface{}, error) {
 			return NewServer(ServerConfig{
-				ServicesDir: "samples/services",
+				ServicesDir: "pkg/connectrpc/samples/services",
 			})
 		},
 		wantErr: "graphql endpoint cannot be empty",

96-111: Fix incorrect path and check LoadFromDirectory error.

The path "samples/services/employee.v1" at line 100 is incorrect when CI runs from the router directory. Additionally, the error from LoadFromDirectory is checked at line 101-103 but the function still proceeds to construct VanguardService even when loading fails, which masks the path issue.

Apply this diff:

 	{
 		name: "VanguardService: nil handler",
 		constructor: func() (interface{}, error) {
 			protoLoader := NewProtoLoader(logger)
-			err := protoLoader.LoadFromDirectory("samples/services/employee.v1")
+			err := protoLoader.LoadFromDirectory("pkg/connectrpc/samples/services/employee.v1")
 			if err != nil {
 				return nil, err
 			}

163-171: Fix incorrect relative path for CI execution.

The path "samples/services" at line 165 is incorrect when tests run from the router directory.

Apply this diff:

 	t.Run("Server: adds protocol to endpoint", func(t *testing.T) {
 		server, err := NewServer(ServerConfig{
-			ServicesDir:     "samples/services",
+			ServicesDir:     "pkg/connectrpc/samples/services",
 			GraphQLEndpoint: "localhost:4000/graphql",
 		})

173-181: Fix incorrect relative path for CI execution.

The path "samples/services" at line 175 is incorrect when tests run from the router directory.

Apply this diff:

 	t.Run("Server: uses default listen address", func(t *testing.T) {
 		server, err := NewServer(ServerConfig{
-			ServicesDir:     "samples/services",
+			ServicesDir:     "pkg/connectrpc/samples/services",
 			GraphQLEndpoint: "http://localhost:4000/graphql",
 		})
router/pkg/connectrpc/validator.go (2)

217-237: Validate int64 string values instead of accepting any string.

Per protobuf JSON mapping, int64 can be represented as a string to avoid precision loss, but line 231 accepts any string without parsing. Invalid values like "abc" or "12.34" would pass validation. Parse the string to ensure it's a valid integer.

Apply this diff:

+import (
+	"strconv"
+	// ... other imports
+)

 	case descriptorpb.FieldDescriptorProto_TYPE_INT64,
 		descriptorpb.FieldDescriptorProto_TYPE_SINT64,
 		descriptorpb.FieldDescriptorProto_TYPE_SFIXED64:
 		// JSON numbers are float64, check if it's a valid integer
 		// Note: int64 can be represented as string in JSON to avoid precision loss
 		switch v := value.(type) {
 		case float64:
 			if v != float64(int64(v)) {
 				return &ValidationError{
 					Field:   fieldPath,
 					Message: fmt.Sprintf("Int64 cannot represent non-integer value: %v", value),
 				}
 			}
 		case string:
-			// String representation is valid for int64
+			// Validate string can be parsed as int64
+			if _, err := strconv.ParseInt(v, 10, 64); err != nil {
+				return &ValidationError{
+					Field:   fieldPath,
+					Message: fmt.Sprintf("Int64 cannot represent value: %v", value),
+				}
+			}
 		default:
 			return &ValidationError{
 				Field:   fieldPath,
 				Message: fmt.Sprintf("Int64 cannot represent non-numeric value: %v", value),
 			}
 		}

255-272: Validate uint64 string values instead of accepting any string.

Similar to int64, line 266 accepts any string for uint64 without parsing. Parse using strconv.ParseUint to ensure validity.

Apply this diff:

 	case descriptorpb.FieldDescriptorProto_TYPE_UINT64,
 		descriptorpb.FieldDescriptorProto_TYPE_FIXED64:
 		switch v := value.(type) {
 		case float64:
 			if v < 0 || v != float64(uint64(v)) {
 				return &ValidationError{
 					Field:   fieldPath,
 					Message: fmt.Sprintf("UInt64 cannot represent value: %v", value),
 				}
 			}
 		case string:
-			// String representation is valid for uint64
+			// Validate string can be parsed as uint64
+			if _, err := strconv.ParseUint(v, 10, 64); err != nil {
+				return &ValidationError{
+					Field:   fieldPath,
+					Message: fmt.Sprintf("UInt64 cannot represent value: %v", value),
+				}
+			}
 		default:
 			return &ValidationError{
 				Field:   fieldPath,
 				Message: fmt.Sprintf("UInt64 cannot represent non-numeric value: %v", value),
 			}
 		}
router/pkg/connectrpc/server.go (2)

247-257: Remove redundant Clear() that can corrupt in-flight requests.

At lines 250-252, calling s.operationRegistry.Clear() mutates the registry that's still referenced by the old handler chain (via old transcoder → old vanguardService → old rpcHandler). Since initializeComponents() at line 304 creates a completely new OperationRegistry, this Clear() is both redundant and harmful—it can cause in-flight requests to fail.

Apply this diff:

 	// Create a fresh proto loader and clear operation registry before initializing components
 	// This ensures initializeComponents() (and the RPCHandler it constructs) receives the fresh ProtoLoader
 	s.protoLoader = NewProtoLoader(s.logger)
-	if s.operationRegistry != nil {
-		s.operationRegistry.Clear()
-	}

 	// Reinitialize components with the fresh proto loader
 	if err := s.initializeComponents(); err != nil {

234-299: Guard Reload against being called before Start.

Reload() unconditionally accesses s.httpServer.Handler at line 295. If Reload() is called before Start() (which initializes s.httpServer at line 184), this will panic. Add a guard at the beginning of Reload() to fail fast with a clear error.

Apply this diff:

 // Reload reloads the server configuration and operations
 func (s *Server) Reload() error {
+	if s.httpServer == nil {
+		return fmt.Errorf("server not started; call Start before Reload")
+	}
+
 	s.logger.Info("reloading ConnectRPC server")
🧹 Nitpick comments (2)
router/pkg/connectrpc/handler_test.go (2)

115-130: Consider avoiding direct mutation of OperationRegistry.operations in tests.

The tests seed operations via operationRegistry.operations = map[string]map[string]*schemaloader.Operation{...}. This works but ties tests to the internal representation and locking strategy of OperationRegistry, making refactors there more painful.

If feasible, consider adding a small helper (even test-only) or using an exported API on OperationRegistry to register operations for tests instead of bypassing its encapsulation.

Also applies to: 255-267, 287-294


184-219: Optional: Add a negative assertion for headers that should not be forwarded.

The "should forward listed headers" subtest assertively checks that Authorization and X-Custom are forwarded from context, but the comment about Content-Length “should be skipped” isn’t actually verified.

You could strengthen the test by asserting that the value from the context is not propagated (e.g., by using a sentinel value and checking the server never sees that exact value), which would make the header-filtering behavior less prone to regressions.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7895b2a and ff07530.

⛔ Files ignored due to path filters (1)
  • router-tests/testdata/connectrpc/client/employee.v1/service.pb.go is excluded by !**/*.pb.go
📒 Files selected for processing (18)
  • protographic/src/operation-to-proto.ts (2 hunks)
  • router-tests/connectrpc/connectrpc_server_lifecycle_test.go (1 hunks)
  • router-tests/testdata/connectrpc/README.md (1 hunks)
  • router-tests/testdata/connectrpc/buf.gen.yaml (1 hunks)
  • router-tests/testdata/connectrpc/client/employee.v1/employeev1connect/service.connect.go (1 hunks)
  • router/core/router.go (5 hunks)
  • router/core/router_config.go (4 hunks)
  • router/pkg/config/config.go (2 hunks)
  • router/pkg/config/config.schema.json (1 hunks)
  • router/pkg/config/connectrpc_test.go (1 hunks)
  • router/pkg/connectrpc/constructor_validation_test.go (1 hunks)
  • router/pkg/connectrpc/error_handling_test.go (1 hunks)
  • router/pkg/connectrpc/handler.go (1 hunks)
  • router/pkg/connectrpc/handler_test.go (1 hunks)
  • router/pkg/connectrpc/proto_loader.go (1 hunks)
  • router/pkg/connectrpc/server.go (1 hunks)
  • router/pkg/connectrpc/server_test.go (1 hunks)
  • router/pkg/connectrpc/validator.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (8)
  • protographic/src/operation-to-proto.ts
  • router/pkg/connectrpc/server_test.go
  • router/pkg/config/config.go
  • router-tests/connectrpc/connectrpc_server_lifecycle_test.go
  • router/pkg/config/connectrpc_test.go
  • router/pkg/connectrpc/handler.go
  • router-tests/testdata/connectrpc/buf.gen.yaml
  • router-tests/testdata/connectrpc/client/employee.v1/employeev1connect/service.connect.go
🧰 Additional context used
🧠 Learnings (15)
📓 Common learnings
Learnt from: dkorittki
Repo: wundergraph/cosmo PR: 2273
File: router/core/graphql_handler.go:0-0
Timestamp: 2025-11-19T15:13:57.821Z
Learning: In the Cosmo router (wundergraph/cosmo), error handling follows a two-phase pattern: (1) Prehandler phase handles request parsing, validation, and setup errors using `httpGraphqlError` and `writeOperationError` (in files like graphql_prehandler.go, operation_processor.go, parse_multipart.go, batch.go); (2) Execution phase handles resolver execution errors using `WriteError` in GraphQLHandler.ServeHTTP. Because all `httpGraphqlError` instances are caught in the prehandler before ServeHTTP is invoked, any error type checks for `httpGraphqlError` in the execution-phase WriteError method are unreachable code.
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2141
File: router-tests/http_subscriptions_test.go:17-55
Timestamp: 2025-08-28T09:17:49.477Z
Learning: The Cosmo router uses a custom, intentionally rigid multipart implementation for GraphQL subscriptions. The multipart parsing in test files should remain strict and not be made more tolerant, as this rigidity is by design.
📚 Learning: 2025-08-20T22:13:25.222Z
Learnt from: StarpTech
Repo: wundergraph/cosmo PR: 2157
File: router-tests/go.mod:16-16
Timestamp: 2025-08-20T22:13:25.222Z
Learning: github.com/mark3labs/mcp-go v0.38.0 has regressions and should not be used in the wundergraph/cosmo project. v0.36.0 is the stable version that should be used across router-tests and other modules.

Applied to files:

  • router/core/router_config.go
  • router/core/router.go
📚 Learning: 2025-11-19T15:13:57.821Z
Learnt from: dkorittki
Repo: wundergraph/cosmo PR: 2273
File: router/core/graphql_handler.go:0-0
Timestamp: 2025-11-19T15:13:57.821Z
Learning: In the Cosmo router (wundergraph/cosmo), error handling follows a two-phase pattern: (1) Prehandler phase handles request parsing, validation, and setup errors using `httpGraphqlError` and `writeOperationError` (in files like graphql_prehandler.go, operation_processor.go, parse_multipart.go, batch.go); (2) Execution phase handles resolver execution errors using `WriteError` in GraphQLHandler.ServeHTTP. Because all `httpGraphqlError` instances are caught in the prehandler before ServeHTTP is invoked, any error type checks for `httpGraphqlError` in the execution-phase WriteError method are unreachable code.

Applied to files:

  • router/pkg/connectrpc/handler_test.go
  • router/pkg/connectrpc/error_handling_test.go
📚 Learning: 2025-09-02T12:52:27.677Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2181
File: router/core/operation_processor.go:0-0
Timestamp: 2025-09-02T12:52:27.677Z
Learning: Hash validation for persisted queries with query bodies is performed in router/core/graphql_prehandler.go in the handleOperation function at lines 571-578, not in the APQ processing logic in operation_processor.go. This validates that extensions.persistedQuery.sha256Hash matches the computed query hash before any APQ operations occur. There's also a test case that verifies this behavior.

Applied to files:

  • router/pkg/connectrpc/handler_test.go
📚 Learning: 2025-10-01T20:39:16.113Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2252
File: router-tests/telemetry/telemetry_test.go:9684-9693
Timestamp: 2025-10-01T20:39:16.113Z
Learning: Repo preference: In router-tests/telemetry/telemetry_test.go, keep strict > 0 assertions for request.operation.*Time (parsingTime, normalizationTime, validationTime, planningTime) in telemetry-related tests; do not relax to >= 0 unless CI flakiness is observed.

Applied to files:

  • router/pkg/connectrpc/handler_test.go
  • router/pkg/connectrpc/constructor_validation_test.go
  • router/pkg/connectrpc/error_handling_test.go
📚 Learning: 2025-07-21T15:06:36.664Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2067
File: router/pkg/config/config.schema.json:1637-1644
Timestamp: 2025-07-21T15:06:36.664Z
Learning: In the Cosmo router project, when extending JSON schema validation for security-sensitive fields like JWKS secrets, backwards compatibility is maintained by implementing warnings in the Go code rather than hard validation constraints in the schema. This allows existing configurations to continue working while alerting users to potential security issues.

Applied to files:

  • router/pkg/config/config.schema.json
  • router/pkg/connectrpc/validator.go
📚 Learning: 2025-07-21T14:46:34.879Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2067
File: router/pkg/authentication/jwks_token_decoder.go:80-106
Timestamp: 2025-07-21T14:46:34.879Z
Learning: In the Cosmo router project, required field validation for JWKS configuration (Secret, Algorithm, KeyId) is handled at the JSON schema level in config.schema.json rather than through runtime validation in the Go code at router/pkg/authentication/jwks_token_decoder.go.

Applied to files:

  • router/pkg/config/config.schema.json
  • router/pkg/connectrpc/validator.go
📚 Learning: 2025-07-30T09:29:46.660Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2090
File: router/pkg/config/config.schema.json:0-0
Timestamp: 2025-07-30T09:29:46.660Z
Learning: The "operation_name_trim_limit" configuration property in router/pkg/config/config.schema.json should be placed at the security level as a sibling to complexity_limits, not inside the complexity_limits object.

Applied to files:

  • router/pkg/config/config.schema.json
📚 Learning: 2025-06-30T20:39:02.387Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 1929
File: router/internal/circuit/manager.go:16-25
Timestamp: 2025-06-30T20:39:02.387Z
Learning: In the Cosmo router project, parameter validation for circuit breaker configuration is handled at the JSON schema level rather than through runtime validation methods on structs. The config.schema.json file contains comprehensive validation constraints for circuit breaker parameters.

Applied to files:

  • router/pkg/config/config.schema.json
📚 Learning: 2025-08-20T10:08:17.857Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2155
File: router/core/router.go:1857-1866
Timestamp: 2025-08-20T10:08:17.857Z
Learning: In the Cosmo router codebase, JSON schema validation prevents null values in TrafficShapingRules subgraph configurations, making nil checks unnecessary when dereferencing subgraph rule pointers in NewSubgraphTransportOptions.

Applied to files:

  • router/pkg/config/config.schema.json
📚 Learning: 2025-08-20T10:08:17.857Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2155
File: router/core/router.go:1857-1866
Timestamp: 2025-08-20T10:08:17.857Z
Learning: router/pkg/config/config.schema.json forbids null values for traffic_shaping.subgraphs: additionalProperties references $defs.traffic_shaping_subgraph_request_rule with type "object". Therefore, in core.NewSubgraphTransportOptions, dereferencing each subgraph rule pointer is safe under schema-validated configs, and a nil-check is unnecessary.

Applied to files:

  • router/pkg/config/config.schema.json
📚 Learning: 2025-09-17T20:55:39.456Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2172
File: router/core/graph_server.go:0-0
Timestamp: 2025-09-17T20:55:39.456Z
Learning: The Initialize method in router/internal/retrytransport/manager.go has been updated to properly handle feature-flag-only subgraphs by collecting subgraphs from both routerConfig.GetSubgraphs() and routerConfig.FeatureFlagConfigs.ConfigByFeatureFlagName, ensuring all subgraphs receive retry configuration.

Applied to files:

  • router/core/router.go
📚 Learning: 2025-09-24T12:54:00.765Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2222
File: router-tests/websocket_test.go:2238-2302
Timestamp: 2025-09-24T12:54:00.765Z
Learning: The wundergraph/cosmo project uses Go 1.25 (Go 1.25 minimum), so fmt.Appendf and other newer Go standard library functions are available and can be used without compatibility concerns.

Applied to files:

  • router/core/router.go
📚 Learning: 2025-09-24T12:54:00.765Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2222
File: router-tests/websocket_test.go:2238-2302
Timestamp: 2025-09-24T12:54:00.765Z
Learning: The wundergraph/cosmo project uses Go 1.25 (Go 1.23+ minimum), so fmt.Appendf and other newer Go standard library functions are available and can be used without compatibility concerns.

Applied to files:

  • router/core/router.go
📚 Learning: 2025-09-19T15:08:03.085Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2223
File: router/cmd/main.go:32-33
Timestamp: 2025-09-19T15:08:03.085Z
Learning: pyroscope-go's Config.ServerAddress field accepts both full URLs (e.g., "http://localhost:4040") and simple host:port format (e.g., "localhost:4040"). The library handles URL normalization internally, so both formats work correctly.

Applied to files:

  • router/pkg/connectrpc/server.go
🧬 Code graph analysis (5)
router/core/router_config.go (3)
router/pkg/connectrpc/server.go (1)
  • Server (36-47)
router/core/graph_server.go (1)
  • Server (69-72)
router/pkg/config/config.go (1)
  • ConnectRPCConfiguration (1012-1017)
router/pkg/connectrpc/handler_test.go (4)
router/pkg/connectrpc/operation_registry.go (2)
  • NewOperationRegistry (32-41)
  • OperationRegistry (24-29)
router/pkg/connectrpc/proto_loader.go (2)
  • NewProtoLoader (62-72)
  • ProtoLoader (53-59)
router/pkg/connectrpc/handler.go (2)
  • NewRPCHandler (161-194)
  • HandlerConfig (152-158)
router/pkg/connectrpc/test_helpers.go (1)
  • MockHTTPClient (64-71)
router/pkg/connectrpc/validator.go (1)
router/pkg/connectrpc/proto_loader.go (1)
  • ProtoLoader (53-59)
router/pkg/connectrpc/constructor_validation_test.go (3)
router/pkg/connectrpc/handler.go (3)
  • NewRPCHandler (161-194)
  • HandlerConfig (152-158)
  • RPCHandler (143-149)
router/pkg/connectrpc/server.go (2)
  • NewServer (50-166)
  • ServerConfig (21-33)
router/pkg/connectrpc/vanguard_service.go (2)
  • NewVanguardService (68-93)
  • VanguardServiceConfig (53-57)
router/core/router.go (3)
router/pkg/connectrpc/server.go (3)
  • Server (36-47)
  • ServerConfig (21-33)
  • NewServer (50-166)
router/pkg/connectrpc/service_discovery.go (2)
  • DiscoverServices (46-153)
  • ServiceDiscoveryConfig (29-34)
router/pkg/config/config.go (1)
  • ConnectRPCConfiguration (1012-1017)
🪛 markdownlint-cli2 (0.18.1)
router-tests/testdata/connectrpc/README.md

16-16: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: build-router
  • GitHub Check: build_push_image (nonroot)
  • GitHub Check: build_push_image
  • GitHub Check: integration_test (./events)
  • GitHub Check: image_scan
  • GitHub Check: integration_test (./. ./fuzzquery ./lifecycle ./modules)
  • GitHub Check: image_scan (nonroot)
  • GitHub Check: integration_test (./telemetry)
  • GitHub Check: build_test
  • GitHub Check: Analyze (javascript-typescript)
  • GitHub Check: Analyze (go)
  • GitHub Check: build_test
🔇 Additional comments (5)
router/pkg/config/config.schema.json (1)

2137-2185: ConnectRPC schema block is consistent and uses schema-level validation appropriately.

The connect_rpc section cleanly mirrors ConnectRPCConfiguration/ConnectRPCServer (field names, types, defaults) and the if/then constraint correctly enforces services_provider_id only when enabled is true, in line with the project’s pattern of pushing required-field validation into the schema instead of runtime code. Based on learnings, this looks correct.

router/core/router_config.go (1)

14-28: ConnectRPC configuration and usage wiring look consistent.

The new connectRPCServer and connectRPC fields, import of pkg/connectrpc, and the usage["connect_rpc"] = c.connectRPC.Enabled flag all align with the new ConnectRPCConfiguration/server types and the existing usage reporting pattern; no additional schema or nil-guarding changes are needed here.

Also applies to: 111-111, 143-143, 333-334

router/pkg/connectrpc/error_handling_test.go (1)

28-62: Error-handling tests give solid coverage of HTTP and GraphQL failure modes.

The status→Connect-code mapping and the CRITICAL vs PARTIAL GraphQL error scenarios (including metadata, graphql-errors, and graphql-partial-data) are well covered and line up with the new handler behavior and classification scheme; no changes needed here.

Also applies to: 68-303, 305-376

router/core/router.go (2)

960-1034: LGTM: ConnectRPC bootstrap follows established patterns.

The ConnectRPC initialization correctly mirrors the MCP bootstrap flow:

  • Validates provider resolution
  • Discovers services via convention-based approach
  • Derives GraphQL endpoint with fallback
  • Creates and starts server with proper error handling
  • Logs at appropriate levels throughout

1536-1607: Good adoption of sync.WaitGroup.Go for concurrent shutdown.

The shutdown flow now uses wg.Go (available in Go 1.25+) to run cleanup tasks concurrently, as suggested in previous reviews. All shutdown operations properly aggregate errors via concSafeErrorJoiner and the final wg.Wait() at line 1616 ensures completion before returning.

@asoorm asoorm force-pushed the ahmet/eng-8277-connect-rpc-handler-final branch from ff07530 to aae9972 Compare December 3, 2025 12:46
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
protographic/src/operation-to-proto.ts (1)

304-320: Fix comment numbering sequence.

The step numbering in comments is inconsistent. Line 304 uses "4." (duplicate of line 294's step), and line 320 uses "3.5." which is out of sequence. Consider renumbering to maintain a clear sequential flow: 5, 6, 7, etc.

Apply this diff to fix the numbering:

-    // 4. Create request message from variables
+    // 5. Create request message from variables
     const requestMessageName = createRequestMessageName(methodName);
-    // 3.5. Process any input object types referenced in variables
+    // 6. Process any input object types referenced in variables
     if (node.variableDefinitions) {

Then update the subsequent comments accordingly:

  • Line 327: "6." → "7."
  • Line 350: "7." → "8."
♻️ Duplicate comments (9)
router-tests/testdata/connectrpc/README.md (1)

16-22: Fenced code block still missing language identifier.

Static analysis flags this code block as missing a language specifier. This was previously addressed but appears to remain unfixed.

Apply this diff:

-```
+```text
 services/
 └── myservice.v1/
     ├── service.proto                    # Proto service definition
     ├── QueryGetItem.graphql            # GraphQL query operation
     └── MutationCreateItem.graphql      # GraphQL mutation operation

</blockquote></details>
<details>
<summary>router-tests/connectrpc/connectrpc_server_lifecycle_test.go (1)</summary><blockquote>

`53-78`: **Concurrent start test assertion may be too permissive.**

The test asserts "at least one start should succeed" but doesn't verify that the other concurrent starts behave correctly. Since `Server.Start()` creates a new listener on each call without concurrency protection, exactly one should succeed while others fail with port-in-use errors.



Consider adding more specific assertions:

```go
successCount := 0
failureCount := 0
for _, err := range errors {
    if err == nil {
        successCount++
    } else {
        failureCount++
    }
}
assert.Equal(t, 1, successCount, "exactly one start should succeed")
assert.Equal(t, 2, failureCount, "concurrent starts should fail")
router/pkg/connectrpc/constructor_validation_test.go (2)

89-93: Verify relative path works from CI working directory.

The path "samples/services" is relative to the test file location. CI runs tests from the router directory (working-directory: ./router), so this path should be "pkg/connectrpc/samples/services" to work correctly.

 {
     name: "Server: empty graphql endpoint",
     constructor: func() (interface{}, error) {
         return NewServer(ServerConfig{
-            ServicesDir: "samples/services",
+            ServicesDir: "pkg/connectrpc/samples/services",
         })
     },
     wantErr: "graphql endpoint cannot be empty",
 },

163-181: Same path issue in TestConstructorDefaults.

The "samples/services" paths at lines 165 and 175 will fail in CI for the same reason.

#!/bin/bash
# Verify the samples directory structure and test paths
echo "Checking samples directory from repo root:"
ls -la router/pkg/connectrpc/samples/services/ 2>/dev/null || echo "Path not found from repo root"

echo ""
echo "Checking samples directory from router directory (CI context):"
cd router && ls -la samples/services/ 2>/dev/null || echo "Path 'samples/services' not found from router/"

echo ""
echo "Checking correct path from router directory:"
cd .. && cd router && ls -la pkg/connectrpc/samples/services/ 2>/dev/null || echo "Path not found"
router/pkg/connectrpc/validator.go (1)

217-237: Int64/UInt64 string values are accepted without numeric parsing

The int64/uint64 branches treat any string as valid:

case string:
    // String representation is valid for int64

and similarly for uint64. This means "abc" or "12.3" will pass validation even though they are not valid integer literals per protobuf JSON mapping rules.

You should parse these strings and fail on parse errors (e.g. using strconv.ParseInt/ParseUint) so invalid numeric strings are rejected.

For protobuf JSON mapping of `int64`/`uint64`, must string representations be valid base‑10 integers, and is it appropriate to validate them using `strconv.ParseInt`/`ParseUint` in Go?

Also applies to: 255-272

router/pkg/connectrpc/server.go (2)

247-257: Clearing operationRegistry during Reload corrupts in‑flight requests

Reload does:

s.protoLoader = NewProtoLoader(s.logger)
if s.operationRegistry != nil {
    s.operationRegistry.Clear()
}
if err := s.initializeComponents(); err != nil { ... }

initializeComponents then assigns a brand‑new OperationRegistry to s.operationRegistry. The Clear() call therefore mutates the old registry instance that is still referenced by the live handler/transcoder chain until the new handler is swapped in, potentially causing in‑flight requests to see an empty registry.

Since a fresh registry is created in initializeComponents, Clear() is redundant and harmful; it should be removed so existing requests continue to see a consistent view until the new handler is installed.


293-299: Guard Reload against being called before Start to avoid nil httpServer panic

Reload unconditionally does:

handler := s.createHandler()
s.httpServer.Handler = h2c.NewHandler(handler, &http2.Server{})

If Reload() is called on a Server instance before Start(), s.httpServer is nil and this will panic. It’s safer to fail fast:

if s.httpServer == nil {
    return fmt.Errorf("server is not started; call Start before Reload")
}
handler := s.createHandler()
s.httpServer.Handler = h2c.NewHandler(handler, &http2.Server{})

This preserves current behavior when the server is running while avoiding unexpected panics on misuse.

router/pkg/connectrpc/proto_loader.go (2)

74-105: Per‑file parsing with ImportPaths set to the file directory is fragile for common proto layouts

LoadFromDirectory discovers all .proto files and then calls loadProtoFile(path) per file. loadProtoFile builds:

dir := filepath.Dir(path)
filename := filepath.Base(path)
parser := protoparse.Parser{
    ImportPaths: []string{dir},
}
fds, err := parser.ParseFiles(filename)

This makes each file’s directory the only import root and passes just the basename to ParseFiles. In typical layouts where:

  • Imports are rooted at the service root (e.g. import "common/types.proto"), or
  • Files in subdirectories import siblings via paths relative to the service root,

those imports will often not resolve when parsed from subdirs with ImportPaths: []string{dir}.

A more robust pattern is:

  • Treat the service root (the dir passed to LoadFromDirectory) as the import root, and
  • Call ParseFiles once with paths relative to that root.

Concretely:

  • In LoadFromDirectory, compute rel, _ := filepath.Rel(rootDir, path) for each proto and collect a slice of relative filenames.
  • Create a single protoparse.Parser{ImportPaths: []string{rootDir}} and call ParseFiles(relFiles...).

LoadFromDirectories can follow the same pattern per service root. This avoids brittle, per-file import roots and aligns with common proto import practices.

For `github.com/jhump/protoreflect/desc/protoparse`, how should `ImportPaths` and `ParseFiles` filenames be configured to support typical multi-file proto trees where imports are relative to a common root directory?

Also applies to: 208-236


238-278: Duplicate proto “paths” cause entire services to be silently skipped

processFileDescriptor does:

protoFd := fd.UnwrapFile()
_, err := pl.files.FindFileByPath(string(protoFd.Path()))
if err == nil {
    pl.logger.Debug("file descriptor already registered...", ...)
    return nil
}
if err := pl.files.RegisterFile(protoFd); err != nil { ... }

Because loadProtoFile calls ParseFiles with filename := filepath.Base(path), protoFd.Path() is typically just that basename ("service.proto"). If two different service directories each have a top‑level service.proto, the second one will be treated as “already registered” and its services will never be extracted into pl.services, with only a debug log as a hint.

That’s a correctness issue:

  • Distinct services in different directories with the same filename but different packages will result in only the first being loaded.

Options:

  • Include the relative path from the service root in the parser filename so protoFd.Path() is unique per service dir (e.g. "user/service.proto", "billing/service.proto"), or
  • Drop the early return and instead fail fast with a clear error if a truly conflicting file is encountered.

At minimum, this should not silently skip the later service.

In `protoregistry.Files`, what does `FindFileByPath` use as the key, and how is `protoreflect.FileDescriptor.Path()` derived from the filename passed to `protoparse.Parser.ParseFiles`?
🧹 Nitpick comments (3)
router/pkg/config/connectrpc_test.go (1)

1-202: Well-structured test suite with good coverage.

This test file demonstrates excellent testing practices:

  • Clear test organization with descriptive names
  • Proper use of table-driven tests for multiple scenarios
  • Good test isolation using t.TempDir() and t.Setenv()
  • Comprehensive coverage of config loading, merging, and environment variable handling

The tests cover the essential scenarios for ConnectRPC configuration and follow Go testing best practices.

Optional enhancement: Consider adding tests for error scenarios to improve robustness:

  • Invalid YAML syntax
  • Schema validation failures (if applicable)
  • Testing direct environment variable setting via CONNECT_RPC_ENABLED (bypassing YAML)

These additions would provide more complete coverage but are not essential given the current comprehensive happy-path testing.

router/pkg/connectrpc/handler.go (2)

196-233: Validator nil check silently continues - reconsider behavior.

When h.validator == nil, the code logs a warning and returns nil, allowing the request to proceed without validation. This was flagged in a past review asking if this behavior is desirable outside of tests.

Consider making this an error condition rather than a silent pass-through:

 func (h *RPCHandler) validateRequest(serviceName, methodName string, requestJSON []byte) error {
 	if h.validator == nil {
-		h.logger.Warn("validator is nil, skipping validation")
-		return nil
+		return fmt.Errorf("validator is not initialized")
 	}

Alternatively, if this is intentional for certain use cases, document why validation can be skipped.


373-434: Consider using structured error types for GraphQL errors.

The makeCriticalGraphQLError and makePartialGraphQLError functions silently ignore JSON marshaling errors on line 377 and 405. While unlikely to fail, this could mask issues.

-	errorsJSON, _ := json.Marshal(errors)
+	errorsJSON, err := json.Marshal(errors)
+	if err != nil {
+		h.logger.Error("failed to marshal GraphQL errors", zap.Error(err))
+		errorsJSON = []byte("[]")
+	}
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ff07530 and aae9972.

⛔ Files ignored due to path filters (1)
  • router-tests/testdata/connectrpc/client/employee.v1/service.pb.go is excluded by !**/*.pb.go
📒 Files selected for processing (18)
  • protographic/src/operation-to-proto.ts (2 hunks)
  • router-tests/connectrpc/connectrpc_server_lifecycle_test.go (1 hunks)
  • router-tests/testdata/connectrpc/README.md (1 hunks)
  • router-tests/testdata/connectrpc/buf.gen.yaml (1 hunks)
  • router-tests/testdata/connectrpc/client/employee.v1/employeev1connect/service.connect.go (1 hunks)
  • router/core/router.go (5 hunks)
  • router/core/router_config.go (4 hunks)
  • router/pkg/config/config.go (2 hunks)
  • router/pkg/config/config.schema.json (1 hunks)
  • router/pkg/config/connectrpc_test.go (1 hunks)
  • router/pkg/connectrpc/constructor_validation_test.go (1 hunks)
  • router/pkg/connectrpc/error_handling_test.go (1 hunks)
  • router/pkg/connectrpc/handler.go (1 hunks)
  • router/pkg/connectrpc/handler_test.go (1 hunks)
  • router/pkg/connectrpc/proto_loader.go (1 hunks)
  • router/pkg/connectrpc/server.go (1 hunks)
  • router/pkg/connectrpc/server_test.go (1 hunks)
  • router/pkg/connectrpc/validator.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
  • router-tests/testdata/connectrpc/buf.gen.yaml
  • router/pkg/config/config.go
  • router/pkg/connectrpc/server_test.go
  • router/core/router_config.go
  • router/pkg/config/config.schema.json
🧰 Additional context used
🧠 Learnings (15)
📓 Common learnings
Learnt from: dkorittki
Repo: wundergraph/cosmo PR: 2273
File: router/core/graphql_handler.go:0-0
Timestamp: 2025-11-19T15:13:57.821Z
Learning: In the Cosmo router (wundergraph/cosmo), error handling follows a two-phase pattern: (1) Prehandler phase handles request parsing, validation, and setup errors using `httpGraphqlError` and `writeOperationError` (in files like graphql_prehandler.go, operation_processor.go, parse_multipart.go, batch.go); (2) Execution phase handles resolver execution errors using `WriteError` in GraphQLHandler.ServeHTTP. Because all `httpGraphqlError` instances are caught in the prehandler before ServeHTTP is invoked, any error type checks for `httpGraphqlError` in the execution-phase WriteError method are unreachable code.
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2141
File: router-tests/http_subscriptions_test.go:17-55
Timestamp: 2025-08-28T09:17:49.477Z
Learning: The Cosmo router uses a custom, intentionally rigid multipart implementation for GraphQL subscriptions. The multipart parsing in test files should remain strict and not be made more tolerant, as this rigidity is by design.
📚 Learning: 2025-10-01T20:39:16.113Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2252
File: router-tests/telemetry/telemetry_test.go:9684-9693
Timestamp: 2025-10-01T20:39:16.113Z
Learning: Repo preference: In router-tests/telemetry/telemetry_test.go, keep strict > 0 assertions for request.operation.*Time (parsingTime, normalizationTime, validationTime, planningTime) in telemetry-related tests; do not relax to >= 0 unless CI flakiness is observed.

Applied to files:

  • router/pkg/config/connectrpc_test.go
  • router/pkg/connectrpc/error_handling_test.go
  • router/pkg/connectrpc/constructor_validation_test.go
  • router-tests/connectrpc/connectrpc_server_lifecycle_test.go
  • router/pkg/connectrpc/handler_test.go
📚 Learning: 2025-11-19T15:13:57.821Z
Learnt from: dkorittki
Repo: wundergraph/cosmo PR: 2273
File: router/core/graphql_handler.go:0-0
Timestamp: 2025-11-19T15:13:57.821Z
Learning: In the Cosmo router (wundergraph/cosmo), error handling follows a two-phase pattern: (1) Prehandler phase handles request parsing, validation, and setup errors using `httpGraphqlError` and `writeOperationError` (in files like graphql_prehandler.go, operation_processor.go, parse_multipart.go, batch.go); (2) Execution phase handles resolver execution errors using `WriteError` in GraphQLHandler.ServeHTTP. Because all `httpGraphqlError` instances are caught in the prehandler before ServeHTTP is invoked, any error type checks for `httpGraphqlError` in the execution-phase WriteError method are unreachable code.

Applied to files:

  • router/pkg/connectrpc/error_handling_test.go
  • router/pkg/connectrpc/handler.go
  • router/pkg/connectrpc/handler_test.go
📚 Learning: 2025-08-28T09:18:10.121Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2141
File: router-tests/http_subscriptions_test.go:100-108
Timestamp: 2025-08-28T09:18:10.121Z
Learning: In router-tests/http_subscriptions_test.go heartbeat tests, the message ordering should remain strict with data messages followed by heartbeat messages, as the timing is deterministic and known by design in the Cosmo router implementation.

Applied to files:

  • router-tests/connectrpc/connectrpc_server_lifecycle_test.go
📚 Learning: 2025-09-17T20:55:39.456Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2172
File: router/core/graph_server.go:0-0
Timestamp: 2025-09-17T20:55:39.456Z
Learning: The Initialize method in router/internal/retrytransport/manager.go has been updated to properly handle feature-flag-only subgraphs by collecting subgraphs from both routerConfig.GetSubgraphs() and routerConfig.FeatureFlagConfigs.ConfigByFeatureFlagName, ensuring all subgraphs receive retry configuration.

Applied to files:

  • router/pkg/connectrpc/handler.go
  • router/core/router.go
📚 Learning: 2025-09-02T12:52:27.677Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2181
File: router/core/operation_processor.go:0-0
Timestamp: 2025-09-02T12:52:27.677Z
Learning: Hash validation for persisted queries with query bodies is performed in router/core/graphql_prehandler.go in the handleOperation function, not in the APQ processing logic in operation_processor.go. This validates that extensions.persistedQuery.sha256Hash matches the query body before any APQ operations occur.

Applied to files:

  • router/pkg/connectrpc/handler.go
📚 Learning: 2025-09-02T12:52:27.677Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2181
File: router/core/operation_processor.go:0-0
Timestamp: 2025-09-02T12:52:27.677Z
Learning: Hash validation for persisted queries with query bodies is performed in router/core/graphql_prehandler.go in the handleOperation function at lines 571-578, not in the APQ processing logic in operation_processor.go. This validates that extensions.persistedQuery.sha256Hash matches the computed query hash before any APQ operations occur. There's also a test case that verifies this behavior.

Applied to files:

  • router/pkg/connectrpc/handler.go
  • router/pkg/connectrpc/handler_test.go
📚 Learning: 2025-08-07T12:05:06.775Z
Learnt from: StarpTech
Repo: wundergraph/cosmo PR: 2079
File: proto/wg/cosmo/platform/v1/platform.proto:39-45
Timestamp: 2025-08-07T12:05:06.775Z
Learning: In the Cosmo project, the proto fields for schema, mappings, and lock in ProtoInput are intentionally kept as string types rather than bytes because the team works with text data and wants it to be UTF-8 encoded for readability and text processing purposes.

Applied to files:

  • router/pkg/connectrpc/handler.go
📚 Learning: 2025-08-20T22:13:25.222Z
Learnt from: StarpTech
Repo: wundergraph/cosmo PR: 2157
File: router-tests/go.mod:16-16
Timestamp: 2025-08-20T22:13:25.222Z
Learning: github.com/mark3labs/mcp-go v0.38.0 has regressions and should not be used in the wundergraph/cosmo project. v0.36.0 is the stable version that should be used across router-tests and other modules.

Applied to files:

  • router/core/router.go
📚 Learning: 2025-09-24T12:54:00.765Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2222
File: router-tests/websocket_test.go:2238-2302
Timestamp: 2025-09-24T12:54:00.765Z
Learning: The wundergraph/cosmo project uses Go 1.25 (Go 1.25 minimum), so fmt.Appendf and other newer Go standard library functions are available and can be used without compatibility concerns.

Applied to files:

  • router/core/router.go
📚 Learning: 2025-09-24T12:54:00.765Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2222
File: router-tests/websocket_test.go:2238-2302
Timestamp: 2025-09-24T12:54:00.765Z
Learning: The wundergraph/cosmo project uses Go 1.25 (Go 1.23+ minimum), so fmt.Appendf and other newer Go standard library functions are available and can be used without compatibility concerns.

Applied to files:

  • router/core/router.go
📚 Learning: 2025-07-30T09:29:04.257Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2090
File: router/core/operation_processor.go:0-0
Timestamp: 2025-07-30T09:29:04.257Z
Learning: GraphQL operation names don't allow characters with more than 1 code point, so string length operations and slicing work correctly for both byte and character counting in GraphQL operation name processing.

Applied to files:

  • protographic/src/operation-to-proto.ts
📚 Learning: 2025-09-19T15:08:03.085Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2223
File: router/cmd/main.go:32-33
Timestamp: 2025-09-19T15:08:03.085Z
Learning: pyroscope-go's Config.ServerAddress field accepts both full URLs (e.g., "http://localhost:4040") and simple host:port format (e.g., "localhost:4040"). The library handles URL normalization internally, so both formats work correctly.

Applied to files:

  • router/pkg/connectrpc/server.go
📚 Learning: 2025-07-21T14:46:34.879Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2067
File: router/pkg/authentication/jwks_token_decoder.go:80-106
Timestamp: 2025-07-21T14:46:34.879Z
Learning: In the Cosmo router project, required field validation for JWKS configuration (Secret, Algorithm, KeyId) is handled at the JSON schema level in config.schema.json rather than through runtime validation in the Go code at router/pkg/authentication/jwks_token_decoder.go.

Applied to files:

  • router/pkg/connectrpc/validator.go
📚 Learning: 2025-07-21T15:06:36.664Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2067
File: router/pkg/config/config.schema.json:1637-1644
Timestamp: 2025-07-21T15:06:36.664Z
Learning: In the Cosmo router project, when extending JSON schema validation for security-sensitive fields like JWKS secrets, backwards compatibility is maintained by implementing warnings in the Go code rather than hard validation constraints in the schema. This allows existing configurations to continue working while alerting users to potential security issues.

Applied to files:

  • router/pkg/connectrpc/validator.go
🧬 Code graph analysis (7)
router/pkg/config/connectrpc_test.go (1)
router/pkg/config/config.go (3)
  • ConnectRPCConfiguration (1012-1017)
  • LoadConfig (1146-1258)
  • Config (1039-1115)
router/pkg/connectrpc/handler.go (3)
router/pkg/connectrpc/operation_registry.go (1)
  • OperationRegistry (24-29)
router/pkg/connectrpc/validator.go (3)
  • MessageValidator (14-16)
  • NewMessageValidator (19-23)
  • ValidationError (26-29)
router/pkg/connectrpc/proto_loader.go (1)
  • ProtoLoader (53-59)
router/core/router.go (2)
router/pkg/connectrpc/server.go (2)
  • Server (36-47)
  • ServerConfig (21-33)
router/pkg/connectrpc/service_discovery.go (2)
  • DiscoverServices (46-153)
  • ServiceDiscoveryConfig (29-34)
router/pkg/connectrpc/server.go (4)
router/pkg/connectrpc/proto_loader.go (2)
  • ProtoLoader (53-59)
  • NewProtoLoader (62-72)
router/pkg/connectrpc/operation_registry.go (2)
  • OperationRegistry (24-29)
  • NewOperationRegistry (32-41)
router/pkg/connectrpc/handler.go (3)
  • RPCHandler (143-149)
  • NewRPCHandler (161-194)
  • HandlerConfig (152-158)
router/pkg/connectrpc/vanguard_service.go (3)
  • VanguardService (60-65)
  • NewVanguardService (68-93)
  • VanguardServiceConfig (53-57)
router/pkg/connectrpc/handler_test.go (4)
router/pkg/connectrpc/operation_registry.go (2)
  • NewOperationRegistry (32-41)
  • OperationRegistry (24-29)
router/pkg/connectrpc/proto_loader.go (2)
  • NewProtoLoader (62-72)
  • ProtoLoader (53-59)
router/pkg/connectrpc/handler.go (2)
  • NewRPCHandler (161-194)
  • HandlerConfig (152-158)
router/pkg/connectrpc/test_helpers.go (1)
  • MockHTTPClient (64-71)
router/pkg/connectrpc/validator.go (1)
router/pkg/connectrpc/proto_loader.go (1)
  • ProtoLoader (53-59)
router-tests/testdata/connectrpc/client/employee.v1/employeev1connect/service.connect.go (1)
router-tests/testdata/connectrpc/client/employee.v1/service.pb.go (39)
  • FindEmployeesByPetsRequest (321-327)
  • FindEmployeesByPetsRequest (342-342)
  • FindEmployeesByPetsRequest (357-359)
  • FindEmployeesByPetsResponse (368-375)
  • FindEmployeesByPetsResponse (390-390)
  • FindEmployeesByPetsResponse (405-407)
  • FindEmployeesByPetsInlineFragmentRequest (586-592)
  • FindEmployeesByPetsInlineFragmentRequest (607-607)
  • FindEmployeesByPetsInlineFragmentRequest (622-624)
  • FindEmployeesByPetsInlineFragmentResponse (633-640)
  • FindEmployeesByPetsInlineFragmentResponse (655-655)
  • FindEmployeesByPetsInlineFragmentResponse (670-672)
  • FindEmployeesByPetsNamedFragmentRequest (681-687)
  • FindEmployeesByPetsNamedFragmentRequest (702-702)
  • FindEmployeesByPetsNamedFragmentRequest (717-719)
  • FindEmployeesByPetsNamedFragmentResponse (728-735)
  • FindEmployeesByPetsNamedFragmentResponse (750-750)
  • FindEmployeesByPetsNamedFragmentResponse (765-767)
  • GetEmployeeByIdRequest (227-233)
  • GetEmployeeByIdRequest (248-248)
  • GetEmployeeByIdRequest (263-265)
  • GetEmployeeByIdResponse (274-280)
  • GetEmployeeByIdResponse (295-295)
  • GetEmployeeByIdResponse (310-312)
  • GetEmployeesRequest (501-505)
  • GetEmployeesRequest (520-520)
  • GetEmployeesRequest (535-537)
  • GetEmployeesWithMoodRequest (416-420)
  • GetEmployeesWithMoodRequest (435-435)
  • GetEmployeesWithMoodRequest (450-452)
  • GetEmployeesWithMoodResponse (454-460)
  • GetEmployeesWithMoodResponse (475-475)
  • GetEmployeesWithMoodResponse (490-492)
  • UpdateEmployeeMoodRequest (124-131)
  • UpdateEmployeeMoodRequest (146-146)
  • UpdateEmployeeMoodRequest (161-163)
  • UpdateEmployeeMoodResponse (179-186)
  • UpdateEmployeeMoodResponse (201-201)
  • UpdateEmployeeMoodResponse (216-218)
🪛 markdownlint-cli2 (0.18.1)
router-tests/testdata/connectrpc/README.md

16-16: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: build-router
  • GitHub Check: build_test
  • GitHub Check: image_scan
  • GitHub Check: Analyze (javascript-typescript)
  • GitHub Check: build_push_image (nonroot)
  • GitHub Check: Analyze (go)
  • GitHub Check: build_push_image
  • GitHub Check: image_scan (nonroot)
  • GitHub Check: integration_test (./events)
  • GitHub Check: integration_test (./. ./fuzzquery ./lifecycle ./modules)
  • GitHub Check: integration_test (./telemetry)
  • GitHub Check: build_test
🔇 Additional comments (23)
protographic/src/operation-to-proto.ts (2)

267-279: Excellent validation with helpful error messaging!

The PascalCase validation correctly enforces naming conventions with a well-crafted regex that ensures operation names start with uppercase and contain at least one lowercase letter. The error message provides clear guidance with examples and a suggested name using upperFirst(camelCase(operationName)), which addresses the previous review feedback about suggesting the correct name.


294-296: LGTM!

Using the operation name directly without transformation is the correct approach now that PascalCase validation ensures the name is already in the proper format. This maintains exact matching between GraphQL operation names and RPC method names.

router/pkg/config/connectrpc_test.go (5)

12-20: LGTM! Good practice testing zero value semantics.

Testing that the zero value represents a safe, disabled state is an important invariant for configuration structs. The assertions are clear and appropriately verify the safety guarantees.


89-114: LGTM! Excellent test isolation practices.

The test properly uses t.Setenv() for environment variable isolation and t.TempDir() for filesystem isolation. The error handling with require.NoError() is appropriate, and the assertions are clear and comprehensive (aside from the BaseURL field noted in the previous comment).


117-137: LGTM! Clear test of envDefault behavior.

This focused test properly verifies that the envDefault struct tag is respected when no explicit value is provided in the config file. The assertion message clearly documents the expected behavior.


139-167: LGTM! Good integration test coverage.

This test properly verifies that the ConnectRPC configuration integrates correctly with the main Config structure, ensuring that both top-level fields and nested ConnectRPC fields load without interference. The bidirectional verification is important for catching integration issues.


169-202: LGTM! Comprehensive merge behavior verification.

This test properly verifies the multi-file config merge semantics, ensuring that base values are preserved when not overridden and that override values correctly replace base values. The test cases cover both scenarios, which is important for validating the YAML merge functionality.

router-tests/testdata/connectrpc/client/employee.v1/employeev1connect/service.connect.go (1)

1-307: Generated code - no review needed.

This file is auto-generated by protoc-gen-connect-go as indicated by the header comment. Generated code follows standard Connect RPC patterns and should not be manually modified.

router-tests/connectrpc/connectrpc_server_lifecycle_test.go (1)

1-165: Test structure and coverage are comprehensive.

The lifecycle tests cover key scenarios: complete lifecycle flow, stop-without-start error handling, invalid configuration errors, state consistency through reload, graceful shutdown with context deadline, and service info retrieval. The parallel test execution is appropriate.

router/pkg/connectrpc/handler.go (3)

315-338: Recursive key conversion implementation looks correct.

The convertKeysRecursive function properly handles nested maps, arrays of maps, and preserves primitive values. This addresses the previous review feedback about only converting top-level keys.


340-371: snakeToCamel implementation is efficient.

Uses strings.Builder with pre-allocated capacity and byte-level operations as suggested in past reviews. The logic correctly handles edge cases like empty strings, multiple underscores, and trailing underscores.


488-505: HTTP error handling correctly maps status codes and includes metadata.

The non-2xx status handling properly maps HTTP status to Connect codes, sets CRITICAL classification, and includes the response body in metadata for debugging. Good error propagation pattern.

router/pkg/connectrpc/error_handling_test.go (2)

28-62: HTTP status mapping tests provide reasonable coverage.

While this tests a simple switch statement, it documents the expected mappings and will catch regressions if the mapping logic changes. The coverage includes common 4xx/5xx codes and unknown status handling.


64-303: Comprehensive error handling test coverage.

The test suite covers HTTP transport errors, GraphQL CRITICAL errors (no data), and PARTIAL errors (with data). Each case validates Connect error code, message, metadata, and where applicable, GraphQL errors and partial data payloads. Well-structured table-driven tests.

router/pkg/connectrpc/handler_test.go (4)

25-110: Constructor tests provide good validation coverage.

Tests verify valid config creation, protocol normalization, and proper error messages for missing required fields. The nil logger default test confirms graceful handling.


184-219: Header forwarding test structure improved.

The nested test structure with descriptive names ("forwarding headers from context""should forward listed headers") follows the suggestion from past reviews for better test organization.


313-496: Thorough variable conversion test coverage.

Tests cover all important cases: top-level conversion, nested objects, arrays of objects, deeply nested structures, primitive arrays, empty objects/arrays, null values, empty input, and keys without underscores. This ensures the recursive conversion logic is well-validated.


553-577: snakeToCamel edge cases are well-tested.

Tests cover single words, multiple words, consecutive underscores, leading/trailing underscores, mixed case, empty strings, and single underscore. Good boundary condition coverage.

router/core/router.go (4)

960-1034: ConnectRPC bootstrap implementation looks well-structured.

The bootstrap flow correctly:

  1. Resolves the services provider from FileSystem storage
  2. Discovers services using convention-based approach
  3. Determines GraphQL endpoint (configured or derived)
  4. Creates and starts the ConnectRPC server
  5. Stores the server instance for lifecycle management

Good logging at each step for observability.


1551-1557: ConnectRPC server properly integrated into graceful shutdown.

The ConnectRPC server shutdown is correctly added to the shutdown flow using wg.Go for concurrent cleanup, consistent with other server components (Prometheus, MCP, tracer providers).


2208-2212: WithConnectRPC option follows established patterns.

The new option follows the same pattern as WithMCP and other configuration options, maintaining consistency in the public API.


994-1000: The endpoint normalization is already handled correctly. Both connectrpc.NewServer() (lines 73-75 in server.go) and mcpserver.NewGraphQLSchemaServer() (lines 195-197 in mcpserver/server.go) automatically add the http:// protocol prefix if missing. Tests confirm this behavior works as expected for endpoints like "localhost:4000/graphql". No changes needed.

Likely an incorrect or invalid review comment.

router/pkg/connectrpc/validator.go (1)

290-309: Bytes and enum validation only check type, not content

For TYPE_BYTES, any string is accepted while the error message says "must be base64 encoded string," and for TYPE_ENUM any string/number is accepted without checking that it corresponds to a defined enum value.

You may want to either:

  • Actually validate bytes by attempting a base64 decode, and
  • Validate enums by checking the value against field.GetEnumType() definitions.

Or relax the error messages to reflect that only type, not semantic value, is validated.

Comment on lines +47 to +60
name: "full config with overrides",
yaml: `connect_rpc:
enabled: true
server:
listen_addr: "0.0.0.0:8080"
base_url: "http://example.com"
services_provider_id: "fs-protos"
graphql_endpoint: "http://localhost:4000/graphql"
`,
wantEnabled: true,
wantListenAddr: "0.0.0.0:8080",
wantGraphQL: "http://localhost:4000/graphql",
wantProviderID: "fs-protos",
},
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Missing assertion for base_url field.

The test case sets base_url: "http://example.com" in the YAML (line 52) but never asserts its value. This leaves the BaseURL field untested in this scenario.

Add an assertion to verify the BaseURL field:

 		wantEnabled:    true,
 		wantListenAddr: "0.0.0.0:8080",
 		wantGraphQL:    "http://localhost:4000/graphql",
 		wantProviderID: "fs-protos",
+		wantBaseURL:    "http://example.com",

And in the assertion block around line 110:

 		assert.Equal(t, tt.wantEnabled, cfg.Enabled)
 		assert.Equal(t, tt.wantListenAddr, cfg.Server.ListenAddr)
+		if tt.wantBaseURL != "" {
+			assert.Equal(t, tt.wantBaseURL, cfg.Server.BaseURL)
+		}
 		assert.Equal(t, tt.wantGraphQL, cfg.GraphQLEndpoint)

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In router/pkg/config/connectrpc_test.go around lines 47 to 60, the test case
includes base_url: "http://example.com" but does not assert the parsed BaseURL
field; add an assertion to the test case expectations (e.g., wantBaseURL:
"http://example.com") and then update the assertion block near line 110 to check
that cfg.ConnectRPC.Server.BaseURL (or the equivalent struct field) equals the
expected wantBaseURL, failing the test if it does not match.

Comment on lines +517 to +507
// Check if we have GraphQL errors
if len(graphqlResponse.Errors) > 0 {
// Determine if this is CRITICAL or PARTIAL based on data presence
hasData := len(graphqlResponse.Data) > 0 && string(graphqlResponse.Data) != "null" && string(graphqlResponse.Data) != "{}"

if !hasData {
// CRITICAL: Errors with no data - complete failure
return nil, h.makeCriticalGraphQLError(graphqlResponse.Errors, resp.StatusCode)
}

// PARTIAL: Errors with partial data - partial success
return nil, h.makePartialGraphQLError(graphqlResponse.Errors, graphqlResponse.Data, resp.StatusCode)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Data presence check could be more robust.

The check string(graphqlResponse.Data) != "{}" treats an empty object as "no data", but an empty object is valid GraphQL data. This could incorrectly classify responses as CRITICAL when partial data exists.

Consider removing the != "{}" check:

-	hasData := len(graphqlResponse.Data) > 0 && string(graphqlResponse.Data) != "null" && string(graphqlResponse.Data) != "{}"
+	hasData := len(graphqlResponse.Data) > 0 && string(graphqlResponse.Data) != "null"

An empty object {} is still valid data that should result in a PARTIAL classification when errors are present.

🤖 Prompt for AI Agents
In router/pkg/connectrpc/handler.go around lines 517 to 529, the current
data-presence check treats "{}" as no data which can misclassify valid
empty-object GraphQL responses as CRITICAL; remove the `!= "{}"` condition and
instead consider data present when graphqlResponse.Data is non-nil, non-empty
and not the literal "null" (i.e., keep checks for len(graphqlResponse.Data) > 0
and string(graphqlResponse.Data) != "null"), so responses with `{}` are treated
as PARTIAL when errors exist; update the inline comment to reflect that an empty
object is valid data.

Comment on lines 136 to 176
// Handle repeated fields (arrays)
if field.IsRepeated() {
arr, ok := value.([]interface{})
if !ok {
return &ValidationError{
Field: fieldPath,
Message: fmt.Sprintf("expected array, got %T", value),
}
}

// Validate each element in the array
for i, elem := range arr {
elemPath := fmt.Sprintf("%s[%d]", fieldPath, i)
if err := v.validateScalarOrMessageValue(field, elem, elemPath); err != nil {
return err
}
}
return nil
}

// Handle singular fields
return v.validateScalarOrMessageValue(field, value, fieldPath)
}

// validateScalarOrMessageValue validates either a scalar or message value
func (v *MessageValidator) validateScalarOrMessageValue(field *desc.FieldDescriptor, value interface{}, fieldPath string) error {
// Handle message types (nested messages)
if field.GetType() == descriptorpb.FieldDescriptorProto_TYPE_MESSAGE {
nestedData, ok := value.(map[string]interface{})
if !ok {
return &ValidationError{
Field: fieldPath,
Message: fmt.Sprintf("expected object, got %T", value),
}
}
return v.validateMessageFields(field.GetMessageType(), nestedData, fieldPath)
}

// Handle scalar types
return v.validateScalarValue(field, value, fieldPath)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's check what version of protoreflect is being used and inspect the FieldDescriptor interface
cd router/pkg/connectrpc
cat go.mod 2>/dev/null || cd ../../.. && cat go.mod | grep -i protoreflect

Repository: wundergraph/cosmo

Length of output: 99


🏁 Script executed:

# Search for any existing map field handling or IsMap() calls in the validator
rg "IsMap|map\[" router/pkg/connectrpc/validator.go -A 2 -B 2

Repository: wundergraph/cosmo

Length of output: 771


🏁 Script executed:

# Look for test files that might show expected behavior
fd "validator.*test" router/pkg/connectrpc/

Repository: wundergraph/cosmo

Length of output: 43


🏁 Script executed:

# Check the full validator.go file to understand the context and current implementation
wc -l router/pkg/connectrpc/validator.go

Repository: wundergraph/cosmo

Length of output: 99


🌐 Web query:

protobuf JSON mapping map fields representation RFC

💡 Result:

Short answer — Protobuf (proto3) JSON mapping represents map fields as JSON objects where map keys are JSON object property names (i.e., strings) and map values are the usual JSON encoding for the value type. Non-string key types are converted to strings when encoded; on parse, string keys are converted back to the map key type if possible (duplicates: last value wins). Empty maps become {}. See the proto JSON mapping and language docs for details. [1][2][3]

References

  • ProtoJSON / JSON mapping (encoding rules, map behavior). [1]
  • Proto3 language spec (map field definition). [2]
  • Language docs on maps (synthetic entry message, text-format behavior). [3]

🌐 Web query:

github.com/jhump/protoreflect FieldDescriptor IsMap method

💡 Result:

Yes — FieldDescriptor.Has an IsMap() method in jhump/protoreflect. It returns true for protobuf map fields (these are represented as a repeated message type that is a map-entry message with exactly two fields: key tag #1 and value tag #2). [1][2]

Sources:

  • Package desc FieldDescriptor documentation. [1]
  • Implementation / source references showing map-entry representation. [2]

🏁 Script executed:

# Check if there are any tests related to map field validation
fd "test" router/pkg/connectrpc/ -type f | head -20

Repository: wundergraph/cosmo

Length of output: 231


🏁 Script executed:

# Look at the beginning of validator.go to understand the imports and struct definition
head -50 router/pkg/connectrpc/validator.go

Repository: wundergraph/cosmo

Length of output: 1388


🏁 Script executed:

# Check if there's any test data or examples showing map field structures
rg "map\[" router/pkg/connectrpc/ -t go --max-count=20

Repository: wundergraph/cosmo

Length of output: 4733


Map fields are treated as repeated arrays instead of JSON objects

validateFieldValue branches on field.IsRepeated() and assumes repeated fields are JSON arrays. For protobuf map<...> fields, the JSON representation is an object (map[string]value), not []interface{}. Since map fields are represented as repeated map-entry messages, IsRepeated() will return true, causing the code to attempt casting the map object to []interface{} and fail validation with "expected array" errors.

Add a check for field.IsMap() before the repeated-array branch:

  • If field.IsMap(): Expect map[string]interface{} and validate each entry's value against the map value type descriptor.
  • Else if field.IsRepeated(): Keep current array validation.
🤖 Prompt for AI Agents
In router/pkg/connectrpc/validator.go around lines 136 to 176, add a map-field
branch before the existing field.IsRepeated() check: if field.IsMap() then
expect a JSON object (map[string]interface{}), iterate its key/value pairs and
validate each value against the field's map value descriptor (use the field's
map value type/message type to decide scalar vs message validation and call the
appropriate validator methods), otherwise fall through to the existing
repeated-array handling; keep the existing behavior for non-map repeated fields
and for singular fields.

@asoorm asoorm force-pushed the ahmet/eng-8277-connect-rpc-handler-final branch from aae9972 to 7794a2b Compare December 3, 2025 15:34
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
router/pkg/mcpserver/server.go (1)

20-25: Reuse of httputil.SkippedHeaders is good; fix GraphQL error propagation in executeGraphQLQuery.

Two points here:

  1. Replacing the local skip list with httputil.SkippedHeaders keeps MCP header forwarding in sync with the router’s central rules. That’s a good consolidation.

  2. In the GraphQL error branch, err is always nil (because you only enter the if when json.Unmarshal succeeds), and the constructed errorMessage is not used for the “no data” case:

if err := json.Unmarshal(body, &graphqlResponse); err == nil && len(graphqlResponse.Errors) > 0 {
    ...
    errorMessage := strings.Join(errorMessages, "; ")
    ...
    if len(graphqlResponse.Data) == 0 || string(graphqlResponse.Data) == "null" {
        return mcp.NewToolResultErrorFromErr("Response Error", err), nil // err is nil here
    }
    ...
}

This means callers only see a generic "Response Error" without the underlying GraphQL messages. Consider returning an error that includes errorMessage (and not reusing err here), e.g.:

-       if len(graphqlResponse.Data) == 0 || string(graphqlResponse.Data) == "null" {
-           return mcp.NewToolResultErrorFromErr("Response Error", err), nil
-       }
+       if len(graphqlResponse.Data) == 0 || string(graphqlResponse.Data) == "null" {
+           return mcp.NewToolResultErrorFromErr(errorMessage, fmt.Errorf("%s", errorMessage)), nil
+       }

or, if you don’t need an underlying error, switch to the non-error variant (if available) and just pass errorMessage as the text.

Also applies to: 689-743

♻️ Duplicate comments (8)
router-tests/testdata/connectrpc/README.md (1)

16-22: Add language identifier to fenced code block for markdownlint.

The code block currently starts with plain ; add a language (e.g. text) to satisfy MD040 and keep tooling quiet.

router-tests/connectrpc/connectrpc_server_lifecycle_test.go (1)

53-78: Consider tightening assertions for concurrent Start() semantics.

The concurrent start test currently only asserts that at least one Start() call succeeds. Since Start() internally calls net.Listen() on the same port without synchronization, exactly one listener bind will succeed and the others will fail with a port-in-use error. To catch accidental changes in this behavior, assert either:

  • exactly one success and two failures, or
  • at least one success and at least one error.

This prevents silent regressions where concurrent calls might accidentally be allowed, ignored, or cause inconsistent state.

router/pkg/connectrpc/validator.go (2)

136-154: Map fields are not handled correctly.

Protobuf map fields (e.g., map<string, int32>) are represented as repeated map-entry messages internally, so field.IsRepeated() returns true. However, in JSON they're represented as objects ({"key": value}), not arrays. This code will attempt to cast them to []any and fail validation.

Before the repeated array check, add a map-specific branch:

+	// Handle map fields (must come before repeated check)
+	if field.IsMap() {
+		mapData, ok := value.(map[string]any)
+		if !ok {
+			return &ValidationError{
+				Field:   fieldPath,
+				Message: fmt.Sprintf("expected object for map field, got %T", value),
+			}
+		}
+		// Validate each map entry value against the map value type
+		for key, val := range mapData {
+			entryPath := fmt.Sprintf("%s[%s]", fieldPath, key)
+			if err := v.validateScalarOrMessageValue(field.GetMapValueType(), val, entryPath); err != nil {
+				return err
+			}
+		}
+		return nil
+	}
+
 	// Handle repeated fields (arrays)
 	if field.IsRepeated() {

Note: You'll need to determine how to get the map value type descriptor from the field descriptor.


230-231: String int64 values are not validated.

Per the protobuf JSON mapping specification, int64 can be represented as strings to avoid JavaScript precision loss. However, the string case at lines 230-231 accepts any string without parsing:

case string:
    // String representation is valid for int64

This would incorrectly accept invalid values like "abc" or "12.34". Apply this fix:

 case string:
-	// String representation is valid for int64
+	// Validate string can be parsed as int64
+	if _, err := strconv.ParseInt(v, 10, 64); err != nil {
+		return &ValidationError{
+			Field:   fieldPath,
+			Message: fmt.Sprintf("Int64 cannot represent value: %v", value),
+		}
+	}

Apply the same fix for uint64 at lines 265-266 using strconv.ParseUint.

You'll need to add "strconv" to the imports.

router/pkg/connectrpc/server.go (1)

250-252: Clear() call may corrupt in-flight requests and is redundant.

Lines 250-252 call Clear() on the live operation registry that's still referenced by the old handler chain. This can cause in-flight requests to fail if they're looking up operations during the reload.

Since initializeComponents() at line 304 creates a completely new OperationRegistry, the Clear() call is both:

  1. Harmful - corrupts state for in-flight requests
  2. Redundant - the old registry will be replaced anyway

Remove the Clear() call:

 s.protoLoader = NewProtoLoader(s.logger)
-if s.operationRegistry != nil {
-    s.operationRegistry.Clear()
-}

The old registry will be garbage collected naturally once all references to the old handler are released.

router/pkg/connectrpc/proto_loader.go (2)

208-236: Import path configuration limits protobuf import resolution.

The loader parses each proto file individually with ImportPaths: []string{dir} where dir is the file's directory (line 218). This breaks common protobuf import patterns:

  • Imports relative to a service root (e.g., import "common/types.proto") won't resolve if common/ is a sibling directory
  • Cross-directory imports within a service tree will fail

For example, if you have:

services/myservice/
  ├── service.proto (imports "common/types.proto")
  └── common/
      └── types.proto

The import will fail because ImportPaths is set to services/myservice/ instead of allowing resolution from the service root.

Consider using the service root directory as the import path and passing relative filenames to ParseFiles. This would require refactoring to parse files in batch rather than individually.


244-251: Duplicate basename detection silently skips services.

Lines 244-251 check if a file is already registered using protoFd.Path(). However, because ParseFiles is called with filepath.Base(path) (line 223), the path is just the basename (e.g., "service.proto").

If two different service directories both have a service.proto file (a common layout), the second one will be detected as "already registered" and skipped with only a debug log (lines 248-250). This silently drops the second service's definitions from the registry.

This is a correctness issue, not just deduplication. Options:

  1. Use relative paths from a root directory so each file has a unique path
  2. Remove the early return and allow the same basename from different directories
  3. Fail fast with a clear error if an actual duplicate is detected
router/pkg/connectrpc/handler.go (1)

495-507: Treat {} as valid data when classifying CRITICAL vs PARTIAL GraphQL errors.

hasData := len(graphqlResponse.Data) > 0 && string(graphqlResponse.Data) != "null" && string(graphqlResponse.Data) != "{}" treats an empty object as "no data". An empty object is still valid GraphQL data, so responses like {"data": {}, "errors": [...]} will be classified as CRITICAL instead of PARTIAL.

You can drop the {} check:

-	hasData := len(graphqlResponse.Data) > 0 && string(graphqlResponse.Data) != "null" && string(graphqlResponse.Data) != "{}"
+	hasData := len(graphqlResponse.Data) > 0 && string(graphqlResponse.Data) != "null"

This keeps {} in the PARTIAL bucket when errors are present.

🧹 Nitpick comments (5)
router/pkg/config/config.schema.json (1)

2137-2185: ConnectRPC schema wiring matches config; consider tightening provider id.

The connect_rpc block and its if/then requirement on services_provider_id align with ConnectRPCConfiguration and the rest of the schema patterns. If you want to prevent accidental empty IDs, you could optionally add a minLength: 1 on services_provider_id, but the current shape is already consistent with other config sections.

router/pkg/connectrpc/test_helpers.go (1)

50-62: Consider adding status code control to MockGraphQLServer.

The mock server currently hardcodes http.StatusOK. For comprehensive testing of error scenarios (GraphQL errors with 4xx/5xx status codes), consider making the status code configurable:

 type MockGraphQLServer struct {
 	*httptest.Server
 	ResponseBody string
+	StatusCode   int
 }

 func NewMockGraphQLServer(responseBody string) *MockGraphQLServer {
 	server := &MockGraphQLServer{
 		ResponseBody: responseBody,
+		StatusCode:   http.StatusOK,
 	}

 	server.Server = httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
 		w.Header().Set("Content-Type", "application/json")
-		w.WriteHeader(http.StatusOK)
+		w.WriteHeader(server.StatusCode)
 		w.Write([]byte(server.ResponseBody))
 	}))
router/pkg/connectrpc/operation_registry.go (1)

57-76: Consider lock granularity for operation loading performance.

The write lock is held for the entire duration of loading all operation files (lines 57-76), including file I/O operations. For directories with many operation files, this could block readers unnecessarily.

Consider loading files outside the lock and acquiring the lock only for the final registry update:

// Parse all files first (without lock)
var parsedOps []*schemaloader.Operation
for _, filePath := range operationFiles {
    content, err := os.ReadFile(filePath)
    // ... parse operation ...
    parsedOps = append(parsedOps, operation)
}

// Then acquire lock only for registry update
r.mu.Lock()
defer r.mu.Unlock()
// ... update r.operations ...

However, given that operation loading typically happens at startup/reload and the current approach is simpler and safer, this is an optional optimization.

router/pkg/connectrpc/vanguard_service.go (2)

124-131: Comment about incoming request path format is misleading.

extractMethodName expects paths like /package.Service/Method (or without the leading slash), but the comment here says the handler receives /Method without the service prefix. Either the comment or the parsing logic is wrong; given the Connect procedure constants, the code path looks correct and the comment should be updated to match the actual /package.Service/Method format.


375-380: Avoid duplicate file descriptors when multiple services share a file.

GetFileDescriptors appends service.FileDescriptor for every service, so files with multiple services will appear multiple times. This is harmless but slightly wasteful; consider tracking descriptors in a map[protoreflect.FileDescriptor]struct{} (or by path/name) before building the slice, so each file descriptor is returned only once.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between aae9972 and 7794a2b.

⛔ Files ignored due to path filters (1)
  • router-tests/testdata/connectrpc/client/employee.v1/service.pb.go is excluded by !**/*.pb.go
📒 Files selected for processing (27)
  • protographic/src/operation-to-proto.ts (2 hunks)
  • router-tests/connectrpc/connectrpc_server_lifecycle_test.go (1 hunks)
  • router-tests/testdata/connectrpc/README.md (1 hunks)
  • router-tests/testdata/connectrpc/buf.gen.yaml (1 hunks)
  • router-tests/testdata/connectrpc/client/employee.v1/employeev1connect/service.connect.go (1 hunks)
  • router/core/router.go (5 hunks)
  • router/core/router_config.go (4 hunks)
  • router/pkg/config/config.go (2 hunks)
  • router/pkg/config/config.schema.json (1 hunks)
  • router/pkg/config/connectrpc_test.go (1 hunks)
  • router/pkg/connectrpc/constructor_validation_test.go (1 hunks)
  • router/pkg/connectrpc/error_handling_test.go (1 hunks)
  • router/pkg/connectrpc/handler.go (1 hunks)
  • router/pkg/connectrpc/handler_test.go (1 hunks)
  • router/pkg/connectrpc/operation_registry.go (1 hunks)
  • router/pkg/connectrpc/operation_registry_test.go (1 hunks)
  • router/pkg/connectrpc/proto_loader.go (1 hunks)
  • router/pkg/connectrpc/server.go (1 hunks)
  • router/pkg/connectrpc/server_test.go (1 hunks)
  • router/pkg/connectrpc/service_discovery.go (1 hunks)
  • router/pkg/connectrpc/service_discovery_test.go (1 hunks)
  • router/pkg/connectrpc/test_helpers.go (1 hunks)
  • router/pkg/connectrpc/validator.go (1 hunks)
  • router/pkg/connectrpc/vanguard_service.go (1 hunks)
  • router/pkg/connectrpc/vanguard_service_test.go (1 hunks)
  • router/pkg/httputil/headers.go (1 hunks)
  • router/pkg/mcpserver/server.go (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (8)
  • router/pkg/config/config.go
  • router/pkg/connectrpc/vanguard_service_test.go
  • router/pkg/connectrpc/constructor_validation_test.go
  • router/pkg/connectrpc/server_test.go
  • router/pkg/config/connectrpc_test.go
  • router/pkg/connectrpc/operation_registry_test.go
  • router/pkg/connectrpc/handler_test.go
  • router/pkg/connectrpc/service_discovery_test.go
🧰 Additional context used
🧠 Learnings (19)
📓 Common learnings
Learnt from: dkorittki
Repo: wundergraph/cosmo PR: 2273
File: router/core/graphql_handler.go:0-0
Timestamp: 2025-11-19T15:13:57.821Z
Learning: In the Cosmo router (wundergraph/cosmo), error handling follows a two-phase pattern: (1) Prehandler phase handles request parsing, validation, and setup errors using `httpGraphqlError` and `writeOperationError` (in files like graphql_prehandler.go, operation_processor.go, parse_multipart.go, batch.go); (2) Execution phase handles resolver execution errors using `WriteError` in GraphQLHandler.ServeHTTP. Because all `httpGraphqlError` instances are caught in the prehandler before ServeHTTP is invoked, any error type checks for `httpGraphqlError` in the execution-phase WriteError method are unreachable code.
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2141
File: router-tests/http_subscriptions_test.go:17-55
Timestamp: 2025-08-28T09:17:49.477Z
Learning: The Cosmo router uses a custom, intentionally rigid multipart implementation for GraphQL subscriptions. The multipart parsing in test files should remain strict and not be made more tolerant, as this rigidity is by design.
📚 Learning: 2025-08-20T22:13:25.222Z
Learnt from: StarpTech
Repo: wundergraph/cosmo PR: 2157
File: router-tests/go.mod:16-16
Timestamp: 2025-08-20T22:13:25.222Z
Learning: github.com/mark3labs/mcp-go v0.38.0 has regressions and should not be used in the wundergraph/cosmo project. v0.36.0 is the stable version that should be used across router-tests and other modules.

Applied to files:

  • router/pkg/mcpserver/server.go
  • router-tests/testdata/connectrpc/buf.gen.yaml
  • router/core/router_config.go
  • router/core/router.go
📚 Learning: 2025-09-24T12:54:00.765Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2222
File: router-tests/websocket_test.go:2238-2302
Timestamp: 2025-09-24T12:54:00.765Z
Learning: The wundergraph/cosmo project uses Go 1.25 (Go 1.23+ minimum), so fmt.Appendf and other newer Go standard library functions are available and can be used without compatibility concerns.

Applied to files:

  • router/pkg/mcpserver/server.go
  • router/core/router.go
📚 Learning: 2025-11-19T15:13:57.821Z
Learnt from: dkorittki
Repo: wundergraph/cosmo PR: 2273
File: router/core/graphql_handler.go:0-0
Timestamp: 2025-11-19T15:13:57.821Z
Learning: In the Cosmo router (wundergraph/cosmo), error handling follows a two-phase pattern: (1) Prehandler phase handles request parsing, validation, and setup errors using `httpGraphqlError` and `writeOperationError` (in files like graphql_prehandler.go, operation_processor.go, parse_multipart.go, batch.go); (2) Execution phase handles resolver execution errors using `WriteError` in GraphQLHandler.ServeHTTP. Because all `httpGraphqlError` instances are caught in the prehandler before ServeHTTP is invoked, any error type checks for `httpGraphqlError` in the execution-phase WriteError method are unreachable code.

Applied to files:

  • router/pkg/mcpserver/server.go
  • router/pkg/connectrpc/test_helpers.go
  • router/pkg/connectrpc/handler.go
  • router/pkg/connectrpc/error_handling_test.go
  • router/pkg/connectrpc/vanguard_service.go
📚 Learning: 2025-09-24T12:54:00.765Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2222
File: router-tests/websocket_test.go:2238-2302
Timestamp: 2025-09-24T12:54:00.765Z
Learning: The wundergraph/cosmo project uses Go 1.25 (Go 1.25 minimum), so fmt.Appendf and other newer Go standard library functions are available and can be used without compatibility concerns.

Applied to files:

  • router/pkg/mcpserver/server.go
  • router/core/router.go
📚 Learning: 2025-07-21T15:06:36.664Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2067
File: router/pkg/config/config.schema.json:1637-1644
Timestamp: 2025-07-21T15:06:36.664Z
Learning: In the Cosmo router project, when extending JSON schema validation for security-sensitive fields like JWKS secrets, backwards compatibility is maintained by implementing warnings in the Go code rather than hard validation constraints in the schema. This allows existing configurations to continue working while alerting users to potential security issues.

Applied to files:

  • router/pkg/config/config.schema.json
  • router/pkg/connectrpc/validator.go
📚 Learning: 2025-07-30T09:29:46.660Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2090
File: router/pkg/config/config.schema.json:0-0
Timestamp: 2025-07-30T09:29:46.660Z
Learning: The "operation_name_trim_limit" configuration property in router/pkg/config/config.schema.json should be placed at the security level as a sibling to complexity_limits, not inside the complexity_limits object.

Applied to files:

  • router/pkg/config/config.schema.json
📚 Learning: 2025-07-21T14:46:34.879Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2067
File: router/pkg/authentication/jwks_token_decoder.go:80-106
Timestamp: 2025-07-21T14:46:34.879Z
Learning: In the Cosmo router project, required field validation for JWKS configuration (Secret, Algorithm, KeyId) is handled at the JSON schema level in config.schema.json rather than through runtime validation in the Go code at router/pkg/authentication/jwks_token_decoder.go.

Applied to files:

  • router/pkg/config/config.schema.json
  • router/pkg/connectrpc/validator.go
📚 Learning: 2025-06-30T20:39:02.387Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 1929
File: router/internal/circuit/manager.go:16-25
Timestamp: 2025-06-30T20:39:02.387Z
Learning: In the Cosmo router project, parameter validation for circuit breaker configuration is handled at the JSON schema level rather than through runtime validation methods on structs. The config.schema.json file contains comprehensive validation constraints for circuit breaker parameters.

Applied to files:

  • router/pkg/config/config.schema.json
📚 Learning: 2025-08-20T10:08:17.857Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2155
File: router/core/router.go:1857-1866
Timestamp: 2025-08-20T10:08:17.857Z
Learning: In the Cosmo router codebase, JSON schema validation prevents null values in TrafficShapingRules subgraph configurations, making nil checks unnecessary when dereferencing subgraph rule pointers in NewSubgraphTransportOptions.

Applied to files:

  • router/pkg/config/config.schema.json
  • router/pkg/connectrpc/handler.go
📚 Learning: 2025-08-20T10:08:17.857Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2155
File: router/core/router.go:1857-1866
Timestamp: 2025-08-20T10:08:17.857Z
Learning: router/pkg/config/config.schema.json forbids null values for traffic_shaping.subgraphs: additionalProperties references $defs.traffic_shaping_subgraph_request_rule with type "object". Therefore, in core.NewSubgraphTransportOptions, dereferencing each subgraph rule pointer is safe under schema-validated configs, and a nil-check is unnecessary.

Applied to files:

  • router/pkg/config/config.schema.json
  • router/pkg/connectrpc/handler.go
📚 Learning: 2025-09-17T20:55:39.456Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2172
File: router/core/graph_server.go:0-0
Timestamp: 2025-09-17T20:55:39.456Z
Learning: The Initialize method in router/internal/retrytransport/manager.go has been updated to properly handle feature-flag-only subgraphs by collecting subgraphs from both routerConfig.GetSubgraphs() and routerConfig.FeatureFlagConfigs.ConfigByFeatureFlagName, ensuring all subgraphs receive retry configuration.

Applied to files:

  • router/core/router.go
📚 Learning: 2025-07-30T09:29:04.257Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2090
File: router/core/operation_processor.go:0-0
Timestamp: 2025-07-30T09:29:04.257Z
Learning: GraphQL operation names don't allow characters with more than 1 code point, so string length operations and slicing work correctly for both byte and character counting in GraphQL operation name processing.

Applied to files:

  • protographic/src/operation-to-proto.ts
📚 Learning: 2025-09-02T12:52:27.677Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2181
File: router/core/operation_processor.go:0-0
Timestamp: 2025-09-02T12:52:27.677Z
Learning: Hash validation for persisted queries with query bodies is performed in router/core/graphql_prehandler.go in the handleOperation function, not in the APQ processing logic in operation_processor.go. This validates that extensions.persistedQuery.sha256Hash matches the query body before any APQ operations occur.

Applied to files:

  • router/pkg/connectrpc/handler.go
📚 Learning: 2025-09-02T12:52:27.677Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2181
File: router/core/operation_processor.go:0-0
Timestamp: 2025-09-02T12:52:27.677Z
Learning: Hash validation for persisted queries with query bodies is performed in router/core/graphql_prehandler.go in the handleOperation function at lines 571-578, not in the APQ processing logic in operation_processor.go. This validates that extensions.persistedQuery.sha256Hash matches the computed query hash before any APQ operations occur. There's also a test case that verifies this behavior.

Applied to files:

  • router/pkg/connectrpc/handler.go
📚 Learning: 2025-08-07T12:05:06.775Z
Learnt from: StarpTech
Repo: wundergraph/cosmo PR: 2079
File: proto/wg/cosmo/platform/v1/platform.proto:39-45
Timestamp: 2025-08-07T12:05:06.775Z
Learning: In the Cosmo project, the proto fields for schema, mappings, and lock in ProtoInput are intentionally kept as string types rather than bytes because the team works with text data and wants it to be UTF-8 encoded for readability and text processing purposes.

Applied to files:

  • router/pkg/connectrpc/handler.go
📚 Learning: 2025-10-01T20:39:16.113Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2252
File: router-tests/telemetry/telemetry_test.go:9684-9693
Timestamp: 2025-10-01T20:39:16.113Z
Learning: Repo preference: In router-tests/telemetry/telemetry_test.go, keep strict > 0 assertions for request.operation.*Time (parsingTime, normalizationTime, validationTime, planningTime) in telemetry-related tests; do not relax to >= 0 unless CI flakiness is observed.

Applied to files:

  • router/pkg/connectrpc/error_handling_test.go
  • router-tests/connectrpc/connectrpc_server_lifecycle_test.go
📚 Learning: 2025-08-28T09:18:10.121Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2141
File: router-tests/http_subscriptions_test.go:100-108
Timestamp: 2025-08-28T09:18:10.121Z
Learning: In router-tests/http_subscriptions_test.go heartbeat tests, the message ordering should remain strict with data messages followed by heartbeat messages, as the timing is deterministic and known by design in the Cosmo router implementation.

Applied to files:

  • router-tests/connectrpc/connectrpc_server_lifecycle_test.go
📚 Learning: 2025-09-19T15:08:03.085Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2223
File: router/cmd/main.go:32-33
Timestamp: 2025-09-19T15:08:03.085Z
Learning: pyroscope-go's Config.ServerAddress field accepts both full URLs (e.g., "http://localhost:4040") and simple host:port format (e.g., "localhost:4040"). The library handles URL normalization internally, so both formats work correctly.

Applied to files:

  • router/pkg/connectrpc/server.go
🧬 Code graph analysis (7)
router/pkg/mcpserver/server.go (1)
router/pkg/httputil/headers.go (1)
  • SkippedHeaders (6-27)
router/pkg/connectrpc/operation_registry.go (1)
router/core/context.go (3)
  • OperationTypeQuery (506-506)
  • OperationTypeMutation (507-507)
  • OperationTypeSubscription (508-508)
router/pkg/connectrpc/service_discovery.go (3)
router/pkg/mcpserver/util.go (1)
  • Logger (6-9)
demo/pkg/subgraphs/family/subgraph/employees.go (1)
  • String (8-10)
router/pkg/pubsub/datasource/error.go (1)
  • Error (3-6)
router/core/router_config.go (3)
router/pkg/connectrpc/server.go (1)
  • Server (36-47)
router/core/graph_server.go (1)
  • Server (69-72)
router/pkg/config/config.go (1)
  • ConnectRPCConfiguration (1012-1017)
router/pkg/connectrpc/error_handling_test.go (2)
router/pkg/connectrpc/handler.go (7)
  • MetaKeyHTTPStatus (81-81)
  • MetaKeyErrorClassification (82-82)
  • ErrorClassificationCritical (90-90)
  • ErrorClassificationPartial (91-91)
  • HandlerConfig (130-136)
  • MetaKeyGraphQLErrors (83-83)
  • MetaKeyGraphQLPartialData (84-84)
router/pkg/connectrpc/test_helpers.go (1)
  • MockHTTPClient (65-72)
router-tests/testdata/connectrpc/client/employee.v1/employeev1connect/service.connect.go (1)
router-tests/testdata/connectrpc/client/employee.v1/service.pb.go (42)
  • FindEmployeesByPetsRequest (321-327)
  • FindEmployeesByPetsRequest (342-342)
  • FindEmployeesByPetsRequest (357-359)
  • FindEmployeesByPetsResponse (368-375)
  • FindEmployeesByPetsResponse (390-390)
  • FindEmployeesByPetsResponse (405-407)
  • FindEmployeesByPetsInlineFragmentRequest (586-592)
  • FindEmployeesByPetsInlineFragmentRequest (607-607)
  • FindEmployeesByPetsInlineFragmentRequest (622-624)
  • FindEmployeesByPetsInlineFragmentResponse (633-640)
  • FindEmployeesByPetsInlineFragmentResponse (655-655)
  • FindEmployeesByPetsInlineFragmentResponse (670-672)
  • FindEmployeesByPetsNamedFragmentRequest (681-687)
  • FindEmployeesByPetsNamedFragmentRequest (702-702)
  • FindEmployeesByPetsNamedFragmentRequest (717-719)
  • FindEmployeesByPetsNamedFragmentResponse (728-735)
  • FindEmployeesByPetsNamedFragmentResponse (750-750)
  • FindEmployeesByPetsNamedFragmentResponse (765-767)
  • GetEmployeeByIdRequest (227-233)
  • GetEmployeeByIdRequest (248-248)
  • GetEmployeeByIdRequest (263-265)
  • GetEmployeeByIdResponse (274-280)
  • GetEmployeeByIdResponse (295-295)
  • GetEmployeeByIdResponse (310-312)
  • GetEmployeesRequest (501-505)
  • GetEmployeesRequest (520-520)
  • GetEmployeesRequest (535-537)
  • GetEmployeesResponse (539-545)
  • GetEmployeesResponse (560-560)
  • GetEmployeesResponse (575-577)
  • GetEmployeesWithMoodRequest (416-420)
  • GetEmployeesWithMoodRequest (435-435)
  • GetEmployeesWithMoodRequest (450-452)
  • GetEmployeesWithMoodResponse (454-460)
  • GetEmployeesWithMoodResponse (475-475)
  • GetEmployeesWithMoodResponse (490-492)
  • UpdateEmployeeMoodRequest (124-131)
  • UpdateEmployeeMoodRequest (146-146)
  • UpdateEmployeeMoodRequest (161-163)
  • UpdateEmployeeMoodResponse (179-186)
  • UpdateEmployeeMoodResponse (201-201)
  • UpdateEmployeeMoodResponse (216-218)
router-tests/connectrpc/connectrpc_server_lifecycle_test.go (1)
router/pkg/connectrpc/server.go (3)
  • NewServer (50-166)
  • ServerConfig (21-33)
  • Server (36-47)
🪛 markdownlint-cli2 (0.18.1)
router-tests/testdata/connectrpc/README.md

16-16: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: build-router
  • GitHub Check: build_push_image
  • GitHub Check: build_push_image (nonroot)
  • GitHub Check: integration_test (./. ./fuzzquery ./lifecycle ./modules)
  • GitHub Check: integration_test (./telemetry)
  • GitHub Check: image_scan
  • GitHub Check: integration_test (./events)
  • GitHub Check: image_scan (nonroot)
  • GitHub Check: build_test
  • GitHub Check: Analyze (go)
  • GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (18)
protographic/src/operation-to-proto.ts (1)

267-279: PascalCase validation with suggested rename looks solid.

The new PascalCase check and use of upperFirst(camelCase(operationName)) for suggestions give clear, actionable errors while preserving a strict 1:1 mapping from GraphQL operation names to RPC method names (when no type prefix is used). No issues from my side.

router/pkg/connectrpc/error_handling_test.go (1)

16-267: Comprehensive error/success coverage for executeGraphQL.

This test suite does a good job exercising HTTP transport failures, full GraphQL failures, partial-success cases (with both graphql-errors and graphql-partial-data metadata), and successful responses. The setup via MockHTTPClient + NewRPCHandler is clean and keeps the tests focused on the translation logic.

router/core/router_config.go (1)

14-15: ConnectRPC wiring into core config and usage looks consistent.

Threading connectRPCServer and connectRPC through Config, and exposing usage["connect_rpc"] based on Enabled, matches existing patterns (e.g. MCP, metrics, plugins) and should integrate cleanly with existing config loading and telemetry.

Also applies to: 111-111, 143-143, 333-334

router-tests/testdata/connectrpc/buf.gen.yaml (1)

1-15: Buf v2 config is minimal and uses local plugins as desired.

Managed mode with a go_package_prefix and local protoc-gen-go / protoc-gen-connect-go plugins targeting client/ is exactly what you want for regenerating the test clients without relying on remote plugins or rate limits.

router/pkg/connectrpc/test_helpers.go (2)

25-41: LGTM!

The shared proto loader pattern is well-implemented with proper synchronization and test helper semantics. This prevents registration conflicts across tests while maintaining test isolation.


64-85: LGTM!

The mock HTTP client implementation is suitable for unit testing. The mockRoundTripper provides predictable responses for testing different status codes and response bodies.

router/pkg/connectrpc/operation_registry.go (2)

138-164: LGTM!

The operation extraction logic correctly parses GraphQL documents and maps AST operation types to string representations. The operation type strings align with the standard types used elsewhere in the router.


200-215: LGTM!

The getter methods implement proper thread-safety with read locks and defensive copying. Line 211 correctly dereferences the pointer to create a copy, preventing external modification of the registry's internal state.

router/core/router.go (3)

960-1034: LGTM!

The ConnectRPC bootstrap flow follows the established MCP server pattern with proper:

  • Configuration validation and logging
  • Storage provider resolution
  • Service discovery
  • GraphQL endpoint determination
  • Server initialization and startup
  • Error handling at each step

The implementation is consistent with the existing codebase patterns.


1551-1557: LGTM!

The ConnectRPC server shutdown is properly integrated into the concurrent shutdown flow using wg.Go(). This ensures graceful shutdown alongside other components while collecting any errors.


2208-2212: LGTM!

The WithConnectRPC option follows the established router option pattern and properly sets the configuration field.

router/pkg/connectrpc/validator.go (1)

54-62: LGTM!

The debug logging now uses structured zap logging instead of fmt.Printf statements, which is appropriate for production code. The nil check for the logger is a good defensive practice.

router/pkg/connectrpc/server.go (2)

50-88: LGTM!

The constructor properly validates required configuration, sets reasonable defaults, and creates a production-ready HTTP client with retry support. The error messages are clear and actionable.


293-295: LGTM!

The Reload method now properly wraps the new handler with h2c.NewHandler to maintain HTTP/2 over cleartext support for gRPC compatibility. This addresses the previous issue where the h2c wrapper was missing after reload.

router/pkg/connectrpc/service_discovery.go (2)

46-153: LGTM!

The service discovery implementation follows a clear convention-based approach with:

  • Proper validation of inputs and directory existence
  • Uniqueness enforcement for package.service combinations
  • Clear error messages for misconfiguration
  • Appropriate logging at each stage
  • Correct use of filepath.SkipDir to prevent duplicate service detection

The flow is well-structured and handles edge cases appropriately.


177-217: LGTM!

The service info extraction enforces clear conventions:

  • One proto file per service directory (prevents ambiguity)
  • Required package and service declarations
  • Clear error messages when conventions are violated

This makes the discovery predictable and easy to debug.

router/pkg/httputil/headers.go (1)

6-27: No changes needed. Cookie handling is already addressed through the CookieWhitelist middleware in router/internal/middleware/cookie_filter.go, which provides configurable whitelisting and safelisting of cookies. Authorization headers are intentionally forwarded (not in the skip list) and are tested in handler_test.go. The SkippedHeaders map correctly defines hop-by-hop and connection-specific headers that should not cross service boundaries, in accordance with HTTP specifications.

Likely an incorrect or invalid review comment.

router-tests/testdata/connectrpc/client/employee.v1/employeev1connect/service.connect.go (1)

1-307: Generated Connect bindings look standard and need no manual changes.

This file is generator output from protoc-gen-connect-go and matches the usual client/handler patterns for Connect. It’s appropriate to leave it as‑is and rely on regenerating if the proto changes.

Comment on lines +96 to +98
if opRegistry.operations[serviceName] == nil {
opRegistry.operations[serviceName] = make(map[string]*schemaloader.Operation)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Direct access to internal operations map breaks encapsulation.

Lines 96-98 directly manipulate the unexported operations field of OperationRegistry:

if opRegistry.operations[serviceName] == nil {
    opRegistry.operations[serviceName] = make(map[string]*schemaloader.Operation)
}
opRegistry.operations[serviceName]["GetEmployeeById"] = &schemaloader.Operation{...}

This tightly couples tests to internal implementation details. Consider adding a test-friendly public method to OperationRegistry:

// In operation_registry.go:
func (r *OperationRegistry) RegisterOperationForTesting(serviceName, opName string, op *schemaloader.Operation) {
    r.mu.Lock()
    defer r.mu.Unlock()
    if r.operations[serviceName] == nil {
        r.operations[serviceName] = make(map[string]*schemaloader.Operation)
    }
    r.operations[serviceName][opName] = op
}
🤖 Prompt for AI Agents
In router/pkg/connectrpc/test_helpers.go around lines 96 to 98, the test
directly mutates the unexported opRegistry.operations map which breaks
encapsulation; add a new exported helper method on OperationRegistry (e.g.,
RegisterOperationForTesting(serviceName, opName string, op
*schemaloader.Operation)) in operation_registry.go that takes the registry lock,
initializes the service map if nil, and sets the operation entry, then update
this test to call that new method instead of touching opRegistry.operations
directly.

@asoorm asoorm force-pushed the ahmet/eng-8277-connect-rpc-handler-final branch 2 times, most recently from 8b33d1d to 91096a7 Compare December 3, 2025 15:48
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (12)
router/pkg/connectrpc/constructor_validation_test.go (3)

103-114: Tests use relative paths that may fail when CI runs from the router directory.

The paths "samples/services" at lines 103, 114 are relative to the test file location, but CI runs tests from the router/ directory. These paths should be "pkg/connectrpc/samples/services" to work correctly from the router module root.


126-129: Path issue persists but error handling is now correct.

The LoadFromDirectory call properly checks and propagates errors. However, the path "samples/services/employee.v1" still needs to be adjusted to "pkg/connectrpc/samples/services/employee.v1" for CI compatibility.


190-210: Default behavior tests are well-structured; paths need adjustment.

The tests correctly verify endpoint normalization (adding http:// scheme) and default listen address (0.0.0.0:5026). The "samples/services" paths at lines 192 and 203 need the same adjustment as noted above.

router/pkg/connectrpc/vanguard_service_test.go (2)

20-31: Relative path used for proto loading will fail in CI.

GetSharedProtoLoader(t, "samples/services/employee.v1") uses a path relative to the test file. When CI runs from the router/ directory, this path should be "pkg/connectrpc/samples/services/employee.v1".


254-261: Direct access to internal operations map breaks encapsulation.

Lines 254-261 directly manipulate the unexported operations field of OperationRegistry. Consider using a test-friendly public method like RegisterOperationForTesting to avoid coupling tests to internal implementation details.

router-tests/connectrpc/connectrpc_server_lifecycle_test.go (1)

53-78: Concurrent start test could verify expected failure behavior.

The test asserts "at least one should succeed" but doesn't verify that the failures are due to port conflicts. Consider adding an assertion that verifies failures contain "address already in use" to confirm the expected concurrent behavior:

failureCount := 0
for _, err := range errors {
    if err != nil {
        failureCount++
        // Optionally verify it's a port conflict
        assert.Contains(t, err.Error(), "address already in use")
    }
}
assert.Equal(t, 2, failureCount, "concurrent starts should fail with port conflict")
router/pkg/connectrpc/test_helpers.go (1)

96-103: Direct access to internal operations map breaks encapsulation.

Lines 96-98 directly manipulate the unexported operations field. Add a test-friendly method to OperationRegistry:

// In operation_registry.go:
func (r *OperationRegistry) RegisterOperationForTesting(serviceName, opName string, op *schemaloader.Operation) {
    r.mu.Lock()
    defer r.mu.Unlock()
    if r.operations[serviceName] == nil {
        r.operations[serviceName] = make(map[string]*schemaloader.Operation)
    }
    r.operations[serviceName][opName] = op
}
router/pkg/connectrpc/server.go (2)

250-252: Remove redundant Clear() that can corrupt state for in-flight requests.

The initializeComponents() at line 304 creates a completely new OperationRegistry, making this Clear() call redundant. Worse, it modifies the live registry still referenced by the old handler chain (transcoder → vanguardService → rpcHandler → operationRegistry), which can cause in-flight requests to fail with empty operation lookups.

 	// Create a fresh proto loader and clear operation registry before initializing components
 	// This ensures initializeComponents() (and the RPCHandler it constructs) receives the fresh ProtoLoader
 	s.protoLoader = NewProtoLoader(s.logger)
-	if s.operationRegistry != nil {
-		s.operationRegistry.Clear()
-	}

293-299: Guard Reload against being called before Start.

If Reload() is called before Start(), s.httpServer will be nil and this will panic. Add a guard for defensive coding:

+	if s.httpServer == nil {
+		return fmt.Errorf("server is not started; call Start before Reload")
+	}
+
 	// Update HTTP server handler with h2c wrapper for gRPC compatibility
 	handler := s.createHandler()
 	s.httpServer.Handler = h2c.NewHandler(handler, &http2.Server{})
router/pkg/connectrpc/proto_loader.go (2)

208-236: Import path configuration may break common proto layouts.

The parser is configured with ImportPaths: []string{dir} where dir is filepath.Dir(path). This means imports relative to a service root (e.g., import "common/types.proto") won't resolve if types.proto lives in a sibling directory.

Consider using the service root directory as the import path and passing relative paths to ParseFiles:

-	// Get the directory containing the proto file for import resolution
-	dir := filepath.Dir(path)
-	filename := filepath.Base(path)
+	// Use the directory containing the proto file as import path
+	// Note: This may fail for complex import layouts where imports reference
+	// sibling directories. Consider using the service root as import path
+	// and relative filenames in such cases.
+	dir := filepath.Dir(path)
+	filename := filepath.Base(path)

244-251: Duplicate proto file basenames can silently skip services.

The file path check at line 245 uses protoFd.Path() which returns the filename passed to ParseFiles (the basename). If two service directories each have a service.proto file, the second one will be treated as "already registered" and its services won't be extracted.

Consider either:

  1. Using relative paths from the root directory so paths are unique
  2. Failing fast with an error instead of silently skipping
router/pkg/connectrpc/handler.go (1)

495-507: Empty object {} incorrectly treated as "no data".

Line 498 treats {} as having no data, but an empty object is valid GraphQL data. When a query returns {"data": {}, "errors": [...]}, this represents a partial success where some fields errored but the query structure was valid. The current logic incorrectly classifies this as CRITICAL.

Apply this diff to fix the classification:

-	hasData := len(graphqlResponse.Data) > 0 && string(graphqlResponse.Data) != "null" && string(graphqlResponse.Data) != "{}"
+	hasData := len(graphqlResponse.Data) > 0 && string(graphqlResponse.Data) != "null"

With this change, responses with {} and errors will correctly be classified as PARTIAL rather than CRITICAL.

🧹 Nitpick comments (13)
protographic/src/operation-to-proto.ts (1)

294-296: LGTM! Direct methodName assignment simplifies logic.

Using the operation name directly is correct now that PascalCase validation ensures proper casing. This removes unnecessary transformation logic.

Minor: The comment numbering has duplicate step "4" (here and line 304). Consider renumbering for clarity:

-    // 4. Create method name from operation name
+    // 4. Create method name from operation name (validated PascalCase)
     // Use operation name as-is to ensure exact matching (no transformation)
     let methodName = operationName;

And update line 304 to step 5.

router/pkg/connectrpc/service_discovery_test.go (1)

178-216: Consider asserting on OperationFiles in the discovery result, not just findOperationFiles

The “discovers operations in subdirectories of service” test currently validates findOperationFiles(serviceDir) directly. To fully exercise the DiscoverServices contract, it would be more robust to also (or instead) assert that services[0].OperationFiles contains the expected 2 operations, so the test guards the wiring between discovery and operation collection rather than only the helper.

router/pkg/connectrpc/error_handling_test.go (1)

16-267: End-to-end error classifications and metadata coverage look solid

The table-driven TestErrorHandling cases do a good job of exercising HTTP transport errors, pure GraphQL errors, and partial-success scenarios through executeGraphQL, including checking Connect codes, messages, and the key metadata entries (http-status, error-classification, graphql-errors, graphql-partial-data). That’s an appropriate level of coverage for this logic instead of unit-testing tiny helpers. As a minor enhancement, you could also add one or two negative checks (e.g., assert that MetaKeyGraphQLPartialData is empty when only errors+null data are returned) to make regressions on when partial-data metadata is emitted more obvious, but the current tests are already quite strong.

router/pkg/connectrpc/test_helpers.go (1)

55-59: Ignoring Write error in mock server.

The return value from w.Write([]byte(server.ResponseBody)) is discarded. While unlikely to fail in tests, consider logging or handling the error:

-		w.Write([]byte(server.ResponseBody))
+		_, _ = w.Write([]byte(server.ResponseBody))

The explicit discard (_, _) makes it clear the error is intentionally ignored.

router/pkg/connectrpc/operation_registry.go (2)

57-76: Lock held during file I/O operations.

The mutex is acquired at line 57 and held during os.ReadFile operations (line 70). For large numbers of operation files, this could block concurrent readers. Consider loading files outside the lock and only acquiring it for map updates:

// Load files first (outside lock)
type loadedOp struct {
    filePath string
    content  []byte
}
var loaded []loadedOp
for _, filePath := range operationFiles {
    content, err := os.ReadFile(filePath)
    if err != nil {
        r.logger.Warn("failed to read operation file", ...)
        continue
    }
    loaded = append(loaded, loadedOp{filePath, content})
}

// Then acquire lock for map updates
r.mu.Lock()
defer r.mu.Unlock()
// ... process loaded files

This is a minor optimization since this is likely called at startup, but worth noting for scalability.


137-164: Only extracts first operation from document.

The function returns after finding the first OperationDefinition, silently ignoring any subsequent operations. If each file should contain exactly one operation, consider adding a warning when multiple operations are found:

operationCount := 0
for _, ref := range doc.RootNodes {
    if ref.Kind == ast.NodeKindOperationDefinition {
        operationCount++
    }
}
if operationCount > 1 {
    // Log warning or return error
}
router/core/router.go (1)

960-1034: ConnectRPC bootstrap logic is well-structured and follows the MCP pattern.

The implementation correctly:

  1. Validates the services provider exists before proceeding
  2. Discovers services using the convention-based approach
  3. Determines the GraphQL endpoint with appropriate fallback
  4. Creates and starts the server with proper error handling

However, there's a potential issue with the GraphQL endpoint construction:

-		routerGraphQLEndpoint = path.Join(r.listenAddr, r.graphqlPath)
+		routerGraphQLEndpoint = "http://" + r.listenAddr + r.graphqlPath

Using path.Join is incorrect for URL construction—it normalizes slashes and doesn't preserve the protocol. This matches the issue in the MCP bootstrap (line 941) which has the same pattern. The ConnectRPC server explicitly adds http:// if missing (line 73-75 in server.go), but it's cleaner to construct it correctly here.

router/pkg/connectrpc/server.go (1)

77-81: Consider making retry configuration configurable.

The retryablehttp.NewClient() uses default retry settings. For production use, you may want to expose configuration for:

  • Max retries
  • Retry wait time
  • Backoff strategy

This can be deferred to a future iteration since the defaults are reasonable.

router/pkg/connectrpc/proto_loader.go (1)

324-328: Consider returning a copy to prevent external mutation.

The GetServices() method returns the internal map directly. While the doc comment says to treat it as read-only, callers could accidentally mutate it. Consider returning a shallow copy:

 // GetServices returns all loaded service definitions.
 // The returned map should be treated as read-only to prevent accidental mutation.
 func (pl *ProtoLoader) GetServices() map[string]*ServiceDefinition {
-	return pl.services
+	result := make(map[string]*ServiceDefinition, len(pl.services))
+	for k, v := range pl.services {
+		result[k] = v
+	}
+	return result
 }
router/pkg/connectrpc/service_discovery.go (1)

219-253: Proto parsing could be fooled by comments or strings.

The extractPackageFromProto and extractServiceNameFromProto functions use simple line-by-line text matching. This could incorrectly match:

  • Commented-out declarations: // package foo;
  • Strings containing declarations: option foo = "package bar;";

For robustness, consider:

  1. Stripping single-line comments (//...) before matching
  2. Using the proto parser for extraction (though this adds complexity)

Given this is convention-based discovery and proto files are expected to follow standard patterns, this is a minor concern.

 // extractPackageFromProto extracts the package name from proto file content
 func extractPackageFromProto(content string) string {
 	lines := strings.SplitSeq(content, "\n")
 	for line := range lines {
 		line = strings.TrimSpace(line)
+		// Skip comments
+		if strings.HasPrefix(line, "//") || strings.HasPrefix(line, "/*") {
+			continue
+		}
 		if after, ok := strings.CutPrefix(line, "package "); ok {
router/pkg/connectrpc/vanguard_service.go (2)

319-326: Non-deterministic service name ordering.

GetServiceNames() iterates over a map, which has non-deterministic ordering in Go. If callers expect consistent ordering (e.g., for logging or testing), consider sorting the result:

 // GetServiceNames returns the names of all registered services
 func (vs *VanguardService) GetServiceNames() []string {
 	names := make([]string, 0, len(vs.services))
 	for serviceName := range vs.protoLoader.GetServices() {
 		names = append(names, serviceName)
 	}
+	sort.Strings(names)
 	return names
 }

374-381: Potential duplicate file descriptors in result.

GetFileDescriptors iterates over services and appends each service's file descriptor. If multiple services share the same file descriptor (defined in the same proto file), duplicates will be returned. Consider deduplicating:

 // GetFileDescriptors returns all file descriptors from the proto loader
 func (vs *VanguardService) GetFileDescriptors() []protoreflect.FileDescriptor {
-	descriptors := make([]protoreflect.FileDescriptor, 0)
+	seen := make(map[string]bool)
+	descriptors := make([]protoreflect.FileDescriptor, 0)
 	for _, service := range vs.protoLoader.GetServices() {
+		path := string(service.FileDescriptor.Path())
+		if seen[path] {
+			continue
+		}
+		seen[path] = true
 		descriptors = append(descriptors, service.FileDescriptor)
 	}
 	return descriptors
 }
router/pkg/connectrpc/handler.go (1)

94-118: Consider consolidating duplicate GraphQL types.

The GraphQLError and GraphQLResponse types are duplicated in router/pkg/mcpserver/server.go. Consider moving these to a shared package (e.g., router/pkg/graphql/types.go) to reduce duplication and ensure consistency across the codebase. Note that the version here is more complete with Path, Locations, and Extensions fields.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7794a2b and 91096a7.

⛔ Files ignored due to path filters (1)
  • router-tests/testdata/connectrpc/client/employee.v1/service.pb.go is excluded by !**/*.pb.go
📒 Files selected for processing (27)
  • protographic/src/operation-to-proto.ts (2 hunks)
  • router-tests/connectrpc/connectrpc_server_lifecycle_test.go (1 hunks)
  • router-tests/testdata/connectrpc/README.md (1 hunks)
  • router-tests/testdata/connectrpc/buf.gen.yaml (1 hunks)
  • router-tests/testdata/connectrpc/client/employee.v1/employeev1connect/service.connect.go (1 hunks)
  • router/core/router.go (5 hunks)
  • router/core/router_config.go (4 hunks)
  • router/pkg/config/config.go (2 hunks)
  • router/pkg/config/config.schema.json (1 hunks)
  • router/pkg/config/connectrpc_test.go (1 hunks)
  • router/pkg/connectrpc/constructor_validation_test.go (1 hunks)
  • router/pkg/connectrpc/error_handling_test.go (1 hunks)
  • router/pkg/connectrpc/handler.go (1 hunks)
  • router/pkg/connectrpc/handler_test.go (1 hunks)
  • router/pkg/connectrpc/operation_registry.go (1 hunks)
  • router/pkg/connectrpc/operation_registry_test.go (1 hunks)
  • router/pkg/connectrpc/proto_loader.go (1 hunks)
  • router/pkg/connectrpc/server.go (1 hunks)
  • router/pkg/connectrpc/server_test.go (1 hunks)
  • router/pkg/connectrpc/service_discovery.go (1 hunks)
  • router/pkg/connectrpc/service_discovery_test.go (1 hunks)
  • router/pkg/connectrpc/test_helpers.go (1 hunks)
  • router/pkg/connectrpc/validator.go (1 hunks)
  • router/pkg/connectrpc/vanguard_service.go (1 hunks)
  • router/pkg/connectrpc/vanguard_service_test.go (1 hunks)
  • router/pkg/httputil/headers.go (1 hunks)
  • router/pkg/mcpserver/server.go (2 hunks)
✅ Files skipped from review due to trivial changes (1)
  • router/pkg/connectrpc/operation_registry_test.go
🚧 Files skipped from review as they are similar to previous changes (7)
  • router/core/router_config.go
  • router/pkg/connectrpc/handler_test.go
  • router/pkg/connectrpc/server_test.go
  • router/pkg/connectrpc/validator.go
  • router-tests/testdata/connectrpc/buf.gen.yaml
  • router/pkg/config/connectrpc_test.go
  • router/pkg/httputil/headers.go
🧰 Additional context used
🧠 Learnings (18)
📓 Common learnings
Learnt from: dkorittki
Repo: wundergraph/cosmo PR: 2273
File: router/core/graphql_handler.go:0-0
Timestamp: 2025-11-19T15:13:57.821Z
Learning: In the Cosmo router (wundergraph/cosmo), error handling follows a two-phase pattern: (1) Prehandler phase handles request parsing, validation, and setup errors using `httpGraphqlError` and `writeOperationError` (in files like graphql_prehandler.go, operation_processor.go, parse_multipart.go, batch.go); (2) Execution phase handles resolver execution errors using `WriteError` in GraphQLHandler.ServeHTTP. Because all `httpGraphqlError` instances are caught in the prehandler before ServeHTTP is invoked, any error type checks for `httpGraphqlError` in the execution-phase WriteError method are unreachable code.
📚 Learning: 2025-10-01T20:39:16.113Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2252
File: router-tests/telemetry/telemetry_test.go:9684-9693
Timestamp: 2025-10-01T20:39:16.113Z
Learning: Repo preference: In router-tests/telemetry/telemetry_test.go, keep strict > 0 assertions for request.operation.*Time (parsingTime, normalizationTime, validationTime, planningTime) in telemetry-related tests; do not relax to >= 0 unless CI flakiness is observed.

Applied to files:

  • router/pkg/connectrpc/constructor_validation_test.go
  • router/pkg/connectrpc/service_discovery_test.go
  • router/pkg/connectrpc/test_helpers.go
  • router-tests/connectrpc/connectrpc_server_lifecycle_test.go
  • router/pkg/connectrpc/error_handling_test.go
📚 Learning: 2025-08-20T22:13:25.222Z
Learnt from: StarpTech
Repo: wundergraph/cosmo PR: 2157
File: router-tests/go.mod:16-16
Timestamp: 2025-08-20T22:13:25.222Z
Learning: github.com/mark3labs/mcp-go v0.38.0 has regressions and should not be used in the wundergraph/cosmo project. v0.36.0 is the stable version that should be used across router-tests and other modules.

Applied to files:

  • router/pkg/connectrpc/constructor_validation_test.go
  • router/pkg/mcpserver/server.go
  • router/core/router.go
📚 Learning: 2025-09-24T12:54:00.765Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2222
File: router-tests/websocket_test.go:2238-2302
Timestamp: 2025-09-24T12:54:00.765Z
Learning: The wundergraph/cosmo project uses Go 1.25 (Go 1.23+ minimum), so fmt.Appendf and other newer Go standard library functions are available and can be used without compatibility concerns.

Applied to files:

  • router/pkg/mcpserver/server.go
  • router/core/router.go
📚 Learning: 2025-11-19T15:13:57.821Z
Learnt from: dkorittki
Repo: wundergraph/cosmo PR: 2273
File: router/core/graphql_handler.go:0-0
Timestamp: 2025-11-19T15:13:57.821Z
Learning: In the Cosmo router (wundergraph/cosmo), error handling follows a two-phase pattern: (1) Prehandler phase handles request parsing, validation, and setup errors using `httpGraphqlError` and `writeOperationError` (in files like graphql_prehandler.go, operation_processor.go, parse_multipart.go, batch.go); (2) Execution phase handles resolver execution errors using `WriteError` in GraphQLHandler.ServeHTTP. Because all `httpGraphqlError` instances are caught in the prehandler before ServeHTTP is invoked, any error type checks for `httpGraphqlError` in the execution-phase WriteError method are unreachable code.

Applied to files:

  • router/pkg/mcpserver/server.go
  • router/pkg/connectrpc/test_helpers.go
  • router/pkg/connectrpc/vanguard_service.go
  • router/pkg/connectrpc/vanguard_service_test.go
  • router/pkg/connectrpc/handler.go
  • router/pkg/connectrpc/error_handling_test.go
📚 Learning: 2025-09-24T12:54:00.765Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2222
File: router-tests/websocket_test.go:2238-2302
Timestamp: 2025-09-24T12:54:00.765Z
Learning: The wundergraph/cosmo project uses Go 1.25 (Go 1.25 minimum), so fmt.Appendf and other newer Go standard library functions are available and can be used without compatibility concerns.

Applied to files:

  • router/pkg/mcpserver/server.go
  • router/core/router.go
📚 Learning: 2025-09-17T20:55:39.456Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2172
File: router/core/graph_server.go:0-0
Timestamp: 2025-09-17T20:55:39.456Z
Learning: The Initialize method in router/internal/retrytransport/manager.go has been updated to properly handle feature-flag-only subgraphs by collecting subgraphs from both routerConfig.GetSubgraphs() and routerConfig.FeatureFlagConfigs.ConfigByFeatureFlagName, ensuring all subgraphs receive retry configuration.

Applied to files:

  • router/core/router.go
📚 Learning: 2025-07-21T15:06:36.664Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2067
File: router/pkg/config/config.schema.json:1637-1644
Timestamp: 2025-07-21T15:06:36.664Z
Learning: In the Cosmo router project, when extending JSON schema validation for security-sensitive fields like JWKS secrets, backwards compatibility is maintained by implementing warnings in the Go code rather than hard validation constraints in the schema. This allows existing configurations to continue working while alerting users to potential security issues.

Applied to files:

  • router/pkg/config/config.schema.json
📚 Learning: 2025-07-21T14:46:34.879Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2067
File: router/pkg/authentication/jwks_token_decoder.go:80-106
Timestamp: 2025-07-21T14:46:34.879Z
Learning: In the Cosmo router project, required field validation for JWKS configuration (Secret, Algorithm, KeyId) is handled at the JSON schema level in config.schema.json rather than through runtime validation in the Go code at router/pkg/authentication/jwks_token_decoder.go.

Applied to files:

  • router/pkg/config/config.schema.json
📚 Learning: 2025-06-30T20:39:02.387Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 1929
File: router/internal/circuit/manager.go:16-25
Timestamp: 2025-06-30T20:39:02.387Z
Learning: In the Cosmo router project, parameter validation for circuit breaker configuration is handled at the JSON schema level rather than through runtime validation methods on structs. The config.schema.json file contains comprehensive validation constraints for circuit breaker parameters.

Applied to files:

  • router/pkg/config/config.schema.json
📚 Learning: 2025-08-20T10:08:17.857Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2155
File: router/core/router.go:1857-1866
Timestamp: 2025-08-20T10:08:17.857Z
Learning: In the Cosmo router codebase, JSON schema validation prevents null values in TrafficShapingRules subgraph configurations, making nil checks unnecessary when dereferencing subgraph rule pointers in NewSubgraphTransportOptions.

Applied to files:

  • router/pkg/config/config.schema.json
  • router/pkg/connectrpc/handler.go
📚 Learning: 2025-08-20T10:08:17.857Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2155
File: router/core/router.go:1857-1866
Timestamp: 2025-08-20T10:08:17.857Z
Learning: router/pkg/config/config.schema.json forbids null values for traffic_shaping.subgraphs: additionalProperties references $defs.traffic_shaping_subgraph_request_rule with type "object". Therefore, in core.NewSubgraphTransportOptions, dereferencing each subgraph rule pointer is safe under schema-validated configs, and a nil-check is unnecessary.

Applied to files:

  • router/pkg/config/config.schema.json
  • router/pkg/connectrpc/handler.go
📚 Learning: 2025-07-30T09:29:46.660Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2090
File: router/pkg/config/config.schema.json:0-0
Timestamp: 2025-07-30T09:29:46.660Z
Learning: The "operation_name_trim_limit" configuration property in router/pkg/config/config.schema.json should be placed at the security level as a sibling to complexity_limits, not inside the complexity_limits object.

Applied to files:

  • router/pkg/connectrpc/test_helpers.go
📚 Learning: 2025-09-19T15:08:03.085Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2223
File: router/cmd/main.go:32-33
Timestamp: 2025-09-19T15:08:03.085Z
Learning: pyroscope-go's Config.ServerAddress field accepts both full URLs (e.g., "http://localhost:4040") and simple host:port format (e.g., "localhost:4040"). The library handles URL normalization internally, so both formats work correctly.

Applied to files:

  • router/pkg/connectrpc/server.go
📚 Learning: 2025-08-28T09:18:10.121Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2141
File: router-tests/http_subscriptions_test.go:100-108
Timestamp: 2025-08-28T09:18:10.121Z
Learning: In router-tests/http_subscriptions_test.go heartbeat tests, the message ordering should remain strict with data messages followed by heartbeat messages, as the timing is deterministic and known by design in the Cosmo router implementation.

Applied to files:

  • router-tests/connectrpc/connectrpc_server_lifecycle_test.go
📚 Learning: 2025-07-30T09:29:04.257Z
Learnt from: SkArchon
Repo: wundergraph/cosmo PR: 2090
File: router/core/operation_processor.go:0-0
Timestamp: 2025-07-30T09:29:04.257Z
Learning: GraphQL operation names don't allow characters with more than 1 code point, so string length operations and slicing work correctly for both byte and character counting in GraphQL operation name processing.

Applied to files:

  • protographic/src/operation-to-proto.ts
📚 Learning: 2025-09-02T12:52:27.677Z
Learnt from: endigma
Repo: wundergraph/cosmo PR: 2181
File: router/core/operation_processor.go:0-0
Timestamp: 2025-09-02T12:52:27.677Z
Learning: Hash validation for persisted queries with query bodies is performed in router/core/graphql_prehandler.go in the handleOperation function, not in the APQ processing logic in operation_processor.go. This validates that extensions.persistedQuery.sha256Hash matches the query body before any APQ operations occur.

Applied to files:

  • router/pkg/connectrpc/handler.go
📚 Learning: 2025-08-07T12:05:06.775Z
Learnt from: StarpTech
Repo: wundergraph/cosmo PR: 2079
File: proto/wg/cosmo/platform/v1/platform.proto:39-45
Timestamp: 2025-08-07T12:05:06.775Z
Learning: In the Cosmo project, the proto fields for schema, mappings, and lock in ProtoInput are intentionally kept as string types rather than bytes because the team works with text data and wants it to be UTF-8 encoded for readability and text processing purposes.

Applied to files:

  • router/pkg/connectrpc/handler.go
🧬 Code graph analysis (12)
router/pkg/config/config.go (1)
router/pkg/connectrpc/server.go (1)
  • Server (36-47)
router/pkg/mcpserver/server.go (1)
router/pkg/httputil/headers.go (1)
  • SkippedHeaders (6-27)
router/core/router.go (2)
router/pkg/connectrpc/server.go (2)
  • Server (36-47)
  • ServerConfig (21-33)
router/pkg/config/config.go (1)
  • ConnectRPCConfiguration (1012-1017)
router/pkg/connectrpc/service_discovery_test.go (1)
router/pkg/connectrpc/service_discovery.go (2)
  • DiscoverServices (46-153)
  • ServiceDiscoveryConfig (29-34)
router/pkg/connectrpc/operation_registry.go (1)
router/core/context.go (3)
  • OperationTypeQuery (506-506)
  • OperationTypeMutation (507-507)
  • OperationTypeSubscription (508-508)
router/pkg/connectrpc/test_helpers.go (4)
router/pkg/connectrpc/proto_loader.go (2)
  • ProtoLoader (53-59)
  • NewProtoLoader (62-72)
router/pkg/connectrpc/server.go (2)
  • Server (36-47)
  • NewServer (50-166)
router/pkg/connectrpc/handler.go (1)
  • RPCHandler (121-127)
router/pkg/connectrpc/operation_registry.go (2)
  • NewOperationRegistry (34-43)
  • OperationRegistry (26-31)
router/pkg/connectrpc/server.go (5)
router/pkg/connectrpc/proto_loader.go (2)
  • ProtoLoader (53-59)
  • NewProtoLoader (62-72)
router/pkg/connectrpc/operation_registry.go (2)
  • OperationRegistry (26-31)
  • NewOperationRegistry (34-43)
router/pkg/connectrpc/handler.go (3)
  • RPCHandler (121-127)
  • NewRPCHandler (139-172)
  • HandlerConfig (130-136)
router/pkg/connectrpc/vanguard_service.go (3)
  • VanguardService (60-65)
  • NewVanguardService (68-93)
  • VanguardServiceConfig (53-57)
router/pkg/connectrpc/service_discovery.go (2)
  • DiscoverServices (46-153)
  • ServiceDiscoveryConfig (29-34)
router-tests/connectrpc/connectrpc_server_lifecycle_test.go (1)
router/pkg/connectrpc/server.go (3)
  • NewServer (50-166)
  • ServerConfig (21-33)
  • Server (36-47)
router/pkg/connectrpc/vanguard_service.go (3)
router/pkg/connectrpc/handler.go (3)
  • RPCHandler (121-127)
  • MetaKeyGraphQLErrors (83-83)
  • GraphQLError (107-112)
router/pkg/connectrpc/proto_loader.go (3)
  • ProtoLoader (53-59)
  • ServiceDefinition (17-30)
  • MethodDefinition (33-50)
router/pkg/mcpserver/server.go (1)
  • GraphQLError (157-159)
router-tests/testdata/connectrpc/client/employee.v1/employeev1connect/service.connect.go (1)
router-tests/testdata/connectrpc/client/employee.v1/service.pb.go (42)
  • FindEmployeesByPetsRequest (321-327)
  • FindEmployeesByPetsRequest (342-342)
  • FindEmployeesByPetsRequest (357-359)
  • FindEmployeesByPetsResponse (368-375)
  • FindEmployeesByPetsResponse (390-390)
  • FindEmployeesByPetsResponse (405-407)
  • FindEmployeesByPetsInlineFragmentRequest (586-592)
  • FindEmployeesByPetsInlineFragmentRequest (607-607)
  • FindEmployeesByPetsInlineFragmentRequest (622-624)
  • FindEmployeesByPetsInlineFragmentResponse (633-640)
  • FindEmployeesByPetsInlineFragmentResponse (655-655)
  • FindEmployeesByPetsInlineFragmentResponse (670-672)
  • FindEmployeesByPetsNamedFragmentRequest (681-687)
  • FindEmployeesByPetsNamedFragmentRequest (702-702)
  • FindEmployeesByPetsNamedFragmentRequest (717-719)
  • FindEmployeesByPetsNamedFragmentResponse (728-735)
  • FindEmployeesByPetsNamedFragmentResponse (750-750)
  • FindEmployeesByPetsNamedFragmentResponse (765-767)
  • GetEmployeeByIdRequest (227-233)
  • GetEmployeeByIdRequest (248-248)
  • GetEmployeeByIdRequest (263-265)
  • GetEmployeeByIdResponse (274-280)
  • GetEmployeeByIdResponse (295-295)
  • GetEmployeeByIdResponse (310-312)
  • GetEmployeesRequest (501-505)
  • GetEmployeesRequest (520-520)
  • GetEmployeesRequest (535-537)
  • GetEmployeesResponse (539-545)
  • GetEmployeesResponse (560-560)
  • GetEmployeesResponse (575-577)
  • GetEmployeesWithMoodRequest (416-420)
  • GetEmployeesWithMoodRequest (435-435)
  • GetEmployeesWithMoodRequest (450-452)
  • GetEmployeesWithMoodResponse (454-460)
  • GetEmployeesWithMoodResponse (475-475)
  • GetEmployeesWithMoodResponse (490-492)
  • UpdateEmployeeMoodRequest (124-131)
  • UpdateEmployeeMoodRequest (146-146)
  • UpdateEmployeeMoodRequest (161-163)
  • UpdateEmployeeMoodResponse (179-186)
  • UpdateEmployeeMoodResponse (201-201)
  • UpdateEmployeeMoodResponse (216-218)
router/pkg/connectrpc/vanguard_service_test.go (4)
router/pkg/connectrpc/test_helpers.go (2)
  • GetSharedProtoLoader (25-41)
  • NewTestRPCHandler (88-115)
router/pkg/connectrpc/vanguard_service.go (2)
  • NewVanguardService (68-93)
  • VanguardServiceConfig (53-57)
router/pkg/connectrpc/proto_loader.go (3)
  • ProtoLoader (53-59)
  • NewProtoLoader (62-72)
  • ServiceDefinition (17-30)
router/pkg/connectrpc/handler.go (3)
  • RPCHandler (121-127)
  • NewRPCHandler (139-172)
  • HandlerConfig (130-136)
router/pkg/connectrpc/handler.go (3)
router/pkg/mcpserver/server.go (2)
  • GraphQLError (157-159)
  • GraphQLResponse (162-165)
router/pkg/connectrpc/operation_registry.go (1)
  • OperationRegistry (26-31)
router/pkg/httputil/headers.go (1)
  • SkippedHeaders (6-27)
🪛 markdownlint-cli2 (0.18.1)
router-tests/testdata/connectrpc/README.md

16-16: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: image_scan (nonroot)
  • GitHub Check: build_test
  • GitHub Check: build_push_image (nonroot)
  • GitHub Check: build_push_image
  • GitHub Check: integration_test (./events)
  • GitHub Check: integration_test (./. ./fuzzquery ./lifecycle ./modules)
  • GitHub Check: integration_test (./telemetry)
  • GitHub Check: Analyze (go)
🔇 Additional comments (29)
protographic/src/operation-to-proto.ts (2)

267-279: LGTM! Clear PascalCase validation with helpful error message.

The validation correctly enforces PascalCase convention for ConnectRPC method names. The regex properly validates the pattern (starts uppercase, contains lowercase), and the error message provides clear examples plus a suggested name.


281-292: LGTM! Validation reordering is logical.

Moving the root-level alias validation after PascalCase validation is a sensible ordering—validate name format first, then validate structure.

router/pkg/mcpserver/server.go (1)

689-705: Centralizing skipped headers via httputil.SkippedHeaders looks good

Using the shared httputil.SkippedHeaders map here keeps MCP → GraphQL header forwarding aligned with the rest of the router and removes the duplicate local list without changing behavior.

router/pkg/config/config.schema.json (1)

2137-2185: connect_rpc schema matches Go config and enforces the key requirement

The new connect_rpc block is consistent with ConnectRPCConfiguration/ConnectRPCServer (fields, naming, and types), and the if/then constraint cleanly enforces that services_provider_id must be set when enabled is true, which aligns with how other router knobs are validated in the schema rather than at runtime.

router/pkg/config/config.go (1)

1012-1053: ConnectRPC config types and wiring are consistent with the schema and existing patterns

ConnectRPCConfiguration/ConnectRPCServer mirror the new connect_rpc schema block, and the env tags (including envPrefix:"CONNECT_RPC_SERVER_") follow the same pattern as MCP and other nested configs. Adding ConnectRPC to Config under yaml:"connect_rpc,omitempty" keeps the public surface coherent with the YAML schema.

router/pkg/connectrpc/error_handling_test.go (1)

269-340: Successful GraphQL response tests are appropriately scoped

TestSuccessfulGraphQLResponses cleanly verifies that executeGraphQL returns raw data payloads unchanged for both simple and nested structures, without over-specifying internal details. This complements the error-path tests nicely.

router/pkg/connectrpc/vanguard_service_test.go (2)

402-457: Well-structured table-driven tests for method name extraction.

The TestVanguardService_ExtractMethodName function properly covers valid paths, missing methods, wrong services, and malformed paths with too many segments. The test cases are clear and comprehensive.


474-479: Simple and effective error reader mock.

The errorReader helper cleanly simulates read failures for testing error handling paths.

router-tests/connectrpc/connectrpc_server_lifecycle_test.go (3)

15-37: Complete lifecycle test is well-structured.

The test properly verifies the start → reload → stop sequence with appropriate assertions on service availability after each transition.


85-93: Error scenario test correctly validates construction-time failure.

This test properly expects NewServer to fail when given a non-existent services directory, and verifies the error message contains "failed to discover services".


144-164: Service info consistency test validates before and after Start.

Good coverage verifying that service count and names are available immediately after NewServer and remain consistent after Start().

router/pkg/connectrpc/test_helpers.go (1)

79-85: Mock round tripper implementation is clean and effective.

The mockRoundTripper correctly returns a response with NopCloser for the body, ensuring proper resource handling in tests.

router/pkg/connectrpc/operation_registry.go (3)

26-31: Well-documented thread-safe registry design.

The struct and its thread-safety documentation are clear. The service-scoped operation storage design is appropriate for the multi-service ConnectRPC use case.


196-215: Copy semantics correctly prevent external modification.

GetAllOperationsForService returns a copy of operations (dereferencing pointers at line 211), which prevents callers from modifying the registry's internal state. Good defensive design.


275-294: Clear methods properly reset state.

Both Clear() and ClearService() use write locks and properly clean up the internal state. Debug logging provides visibility into registry operations.

router/core/router.go (2)

1551-1557: LGTM! ConnectRPC server shutdown properly integrated.

The shutdown follows the same pattern as other components, using wg.Go for concurrent shutdown and proper error aggregation.


2208-2212: LGTM!

The WithConnectRPC option follows the established pattern for router configuration options.

router/pkg/connectrpc/server.go (2)

358-391: LGTM! Response writer properly implements streaming interfaces.

The responseWriter correctly implements http.Flusher, http.Pusher, and http.Hijacker interfaces required for gRPC streaming compatibility, delegating to the underlying writer when supported.


168-213: LGTM! Server lifecycle is well-implemented.

The Start() method properly:

  1. Validates the server is initialized
  2. Creates the HTTP/2 (h2c) handler for gRPC compatibility
  3. Creates a listener to get the actual bound address
  4. Starts the server in a goroutine with proper error handling
router/pkg/connectrpc/proto_loader.go (1)

61-72: LGTM! ProtoLoader constructor is well-implemented.

The constructor properly handles nil logger by defaulting to a no-op logger, and initializes the local protoregistry.Files to avoid polluting the global registry.

router/pkg/connectrpc/service_discovery.go (3)

185-191: One-proto-per-directory restriction may be limiting.

The enforcement of exactly one proto file per service directory is strict. While this simplifies the discovery logic, it prevents common patterns like:

  • Separating request/response messages into a types.proto
  • Having shared definitions in the same directory

Consider documenting this restriction prominently or allowing multiple proto files with the same package.

Is the one-proto-per-directory restriction intentional for simplicity, or should it support multiple proto files per service (with validation that they share the same package)?


46-153: LGTM! Service discovery logic is well-structured.

The DiscoverServices function:

  1. Validates configuration and provides sensible defaults
  2. Walks the directory tree correctly
  3. Enforces package.service uniqueness across directories
  4. Uses filepath.SkipDir to avoid duplicate discovery
  5. Provides clear error messages for common issues

255-276: LGTM! Operation file discovery is straightforward.

The function correctly walks subdirectories to find .graphql and .gql files for each service.

router/pkg/connectrpc/vanguard_service.go (3)

158-240: LGTM! Service handler implementation is solid.

The createServiceHandler correctly:

  1. Extracts and validates the method name from the path
  2. Verifies the method exists in the service definition
  3. Handles both GET (query param) and POST (body) requests per Connect protocol
  4. Forwards headers via context
  5. Returns appropriate Connect errors for different failure modes

242-286: LGTM! Error handling properly sanitizes internal details.

The writeConnectError function:

  1. Logs full error details server-side with structured logging
  2. Returns only the Connect error code and message to clients
  3. Includes GraphQL errors from metadata when present for debugging
  4. Uses proper HTTP status code mapping

This addresses the previous security concern about exposing internal infrastructure details.


95-156: LGTM! Service registration with Vanguard is well-implemented.

The registerServices function:

  1. Creates a custom type resolver from the local Files registry (avoiding global registry pollution)
  2. Logs detailed service and method information
  3. Configures Vanguard to always transcode to Connect protocol with JSON codec
  4. Uses NewServiceWithSchema with the custom type resolver
router/pkg/connectrpc/handler.go (2)

268-291: LGTM - Recursive conversion implemented.

The recursive key conversion properly handles nested maps and arrays, addressing the previous concern about only converting top-level keys.


318-349: LGTM - Efficient snake_case to camelCase conversion.

The implementation uses strings.Builder with pre-allocated capacity and processes the string in a single pass, which is performant and correct.

router-tests/testdata/connectrpc/client/employee.v1/employeev1connect/service.connect.go (1)

1-307: Generated file - no manual review required.

This file is auto-generated by protoc-gen-connect-go (line 1). Generated files should not be manually reviewed or edited. Any issues should be addressed by updating the proto definitions or the code generator tool.

Comment on lines +16 to +22
```
services/
└── myservice.v1/
├── service.proto # Proto service definition
├── QueryGetItem.graphql # GraphQL query operation
└── MutationCreateItem.graphql # GraphQL mutation operation
```
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add language identifier to fenced code block.

The code block lacks a language specifier. Add a language identifier to satisfy Markdown linting rules.

Apply this diff to fix the issue:

-```
+```text
 services/
 └── myservice.v1/
     ├── service.proto                    # Proto service definition
     ├── QueryGetItem.graphql            # GraphQL query operation
     └── MutationCreateItem.graphql      # GraphQL mutation operation
-```
+```
🧰 Tools
🪛 markdownlint-cli2 (0.18.1)

16-16: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents
In router-tests/testdata/connectrpc/README.md around lines 16 to 22, the fenced
code block is missing a language identifier; change the opening fence from ```
to ```text so the block is marked as plain text (ensure the closing fence
remains ```), satisfying Markdown linting rules.

Comment on lines +471 to +480
// Create Connect error with metadata
connectErr := connect.NewError(code, fmt.Errorf("GraphQL request failed with HTTP %d", resp.StatusCode))
connectErr.Meta().Set(MetaKeyErrorClassification, ErrorClassificationCritical)
connectErr.Meta().Set(MetaKeyHTTPStatus, fmt.Sprintf("%d", resp.StatusCode))
connectErr.Meta().Set(MetaKeyHTTPResponseBody, string(responseBody))

h.logger.Error("HTTP error from GraphQL endpoint",
zap.Int("status_code", resp.StatusCode),
zap.String("connect_code", code.String()),
zap.String("response_body", string(responseBody)))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Security: Full response body exposure may leak internal details.

Line 475 exposes the complete upstream HTTP response body in error metadata returned to clients. Depending on how the GraphQL endpoint formats errors, this could leak stack traces, internal URLs, or implementation details. Consider:

  1. Redacting or sanitizing the response body before including it
  2. Omitting it entirely in production environments
  3. Gating this behind a debug configuration flag

Keep detailed response bodies in server logs only.

Based on learnings, this follows the pattern where security considerations should be evaluated in the context of error metadata exposure to external clients.

@asoorm asoorm force-pushed the ahmet/eng-8277-connect-rpc-handler-final branch from 91096a7 to 46c3f55 Compare December 3, 2025 15:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants