Skip to content

Add deployment.GetFullyQualifiedHomeserverName(t, hsName) to support custom Deployment #780

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
May 19, 2025

Conversation

MadLittleMods
Copy link
Collaborator

@MadLittleMods MadLittleMods commented May 16, 2025

Split off from #778 per discussion,

Spawning from a real use-case with a custom Deployment/Deployer (Element-internal).

Introduce complement.Deployment.GetFullyQualifiedHomeserverName(hsName) to allow the per-deployment short homeserver aliases like hs1 to be mapped to something else that's resolvable in your custom deployments context. Example: hs1 -> hs1.shard1:8481.

This is useful for situations like the following where you have to specify the via servers in a federated join request during a test:

alice.MustJoinRoom(t, bobRoomID, []spec.ServerName{
	deployment.GetFullyQualifiedHomeserverName(t, "hs2"),
})

But why does this have to be part of the complement.Deployment interface instead of your own custom deployment?

  • Tests only have access to the generic complement.Deployment interface
  • We can't derive fully-qualified homeserver names from the existing complement.Deployment interface
  • While we could cheekily cast the generic complement.Deployment back to CustomDeployment in our own tests (and have the helper in the CustomDeployment instead), if we start using something exotic in our out-of-repo Complement tests, the suite of existing Complement tests in this repo will not be compatible.

(also see below)

Motivating custom Deployment use case

complement.Deployment is an interface that can be backed by anything. For reference, custom deployments were introduced in #750. The default Deployment naming scheme in Complement is hs1, hs2, etc. It's really nice and convenient to be able to simply refer to homeservers as hs1, etc within a deployment. And using consistent names like this makes tests compatible with each other regardless of which Deployment is being used.

The built-in Deployment in Complement has each homeserver in a Docker container which already has network aliases like hs1, hs2, etc so no translation is needed from friendly name to resolvable address. When one homeserver needs to federate with the other, it can simply make a request to https://hs1:8448/... per spec on resolving server names.

Right-now, we hard-code hs1 across the tests when we specify "via" servers in join requests but that only works if you follow the strict single-deployment naming scheme.

bob.MustJoinRoom(t, roomID, []string{"hs1"})

In the current setup of our custom Deployment, each Deployment is a "shard" application that can deploy multiple homeserver "tenants". We specifically want to test that homeservers between multiple shards can federate with each other as a sanity check (make sure our shards can deploy homeserver tenants correctly). If we keep using the consistent hs1, hs2 naming scheme for each Deployment we're going to have conflicts. This is where deployment.GetFullyQualifiedHomeserverName(t, hsName) comes in handy. We can call deployment1.GetFullyQualifiedHomeserverName(t, "hs1") -> hs1.shard1 and also deployment2.GetFullyQualifiedHomeserverName(t, "hs1") -> hs1.shard2 to get their unique resolvable addresses in the network.

Additionally, the helper removes the constraint of needing the network to strictly resolve hs1, hs2 hostnames to their respective homeservers. Whenever you need to refer to another homeserver, use deployment.GetFullyQualifiedHomeserverName(hsName) to take care of the nuance of environment that the given Deployment creates.

Example of a cross-deployment test:

func TestMain(m *testing.M) {
	complement.TestMain(m, "custom_tests",
		complement.WithDeployment(internal.MakeCustomDeployment()),
	)
}

func TestCrossShardFederation(t *testing.T) {
	// Create two shards with their own homeserver tenants
	shardDeployment1 := complement.Deploy(t, 1)
	defer shardDeployment1.Destroy(t)
	shardDeployment2 := complement.Deploy(t, 1)
	defer shardDeployment2.Destroy(t)

	alice := shardDeployment1.Register(t, "hs1", helpers.RegistrationOpts{})
	bob := shardDeployment2.Register(t, "hs1", helpers.RegistrationOpts{})

	aliceRoomID := alice.MustCreateRoom(t, map[string]any{
		"preset": "public_chat",
	})
	bobRoomID := bob.MustCreateRoom(t, map[string]any{
		"preset": "public_chat",
	})

	t.Run("parallel", func(t *testing.T) {
		t.Run("shard1 -> shard2", func(t *testing.T) {
			// Since these tests use the same config, they can be run in parallel
			t.Parallel()

			alice.MustJoinRoom(t, bobRoomID, []string{
				shardDeployment2.GetFullyQualifiedHomeserverName(t, "hs1"),
			})

			bob.MustSyncUntil(t, client.SyncReq{}, client.SyncJoinedTo(alice.UserID, bobRoomID))
		})

		t.Run("shard2 -> shard1", func(t *testing.T) {
			// Since these tests use the same config, they can be run in parallel
			t.Parallel()

			bob.MustJoinRoom(t, aliceRoomID, []string{
				shardDeployment1.GetFullyQualifiedHomeserverName(t, "hs1"),
			})

			alice.MustSyncUntil(t, client.SyncReq{}, client.SyncJoinedTo(bob.UserID, aliceRoomID))
		})
	})
}

Per the discussion in #780 (comment), multiple-deployments per test doesn't work with Complement's Deployment implementation yet and the Deployment is meant to encapsulate an entire deployment, all servers and network links between them. Multi-Deployment was the motivating use case but use at your own discretion until further guidance is given.

Todo

  • Update tests to use GetFullyQualifiedHomeserverName(...)
    • Join room (MustJoinRoom, JoinRoom)
    • Knock room (mustKnockOnRoomSynced, knockOnRoomWithStatus)
    • srv.MustJoinRoom, srv.MustLeaveRoom, srv.MustSendTransaction
    • FederationClient -> fedClient.MakeJoin, fedClient.SendJoin, etc
    • fclient, fclient.NewFederationRequest
    • m.space.child via
    • m.space.parent via
    • m.room.join_rules restricted via
    • gomatrixserverlib.EDU Destination
  • Potentially update the built-in Deployment implementation to support multiple deployments at the same time, tracked by this discussion below

Pull Request Checklist

Example:
```
alice.MustJoinRoom(t, bobRoomID, []string{
	shardDeployment2.GetFullyQualifiedHomeserverName(t, "hs1"),
})
```
@MadLittleMods MadLittleMods force-pushed the madlittlemods/deployment-fqdn-helper branch from 0e5756a to e5ff236 Compare May 16, 2025 15:59
@MadLittleMods MadLittleMods changed the title Add deployment.GetFullyQualifiedHomeserverName(hsName) Add deployment.GetFullyQualifiedHomeserverName(hsName) to support custom Deployment May 16, 2025
@MadLittleMods MadLittleMods changed the title Add deployment.GetFullyQualifiedHomeserverName(hsName) to support custom Deployment Add deployment.GetFullyQualifiedHomeserverName(t, hsName) to support custom Deployment May 16, 2025
Comment on lines 142 to +147
// MustJoinRoom joins the room ID or alias given, else fails the test. Returns the room ID.
func (c *CSAPI) MustJoinRoom(t ct.TestLike, roomIDOrAlias string, serverNames []string) string {
//
// Args:
// - `serverNames`: The list of servers to attempt to join the room through.
// These should be a resolvable addresses within the deployment network.
func (c *CSAPI) MustJoinRoom(t ct.TestLike, roomIDOrAlias string, serverNames []spec.ServerName) string {
Copy link
Collaborator Author

@MadLittleMods MadLittleMods May 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically, wherever we expect people to use deployment.GetFullyQualifiedHomeserverName(t, hsName), I've updated these function signatures to accept spec.ServerName instead of just plain strings.

I also think this is more semantically correct for the places because this needs to be a resolvable homeserver address in the federation.

Comment on lines +521 to +522
// TODO: It feels like `ServersInRoom` should be `[]spec.ServerName` instead of `[]string`
ServersInRoom: serversInRoomStrings,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like a gomatrixserverlib problem.

(not going to fix in this PR)

@@ -63,7 +64,7 @@ func TestOutboundFederationSend(t *testing.T) {
roomAlias := srv.MakeAliasMapping("flibble", serverRoom.RoomID)

// the local homeserver joins the room
alice.MustJoinRoom(t, roomAlias, []string{deployment.GetConfig().HostnameRunningComplement})
alice.MustJoinRoom(t, roomAlias, []spec.ServerName{srv.ServerName()})
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell, this was just a mistake from before. Unclear how it worked before since it's missing the randomly assigned port.

@@ -27,7 +28,9 @@ func TestPresence(t *testing.T) {

// to share presence alice and bob must be in a shared room
roomID := alice.MustCreateRoom(t, map[string]interface{}{"preset": "public_chat"})
bob.MustJoinRoom(t, roomID, []string{"hs1"})
bob.MustJoinRoom(t, roomID, []spec.ServerName{
deployment.GetFullyQualifiedHomeserverName(t, "hs1"),
Copy link
Collaborator Author

@MadLittleMods MadLittleMods May 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the reviewer: I've tried to be thorough in updating everything on this list (from the PR description). This would be the main thing to think about. Are there other spots that we need to use GetFullyQualifiedHomeserverName(...) instead of the hard-coded hs1 values?

  • Update tests to use GetFullyQualifiedHomeserverName(...)
    • Join room (MustJoinRoom, JoinRoom)
    • Knock room (mustKnockOnRoomSynced, knockOnRoomWithStatus)
    • srv.MustJoinRoom, srv.MustLeaveRoom, srv.MustSendTransaction
    • FederationClient -> fedClient.MakeJoin, fedClient.SendJoin, etc
    • fclient, fclient.NewFederationRequest
    • m.space.child via
    • m.space.parent via
    • m.room.join_rules restricted via
    • gomatrixserverlib.EDU Destination

I've also reviewed the diff itself to ensure that I didn't accidentally swap hs1 for hs2 somewhere.

// the container.
return spec.ServerName(hsName)
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, I don't think the built-in Complement Deployment implementation supports multiple Deployments at the same time (hs1, hs2 would conflict between them). Since one of the goals of this PR is to unlock that functionality for custom Deployment's, it could make some sense to also refactor and update that here as well.

I'd rather leave it as-is until we need it or at-least do this in a follow-up PR.

See the PR description for more context on multiple Deployment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description doesn't give more context on why multiple Deployment is desirable:

But imagine a case where we have multiple Deployment and we want the homeservers to communicate with each other.

Why would you do this? The Deployment is meant to encapsulate an entire deployment, all servers and network links between them. I don't understand the rationale.

Copy link
Collaborator Author

@MadLittleMods MadLittleMods May 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kegsay In the current setup of our custom Deployment, each Deployment is a "shard" application that can deploy multiple homeserver "tenants".

And we specifically want to test that homeservers between multiple shards can federate with each other as a sanity check (make sure our shards can deploy homeserver tenants correctly):

func TestCrossShardFederation(t *testing.T) {
	// Create two shards with their own homeserver tenants
	shardDeployment1 := complement.Deploy(t, 1)
	defer shardDeployment1.Destroy(t)
	shardDeployment2 := complement.Deploy(t, 1)
	defer shardDeployment2.Destroy(t)

	alice := shardDeployment1.Register(t, "hs1", helpers.RegistrationOpts{})
	bob := shardDeployment2.Register(t, "hs1", helpers.RegistrationOpts{})

	aliceRoomID := alice.MustCreateRoom(t, map[string]any{
		"preset": "public_chat",
	})
	bobRoomID := bob.MustCreateRoom(t, map[string]any{
		"preset": "public_chat",
	})

	t.Run("parallel", func(t *testing.T) {
		t.Run("shard1 -> shard2", func(t *testing.T) {
			// Since these tests use the same config, they can be run in parallel
			t.Parallel()

			alice.MustJoinRoom(t, bobRoomID, []string{
				shardDeployment2.GetFullyQualifiedHomeserverName(t, "hs1"),
			})

			bob.MustSyncUntil(t, client.SyncReq{}, client.SyncJoinedTo(alice.UserID, bobRoomID))
		})

		t.Run("shard2 -> shard1", func(t *testing.T) {
			// Since these tests use the same config, they can be run in parallel
			t.Parallel()

			bob.MustJoinRoom(t, aliceRoomID, []string{
				shardDeployment1.GetFullyQualifiedHomeserverName(t, "hs1"),
			})

			alice.MustSyncUntil(t, client.SyncReq{}, client.SyncJoinedTo(bob.UserID, aliceRoomID))
		})
	})
}

This does assume that each Deployment shares a network that can communicate with each other (which they do).

Better way to go about this?

One alternative I can think of is to bake this information into the hsName (hs1.shard1) and parse it out but then we run into existing test compatibility issues as our custom deployment Deploy no longer creates hs1 named things.

Another is to statically assign each homeserver to a shard like hs1 -> shard1 and hs2 -> shard2, etc. But that's not very flexible to different numbers of homeservers per shard and the magic value knowledge that gets built-in to the tests.

@MadLittleMods MadLittleMods marked this pull request as ready for review May 16, 2025 21:37
@MadLittleMods MadLittleMods requested review from kegsay and a team as code owners May 16, 2025 21:37
@MadLittleMods MadLittleMods removed request for a team May 16, 2025 21:37
Copy link
Member

@kegsay kegsay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The addition of FQDN LGTM.

I'm not particularly convinced about the need for >1 Deployment at the same time, this isn't how Complement was designed to work. Given test authors have the power to control their Deplyoment impl, it makes sense to add FQDN support and remove the assumption that "hs1" is routable in their deployment.

@MadLittleMods MadLittleMods merged commit 28a0901 into main May 19, 2025
4 checks passed
@MadLittleMods MadLittleMods deleted the madlittlemods/deployment-fqdn-helper branch May 19, 2025 16:34
@MadLittleMods
Copy link
Collaborator Author

Thanks for the review @kegsay! Curious to get your further opinions/options/guidance on how best to address our multi-Deployment use case.

jevolk pushed a commit to matrix-construct/complement that referenced this pull request Jul 22, 2025
…t custom `Deployment` (matrix-org#780)

*Split off from matrix-org#778 per
[discussion](matrix-org#778 (comment)

Spawning from a real use-case with a custom `Deployment`/`Deployer` (Element-internal).

Introduce `complement.Deployment.GetFullyQualifiedHomeserverName(hsName)` to allow the
per-deployment short homeserver aliases like `hs1` to be mapped to something else that's
resolvable in your custom deployments context. Example: `hs1` -> `hs1.shard1:8481`.

This is useful for situations like the following where you have to specify the via
servers in a federated join request during a test:

```go
alice.MustJoinRoom(t, bobRoomID, []string{
	deployment.GetFullyQualifiedHomeserverName(t, "hs2"),
})
```


### But why does this have to be part of the `complement.Deployment` interface instead of your own custom deployment?

 - Tests only have access to the generic `complement.Deployment` interface
 - We can't derive fully-qualified homeserver names from the existing
   `complement.Deployment` interface
 - While we could cheekily cast the generic `complement.Deployment` back to
   `CustomDeployment` in our own tests (and have the helper in the `CustomDeployment`
   instead), if we start using something exotic in our out-of-repo Complement tests, the
   suite of existing Complement tests in this repo will not be compatible.

(also see below)

### Motivating custom `Deployment` use case

[`complement.Deployment`](https://github.com/matrix-org/complement/blob/d2e04c995666fbeb0948e6a4ed52d3fbb45fbdf7/test_package.go#L21-L69)
is an interface that can be backed by anything. For reference, custom deployments were
introduced in matrix-org#750. The [default
`Deployment` naming scheme in Complement is `hs1`, `hs2`,
etc](https://github.com/matrix-org/complement/blob/6b63eff50804beb334ca215650f5027ddf02ae9a/test_package.go#L198).
It's really nice and convenient to be able to simply refer to homeservers as `hs1`, etc
within a deployment. And using consistent names like this makes tests compatible with
each other regardless of which `Deployment` is being used.

The built-in `Deployment` in Complement has each homeserver in a Docker container which
already has network aliases like `hs1`, `hs2`, etc so no translation is needed from
friendly name to resolvable address. When one homeserver needs to federate with the
other, it can simply make a request to `https://hs1:8448/...` per [spec on resolving
server names](https://spec.matrix.org/v1.13/server-server-api/#resolving-server-names).

Right-now, we hard-code `hs1` across the tests when we specify ["via" servers in join
requests](https://spec.matrix.org/v1.13/client-server-api/#post_matrixclientv3joinroomidoralias)
but that only works if you follow the strict single-deployment naming scheme. 

https://github.com/matrix-org/complement/blob/6b63eff50804beb334ca215650f5027ddf02ae9a/tests/federation_rooms_invite_test.go#L112

In the current setup of our custom `Deployment`, each `Deployment` is a "shard"
application that can deploy multiple homeserver "tenants". We specifically want to test
that homeservers between multiple shards can federate with each other as a sanity check
(make sure our shards can deploy homeserver tenants correctly). If we keep using the
consistent `hs1`, `hs2` naming scheme for each `Deployment` we're going to have
conflicts. This is where `deployment.GetFullyQualifiedHomeserverName(t, hsName)` comes
in handy. We can call `deployment1.GetFullyQualifiedHomeserverName(t, "hs1")` ->
`hs1.shard1` and also `deployment2.GetFullyQualifiedHomeserverName(t, "hs1")` ->
`hs1.shard2` to get their unique resolvable addresses in the network.

Additionally, the helper removes the constraint of needing the network to strictly
resolve `hs1`, `hs2` hostnames to their respective homeservers. Whenever you need to
refer to another homeserver, use `deployment.GetFullyQualifiedHomeserverName(hsName)` to
take care of the nuance of environment that the given `Deployment` creates.

Example of a cross-deployment test:

```go
func TestMain(m *testing.M) {
	complement.TestMain(m, "custom_tests",
		complement.WithDeployment(internal.MakeCustomDeployment()),
	)
}

func TestCrossShardFederation(t *testing.T) {
	// Create two shards with their own homeserver tenants
	shardDeployment1 := complement.Deploy(t, 1)
	defer shardDeployment1.Destroy(t)
	shardDeployment2 := complement.Deploy(t, 1)
	defer shardDeployment2.Destroy(t)

	alice := shardDeployment1.Register(t, "hs1", helpers.RegistrationOpts{})
	bob := shardDeployment2.Register(t, "hs1", helpers.RegistrationOpts{})

	aliceRoomID := alice.MustCreateRoom(t, map[string]any{
		"preset": "public_chat",
	})
	bobRoomID := bob.MustCreateRoom(t, map[string]any{
		"preset": "public_chat",
	})

	t.Run("parallel", func(t *testing.T) {
		t.Run("shard1 -> shard2", func(t *testing.T) {
			// Since these tests use the same config, they can be run in parallel
			t.Parallel()

			alice.MustJoinRoom(t, bobRoomID, []string{
				shardDeployment2.GetFullyQualifiedHomeserverName(t, "hs1"),
			})

			bob.MustSyncUntil(t, client.SyncReq{}, client.SyncJoinedTo(alice.UserID, bobRoomID))
		})

		t.Run("shard2 -> shard1", func(t *testing.T) {
			// Since these tests use the same config, they can be run in parallel
			t.Parallel()

			bob.MustJoinRoom(t, aliceRoomID, []string{
				shardDeployment1.GetFullyQualifiedHomeserverName(t, "hs1"),
			})

			alice.MustSyncUntil(t, client.SyncReq{}, client.SyncJoinedTo(bob.UserID, aliceRoomID))
		})
	})
}
```

Per the discussion in
matrix-org#780 (comment),
multiple-deployments per test doesn't work with Complement's `Deployment` implementation
yet and the `Deployment` is meant to encapsulate an _entire_ deployment, all servers and
network links between them. This was the motivating use case but use at your own
discretion until further guidance is given.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants