Skip to content

Conversation

BenAtAmazon
Copy link
Contributor

@BenAtAmazon BenAtAmazon commented Oct 7, 2025

Proposed Changes

This PR adds support for multiple hostname paths in the AWS peer discovery plugin to enable zero-downtime rolling upgrades during hostname migration scenarios. The implementation allows RabbitMQ nodes to discover peers using multiple hostname paths, ensuring cluster formation succeeds even when nodes are configured with different hostname paths during rolling upgrades.

Backward Compatibility

Existing single hostname_path configuration continues to work (unchanged).
We fallback to single path behavior when no numbered paths are configured.

Configuration Examples

Multiple paths for zero-downtime migration
cluster_formation.aws.hostname_path.1 = networkInterfaceSet,2,privateIpAddressesSet,1,privateDnsName
cluster_formation.aws.hostname_path.2 = privateDnsName
cluster_formation.aws.hostname_path.3 = networkInterfaceSet,1,privateIpAddressesSet,2,privateIpAddress

Note: This follows the existing pattern we have for classic_config:

cluster_formation.classic_config.nodes.1 = rabbit@<hostnameA>
cluster_formation.classic_config.nodes.2 = rabbit@<hostnameB>
cluster_formation.classic_config.nodes.3 = rabbit@<hostnameC>
Single path (backward compatible)
cluster_formation.aws.hostname_path = privateDnsName

Types of Changes

What types of changes does your code introduce to this project?
Put an x in the boxes that apply

  • Bug fix (non-breaking change which fixes issue #NNNN)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause an observable behavior change in existing systems)
  • Documentation improvements (corrections, new content, etc)
  • Cosmetic change (whitespace, formatting, etc)
  • Build system and/or CI

Checklist

Put an x in the boxes that apply.
You can also fill these out after creating the PR.
This is simply a reminder of what we are going to look for before merging your code.

  • Mandatory: I (or my employer/client) have have signed the CA (see https://github.com/rabbitmq/cla)
  • I have read the CONTRIBUTING.md document
  • I have added tests that prove my fix is effective or that my feature works
  • All tests pass locally with my changes
  • If relevant, I have added necessary documentation to https://github.com/rabbitmq/rabbitmq-website
  • If relevant, I have added this change to the first version(s) in release-notes that I expect to introduce it

Copy link
Collaborator

@lukebakken lukebakken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please be aware that the genie is extremely lazy with whitespace and will gladly add space characters on blank lines, at the end of lines, etc.

@BenAtAmazon BenAtAmazon force-pushed the aws/add-peer-discovery-multi-hostname-path branch from ca6b404 to 175a956 Compare October 10, 2025 03:06
end.

-spec get_value(string()|integer(), props()) -> props().
get_value(_, []) ->
Copy link
Collaborator

@lukebakken lukebakken Oct 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
get_value(Key, []) when is_integer(Key) ->
[];
get_value(Key, Props) when is_integer(Key) ->
{"item", Props2} = lists:nth(Key, Props),
Props2;
get_value(Key, Props) ->
Value = proplists:get_value(Key, Props),
sort_ec2_hostname_path_set_members(Key, Value).

Copy link
Collaborator

@the-mikedavis the-mikedavis Oct 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The second clause would be redundant, no? It should always be covered by the first clause

(suggestion edited)

Copy link
Contributor Author

@BenAtAmazon BenAtAmazon Oct 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: I added this new "Malformed data" case in response to this boot error when an invalid hostname path is used (i.e. pointing to a 3rd ENI when it doesn't exist). Hopefully this maintains the intent this suggestion.

get_value(Key, Props) when is_integer(Key), is_list(Props), length(Props) >= Key, Key > 0 ->
    case lists:nth(Key, Props) of
        {"item", Props2} -> Props2;
        _ -> []  % Malformed data
    end;

Error without above case:

BOOT FAILED
===========
Exception during startup:

error:function_clause

    rabbit_peer_discovery_util:node_name/1, line 210
        args: "1"
    rabbit_peer_discovery_aws:-get_autoscaling_group_node_list/2-lc$^0/1-0-/1, line 207
    rabbit_peer_discovery_aws:get_autoscaling_group_node_list/2, line 207
    rabbit_peer_discovery:discover_cluster_nodes/1, line 316
    rabbit_peer_discovery:sync_desired_cluster/3, line 196
    rabbit_db:init/0, line 64
    rabbit_boot_steps:-run_step/2-lc$^0/1-0-/2, line 53
    rabbit_boot_steps:run_step/2, line 60

@BenAtAmazon BenAtAmazon force-pushed the aws/add-peer-discovery-multi-hostname-path branch from 175a956 to b75758a Compare October 15, 2025 18:38
Adds support for multiple hostname paths in the AWS peer discovery plugin to enable zero-downtime rolling upgrades during hostname migration scenarios.

The implementation allows RabbitMQ nodes to discover peers using multiple hostname paths, ensuring cluster formation succeeds even when nodes are configured with different hostname paths during rolling upgrades.

Example usage:

cluster_formation.aws.hostname_path.1 = privateDnsName
cluster_formation.aws.hostname_path.2 = privateIpAddress
@BenAtAmazon BenAtAmazon force-pushed the aws/add-peer-discovery-multi-hostname-path branch from b75758a to 7cdb505 Compare October 15, 2025 19:53
end}]
}).

invalid_hostname_paths_graceful_handling(_Config) ->
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also added these new test cases to signify intent and improve coverage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants