Skip to content

APP-7995: Don't allow two module configs to point to same ExePath #4897

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

michaellee1019
Copy link
Member

@michaellee1019 michaellee1019 commented Apr 8, 2025

Abe and I found a situation where if two modules point to the same ExePath, the module will be loaded twice and neither will work. This validation check will error on adding the 2nd module from the config when the ExePath exists.

For example this config is currently an issue:

    {
      "type": "registry",
      "name": "michaellee1019_gpio-flicker_from_reload",
      "module_id": "michaellee1019:gpio-flicker",
      "version": "latest-with-prerelease",
      "reload_path": "/root/.viam/packages-local/michaellee1019_gpio-flicker_from_reload-bin-module.tar.gz",
      "reload_enabled": true
    },
    {
      "type": "local",
      "name": "michaellee1019_gpio-flicker_from_reload-local",
      "executable_path": "/root/.viam/packages-local/michaellee1019_gpio-flicker_from_reload-bin-module.tar.gz"
    }

Tested on a machine in my personal org https://app.viam.com/machine/9dca9f1a-2517-49ee-806e-7c72ec022d5d/configure/builder

4/9/2025, 10:34:58 AM error rdk.resource_manager   impl/resource_manager.go:1143   error adding modules   error An existing module michaellee1019_gpio-flicker_from_reload-local already exists with the same executable path as module michaellee1019_gpio-flicker_from_reload

4/9/2025, 10:34:58 AM error rdk.modmanager.michaellee1019_gpio-flicker_from_reload   modmanager/manager.go:319   Error adding module   module michaellee1019_gpio-flicker_from_reload  error An existing module michaellee1019_gpio-flicker_from_reload-local already exists with the same executable path as module michaellee1019_gpio-flicker_from_reload

4/9/2025, 10:34:58 AM info rdk.modmanager.michaellee1019_gpio-flicker_from_reload   modmanager/manager.go:316   Now adding module   module michaellee1019_gpio-flicker_from_reload

4/9/2025, 10:34:48 AM info rdk   impl/local_robot.go:1302   Robot (re)configured

And the first module that is loaded works fine.

@michaellee1019 michaellee1019 added the appimage Build AppImage of PR label Apr 8, 2025
@viambot viambot added the safe to test This pull request is marked safe to test from a trusted zone label Apr 8, 2025
@michaellee1019 michaellee1019 changed the title Don't allow two module configs to point to same ExePath APP-7995: Don't allow two module configs to point to same ExePath Apr 8, 2025
@michaellee1019 michaellee1019 marked this pull request as ready for review April 8, 2025 21:01
@michaellee1019 michaellee1019 added appimage-ignore-tests Build AppImage of PR and ignore tests and removed appimage Build AppImage of PR labels Apr 8, 2025
Copy link
Member

@abe-winter abe-winter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good + your suite / live tests seem legit

left non-blocking design questions

@@ -343,6 +343,11 @@ func (mgr *Manager) add(ctx context.Context, conf config.Module, moduleLogger lo
return nil
}

exists, existingName := mgr.execPathAlreadyExists(&conf)
if exists {
return errors.Errorf("An existing module %s already exists with the same executable path as module %s", existingName, conf.Name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this break the entire reconfigure? or does the reconfigure loop continue for other modules

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will only stop reconfigure for this module. The reconfigure loop is higher up in the stack, this method is called for each module in the case where there are multiple modules being reconfigured.

So this one module would error, and the existing module that is already configured with the same exepath will keep running.

@@ -442,6 +447,11 @@ func (mgr *Manager) Reconfigure(ctx context.Context, conf config.Module) ([]reso
return nil, errors.Errorf("cannot reconfigure module %s as it does not exist", conf.Name)
}

exists, existingName := mgr.execPathAlreadyExists(&conf)
if exists {
return nil, errors.Errorf("An existing module %s already exists with the same executable path as module %s", existingName, conf.Name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is happening in reconfigure. thinking through the broader flow here:

  • I create module 1
  • I create module 2 with same exe path
  • module 2 fails to add()
  • now I reconfigure module 1
  • does it error, leaving zero working copies? or is module 2 not in mgr.modules, because it failed to add()?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. the 2nd module will never be added to mgr.modules in add(). An error is returned during this validation before it is attempted to be started and added to mgr.modules.

So in the case you brought up, module 1 will be added, module 2 will not, when module 1 is reconfigured it will pass this validation because it has no awareness that module 2 is failing to be added.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also did a another round of manual testing and ensured this is the case. Didn't see this issue or anything else in the various scenarios I tested.

@michaellee1019
Copy link
Member Author

TY for the quick review!

@michaellee1019 michaellee1019 added viam-org Is part of the viamrobotics organization. safe to test This pull request is marked safe to test from a trusted zone and removed appimage-ignore-tests Build AppImage of PR and ignore tests viam-org Is part of the viamrobotics organization. safe to test This pull request is marked safe to test from a trusted zone labels Apr 9, 2025
@michaellee1019 michaellee1019 merged commit 04621bc into viamrobotics:main Apr 15, 2025
61 of 73 checks passed
@michaellee1019 michaellee1019 deleted the ml/prevent-dup-exepaths branch April 15, 2025 14:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
safe to test This pull request is marked safe to test from a trusted zone
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants