Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resolving type system imports through SPIs slows things down too much #430

Open
reckart opened this issue Dec 16, 2024 · 0 comments
Open
Assignees

Comments

@reckart
Copy link
Member

reckart commented Dec 16, 2024

Describe the refactoring action
UIMA 3.6.0 introduced a new way of resolving imports in descriptors through SPIs via the TypeSystemProvider.findResourceUrl(String) method.

Internally, the resolving of the imports is currently handled by the Import_impl.findAbsoluteUrl(ResourceManager) method which looks up all TypeSystemProvider implementations and then scans them until it finds one that provides the imported resource. Looking up the TypeSystemProvider implementation is particularly slow when using spifly in OSGi contexts. But also generally, whenever an SPI is looked up, it is instantiated so any information about the type systems is currently not cached in the providers. That means resolving imports, etc. is prone to be slow and this slowness is amplified by the lack of caching.

So there are various options we might look at, e.g.:

Build time

  • The uima-bnd-plugin could possible be extended to not only consider the imports found in the current module but also follow these imports transitively while collecting package names to generate as OSGI package imports. This could be done by just looking at the UIMA import elements (like now) - or - by resolving the imports and then look at the package names of the imported types (which would work only if types and their XML descriptors live in the same packages - which they often do not). Also, if the scope of visibility of the imports is different at build time than they are at compile time, the generated OSGI package imports might be wrong - not sure if that could be a problem or if the scope at build time should always prescribe the scope at runtime... after all, for Java classes its like that and it works well (with OSGi).

Run time

  • ResourceManager or RelativePathResolver implementations could look up TypeSystemProvider implementations once and then cache them. The scope of visibility of one of these is less likely to change over time - and if so, it is much easier for the user to discard them and create new ones. Also, they already cache imported descriptors anyway, so it seems to be the proper place to also cache the SPIs.
  • TypeSystemProvider implementations could internally statically cache their information. However, that information might become stale as bundles are loaded/unloaded and the scope of visibility changes. It would be tricky also to reset these caches. Does not seem like a good life-cycle management.

Analysis

  • via FSClassRegistry.loadJCasClassesFromSPI() -> TypeSystemProvider.listJCasClasses() -> TypeSystemProvider.listTypeSystemDescriptions() -> TypeSystemDescription.resolveImports().
    • TypeSystemProvider.listTypeSystemDescriptions() returns multiple descriptions that are each resolved separately. This could be changed to not resolve so that the resolving could happen using a single ResourceManager at the level of TypeSystemProvider.listJCasClasses() and thereby it could better exploit a caching mechanism that could be added to the ResourceManager.
  • via TypeSystemDescriptionFactory.createTypeSystemDescription() -> TypeSystemProvider.listTypeSystemDescription(). There the problem is that not resolving type systems before returning them would mean that the TypeSystemDescriptionFactory would then have to resolve - and it might see different type systems than the bundle containing the individual TypeSystemProvider implementations. TypeSystemDescriptionFactory would always have to resort to SPIs while the TypeSystemProvider implementations could profit e.g. from locally declared OSGi package imports.

Expected benefit
Situations that require looking up type systems and resolving imports should be fast again.

@reckart reckart self-assigned this Dec 16, 2024
reckart added a commit that referenced this issue Dec 16, 2024
…own too much

- Clean up ClassLoaderUtils a bit
- Added missing since version to deprecation
- Consolidate redundant code inside RelativePathResolver_impl
reckart added a commit that referenced this issue Dec 16, 2024
…own too much

- Allow UimaBndPlugin to dive into transitive imports and process them as well
reckart added a commit that referenced this issue Dec 18, 2024
…own too much

- Make transitive import resoving optional
- Added more debug logging
reckart added a commit that referenced this issue Dec 20, 2024
…own too much

- uimaFIT will now use the UIMA Framework classloader to discover ValidationChecks, so the uimaj-core bundle needs to declare that SPI in its OSGi metadata
- Fix copy/paste issue in SerialFormat JavaDoc
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant