You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the refactoring action
UIMA 3.6.0 introduced a new way of resolving imports in descriptors through SPIs via the TypeSystemProvider.findResourceUrl(String) method.
Internally, the resolving of the imports is currently handled by the Import_impl.findAbsoluteUrl(ResourceManager) method which looks up all TypeSystemProvider implementations and then scans them until it finds one that provides the imported resource. Looking up the TypeSystemProvider implementation is particularly slow when using spifly in OSGi contexts. But also generally, whenever an SPI is looked up, it is instantiated so any information about the type systems is currently not cached in the providers. That means resolving imports, etc. is prone to be slow and this slowness is amplified by the lack of caching.
So there are various options we might look at, e.g.:
Build time
The uima-bnd-plugin could possible be extended to not only consider the imports found in the current module but also follow these imports transitively while collecting package names to generate as OSGI package imports. This could be done by just looking at the UIMA import elements (like now) - or - by resolving the imports and then look at the package names of the imported types (which would work only if types and their XML descriptors live in the same packages - which they often do not). Also, if the scope of visibility of the imports is different at build time than they are at compile time, the generated OSGI package imports might be wrong - not sure if that could be a problem or if the scope at build time should always prescribe the scope at runtime... after all, for Java classes its like that and it works well (with OSGi).
Run time
ResourceManager or RelativePathResolver implementations could look up TypeSystemProvider implementations once and then cache them. The scope of visibility of one of these is less likely to change over time - and if so, it is much easier for the user to discard them and create new ones. Also, they already cache imported descriptors anyway, so it seems to be the proper place to also cache the SPIs.
TypeSystemProvider implementations could internally statically cache their information. However, that information might become stale as bundles are loaded/unloaded and the scope of visibility changes. It would be tricky also to reset these caches. Does not seem like a good life-cycle management.
Analysis
via FSClassRegistry.loadJCasClassesFromSPI() -> TypeSystemProvider.listJCasClasses() -> TypeSystemProvider.listTypeSystemDescriptions() -> TypeSystemDescription.resolveImports().
TypeSystemProvider.listTypeSystemDescriptions() returns multiple descriptions that are each resolved separately. This could be changed to not resolve so that the resolving could happen using a single ResourceManager at the level of TypeSystemProvider.listJCasClasses() and thereby it could better exploit a caching mechanism that could be added to the ResourceManager.
via TypeSystemDescriptionFactory.createTypeSystemDescription() -> TypeSystemProvider.listTypeSystemDescription(). There the problem is that not resolving type systems before returning them would mean that the TypeSystemDescriptionFactory would then have to resolve - and it might see different type systems than the bundle containing the individual TypeSystemProvider implementations. TypeSystemDescriptionFactory would always have to resort to SPIs while the TypeSystemProvider implementations could profit e.g. from locally declared OSGi package imports.
Expected benefit
Situations that require looking up type systems and resolving imports should be fast again.
The text was updated successfully, but these errors were encountered:
…own too much
- Clean up ClassLoaderUtils a bit
- Added missing since version to deprecation
- Consolidate redundant code inside RelativePathResolver_impl
…own too much
- uimaFIT will now use the UIMA Framework classloader to discover ValidationChecks, so the uimaj-core bundle needs to declare that SPI in its OSGi metadata
- Fix copy/paste issue in SerialFormat JavaDoc
Describe the refactoring action
UIMA 3.6.0 introduced a new way of resolving imports in descriptors through SPIs via the
TypeSystemProvider.findResourceUrl(String)
method.Internally, the resolving of the imports is currently handled by the
Import_impl.findAbsoluteUrl(ResourceManager)
method which looks up allTypeSystemProvider
implementations and then scans them until it finds one that provides the imported resource. Looking up theTypeSystemProvider
implementation is particularly slow when using spifly in OSGi contexts. But also generally, whenever an SPI is looked up, it is instantiated so any information about the type systems is currently not cached in the providers. That means resolving imports, etc. is prone to be slow and this slowness is amplified by the lack of caching.So there are various options we might look at, e.g.:
Build time
Run time
ResourceManager
orRelativePathResolver
implementations could look upTypeSystemProvider
implementations once and then cache them. The scope of visibility of one of these is less likely to change over time - and if so, it is much easier for the user to discard them and create new ones. Also, they already cache imported descriptors anyway, so it seems to be the proper place to also cache the SPIs.TypeSystemProvider
implementations could internally statically cache their information. However, that information might become stale as bundles are loaded/unloaded and the scope of visibility changes. It would be tricky also to reset these caches. Does not seem like a good life-cycle management.Analysis
FSClassRegistry.loadJCasClassesFromSPI()
->TypeSystemProvider.listJCasClasses()
->TypeSystemProvider.listTypeSystemDescriptions()
->TypeSystemDescription.resolveImports()
.TypeSystemProvider.listTypeSystemDescriptions()
returns multiple descriptions that are each resolved separately. This could be changed to not resolve so that the resolving could happen using a singleResourceManager
at the level ofTypeSystemProvider.listJCasClasses()
and thereby it could better exploit a caching mechanism that could be added to theResourceManager
.TypeSystemDescriptionFactory.createTypeSystemDescription()
->TypeSystemProvider.listTypeSystemDescription()
. There the problem is that not resolving type systems before returning them would mean that theTypeSystemDescriptionFactory
would then have to resolve - and it might see different type systems than the bundle containing the individualTypeSystemProvider
implementations.TypeSystemDescriptionFactory
would always have to resort to SPIs while theTypeSystemProvider
implementations could profit e.g. from locally declared OSGi package imports.Expected benefit
Situations that require looking up type systems and resolving imports should be fast again.
The text was updated successfully, but these errors were encountered: