Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A global config and setup method #64

Merged
merged 19 commits into from
Feb 20, 2025
Merged

Conversation

AlanCoding
Copy link
Member

Fixes #44

The argument for this is difficult and something I've struggled with. But it's a strong argument. We have 3 different actors:

  1. The service itself, it runs tasks
  2. The publisher, it submits tasks
  3. Any dispatcherctl type command, exists in AWX, does control-and-reply to get debugging information

Because these are all invoked separately, we should requires the setup() method to be called, and if not, throw an error. Otherwise, it's very difficult to assure that the service is using the same config as the publisher, for example. On higher levels, this is easy to enforce using the Django .ready() method. But this will set the tone for how tests are written on the psycopg layer.

Copy link
Collaborator

@Alex-Izquierdo Alex-Izquierdo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this PR is a draft, I'm not sure if some of my questions are premature. I'n sorry in that case.

@AlanCoding AlanCoding force-pushed the class_config branch 2 times, most recently from c8020c0 to 7f5a44d Compare February 12, 2025 18:40
First pass at global config setup

Finish running and just starting on tests

Add a config test

Make half-way progress through demo script

Cut some more stuff out of the config

Fix failing unit test, handle queue can not be found

Review comment to consolidate factory handling

Factories refactor

Adopt new patterns up to some tests passing

Unfinished start on settings serialization
@AlanCoding AlanCoding changed the title [WIP] A global config and setup method A global config and setup method Feb 14, 2025
@AlanCoding AlanCoding marked this pull request as ready for review February 14, 2025 21:07
@AlanCoding AlanCoding requested a review from pb82 February 14, 2025 21:07
@AlanCoding AlanCoding mentioned this pull request Feb 15, 2025
@AlanCoding
Copy link
Member Author

Additional incoming change:

I've changed my tune on how __init__ arguments should be passed. We should be consistent, and this would result in needing to pass related objects as arguments when one object needs another. So the worker pool needs to be passed the process manager. Then the main object needs to be passed the worker pool. Then the factories will work out how to extract the kwargs from the config file. This suggests a new YAML structure like this:

service:
  pool_cls: dispatcher.pool.WorkerPool
  pool_kwargs:
    min_workers: 2
    max_workers: 12
  process_manager_cls: dispatcher.process.ForkServerManager
  process_manager_kwargs:
    preload_modules: my_app.hazmat

I was trying to make something work where we passed the process-manager kwargs into the main object... and this pattern doesn't scale well. And this all needs to be as "dumb", and straight-to-the-code as possible. Rephrasing, making the config file look messy is fine if it makes things simpler overall.

Copy link
Collaborator

@Alex-Izquierdo Alex-Izquierdo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should have the config a schema from where it can be easily validated?

@AlanCoding
Copy link
Member Author

Should have the config a schema from where it can be easily validated?

No and yes.

yes we can validate the schema. This is not jsonschema, but validating the content, which is a python dictionary by the point we would care to do this. Demo:

from dispatcher.config import setup, settings
setup(from_file='dispatcher.yml')
pkw = settings.service['pool_kwargs']

import inspect
from dispatcher.pool import WorkerPool
signature = inspect.signature(WorkerPool.__init__)
parameters = signature.parameters
valid_kwargs = [k for k, v in parameters.items() if v.kind != inspect.Parameter.VAR_POSITIONAL and v.kind != inspect.Parameter.VAR_KEYWORD]

>>> valid_kwargs
['self', 'worker_id', 'process']
>>> set(pkw.keys()) - set(valid_kwargs)
set()

This assures that all keys are valid. This doesn't do type validation, that would be a general problem that I would rather take to be solved, as opposed to solve myself.

I hope this answers the first part that no we don't want to write out a schema somewhere, because this general method defines the schema we want.

What we're lacking is an articulation of the value-add from doing fancy inspect stuff like this, as opposed to just letting the code go error. Possible answers:

  1. We could use this inspection to construct a schema dynamically. Put it where then?
  2. Validate the config schema in the publisher, not just in the service. But in that case, we could just initialize the objects and let the error happen that way, which would be simpler and more reliable.

@Alex-Izquierdo
Copy link
Collaborator

Should have the config a schema from where it can be easily validated?

No and yes.

yes we can validate the schema. This is not jsonschema, but validating the content, which is a python dictionary by the point we would care to do this. Demo:

from dispatcher.config import setup, settings
setup(from_file='dispatcher.yml')
pkw = settings.service['pool_kwargs']

import inspect
from dispatcher.pool import WorkerPool
signature = inspect.signature(WorkerPool.__init__)
parameters = signature.parameters
valid_kwargs = [k for k, v in parameters.items() if v.kind != inspect.Parameter.VAR_POSITIONAL and v.kind != inspect.Parameter.VAR_KEYWORD]

>>> valid_kwargs
['self', 'worker_id', 'process']
>>> set(pkw.keys()) - set(valid_kwargs)
set()

This assures that all keys are valid. This doesn't do type validation, that would be a general problem that I would rather take to be solved, as opposed to solve myself.

I hope this answers the first part that no we don't want to write out a schema somewhere, because this general method defines the schema we want.

What we're lacking is an articulation of the value-add from doing fancy inspect stuff like this, as opposed to just letting the code go error. Possible answers:

  1. We could use this inspection to construct a schema dynamically. Put it where then?
  2. Validate the config schema in the publisher, not just in the service. But in that case, we could just initialize the objects and let the error happen that way, which would be simpler and more reliable.

I agree that using JSON Schema might be overkill. However, my main concern is not how the validation is done but rather ensuring that validation happens effectively.

The primary argument for validation is the "fail faster, fail earlier" principle, which improves both error handling and user experience. If an error in the configuration can be detected early, we can provide a clear and specific error message explaining what is wrong and why it is not accepted. This prevents unnecessary execution and debugging effort.

I also disagree with the idea of simply initializing the objects and letting the error occur naturally. This approach may lead to errors that are harder to debug or, worse, result in silent failures or unexpected behaviors.

Your approach is certainly creative, but I see some potential issues. It would not account for optional parameters, and it relies on a strict 1:1 relationship between the configuration and function/object signatures. This limits flexibility and, in my opinion, introduces a partial coupling of concerns.

For these reasons, I advocate for an independent configuration that can be validated as early as possible.

@Alex-Izquierdo
Copy link
Collaborator

Alex-Izquierdo commented Feb 18, 2025

For these reasons, I advocate for an independent configuration that can be validated as early as possible.

I would like to clarify that I'm not going to block the PR on this matter. In one way or another, it is something that can be addressed perfectly in later iterations. I'm just sharing my PoV on how it could be. :)

@AlanCoding
Copy link
Member Author

Auto-gen schema could help for documentation and versioning. We could put this in the repo and check for diff in checks. If there's a diff, then we can bump a version.

This is to help with a real problem, which is versioning. My intent was to cut a version (PyPI, or just a tag) before merging this because it requires changes to the eda-server branch. Auto-generating a spec file (that is enforced) would document when and how something changed to expected schema.

But failing earlier doesn't make sense to me. Initializing the objects from the factories (from settings) is "free", and can be done in any context.

@AlanCoding
Copy link
Member Author

This is moving a bit fast, but I went ahead and pushed a commit that matches the description of my last comment.

I appreciate that this would be better as jsonschema. But that requires a general tool to convert type hints to jsonschema, there's no point in us maintaining that logic. Let's file an issue for it. Current format is kind of human readable, but jsonschema would be better.

@AlanCoding AlanCoding merged commit 33d4256 into ansible:main Feb 20, 2025
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add a class or object to represent the global config
2 participants