-
Notifications
You must be signed in to change notification settings - Fork 840
.NET: Python: add ADR for threads #2420
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
.NET: Python: add ADR for threads #2420
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR introduces an Architecture Decision Record (ADR) proposing changes to the thread management system in the Python agent framework. The ADR addresses confusion around the current AgentThread class, which can exist in multiple states and hold different types of data, making it difficult for users to understand and use correctly.
Key Changes
- Proposes two options for restructuring thread management: maintaining the current single-class approach vs. separating into distinct
RemoteThreadandLocalThreadclasses - Documents the current implementation's limitations and the decision drivers focused on ease of use and clarity
- Presents trade-offs between consistency with .NET implementation and Python-specific design patterns
| - We would then add a flag on ChatClients, to indicate which type of thread they support, and it can be both, so two flags are likely needed, although local thread might always be possible. | ||
| - And finally, all Agents would get two methods, `get_service_thread()`/`get_remote_thread` and `get_local_thread()`, both of which might raise an error if the chat client does not support that type of thread. | ||
| - the `run` methods would take both types of threads, but would raise an error if the thread type is not supported by the chat client. And it would also check with the `store` parameter to make sure it is used correctly, or set it correctly. | ||
| - One open question is how to handle when there is a mismatch between the thread type and the `store` parameter, for example passing a `LocalAgentThread` with `store=True`, or a `ServiceAgentThread` with `store=False`. Options are to either raise an error, or to ignore the `store` parameter and always do the right thing based on the thread type. Or to transform the thread into the right type, but that seems more complex and might not always be possible. Although starting with a local thread (which would be a list of chat messages in a ChatMessageStore) and then setting store=True might make sense, the return would be a service thread then, but that adds complexity, this might be useful for workflows that combine different agent types. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the fact that it's explicit, but this also makes it more complicated with store parameter, because at the moment you only need to change this parameter value True/False, while with a separate thread type you would also need to change that thread type, so basically configuring the same logic in two different places.
If we introduce a separate thread type for remote and local messages, maybe we can remove store parameter in this case. Instead of using store parameter, we will just check a thread type, and by default it will be a local one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point, we do need to look at both conversation_id and store parameters in this case. I'll add
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only challenge is that chat options work at the chat client level, while this works at the agent level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added more clarity on this choice, and the option I think makes most sense.
| - Good, because it is clear which chat clients support which type of thread. | ||
| - Good, because we can make all the logic that deals with threads much clearer, as each class has a single responsibility. | ||
| - Good, because it might also enable a abstracted method to get a list of chat messages from a thread through the chat client. | ||
| - Bad, because it would more fundamentally diverge from dotnet. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if this API brings more clarity, it can be introduced in .NET as well. Otherwise, I'm not sure if we want to diverge from .NET this much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When discussing with Weslie we also came to the conclusion that we might invert the order, and do this for python and follow with .net
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main problem in .net is that we don't know what kind of thread is supported by the underlying service until we get the first response from the chat client and changing this will be challenging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main problem in .net is that we don't know what kind of thread is supported by the underlying service until we get the first response from the chat client and changing this will be challenging.
@eavanvalkenburg Isn't it the same in Python?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is, but since currently all the clients are built by us and part of AF, it's a onetime breaking change to add the flag to the protocol and base_client and then we are good to go, and if someone implements their own, it's a simple addition. For dotnet, this is a bit more tricky
4510170 to
531d751
Compare
| - Bad, because it is unclear which chat clients can support which type of thread. | ||
| - Bad, because dotnet also has subclasses for each type of agent, so already somewhat diverging from dotnet. | ||
|
|
||
| ### 2. Separate classes for `ServiceThread`/`RemoteThread` and `LocalAgentThread`/`LocalThread`, each with their own behaviors and methods. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the explicit-ness of the separate classes. I think doing it this way will be better for developers, too, as they'd know which type they're working with and what's available in each object.
4200763 to
db563ac
Compare
| The first issue with threads is that depending on whether the `service_thread_id` is filled we treat it differently from a thread with `chat_message_store` filled, and when neither is filled it is considered uninitialized. Further, depending on the state, certain methods behave differently, such as `on_new_messages`, which is a no-op for threads with a `service_thread_id`, but calls `add_messages` on the `chat_message_store` for threads with that filled. When calling `agent.get_new_thread` it is also unclear what is referred to, a service side thread, a locally stored thread or neither. Finally, when passing in a thread of one type, but also setting the `store` and `conversation_id` parameters when calling `agent.run` might either raise a error, change something or does not work as expected. | ||
|
|
||
| ### Issue 2: ChatMessageStore usage | ||
| The second issue is the usage of `ChatMessageStore` as the way to store messages locally, while it works well, it does not capture a whole thread, nor can it be used as a way to store either messages or a service id, leading to the need for a separate class that can store `AgentThread` objects, meaning we either need two abstractions and implementations, one for `ChatMessageStore` and one for `AgentThreadStore`, or we need to complicate the `AgentThread` class further by adding an id and save/load methods. Or people will forego using the ChatMessageStore altogether and just use a AgentThreadStore and the built-in in-memory ChatMessageStore. Giving us two versions of doing similar things: "storing threads", where one works for all thread (AgentThreadStore) and one only for local threads (ChatMessageStore). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it does not capture a whole thread - by "whole thread" you mean it does not capture all the messages in the thread, or you mean messages and some additional thread attributes?
| This approach would mean: | ||
| - Creating two subclasses of AgentThread, one for service threads and one for local threads, both with different attributes and methods. Tentatively called `RemoteThread` and `LocalThread`. | ||
| - Removing `ChatMessageStore`, instead a `LocalThread` would have a list of ChatMessages as attribute. | ||
| - Moving `ContextProvider` back into Agent, replacing with a field `context_data`/`context_state` or a dict of `context_provider_name/id: context_data/state` on both thread types, which the agent would then use to get the context from the context providers when running in that thread. This makes the thread itself state-only, and the context provider can be stateless and live in the agent. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if context provider should be part of the thread at all, it feels to me like an optional component that should be part of the Agent.
| - Moving `ContextProvider` back into Agent, replacing with a field `context_data`/`context_state` or a dict of `context_provider_name/id: context_data/state` on both thread types, which the agent would then use to get the context from the context providers when running in that thread. This makes the thread itself state-only, and the context provider can be stateless and live in the agent. | ||
| - The protocol/interface for ContextProviders would need a slight update, one to return a initial `context_data/state` object, the logic of which is maintained by the provider (whether it matches a app, user or session and what to record in there), and adding that `context_data/state` to the invoked and invoking methods. The `context_data/state` needs to be (de)serializable in a standard way, to make handling the thread easier. | ||
| - We would need to add a flag on `ChatClients`, to indicate if they support `remote/service` threads, and we assume that we always support `local` threads. | ||
| - All Agents would get two methods, `get_service_thread(thread_id: str | None = None, ...)`/`get_remote_thread(thread_id: str | None = None, ...)` and `get_local_thread(messages: list[ChatMessage] | None = None, ...)`, both of which might raise an error if the chat client does not support that type of thread, after creation the agent then calls the context_provider(s) to get a `context_data/state` assigned as well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In previous point there is a statement and we assume that we always support local threads, but in this point we say that both remote and local thread get methods might raise an error if the chat client does not support that type of thread. But if we always support local threads (which makes sense to me), then method get_local_thread should never throw.
| - `LocalThread` | ||
| - `ClientSideThread` | ||
|
|
||
| `RemoteThread` and `LocalThread` seem the clearest and most concise options and the most pythonic. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like RemoteThread and LocalThread, but if we are doing this refactoring, we should also make a decision if thread terminology is the final option, or we want to think about conversation and any other potential alternatives. I remember there was some feedback about thread naming.
Also, one potential alternative to RemoteThread is HostedThread. It's also short and works well with tool types like HostedMCPTool.
|
|
||
| `RemoteThread` and `LocalThread` seem the clearest and most concise options and the most pythonic. | ||
|
|
||
| So that gives the following: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another "bad" point:
Bad, because users will need to switch from one get_new_thread method to two different methods get_remote_thread and get_local_thread and there may be a situation when one of the methods will throw an exception. While get_new_thread always returns some instance.
Motivation and Context
ADR proposing changes to the thread setup in Python.
Description
Contribution Checklist