-
Notifications
You must be signed in to change notification settings - Fork 2.2k
added dspy.Stateful module for automatic history management #8798
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
I can see how this is a quality of life feature, but also confusing. It basically is a chat app wrapper, where typically application decides how much history to send in programmatically. But for many other patterns of dspy program, say rewrite query before a search, etc, it is not exactly helpful. Of course user has to choose when to apply statefulness vs. statelessness, but I wonder how needed is this. That said, if it's extensible, so basically being able to add other modules or MCPs? Then maybe it's something. Not sure |
I really like this as a quality of life feature! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR!
@okhat I am not entirely sure about if we should add this. My personal opinion is I want users to manage the conversation history because this is a customizable, e.g., how to truncate the history, how to compress the conversation and so on, but maybe you have seen this request from the community?
as much as i like being able to brag about writing my own modules because there were no chat adapters back in the day, i think i can safely say that there is sufficient community interest around easier abstractions for stateful agents: https://x.com/raw_works/status/1965459911686631866 even if this PR isn't the right approach, why do we have to pass history back in manually as an input? https://deepwiki.com/search/how-does-dspy-handle-chat-hist_4be30063-0511-4977-bd04-f61a08b95af1 if you don't like the style of this PR, maybe there can just be a "stateful" boolean flag built in? from my perspective, this should be as easy for the user as setting cache=true/false. lot's of good reasons you would or would not want cache. lot's of good reasons you would or would not want chat history. (2025 is the year of agents, btw) |
I echo rawwerks here. I encourage to work even more on it to remove some of the mentioned concerns through innovation and technical solutions, but if chainofthought has its place, I don't see why this one would not. Unless, DSPy decides to stay low level. That is valid, but if so, it should be explicitly stated, in many places, so that it help the community know where to go contribute such things in the dspyverse. |
|
Oh, I love this discussion :) really appreciate it. The PR was more meant to gauge the interest in a feature like this. I'm sure that the implementation can be further enhanced. But here are my two cents, why I think something like this is a good idea and actually in line with DSPy's goal/philosophy.
That being said, I can absolutely understand if this implementation is too specific or non-scalable or if the DSPy team wants to keep the history management more low level. In the end the core maintainers are the people knowing the library, the direction and the overall philosophy of the library best. To address some points raised by @joelgrus: @1: I imagine it being used by initiating a new instance of the class for every new interaction, like you would initiate a new Again, I really value this constructive discussion here :) |
so here's a very toy example of multiplayer, multi-turn QA running as an API (which seems like a very reasonable / common use case): https://gist.github.com/joelgrus/eb144fda2d9b94429ba1ed1ca48e2861 I think that implementing conversation history as "stateful" modules precludes this kind of application, you'd have to somehow maintain / persist a copy of the module for every conversation (which seems bad!). And if you want to scale to multiple backend servers, then what? and the non-"database" parts of history are actually very little code. having written that out, there are two parts of the history that do feel unergonomic:
|
So to echo @joelgrus a bit more (sorry), in a chat system, the following could also happen:
A proper chat system is actually a lot more complicated if we want intelligence, and the better practice is usually make APIs stateless. I do think that while While painful and cumbersome to keep writing boilerplates (like passing in messages), I don't think application developers are usually that opposed to write them, because usually you'd want more control on application side. But I can see how there maybe a class of things (as opposed to just modules) like this one that becomes "lightweight applications" (rather than just programs), especially for prototyping. I don't see how easy it is to transition these to production though. |
Someone was sharing their work on Discord: https://www.modaic.dev/
I don't know if this is going to be it, but I do think while there's interests on the application side, it probably can be outside of core |
Sorry in advance—I might sound a little grumpy. My motivation here is to defend the less experienced developers and beginners out there (the very folks I’ve spent the past few months introducing to dspy). Many of you are questioning whether this should be in the core—without explaining why raising that question is even worth it. What are the real drawbacks of including it in the core, and why are those drawbacks more important than the benefits seen by those who want it? Please remember, not all Python users are app or software developers. There are many types of Python users, and it’s likely that other groups need the very things you dismiss. Please explain how its presence in the core would actively harm the Python users you’re most familiar with, and why that harm outweighs the benefits it might provide to other users you may be less familiar with. Personally, I love dspy because it boldly removes boilerplate and distills AI programming to its essential parts. The history and current approaches to statefulness are very much at odds with that vision, and I dislike arguments like “it’s not that hard” or “you get more control”. If dspy were about control and ease, we would still be making raw HTTP requests. Also, there was no suggestion to remove dspy.History you can reach for the control and you can code with dspy.History if you need/want to. It can be quite discouraging to work on improving the statefulness problem in modules and programs when the conversation isn’t focused on how to solve it and the possible pitfalls, but instead on “it doesn’t belong here.” I keep telling everyone that dspy is open to pull requests and exists for everyone’s use. But the conversation here—and the one around template adapters—makes me reevaluate that belief. I take the time to write that because I am 100% convinced that those questioning dspy.Stateful have very very valid experiences, knowledge and perspectives that would help us reach the 'ideal' design of that module and you have shared some of that and this is great! But I think it would help dspy and its community more if we all focused on identifying the problems in dspy.Stateful module and brainstorm potential solutions and decide afterwards to give up if it is too hard for us. Instead of shutting it down right out of the gate because dspy.History give us control and it's 'not that hard'. I have had to decide 3 times now to make libraries in the dspyverse but not in the core: ovllm, functai and attachments. I can see that not everything goes in core but addressing the statefullness / multiturness seems very core to me and as a proof of that dspy.History exists and is in core. I simply believe we still have work to do and we should encourage that work and it appears to me that the solution will be in some sort of module modifier, which this is very much into that direction. |
@MaximeRivest Very fair points and sorry for making so much assumptions over the threads. Also thank you for holding the thread accountable. I'm going to take a stab at this. First, this obviously is still up for the maintainers, as opinionated / taste matters. Personally, I see the value of the motivation itself, even as part of ComplexingIn general, it comes down to complexing. Just because something is easy to write (usually the motivation to get rid of boilerplates), doesn't mean it's not complex. Complexing is to intertwine, and in design sense, it's intertwining different things together. For example, Streamlit and Gradio are libraries that make it easy, but they are complex by nature, because they complex so many different things (context, states, etc etc..) into single loops, making them not extensible for more sophisticated work. DSPy at its core, as you mentioned, breaks down AI programming to right level of primitives. Each primitive (Signature, Module, Adaptors, ..) has clear separation of concern, somehow captures the right interaction point, and at the core of it, they're simple, because they're one thing for one job, which is what makes them composable. Modules like Another reason why the allergic reactions showed up is because it seems to force interaction with LMs in a very specific way, chat app. Examples given are also geared towards that. I have a feeling Counter PointsNow, in the spirit of examining ourselves as well, I'm going to present some counter points. We could argue, maybe in certain programs statefulness could be helpful. For example:
This particular program doesn't work well, but you can see how maybe keeping context for long form extraction may be helpful, and having statefulness could be helpful, so I do think there's a wider argument that statefulness isn't just for a chat app. HmmmOk, as I type out more, maybe this is a fine module, and I don't think having a separate class to represent application makes sense anymore, because this implementation doesn't have to restrict to chat apps. So then, maybe having more methods on managing history in different ways, such as @vacmar01 what are your thoughts on that |
To restate the motivation why I proposed something like
@ianyu93 To extend the implementation with additional functionality like loading histories, saving histories, compacting (maybe with a DSPy module) or a max_history_length (maybe as a parameter) are all great ideas. Much of the discussion above is focused on my specific implementation (which is fair considering this is a concrete PR after all). But I think we need to focus the discussion more on whether any form of abstraction around history management is something that DSPy wants to provide or not. In my opinion an ergonomics improvement would be very much in line with the overall philosophy of the library, but I'm open to other opinions. And I also think that a module (in whatever form) is the way to go. Edit: On further thought: Maybe renaming this to something like |
I tend to agree with @MaximeRivest and think about it from an ecosystem perspective. It is a DX + ecosystem building discussion. As the framework mature, the DX and the ecosystem promotion need to be taken into account while preferring philosophy over over-opinionated closeness. To build an ecosystem, one need to take the newcomer into account while introducing them to the philosophy driving the ecosystem. The way I interpret it, As a heavy user of the framework (and first time contributor to the discussions here), building pipelines and AI interfaces regularly using to quote @joelgrus
I agree, there are. Then, a simple history should be implemented as in the PR and can be used in examples for newcomers to start with stateful AI software and allow the community to support easy adoption for use cases requiring it. BTW, @vacmar01 thanks for promoting the need |
Yeah thanks for helping to wrap my head around. @vacmar01 I actually think Stateful is better now than Chat 😂 because there are non chat ways to utilize it @ziv-bakhajian so if I understand correctly, you're thinking of 1. There could have a new primitive |
@ianyu93 Yes. I did not finished to wrap my head around the ergonomics, but a state base class with a parameter-based invocation at the beginning and an update at the end (as signatures to be implemented) of a stateful module call might be a start. the parameter-based invocation means to invoke from the state the inference state based on parameters forwarded by the stateful modules forward method. To take @joelgrus's multi conversation example,
notice that I did not assume that the state is managed by the module as one might want to forward the state between modules and might prefer to provide the state as a parameter to the forward method rather than manage it in the module. IMO, both should be supported. There are a lot of signature manipulations and ergonomics to be decided. this is an initial trajectory to develop. |
it doesn't force anyone to use dspy as a chat app if you default to dspy being stateless. despite the last three years of people saying "chat isn't the right ux for ai" - there are about a trillion dollars of capital that disagrees with this. not to mention that 99.9% of the general public equates "ai" with "chat". i don't understand why it's so painful to the dspy team to acknowledge that a lot of people want chat, and that managing a message history array simply isn't that hard (for you), but is just hard enough (for devs) to turn them away from dspy. i think you are smart enough to implement this in a way that doesn't accidentally blow up the rest of dspy it ultimately boils down to this: who is dspy for? |
(obvious caveat: I am not part of the dspy team, just an enthusiast, so all this is just like my opinion, man. but also I was for several years a core engineer working on the allennlp library, so I do have a lot of hard-won experience about the importance of being very thoughtful and deliberate about adding new features and abstractions to widely-used open-source libraries) It seems like there are a few different things going on here:
|
Here's a toy stateless version of Predict that automatically handles history, via an injected from collections import defaultdict
from abc import ABC
import dspy
dspy.configure(lm=dspy.LM('gemini/gemini-2.5-flash-lite'))
class HistoryRepository(ABC):
"""
Abstract base class for history repositories.
You could store it in memory, a database, or any other storage system.
"""
def __getitem__(self, key: str) -> dspy.History:
raise NotImplementedError()
def __setitem__(self, key: str, value: dspy.History) -> None:
raise NotImplementedError()
class InMemoryHistoryRepository(HistoryRepository):
"""
defaultdict by key, grows without bound!
"""
def __init__(self) -> None:
self.histories = defaultdict(lambda: dspy.History(messages=[]))
def __getitem__(self, key: str) -> dspy.History:
return self.histories[key]
def __setitem__(self, key: str, value: dspy.History) -> None:
self.histories[key] = value
class PredictWithHistory(dspy.Predict):
"""
A Predict module that automatically manages conversation history.
Stateless and allows for injectable history repository.
"""
def __init__(
self,
signature,
history_repository: HistoryRepository,
history_key_field: str = "user_id",
callbacks = None,
**config) -> None:
super().__init__(signature, callbacks=callbacks, **config)
self.history_repository = history_repository
self.history_key_field = history_key_field
self.signature = self.signature.prepend(
name="history",
field=dspy.InputField(),
type_=dspy.History
)
def forward(self, **kwargs):
if self.history_key_field not in kwargs:
raise ValueError(f"Missing history key field '{self.history_key_field}' in input arguments.")
history_key = kwargs.pop(self.history_key_field)
history = self.history_repository[history_key]
kwargs["history"] = history
res = super().forward(**kwargs)
# Build history entry
turn = {k: v for k, v in kwargs.items() if k != "history"}
if isinstance(res, dspy.Prediction):
turn.update(dict(res))
elif isinstance(res, dict):
turn.update(res)
else:
turn["output"] = res
history.messages.append(turn)
self.history_repository[history_key] = history
return res
def aforward(self, **kwargs):
raise NotImplementedError("Asynchronous version is not implemented yet.")
# Your code starts here
repo = InMemoryHistoryRepository()
qa = PredictWithHistory(
"question -> answer",
history_repository=repo
)
response_a_1 = qa(question="What's Python the programming language?", user_id="user_a")
print(response_a_1)
response_b_1 = qa(question="What's a Porsche?", user_id="user_b")
print(response_b_1)
response_a_2 = qa(question="Is it fast?", user_id="user_a")
print(response_a_2)
response_b_2 = qa(question="Is it fast?", user_id="user_b")
print(response_b_2) (it creates other problems once you start optimizing, as its signature now has "secret" fields that you have to account for, but maybe that's ok) |
I’m still exploring, but it may be useful for others who are also experimenting to note that dspy already “tracks” traces and histories — though not to be confused with
Line 170 in 46bb389
Line 79 in 46bb389
My intuition is that the “best” solution should take these into account. It seems to me like the statefulness that most of us desire is in fact a 'automatic' selection -> cloning -> ?modifying? -> growing of a previously existing trace. ps: seems like those could also by mined for creating tooling that would help in the production of synthetic training set. |
I'd like to throw a few opinions out here as someone who has been watching from the sidelines and has a few thoughts. My initial knee-jerk reaction was "Why would I want this?" and I was against modules that add statefulness because I believed it was the wrong direction in a lot of cases, but I took a step back and evaluated why I would not use it. Statefulness has been ingrained into a core principal of building APIs and is what I have adhered to while building AI solutions with clients. The reason for statefulness in API design is a few reasons:
These 2 issues are not identical but are closely related. Now here's where I had a realization and a change of heart. Just because I might not use this feature in its current form does not mean that it should be rejected outright. It solves a real problem that others face that I think is worth talking about. Persistence and history can be a challenging problem. As an example, Langchain and LangGraph have at least 2 methods of saving and loading chat histories. In langchain, there are message stores that simply store previous messages in history, effectively what this PR is aiming to make a drop-in feature. In langgraph, the persistence of threads is much more heavy handed but can store all of your programs' state at a given point in time, encompassing more than just the messages but also internal graph state (in DSPy, this would be module state) I would love to see what this feature could become given enough time and attention from the community. Perhaps we have a generic An Idea for the Community Does it make sense for the community surrounding DSPy to create something akin to |
Thanks all for the discussion, this is very informative and engaging. Why we chose
|
I think the analogy between TensorFlow and Keras (or also Fast.ai and PyTorch) is a solid one and I see merit in keeping DSPy on the "TensorFlow" or "PyTorch" level. Considering the discussion here in the thread, I vote for a separate repo like |
Add
dspy.Stateful
Module for Automatic History ManagementProblem
Managing conversation history with
dspy.History
currently requires significant boilerplate:history: dspy.History = dspy.InputField()
to signaturesSolution
dspy.Stateful
is a zero-modification wrapper that automatically handles history management for any DSPy module.Key Features:
signature.prepend()
to add history fieldsUsage
Implementation
The wrapper:
This eliminates boilerplate while maintaining full compatibility with existing DSPy modules and patterns.