Skip to content

Conversation

HendrikStrobelt
Copy link
Contributor

.. based on full access to a copy of the context.
allows:

  • RejectionSampling (as we had already)
  • AgenticSampling (...)

Comment on lines +155 to +159
def copy_and_repair(self, repair_string: str) -> Instruction:
"""Creates a copy of the instruction and adds/overwrites the repair string."""
res = deepcopy(self)
res._repair_string = repair_string
return res
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you end up utilizing this copy_and_repair function anywhere? It looks like you opted for communicating the failed requirements as messages instead. If so, can you please remove the _repair changes to instructions and their templates?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I probably have to add an example of using this during a repair. Right now we only have "try again" without alteration and the agentic way..

@HendrikStrobelt HendrikStrobelt marked this pull request as ready for review August 28, 2025 15:42
Copy link
Contributor

@jakelorocco jakelorocco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm; left one comment about repair / sampling strats longterm and the tests don't technically fit our format, but they seem to work.

Comment on lines +369 to +380
context.insert_turn(ContextTurn(past_actions[-1], past_results[-1]))

last_failed_reqs: list[Requirement] = [s[0] for s in past_val[-1] if not s[1]]
last_failed_reqs_str = "* " + "\n* ".join(
[str(r.description) for r in last_failed_reqs]
)
# TODO: what to do with checks ??

next_action = Message(
role="user",
content=f"The following requirements have not been met: \n{last_failed_reqs_str}\n Please try again to fulfill the requirements.",
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm uncertain about having repair strategies modify the context directly. I guess it has to be done this way to support different repair strategies. It seems like this might be overly limiting though.

For instance, what if a repair strategy wants to offer up two possible future actions or two possible versions of the context to run against?
Maybe that just becomes a new sampling strategy with a new repair function signature.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The solution for most complex Strategies would be to inherit from SamplingStrategy and not from BaseSamplingStrategy.

@nrfulton nrfulton merged commit 5acc286 into main Aug 29, 2025
4 checks passed
@nrfulton nrfulton deleted the hen/sampling_strategy_new branch August 29, 2025 12:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants