new sampling strategies #65

HendrikStrobelt · 2025-08-15T11:34:13Z

.. based on full access to a copy of the context.
allows:

RejectionSampling (as we had already)
AgenticSampling (...)

… can only be set by `.copy_and_repair(...)`. The templates are updated to accommodate the new field.

…pling

jakelorocco · 2025-08-26T12:48:39Z

mellea/stdlib/instruction.py

+    def copy_and_repair(self, repair_string: str) -> Instruction:
+        """Creates a copy of the instruction and adds/overwrites the repair string."""
+        res = deepcopy(self)
+        res._repair_string = repair_string
+        return res


Did you end up utilizing this copy_and_repair function anywhere? It looks like you opted for communicating the failed requirements as messages instead. If so, can you please remove the _repair changes to instructions and their templates?

I probably have to add an example of using this during a repair. Right now we only have "try again" without alteration and the agentic way..

jakelorocco

lgtm; left one comment about repair / sampling strats longterm and the tests don't technically fit our format, but they seem to work.

jakelorocco · 2025-08-28T16:41:56Z

mellea/stdlib/sampling.py

+        context.insert_turn(ContextTurn(past_actions[-1], past_results[-1]))
+
+        last_failed_reqs: list[Requirement] = [s[0] for s in past_val[-1] if not s[1]]
+        last_failed_reqs_str = "* " + "\n* ".join(
+            [str(r.description) for r in last_failed_reqs]
+        )
+        # TODO: what to do with checks ??
+
+        next_action = Message(
+            role="user",
+            content=f"The following requirements have not been met: \n{last_failed_reqs_str}\n Please try again to fulfill the requirements.",
+        )


I'm uncertain about having repair strategies modify the context directly. I guess it has to be done this way to support different repair strategies. It seems like this might be overly limiting though.

For instance, what if a repair strategy wants to offer up two possible future actions or two possible versions of the context to run against?
Maybe that just becomes a new sampling strategy with a new repair function signature.

The solution for most complex Strategies would be to inherit from SamplingStrategy and not from BaseSamplingStrategy.

HendrikStrobelt added 3 commits August 14, 2025 16:58

preparing for new sampling: adding a repair field to Instruction that…

e4b7cd4

… can only be set by `.copy_and_repair(...)`. The templates are updated to accommodate the new field.

new signature for RejectionSampling

f02e88b

adding RejectionSampling and AgenticSampling as subclasses of BaseSam…

c9de1f9

…pling

HendrikStrobelt requested review from nrfulton and jakelorocco August 15, 2025 11:34

jakelorocco reviewed Aug 26, 2025

View reviewed changes

HendrikStrobelt added 2 commits August 28, 2025 16:52

fixing requirements and adding tests

decab14

refactoring repair and select-from-failure as abtsract methods.

570dd7f

HendrikStrobelt marked this pull request as ready for review August 28, 2025 15:42

HendrikStrobelt requested a review from jakelorocco August 28, 2025 15:42

Merge branch 'main' into hen/sampling_strategy_new

bcba0ee

jakelorocco approved these changes Aug 28, 2025

View reviewed changes

nrfulton merged commit 5acc286 into main Aug 29, 2025
4 checks passed

nrfulton deleted the hen/sampling_strategy_new branch August 29, 2025 12:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

new sampling strategies #65

new sampling strategies #65

Uh oh!

HendrikStrobelt commented Aug 15, 2025

Uh oh!

jakelorocco Aug 26, 2025

Uh oh!

HendrikStrobelt Aug 26, 2025

Uh oh!

jakelorocco left a comment

Uh oh!

jakelorocco Aug 28, 2025

Uh oh!

HendrikStrobelt Aug 29, 2025

Uh oh!

Uh oh!

Uh oh!

new sampling strategies #65

new sampling strategies #65

Uh oh!

Conversation

HendrikStrobelt commented Aug 15, 2025

Uh oh!

jakelorocco Aug 26, 2025

Choose a reason for hiding this comment

Uh oh!

HendrikStrobelt Aug 26, 2025

Choose a reason for hiding this comment

Uh oh!

jakelorocco left a comment

Choose a reason for hiding this comment

Uh oh!

jakelorocco Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

HendrikStrobelt Aug 29, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!