Refactored scikit-learn flavour of DifferenceInDifferences and allowed custom column names for post_treatment variable. #515

roesta07 · 2025-07-30T06:01:49Z

closes issues #390 and #514

causal impact calculation in scikit-learn flavour of DifferenceInDifferences
Allow the user to use whatever column name they want for 'post_treatment' variable while constructing DifferenceInDifferences object with new parameter post_treatment_variable_name . Also setting its default value to 'post_treatment' so that it does not break previously written codes.

📚 Documentation preview 📚: https://causalpy--515.org.readthedocs.build/en/515/

…y for did

codecov · 2025-07-30T06:36:36Z

Codecov Report

❌ Patch coverage is 88.88889% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 95.13%. Comparing base (09adfd7) to head (7fbb27a).

Files with missing lines	Patch %	Lines
causalpy/experiments/diff_in_diff.py	88.88%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #515      +/-   ##
==========================================
- Coverage   95.19%   95.13%   -0.06%     
==========================================
  Files          28       28              
  Lines        2457     2468      +11     
==========================================
+ Hits         2339     2348       +9     
- Misses        118      120       +2

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

drbenvincent

Looks like the remote checks are failing. Sometimes you need to run the pre-commit checks locally twice - the interrogate thing is a bit fiddly.
And looks like we'll need to increase test coverage. So obvious ones would be to include tests where we use the default, or a user-provided post treatment variable name.

Overall, this is looking good. Thanks for the PR :)

Oh, remember to update from main regularly :)

drbenvincent · 2025-07-30T07:52:00Z

causalpy/experiments/diff_in_diff.py

-            )
+        # Check if post_treatment_variable_name is in formula
+        if self.post_treatment_variable_name not in self.formula:
+            if self.post_treatment_variable_name == "post_treatment":


I've got a minor preference to just give one generic exception message, rather than a custom one dependent on self.post_treatment_variable_name. That will also cut down on the number of tests required to achieve high test coverage.

Yeah absolutely!! More generic ones like "Missing required variable '{self.post_treatment_variable_name}' in formula" can be used

drbenvincent · 2025-07-30T07:53:49Z

causalpy/experiments/diff_in_diff.py

+
+        # Check if post_treatment_variable_name is in data columns
+        if self.post_treatment_variable_name not in self.data.columns:
+            if self.post_treatment_variable_name == "post_treatment":


Same comment as above. Just give one more generic exception message, regardless of what self.post_treatment_variable_name is.

drbenvincent · 2025-07-30T07:55:54Z

causalpy/experiments/diff_in_diff.py

+            # Store the coefficient into dictionary {intercept:value}
+            coef_map = dict(zip(self.labels, self.model.get_coeffs()))
+            # Create and find the interaction term based on the values user provided
+            interaction_term = (


Nice. We'll need more tests anyway to ensure test coverage, so when you do that can you add cases for when people specify formulas like post_treatment:a and post_treatment*b. It should work because we'll always get a coefficient for post_treatment:a, but it is worth adding the test

Yeah, will add some tests for a cases where a user provides post treatment variable name and check for FormulaExeption and DataException

but @drbenvincent can you elaborate on this specific test. Are we also checking the coefficient value where two interaction terms are used?

I'd not thought of that. I guess it's easy to find and interaction term of the post treatment variable and something else. But if there are two interaction terms, both including the post treatment variable, then that might get messy. Can we think of any situations where that be a good idea? If not, then maybe that could throw and exception and we just say we can't deal with a formula like that?

drbenvincent · 2025-07-30T07:58:52Z

causalpy/experiments/diff_in_diff.py

@@ -128,6 +130,12 @@ def __init__(
            }
            self.model.fit(X=self.X, y=self.y, coords=COORDS)
        elif isinstance(self.model, RegressorMixin):
+            # For scikit-learn models, automatically set fit_intercept=False


Rojan Shrestha added 2 commits July 29, 2025 22:06

Added post_treatment_variable_name parameter and sklearn model summar…

4ebe1a7

…y for did

Refactor DiD validation: segregate FormulaException and DataException

7fbb27a

drbenvincent requested changes Jul 30, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Refactored scikit-learn flavour of DifferenceInDifferences and allowed custom column names for post_treatment variable. #515

Refactored scikit-learn flavour of DifferenceInDifferences and allowed custom column names for post_treatment variable. #515

Uh oh!

roesta07 commented Jul 30, 2025 •

edited by github-actions bot

Loading

Uh oh!

codecov bot commented Jul 30, 2025

Uh oh!

drbenvincent left a comment •

edited

Loading

Uh oh!

drbenvincent Jul 30, 2025

Uh oh!

roesta07 Jul 30, 2025

Uh oh!

drbenvincent Jul 30, 2025

Uh oh!

drbenvincent Jul 30, 2025

Uh oh!

roesta07 Jul 30, 2025

Uh oh!

drbenvincent Jul 30, 2025

Uh oh!

drbenvincent Jul 30, 2025

Uh oh!

Uh oh!

Refactored scikit-learn flavour of DifferenceInDifferences and allowed custom column names for post_treatment variable. #515

Are you sure you want to change the base?

Refactored scikit-learn flavour of DifferenceInDifferences and allowed custom column names for post_treatment variable. #515

Uh oh!

Conversation

roesta07 commented Jul 30, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Jul 30, 2025

Codecov Report

Uh oh!

drbenvincent left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

drbenvincent Jul 30, 2025

Choose a reason for hiding this comment

Uh oh!

roesta07 Jul 30, 2025

Choose a reason for hiding this comment

Uh oh!

drbenvincent Jul 30, 2025

Choose a reason for hiding this comment

Uh oh!

drbenvincent Jul 30, 2025

Choose a reason for hiding this comment

Uh oh!

roesta07 Jul 30, 2025

Choose a reason for hiding this comment

Uh oh!

drbenvincent Jul 30, 2025

Choose a reason for hiding this comment

Uh oh!

drbenvincent Jul 30, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

roesta07 commented Jul 30, 2025 •

edited by github-actions bot

Loading

drbenvincent left a comment •

edited

Loading