Skip to content

Conversation

borfast
Copy link
Collaborator

@borfast borfast commented Sep 29, 2025

This replaces the isContribution filter in the pipes that use activities, and adds the ability to filter by activity type "category", by including a new includeCodeContributions and includeCollaborations parameters.

By default, if neither parameter is passed to the pipe, it considers includeCodeContributions = 1.

The changes in services/libs/tinybird/pipes/top_member_org_copy.pipe were done so that the "Top contributors" and "Top organizations" widgets in the app front page use filtered data.

I'm not sure what to do for the "Top LF projects" widget, though. It uses the projects_list pipe from Tinybird, but that one does not touch activities, so we can't filter its data by activity type directly. The pipe gets its data from the insightsProjects_filtered pipe, which in turn gets its data from insights_projects_populated_ds data source, and that one is populated by a copy pipe, which does get activities data - but if we filter the data before it is copied, we affect every other pipe downstream from that one, which is no good.


Note

Replaces isContribution filters with activity type-based filtering via new activityTypes_filtered pipe and updates related pipes/widgets to support includeCodeContributions/includeCollaborations.

  • Pipes: activity filtering overhaul
    • Replace isContribution checks with AND (type, platform) IN (SELECT activityType, platform FROM activityTypes_filtered) in:
      • activities_filtered.pipe, activities_filtered_historical_cutoff.pipe, activities_filtered_retention.pipe
      • health_score_active_contributors.pipe, health_score_contributor_dependency.pipe, health_score_organization_dependency.pipe
      • segmentId_aggregates_mv.pipe
    • Add params inherited from activityTypes_filtered: includeCodeContributions, includeCollaborations (and docs tweaks to repos).
  • New pipe
    • activityTypes_filtered.pipe: parameterized filter over activityTypes with includeCodeContributions (default 1), includeCollaborations, includeOtherContributions; outputs (activityType, platform).
  • Widgets data sources
    • top_member_org_copy.pipe: constrain member/org activity counts to code contributions or collaborations using activityTypes (exclude other activity types).

Written by Cursor Bugbot for commit 1b769fa. This will update automatically on new commits. Configure here.

cursor[bot]

This comment was marked as outdated.

@borfast borfast force-pushed the feat/IN-707-change-filtering-by-activity-type branch from b57d052 to 4ad3616 Compare September 29, 2025 15:11
cursor[bot]

This comment was marked as outdated.

cursor[bot]

This comment was marked as outdated.

@borfast borfast requested a review from gaspergrom September 30, 2025 16:08
@borfast borfast force-pushed the feat/IN-707-change-filtering-by-activity-type branch from df4c21f to e9d967f Compare September 30, 2025 17:32
@joanagmaia
Copy link
Contributor

@gaspergrom can you check @borfast comment from the PR?

I'm not sure what to do for the "Top LF projects" widget, though. It uses the projects_list pipe from Tinybird, but that one does not touch activities, so we can't filter its data by activity type directly. The pipe gets its data from the insightsProjects_filtered pipe, which in turn gets its data from insights_projects_populated_ds data source, and that one is populated by a copy pipe, which does get activities data - but if we filter the data before it is copied, we affect every other pipe downstream from that one, which is no good.

Raúl raised this on the backend weekly sync.

@epipav if you can also check this one out, I think it will help. This is a change that will affect almost all metrics we display on Insights.

cursor[bot]

This comment was marked as outdated.

Copy link
Collaborator

@epipav epipav left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks ok - I added comments to a few places where we changed the parameter name from onlyContributions to includeX style - In these places where we consume the tinybird API, client should also be changed. Have we already made this change somewhere?

Also added a readability nit to the pipe

Comment on lines 1 to 61
DESCRIPTION >
- `activityTypes_filtered.pipe` allows filtering activityTypes from the respective data source.
- By default, this only returns code contribution activities (`includeCodeContributions = 1`).
- To return all activities, set `includeCodeContributions = 1`, `includeCollaborations = 1`, and `includeOtherContributions = 1`.
- Parameters:
- `includeCodeContributions`: Optional boolean to include code contribution activities. Defaults to 1. Set to 0 to exclude.
- `includeCollaborations`: Optional boolean to include or exclude collaboration activities.
- `includeOtherContributions`: Optional boolean to include other contribution activities (activities that are neither code contributions nor collaborations).
- Response: `activityType`, `platform`.
- This pipe is used by other downstream pipes as an auxiliary method of filtering data by activity types.

NODE activityTypes_selected
SQL >
%
SELECT activityType, platform
FROM activityTypes
WHERE
(
-- If no parameters are defined, default to including code contributions.
{% if not defined(includeCodeContributions) and not defined(
includeCollaborations
) and not defined(includeOtherContributions) %}isCodeContribution = 1
{% else %}
-- Start with a false literal to safely prepend OR clauses in the next statements,
-- even if the previous guard didn't output anything.
0
{% if defined(includeCodeContributions) %}
OR (
{{
UInt8(
includeCodeContributions,
description="Include code contribution activities",
required=False,
)
}} = 1 AND isCodeContribution = 1
)
{% end %}
{% if defined(includeCollaborations) %}
OR (
{{
UInt8(
includeCollaborations,
description="Include non-code collaboration activities",
required=False,
)
}} = 1 AND isCollaboration = 1
)
{% end %}
{% if defined(includeOtherContributions) %}
OR (
{{
UInt8(
includeOtherContributions,
description="Include other contribution activities",
required=False,
)
}} = 1 AND isCodeContribution = 0 AND isCollaboration = 0
)
{% end %}
{% end %}
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Readability NIT: If we get the parameters to the query using CTEs, it can be shorter and easier to read

%
WITH
  {{ UInt8(includeCodeContributions, default=1) }} AS icc,
  {{ UInt8(includeCollaborations, default=0) }} AS icol,
  {{ UInt8(includeOtherContributions, default=0) }} AS ioc
SELECT activityType, platform
FROM activityTypes
WHERE
      (icc = 1  AND isCodeContribution = 1)
   OR (icol = 1 AND isCollaboration   = 1)
   OR (ioc = 1  AND isCodeContribution = 0 AND isCollaboration = 0)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call, I'll change it 👍

{% if not defined(onlyContributions) or (
defined(onlyContributions) and onlyContributions == 1
) %} AND a.isContribution {% end %}
AND (a.type, a.platform) IN (SELECT activityType, platform FROM activityTypes_filtered)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this'll break on the client part because client still uses onlyContributions ?

not defined(onlyContributions)
or (defined(onlyContributions) and onlyContributions == 1)
) %} AND a.isContribution {% end %}
AND (a.type, a.platform) IN (SELECT activityType, platform FROM activityTypes_filtered)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this'll break on the client part because client still uses onlyContributions ?

{% if not defined(onlyContributions) or (
defined(onlyContributions) and onlyContributions == 1
) %} AND a.isContribution {% end %}
AND (a.type, a.platform) IN (SELECT activityType, platform FROM activityTypes_filtered)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this'll break on the client part because client still uses onlyContributions ?

@borfast
Copy link
Collaborator Author

borfast commented Oct 2, 2025

In these places where we consume the tinybird API, client should also be changed. Have we already made this change somewhere?

Yes, there'a a PR for Insights that covers this. As far as I know, there's no other client that uses onlyContributions.

@borfast borfast requested a review from epipav October 2, 2025 18:16
@borfast borfast force-pushed the feat/IN-707-change-filtering-by-activity-type branch from d10a98b to 8430f7b Compare October 2, 2025 18:17
WHERE
(icc = 1 AND isCodeContribution = 1)
OR (icol = 1 AND isCollaboration = 1)
OR (ioc = 1 AND isCodeContribution = 0 AND isCollaboration = 0)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Filter Inclusion Error

The filter for "other contributions" doesn't verify that activities are actual contributions. This means the pipe can return non-contribution activities, which differs from the original behavior of only including isContribution = 1 activities.

Fix in Cursor Fix in Web

@borfast borfast merged commit a456cd7 into main Oct 7, 2025
16 of 17 checks passed
@borfast borfast deleted the feat/IN-707-change-filtering-by-activity-type branch October 7, 2025 08:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants