Skip to content

Conversation

@cubic-dev-local
Copy link

Summary by cubic

Emit the full search query text to PostHog for both success and error events to enable better query history and analysis. Adds the query field to the analytics payloads alongside existing metrics.

---

Based on: airweave-ai/airweave#805

Copy link

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 1 file

Prompt for AI agents (all 1 issues)

Understand the root cause of the following 1 issues and fix them.


<file name="backend/airweave/analytics/search_analytics.py">

<violation number="1" location="backend/airweave/analytics/search_analytics.py:35">
This change introduces a potential sensitive data leak. The raw search query is being added to the PostHog analytics event, but the project&#39;s analytics guidelines in `backend/airweave/analytics/README.md` explicitly forbid including sensitive data in event properties. Search queries may contain PII or other sensitive information.</violation>
</file>

React with 👍 or 👎 to teach cubic. Mention @cubic-dev-ai to give feedback, ask questions, or re-run the review.

Dictionary of analytics properties
"""
properties = {
"query": query, # Full query text for PostHog history
Copy link

@cubic-dev-ai cubic-dev-ai bot Sep 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change introduces a potential sensitive data leak. The raw search query is being added to the PostHog analytics event, but the project's analytics guidelines in backend/airweave/analytics/README.md explicitly forbid including sensitive data in event properties. Search queries may contain PII or other sensitive information.

Prompt for AI agents
Address the following comment on backend/airweave/analytics/search_analytics.py at line 35:

<comment>This change introduces a potential sensitive data leak. The raw search query is being added to the PostHog analytics event, but the project&#39;s analytics guidelines in `backend/airweave/analytics/README.md` explicitly forbid including sensitive data in event properties. Search queries may contain PII or other sensitive information.</comment>

<file context>
@@ -32,6 +32,7 @@ def build_search_properties(
         Dictionary of analytics properties
     &quot;&quot;&quot;
     properties = {
+        &quot;query&quot;: query,  # Full query text for PostHog history
         &quot;query_length&quot;: len(query),
         &quot;collection_slug&quot;: collection_slug,
</file context>

[internal] Confidence score: 10/10

[internal] Posted by: System Design Agent

Fix with Cubic

Copy link
Author

@cubic-dev-local cubic-dev-local bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 1 file

Prompt for AI agents (all 2 issues)

Understand the root cause of the following 2 issues and fix them.


<file name="backend/airweave/analytics/search_analytics.py">

<violation number="1" location="backend/airweave/analytics/search_analytics.py:35">
This change sends the full search query text to PostHog, which violates the project&#39;s own analytics guidelines. The `analytics/README.md` states, &#39;Sensitive data should not be included in event properties.&#39; Search queries can contain PII or other confidential information, making this a potential privacy and compliance issue.</violation>

<violation number="2" location="backend/airweave/analytics/search_analytics.py:35">
The new `query` property isn’t documented in the Search Event Properties section; update docs to reflect this property for consistency.

*DEV MODE: This violation would have been filtered out by screening filters. Failing filters: commentPurpose, documentationImpact.*

        DEV MODE: This violation would have been filtered out by GPT-5.
Reasoning:
• **GPT-5**: Only a docs mismatch; low severity with no functional impact. Not worth reporting compared to privacy issues.</violation>
</file>

React with 👍 or 👎 to teach cubic. Mention @cubic-dev-ai to give feedback, ask questions, or re-run the review.

Dictionary of analytics properties
"""
properties = {
"query": query, # Full query text for PostHog history
Copy link
Author

@cubic-dev-local cubic-dev-local bot Sep 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change sends the full search query text to PostHog, which violates the project's own analytics guidelines. The analytics/README.md states, 'Sensitive data should not be included in event properties.' Search queries can contain PII or other confidential information, making this a potential privacy and compliance issue.

Prompt for AI agents
Address the following comment on backend/airweave/analytics/search_analytics.py at line 35:

<comment>This change sends the full search query text to PostHog, which violates the project&#39;s own analytics guidelines. The `analytics/README.md` states, &#39;Sensitive data should not be included in event properties.&#39; Search queries can contain PII or other confidential information, making this a potential privacy and compliance issue.</comment>

<file context>
@@ -32,6 +32,7 @@ def build_search_properties(
         Dictionary of analytics properties
     &quot;&quot;&quot;
     properties = {
+        &quot;query&quot;: query,  # Full query text for PostHog history
         &quot;query_length&quot;: len(query),
         &quot;collection_slug&quot;: collection_slug,
</file context>

[internal] Confidence score: 10/10

[internal] Posted by: System Design Agent

Fix with Cubic

Dictionary of analytics properties
"""
properties = {
"query": query, # Full query text for PostHog history
Copy link
Author

@cubic-dev-local cubic-dev-local bot Sep 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new query property isn’t documented in the Search Event Properties section; update docs to reflect this property for consistency.

DEV MODE: This violation would have been filtered out by screening filters. Failing filters: commentPurpose, documentationImpact.

    DEV MODE: This violation would have been filtered out by GPT-5.

Reasoning:
GPT-5: Only a docs mismatch; low severity with no functional impact. Not worth reporting compared to privacy issues.

Prompt for AI agents
Address the following comment on backend/airweave/analytics/search_analytics.py at line 35:

<comment>The new `query` property isn’t documented in the Search Event Properties section; update docs to reflect this property for consistency.

*DEV MODE: This violation would have been filtered out by screening filters. Failing filters: commentPurpose, documentationImpact.*

        DEV MODE: This violation would have been filtered out by GPT-5.
Reasoning:
• **GPT-5**: Only a docs mismatch; low severity with no functional impact. Not worth reporting compared to privacy issues.</comment>

<file context>
@@ -32,6 +32,7 @@ def build_search_properties(
         Dictionary of analytics properties
     &quot;&quot;&quot;
     properties = {
+        &quot;query&quot;: query,  # Full query text for PostHog history
         &quot;query_length&quot;: len(query),
         &quot;collection_slug&quot;: collection_slug,
</file context>

[internal] Confidence score: 7/10

[internal] Posted by: General AI Review Agent

Fix with Cubic

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant