Skip to content

chore: simplify alert creation and evaluation #1388

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Jul 25, 2025

Conversation

nikhilsinhaparseable
Copy link
Contributor

@nikhilsinhaparseable nikhilsinhaparseable commented Jul 24, 2025

removed stream name, condition config, aggregate config from alert request
add query as parameter
rename aggregateConfig to thresholdConfig

new payload
```
{
  "severity": "high",
  "title": "error count",
  "alertType": "threshold",
  "query": "SELECT COUNT(`body`) AS `COUNT_body` FROM `test1` WHERE `severity_text` = 'ERROR'  LIMIT 500",
  "thresholdConfig": {
    "operator": ">",
    "value": 1000
  },
  "evalConfig": {
    "rollingWindow": {
      "evalStart": "5h",
      "evalEnd": "now",
      "evalFrequency": 1
    }
  },
  "targets": [
    "01K0XDNX76DXGHQ1T32XG63K27"
  ]
}
```

Summary by CodeRabbit

  • New Features

    • Alerts now use a simplified query-based configuration with threshold comparison, replacing aggregate-based alerts.
    • Automatic migration of existing alerts from the old format to the new query-based format.
    • Added querier selection and load balancing for distributed query execution, enhancing reliability and performance.
  • Improvements

    • Streamlined alert validation and user authorization for consistent and secure checks.
    • Simplified alert evaluation logic, focusing on single query execution and threshold comparison.
    • Consolidated alert creation payloads for easier configuration and maintenance.
    • Enhanced alert loading to transparently migrate and validate alerts with updated versioning.
  • Bug Fixes

    • Addressed issues in alert migration and query validation processes.
  • Chores

    • Added new dependencies and improved internal code structure for maintainability.
    • Made internal constants publicly accessible to support integration.

    removed stream name, condition config, aggregate config from alert request
    add query as parameter
    rename aggregateConfig to thresholdConfig

    new payload
    ```
    {
      "severity": "high",
      "title": "error count",
      "alertType": "threshold",
      "query": "SELECT COUNT(`body`) AS `COUNT_body` FROM `test1` WHERE `severity_text` = 'ERROR'  LIMIT 500",
      "thresholdConfig": {
        "operator": ">",
        "value": 1000
      },
      "evalConfig": {
        "rollingWindow": {
          "evalStart": "5h",
          "evalEnd": "now",
          "evalFrequency": 1
        }
      },
      "targets": [
        "01K0XDNX76DXGHQ1T32XG63K27"
      ]
    }
    ```
Copy link
Contributor

coderabbitai bot commented Jul 24, 2025

Walkthrough

This update refactors the alert system from an aggregate-based model to a query-and-threshold-based model, introduces migration from v1 to v2 alert configurations, and streamlines alert evaluation. It also adds robust querier selection and dispatch in the cluster handler, updates alert creation scripts, and adjusts validation and authorization logic.

Changes

File(s) Change Summary
Cargo.toml Added lazy_static crate as a dependency.
resources/ingest_demo_data.sh Simplified alert creation JSON: removed aggregates and replaced with thresholdConfig for all demo alerts.
src/alerts/alerts_utils.rs Refactored alert evaluation: removed aggregate logic, now executes alert query and compares result to threshold; added time range extraction, local/remote query execution, simplified result extraction and error handling.
src/alerts/mod.rs Migrated alert config from v1 (aggregate-based) to v2 (query/threshold-based); added migration logic; updated validation to check queries; adjusted alert loading, creation, and state management for new model.
src/handlers/http/alerts.rs Refactored alert query validation and user authorization to use new query field and validation method; removed base query extraction.
src/handlers/http/cluster/mod.rs Added querier selection and dispatch: maintains querier status, performs liveness checks, round-robin and LRU selection, marks queriers available/unavailable, and handles query forwarding with error handling.
src/handlers/http/query.rs Made TIME_ELAPSED_HEADER constant public.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant HTTPHandler
    participant Alerts
    participant Storage
    participant Querier

    User->>HTTPHandler: Create/Update Alert (v1 or v2)
    HTTPHandler->>Alerts: Validate and Load Alert
    Alerts->>Storage: Read alert JSON
    alt v1 alert
        Alerts->>Alerts: Migrate v1 to v2 (build query, threshold)
        Alerts->>Storage: Save migrated v2 alert
    end
    Alerts-->>HTTPHandler: AlertConfig (v2)
    HTTPHandler-->>User: Response

    User->>HTTPHandler: Trigger Alert Evaluation
    HTTPHandler->>Alerts: evaluate_alert(alert)
    Alerts->>Querier: Execute alert.query (local/remote)
    Querier-->>Alerts: Numeric result
    Alerts->>Alerts: Compare result to threshold
    Alerts->>Storage: Update alert state
    Alerts-->>HTTPHandler: Evaluation outcome
    HTTPHandler-->>User: Response
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

In the meadow where queries hop and play,
Alerts once juggled aggregates all day.
Now with thresholds crisp and clear,
Migration magic brings v2 near.
Queriers round-robin, never in a jam,
This bunny cheers, “What a refactor—hot damn!” 🐇✨

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🔭 Outside diff range comments (1)
src/alerts/alerts_utils.rs (1)

206-234: Consider returning Result instead of defaulting to 0.0.

The function silently returns 0.0 when it can't extract a value, which could mask errors in alert evaluation. Also, it only handles Float64 and Int64 arrays.

-fn get_final_value(records: Vec<RecordBatch>) -> f64 {
+fn get_final_value(records: Vec<RecordBatch>) -> Result<f64, AlertError> {
     trace!("records-\n{records:?}");

-    if let Some(f) = records
+    // Check if we have any records
+    let first_batch = records
         .first()
-        .and_then(|batch| {
-            trace!("batch.column(0)-\n{:?}", batch.column(0));
-            batch.column(0).as_any().downcast_ref::<Float64Array>()
-        })
-        .map(|array| {
-            trace!("array-\n{array:?}");
-            array.value(0)
-        })
-    {
-        f
-    } else {
-        records
-            .first()
-            .and_then(|batch| {
-                trace!("batch.column(0)-\n{:?}", batch.column(0));
-                batch.column(0).as_any().downcast_ref::<Int64Array>()
-            })
-            .map(|array| {
-                trace!("array-\n{array:?}");
-                array.value(0)
-            })
-            .unwrap_or_default() as f64
-    }
+        .ok_or_else(|| AlertError::CustomError("No records returned from query".to_string()))?;
+
+    if batch.num_rows() == 0 {
+        return Err(AlertError::CustomError("Query returned empty batch".to_string()));
+    }
+
+    let column = batch.column(0);
+    trace!("batch.column(0)-\n{:?}", column);
+
+    // Try different numeric array types
+    if let Some(array) = column.as_any().downcast_ref::<Float64Array>() {
+        trace!("array-\n{array:?}");
+        Ok(array.value(0))
+    } else if let Some(array) = column.as_any().downcast_ref::<Int64Array>() {
+        trace!("array-\n{array:?}");
+        Ok(array.value(0) as f64)
+    } else {
+        Err(AlertError::CustomError(format!(
+            "Unsupported array type for alert value: {:?}",
+            column.data_type()
+        )))
+    }
 }

Also update the calling code:

-    Ok(get_final_value(records))
+    get_final_value(records)
🧹 Nitpick comments (6)
src/handlers/http/cluster/mod.rs (1)

70-74: Consider memory management for the querier map.

The global QUERIER_MAP could grow unbounded if queriers are frequently added and removed from the cluster. Consider implementing a cleanup mechanism or size limit to prevent potential memory issues in long-running deployments.

Also applies to: 1116-1122

src/alerts/alerts_utils.rs (3)

68-77: Consider simplifying the pattern matching.

Since EvalConfig currently only has one variant, the pattern matching could be simplified.

-fn extract_time_range(eval_config: &super::EvalConfig) -> Result<TimeRange, AlertError> {
-    let (start_time, end_time) = match eval_config {
-        super::EvalConfig::RollingWindow(rolling_window) => (&rolling_window.eval_start, "now"),
-    };
-
-    TimeRange::parse_human_time(start_time, end_time)
-        .map_err(|err| AlertError::CustomError(err.to_string()))
-}
+fn extract_time_range(eval_config: &super::EvalConfig) -> Result<TimeRange, AlertError> {
+    match eval_config {
+        super::EvalConfig::RollingWindow(rolling_window) => {
+            TimeRange::parse_human_time(&rolling_window.eval_start, "now")
+                .map_err(|err| AlertError::CustomError(err.to_string()))
+        }
+    }
+}

93-127: Well-structured local query execution with proper error handling.

The function correctly handles stream creation, query planning, and execution. Good practice adding context to error messages.

Consider adding a comment explaining why Either::Right represents an error case:

     let records = match records {
         Either::Left(rbs) => rbs,
         Either::Right(_) => {
+            // Right variant indicates streaming response which is not expected for aggregate queries
             return Err(AlertError::CustomError(
                 "Query returned no results".to_string(),
             ));
         }

151-164: Consider improving the error message for better debugging.

The function handles all numeric JSON types correctly.

-fn convert_result_to_f64(result_value: serde_json::Value) -> Result<f64, AlertError> {
-    if let Some(value) = result_value.as_f64() {
-        Ok(value)
-    } else if let Some(value) = result_value.as_i64() {
-        Ok(value as f64)
-    } else if let Some(value) = result_value.as_u64() {
-        Ok(value as f64)
-    } else {
-        Err(AlertError::CustomError(
-            "Query result is not a number".to_string(),
-        ))
-    }
-}
+fn convert_result_to_f64(result_value: serde_json::Value) -> Result<f64, AlertError> {
+    if let Some(value) = result_value.as_f64() {
+        Ok(value)
+    } else if let Some(value) = result_value.as_i64() {
+        Ok(value as f64)
+    } else if let Some(value) = result_value.as_u64() {
+        Ok(value as f64)
+    } else {
+        Err(AlertError::CustomError(
+            format!("Query result is not a number, got: {:?}", result_value),
+        ))
+    }
+}
src/alerts/mod.rs (2)

676-699: Well-implemented SQL query construction.

Good handling of aggregate functions and proper column quoting.

Consider validating the stream name to prevent SQL injection:

 fn build_base_query(
     aggregate_function: &AggregateFunction,
     aggregate_config: &JsonValue,
     stream: &str,
 ) -> Result<String, AlertError> {
+    // Validate stream name contains only valid characters
+    if !stream.chars().all(|c| c.is_alphanumeric() || c == '_' || c == '-') {
+        return Err(AlertError::CustomError(format!("Invalid stream name: {}", stream)));
+    }
+    
     let column = aggregate_config["column"].as_str().unwrap_or("*");

925-961: Smart recursive analysis of query structure.

The implementation correctly identifies aggregate queries that return single values.

Consider adding support for queries with LIMIT 1 that might return a single value:

 fn is_logical_plan_aggregate(plan: &LogicalPlan) -> bool {
     match plan {
         // Direct aggregate: SELECT COUNT(*), AVG(col), etc.
         LogicalPlan::Aggregate(_) => true,

         // Projection over aggregate: SELECT COUNT(*) as total, SELECT AVG(col) as average
         LogicalPlan::Projection(Projection { input, expr, .. }) => {
             // Check if input contains an aggregate and we have exactly one expression
             let is_aggregate_input = Self::is_logical_plan_aggregate(input);
             let single_expr = expr.len() == 1;
             is_aggregate_input && single_expr
         }

+        // Limit over aggregate: SELECT COUNT(*) FROM table LIMIT 1
+        LogicalPlan::Limit(limit) => {
+            limit.fetch.map_or(false, |f| f == 1) && Self::is_logical_plan_aggregate(&limit.input)
+        }
+
         // Recursively check wrapped plans (Filter, Limit, Sort, etc.)
         _ => {
             // Use inputs() method to get all input plans
             plan.inputs()
                 .iter()
                 .any(|input| Self::is_logical_plan_aggregate(input))
         }
     }
 }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 30ee3d4 and 62f9459.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (7)
  • Cargo.toml (1 hunks)
  • resources/ingest_demo_data.sh (3 hunks)
  • src/alerts/alerts_utils.rs (5 hunks)
  • src/alerts/mod.rs (15 hunks)
  • src/handlers/http/alerts.rs (4 hunks)
  • src/handlers/http/cluster/mod.rs (4 hunks)
  • src/handlers/http/query.rs (1 hunks)
🧰 Additional context used
🧠 Learnings (4)
src/handlers/http/query.rs (2)

Learnt from: parmesant
PR: #1347
File: src/handlers/http/query.rs:0-0
Timestamp: 2025-06-18T12:44:31.983Z
Learning: The counts API in src/handlers/http/query.rs does not currently support group_by functionality in COUNT queries, so the hard-coded fields array ["start_time", "end_time", "count"] is appropriate for the current scope.

Learnt from: nikhilsinhaparseable
PR: #1340
File: src/query/mod.rs:64-66
Timestamp: 2025-06-18T06:39:04.775Z
Learning: In src/query/mod.rs, QUERY_SESSION_STATE and QUERY_SESSION serve different architectural purposes: QUERY_SESSION_STATE is used for stats calculation and allows dynamic registration of individual parquet files from the staging path (files created every minute), while QUERY_SESSION is used for object store queries with the global schema provider. Session contexts with schema providers don't support registering individual tables/parquets, so both session objects are necessary for their respective use cases.

src/handlers/http/alerts.rs (3)

Learnt from: nikhilsinhaparseable
PR: #1305
File: src/handlers/http/users/dashboards.rs:0-0
Timestamp: 2025-05-01T10:27:56.858Z
Learning: The add_tile() function in src/handlers/http/users/dashboards.rs should use get_dashboard_by_user(dashboard_id, &user_id) instead of get_dashboard(dashboard_id) to ensure proper authorization checks when modifying a dashboard.

Learnt from: nikhilsinhaparseable
PR: #1340
File: src/query/mod.rs:64-66
Timestamp: 2025-06-18T06:39:04.775Z
Learning: In src/query/mod.rs, QUERY_SESSION_STATE and QUERY_SESSION serve different architectural purposes: QUERY_SESSION_STATE is used for stats calculation and allows dynamic registration of individual parquet files from the staging path (files created every minute), while QUERY_SESSION is used for object store queries with the global schema provider. Session contexts with schema providers don't support registering individual tables/parquets, so both session objects are necessary for their respective use cases.

Learnt from: de-sh
PR: #1185
File: src/handlers/http/logstream.rs:255-261
Timestamp: 2025-02-14T09:49:25.818Z
Learning: In Parseable's logstream handlers, stream existence checks must be performed for both query and standalone modes. The pattern !PARSEABLE.streams.contains(&stream_name) && (PARSEABLE.options.mode != Mode::Query || !PARSEABLE.create_stream_and_schema_from_storage(&stream_name).await?) ensures proper error handling in both modes.

src/handlers/http/cluster/mod.rs (4)

Learnt from: nikhilsinhaparseable
PR: #1340
File: src/query/mod.rs:64-66
Timestamp: 2025-06-18T06:39:04.775Z
Learning: In src/query/mod.rs, QUERY_SESSION_STATE and QUERY_SESSION serve different architectural purposes: QUERY_SESSION_STATE is used for stats calculation and allows dynamic registration of individual parquet files from the staging path (files created every minute), while QUERY_SESSION is used for object store queries with the global schema provider. Session contexts with schema providers don't support registering individual tables/parquets, so both session objects are necessary for their respective use cases.

Learnt from: nikhilsinhaparseable
PR: #1346
File: src/parseable/streams.rs:319-331
Timestamp: 2025-06-16T09:50:38.636Z
Learning: In Parseable's Ingest or Query mode, the node_id is always available because it's generated during server initialization itself, before the get_node_id_string() function in streams.rs would be called. This makes the .expect() calls on QUERIER_META.get() and INGESTOR_META.get() safe in this context.

Learnt from: de-sh
PR: #1185
File: src/handlers/http/logstream.rs:255-261
Timestamp: 2025-02-14T09:49:25.818Z
Learning: In Parseable's logstream handlers, stream existence checks must be performed for both query and standalone modes. The pattern !PARSEABLE.streams.contains(&stream_name) && (PARSEABLE.options.mode != Mode::Query || !PARSEABLE.create_stream_and_schema_from_storage(&stream_name).await?) ensures proper error handling in both modes.

Learnt from: parmesant
PR: #1347
File: src/handlers/http/query.rs:0-0
Timestamp: 2025-06-18T12:44:31.983Z
Learning: The counts API in src/handlers/http/query.rs does not currently support group_by functionality in COUNT queries, so the hard-coded fields array ["start_time", "end_time", "count"] is appropriate for the current scope.

src/alerts/mod.rs (1)

Learnt from: de-sh
PR: #1185
File: src/handlers/http/logstream.rs:255-261
Timestamp: 2025-02-14T09:49:25.818Z
Learning: In Parseable's logstream handlers, stream existence checks must be performed for both query and standalone modes. The pattern !PARSEABLE.streams.contains(&stream_name) && (PARSEABLE.options.mode != Mode::Query || !PARSEABLE.create_stream_and_schema_from_storage(&stream_name).await?) ensures proper error handling in both modes.

🧬 Code Graph Analysis (1)
src/handlers/http/alerts.rs (1)
src/utils/mod.rs (1)
  • user_auth_for_query (79-88)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: Quest Smoke and Load Tests for Distributed deployments
  • GitHub Check: Quest Smoke and Load Tests for Standalone deployments
  • GitHub Check: Build Default x86_64-pc-windows-msvc
  • GitHub Check: coverage
  • GitHub Check: Build Default aarch64-apple-darwin
  • GitHub Check: Build Kafka x86_64-unknown-linux-gnu
  • GitHub Check: Build Default aarch64-unknown-linux-gnu
  • GitHub Check: Build Kafka aarch64-apple-darwin
  • GitHub Check: Build Default x86_64-unknown-linux-gnu
  • GitHub Check: Build Default x86_64-apple-darwin
🔇 Additional comments (17)
src/handlers/http/query.rs (1)

59-59: LGTM!

Making TIME_ELAPSED_HEADER public is appropriate as it's now used by the cluster handler module for query response timing extraction.

src/handlers/http/alerts.rs (2)

57-57: LGTM!

The streamlined validation approach using alert.validate(session_key) properly encapsulates the validation logic within the alert structure.


82-82: Consistent authorization approach.

The direct use of user_auth_for_query with alert.query across GET, DELETE, and PUT handlers provides a clean and consistent authorization pattern that aligns well with the query-based alert model.

Also applies to: 96-96, 129-129

src/handlers/http/cluster/mod.rs (1)

1123-1208: Well-designed querier selection with proper concurrency control.

The implementation excellently handles:

  • Concurrent liveness checks with semaphore-based rate limiting
  • Efficient map updates for live/dead queriers
  • Smart fallback from round-robin to LRU strategy

This design should scale well for typical cluster sizes.

src/alerts/alerts_utils.rs (8)

21-39: LGTM! Clean imports aligned with the new query-based alert system.

The imports are well-organized and properly support the refactored alert evaluation logic, including the new cluster query dispatch functionality.


54-66: LGTM! Clean refactor to query-based evaluation.

The simplified evaluation flow clearly separates concerns: time range extraction, query execution, threshold evaluation, and state update. Good use of the ? operator for error propagation.


129-150: LGTM! Clean implementation of remote query execution.

The Query request is properly configured for alert evaluation (non-streaming, aggregate results). Good separation of concerns with the JSON-to-f64 conversion delegated to a helper function.


166-175: LGTM! Complete and correct threshold evaluation.

All comparison operators are properly implemented.


177-204: Excellent state management with informative alert messages.

The function correctly handles state transitions and generates detailed messages that include all relevant context (query, threshold, actual value, evaluation window). The optimization to avoid unnecessary state updates is a nice touch.


236-328: Well-implemented filter string generation with proper SQL escaping.

The function correctly handles all WHERE operators with appropriate escaping for LIKE patterns. Good validation for NULL operators and clear error messages.


330-365: LGTM! Clean value type handling.

The ValueType enum and its trait implementations provide good type safety for filter values.


78-92: Incomplete mode coverage in execute_alert_query

The execute_alert_query function currently handles only Mode::All, Mode::Query, and Mode::Prism, but the Mode enum (src/option.rs) also defines two additional variants:

  • Mode::Ingest
  • Mode::Index

Please confirm whether alerts should also be evaluated under these modes (for example, by routing them through execute_local_query or another handler) or if returning a CustomError is the intended behavior for them.

src/alerts/mod.rs (5)

19-64: LGTM! Appropriate imports and version update.

The imports properly support the new query-based system and migration logic. Boxing AlertConfig in AlertTask::Create is a good choice for handling async migration.


572-603: Excellent migration implementation from v1 to v2.

The migration logic is comprehensive and handles all aspects of the v1 alert structure. Good practice to save the migrated alert immediately.


870-923: Robust query validation with proper authorization checks.

Excellent implementation that validates query syntax, user permissions, and ensures the query is suitable for alerting (returns single aggregate value).


1079-1170: Excellent backward compatibility handling in alert loading.

The function robustly handles version detection and migration with good error recovery. The logic to detect v1 alerts by checking for the presence of stream field or absence of query field is thorough.


1044-1068: LGTM! Appropriate error handling for invalid queries.

The new InvalidAlertQuery error variant is properly integrated with the correct HTTP status code.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
src/alerts/mod.rs (1)

96-104: Consider explicit error handling for unknown versions.

This matches a previous review comment. Defaulting unknown versions to V2 might hide corruption or format issues. Consider adding logging for unknown versions or returning a Result type.

🧹 Nitpick comments (1)
src/alerts/mod.rs (1)

1118-1208: Robust alert loading with automatic migration.

The loading logic properly handles the transition from v1 to v2 alerts:

  • Safe version detection using raw JSON parsing
  • Automatic migration of v1 alerts during load
  • Proper fallback for alerts without version fields
  • Good error handling with continuation on individual failures

The retry mechanism for failed AlertTask creation is a nice touch for resilience.

Consider implementing atomic migration to avoid potential inconsistencies if the process fails partway through:

// Before processing alerts, create a migration transaction
let migration_batch = Vec::new();
// Collect all migrations first, then apply atomically
for migration in migration_batch {
    store.put_alert(migration.id, &migration).await?;
}
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 62f9459 and ea5270e.

📒 Files selected for processing (2)
  • resources/ingest_demo_data.sh (3 hunks)
  • src/alerts/mod.rs (15 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • resources/ingest_demo_data.sh
🧰 Additional context used
🧠 Learnings (1)
src/alerts/mod.rs (1)

Learnt from: de-sh
PR: #1185
File: src/handlers/http/logstream.rs:255-261
Timestamp: 2025-02-14T09:49:25.818Z
Learning: In Parseable's logstream handlers, stream existence checks must be performed for both query and standalone modes. The pattern !PARSEABLE.streams.contains(&stream_name) && (PARSEABLE.options.mode != Mode::Query || !PARSEABLE.create_stream_and_schema_from_storage(&stream_name).await?) ensures proper error handling in both modes.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: coverage
  • GitHub Check: Quest Smoke and Load Tests for Distributed deployments
  • GitHub Check: Build Default aarch64-apple-darwin
  • GitHub Check: Build Kafka x86_64-unknown-linux-gnu
  • GitHub Check: Quest Smoke and Load Tests for Standalone deployments
  • GitHub Check: Build Default x86_64-pc-windows-msvc
  • GitHub Check: Build Kafka aarch64-apple-darwin
  • GitHub Check: Build Default x86_64-apple-darwin
  • GitHub Check: Build Default aarch64-unknown-linux-gnu
  • GitHub Check: Build Default x86_64-unknown-linux-gnu
🔇 Additional comments (6)
src/alerts/mod.rs (6)

53-63: LGTM! Helper structures are well-designed.

The BasicAlertFields struct provides a clean way to extract common fields during migration, and updating the version constant to "v2" aligns with the migration objectives.


396-516: Well-structured simplification of configuration.

The new ThresholdConfig and LogicalOperator structures are cleaner and more focused than the previous aggregate-based approach. The implementations are standard and follow Rust conventions properly.


520-569: Structural changes align with v2 alert model.

The replacement of stream with query and aggregates with threshold_config correctly implements the shift from aggregate-based to query-based alerts. The conversion logic is sound.


909-999: Excellent validation improvements using DataFusion.

The new validation approach is much more robust:

  • Proper query syntax validation using DataFusion's parser
  • Authorization checks integrated with existing RBAC system
  • Intelligent aggregate query detection through logical plan analysis
  • Comprehensive error handling for various failure modes

The is_logical_plan_aggregate method correctly identifies aggregate queries by checking for Aggregate nodes and Projection over aggregates.


1082-1083: Appropriate error handling for query validation.

Adding the InvalidAlertQuery error variant provides clear error reporting for the new query-based validation logic.


573-891: Ensure proper validation and escaping of stream and column identifiers

While the value strings in WHERE clauses are correctly escaped, the stream and column inputs are interpolated directly into the SQL without any sanitization. This can allow injection if those fields contain unexpected characters. Please update the code to enforce safe identifier handling:

  • In build_query_from_v1 / build_base_query (src/alerts/mod.rs):
    • Validate that stream matches a strict pattern (e.g. ^[A-Za-z0-9_]+$) before quoting.
    • Similarly, validate column or fall back to * if it doesn’t meet your naming rules.
  • Escape any internal double quotes in identifiers, or better yet, leverage a SQL‐builder or identifier‐quoting utility instead of format!.
  • Add unit tests to cover edge cases where stream or column contains malicious payloads.
⛔ Skipped due to learnings
Learnt from: de-sh
PR: parseablehq/parseable#1236
File: src/prism/logstream/mod.rs:332-332
Timestamp: 2025-03-13T11:39:52.587Z
Learning: SQL injection concerns can be ignored in this codebase as all SQL queries are run against immutable data streams, limiting the potential impact of any injection.
Learnt from: de-sh
PR: parseablehq/parseable#1185
File: src/handlers/http/logstream.rs:255-261
Timestamp: 2025-02-14T09:49:25.818Z
Learning: In Parseable's logstream handlers, stream existence checks must be performed for both query and standalone modes. The pattern `!PARSEABLE.streams.contains(&stream_name) && (PARSEABLE.options.mode != Mode::Query || !PARSEABLE.create_stream_and_schema_from_storage(&stream_name).await?)` ensures proper error handling in both modes.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
src/handlers/http/cluster/mod.rs (1)

1123-1208: Robust querier selection with good concurrency control.

The implementation properly handles:

  • Concurrency limits for liveness checks to prevent resource exhaustion
  • Dynamic querier map updates based on liveness
  • Fallback selection strategies (round-robin → LRU)
  • Proper cleanup of dead queriers

Consider caching liveness results with a TTL for frequently called operations to reduce the overhead of checking every querier on each request.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ea5270e and 28a8818.

📒 Files selected for processing (3)
  • Cargo.toml (1 hunks)
  • resources/ingest_demo_data.sh (3 hunks)
  • src/handlers/http/cluster/mod.rs (4 hunks)
✅ Files skipped from review due to trivial changes (1)
  • Cargo.toml
🚧 Files skipped from review as they are similar to previous changes (1)
  • resources/ingest_demo_data.sh
🧰 Additional context used
🧠 Learnings (1)
src/handlers/http/cluster/mod.rs (5)

Learnt from: nikhilsinhaparseable
PR: #1340
File: src/query/mod.rs:64-66
Timestamp: 2025-06-18T06:39:04.775Z
Learning: In src/query/mod.rs, QUERY_SESSION_STATE and QUERY_SESSION serve different architectural purposes: QUERY_SESSION_STATE is used for stats calculation and allows dynamic registration of individual parquet files from the staging path (files created every minute), while QUERY_SESSION is used for object store queries with the global schema provider. Session contexts with schema providers don't support registering individual tables/parquets, so both session objects are necessary for their respective use cases.

Learnt from: nikhilsinhaparseable
PR: #1346
File: src/parseable/streams.rs:319-331
Timestamp: 2025-06-16T09:50:38.636Z
Learning: In Parseable's Ingest or Query mode, the node_id is always available because it's generated during server initialization itself, before the get_node_id_string() function in streams.rs would be called. This makes the .expect() calls on QUERIER_META.get() and INGESTOR_META.get() safe in this context.

Learnt from: de-sh
PR: #1185
File: src/handlers/http/logstream.rs:255-261
Timestamp: 2025-02-14T09:49:25.818Z
Learning: In Parseable's logstream handlers, stream existence checks must be performed for both query and standalone modes. The pattern !PARSEABLE.streams.contains(&stream_name) && (PARSEABLE.options.mode != Mode::Query || !PARSEABLE.create_stream_and_schema_from_storage(&stream_name).await?) ensures proper error handling in both modes.

Learnt from: nikhilsinhaparseable
PR: #1346
File: src/handlers/http/health_check.rs:81-90
Timestamp: 2025-06-16T02:04:58.990Z
Learning: In the shutdown function in src/handlers/http/health_check.rs, the design approach is to log errors from sync operations rather than propagate them. This is intentional because the shutdown function is called on SIGTERM/SIGINT signals, and the goal is to perform best-effort cleanup (syncing pending files to object storage) while allowing the shutdown to proceed regardless of sync failures. Logging provides debugging information without blocking the shutdown process.

Learnt from: parmesant
PR: #1347
File: src/handlers/http/query.rs:0-0
Timestamp: 2025-06-18T12:44:31.983Z
Learning: The counts API in src/handlers/http/query.rs does not currently support group_by functionality in COUNT queries, so the hard-coded fields array ["start_time", "end_time", "count"] is appropriate for the current scope.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: Quest Smoke and Load Tests for Standalone deployments
  • GitHub Check: Quest Smoke and Load Tests for Distributed deployments
  • GitHub Check: coverage
  • GitHub Check: Build Default x86_64-pc-windows-msvc
  • GitHub Check: Build Default aarch64-unknown-linux-gnu
  • GitHub Check: Build Kafka x86_64-unknown-linux-gnu
  • GitHub Check: Build Default x86_64-apple-darwin
  • GitHub Check: Build Default aarch64-apple-darwin
  • GitHub Check: Build Default x86_64-unknown-linux-gnu
  • GitHub Check: Build Kafka aarch64-apple-darwin
🔇 Additional comments (5)
src/handlers/http/cluster/mod.rs (5)

21-74: LGTM! Clean import additions and proper global state management.

The new imports are well-organized and appropriate for the querier selection functionality. The lazy_static usage with Arc<RwLock<>> provides proper thread-safe access to the global querier state.


1116-1121: Well-designed querier status tracking structure.

The struct appropriately captures the necessary state for querier management: metadata, availability, and usage tracking for LRU selection.


1210-1298: Well-implemented selection algorithms with proper edge case handling.

Both selection strategies are correctly implemented:

  • Round-robin properly cycles through available queriers with wrap-around logic
  • LRU correctly prioritizes never-used queriers and selects based on oldest usage time
  • Edge cases (empty maps, missing entries) are handled appropriately

1300-1307: Simple and correct querier availability management.

The function properly marks queriers as available while preserving usage tracking for LRU selection.


1309-1372: Excellent error handling that addresses previous review concerns.

The function properly ensures queriers are marked available on all code paths:

  • Serialization errors (line 1325)
  • HTTP request errors (line 1340)
  • Successful requests (line 1346)

This implementation correctly resolves the potential querier exhaustion issue identified in previous reviews.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
src/alerts/mod.rs (2)

621-627: Consider more robust enum parsing with explicit error handling.

The migration functions use string matching with default fallbacks for enums (severity, alert operators, alert state). While functional, this could mask data corruption issues during migration.

Consider adding warning logs for unknown enum values:

let severity = match severity_str.to_lowercase().as_str() {
    "critical" => Severity::Critical,
    "high" => Severity::High,
    "medium" => Severity::Medium,
    "low" => Severity::Low,
-   _ => Severity::Medium, // default
+   _ => {
+       warn!("Unknown severity '{}' in v1 alert migration, defaulting to Medium", severity_str);
+       Severity::Medium
+   }
};

Similar patterns should be applied to operator and state parsing.

Also applies to: 824-832, 885-890


1133-1141: Consider batching storage operations during migration.

The current migration approach saves each migrated alert individually to storage. For systems with many v1 alerts, this could result in numerous storage operations during startup.

Consider collecting migrated alerts and performing batch storage operations:

let mut migrated_alerts = Vec::new();

// Collect migrations first
for raw_bytes in raw_objects {
    // ... migration logic ...
    if needs_migration {
        let migrated = AlertConfig::migrate_from_v1(&json_value, store.as_ref()).await?;
        migrated_alerts.push(migrated);
    }
}

// Batch save migrated alerts
if !migrated_alerts.is_empty() {
    // Implement batch storage operation
    store.put_alerts_batch(&migrated_alerts).await?;
}

Also applies to: 1150-1156, 1170-1176

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 28a8818 and 7a84874.

📒 Files selected for processing (1)
  • src/alerts/mod.rs (15 hunks)
🧰 Additional context used
🧠 Learnings (1)
src/alerts/mod.rs (2)

Learnt from: nikhilsinhaparseable
PR: #1388
File: src/alerts/mod.rs:88-104
Timestamp: 2025-07-24T11:09:21.746Z
Learning: In the Parseable alert system (src/alerts/mod.rs), alert versions are server-generated and controlled via CURRENT_ALERTS_VERSION constant, not user input. The AlertVerison enum's From<&str> implementation correctly defaults unknown versions to V2 since the server only generates known versions (v1, v2). Unknown versions would only occur in exceptional cases like file corruption, making the current fallback approach appropriate.

Learnt from: de-sh
PR: #1185
File: src/handlers/http/logstream.rs:255-261
Timestamp: 2025-02-14T09:49:25.818Z
Learning: In Parseable's logstream handlers, stream existence checks must be performed for both query and standalone modes. The pattern !PARSEABLE.streams.contains(&stream_name) && (PARSEABLE.options.mode != Mode::Query || !PARSEABLE.create_stream_and_schema_from_storage(&stream_name).await?) ensures proper error handling in both modes.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: Quest Smoke and Load Tests for Standalone deployments
  • GitHub Check: Quest Smoke and Load Tests for Distributed deployments
  • GitHub Check: coverage
  • GitHub Check: Build Default x86_64-apple-darwin
  • GitHub Check: Build Default x86_64-pc-windows-msvc
  • GitHub Check: Build Default aarch64-apple-darwin
  • GitHub Check: Build Default aarch64-unknown-linux-gnu
  • GitHub Check: Build Default x86_64-unknown-linux-gnu
  • GitHub Check: Build Kafka aarch64-apple-darwin
  • GitHub Check: Build Kafka x86_64-unknown-linux-gnu
🔇 Additional comments (14)
src/alerts/mod.rs (14)

63-63: Version migration to v2 looks appropriate.

The update to CURRENT_ALERTS_VERSION = "v2" and defaulting AlertVerison to V2 aligns well with the alert system migration. Based on the retrieved learning, the server-controlled versioning approach with fallback to v2 is correctly implemented.

Also applies to: 92-93, 100-101


54-58: Helper struct is well-designed for migration.

The BasicAlertFields struct appropriately captures common fields needed during v1 to v2 migration, keeping the migration code clean and organized.


396-399: New data structures align well with simplified alert model.

The ThresholdConfig struct and LogicalOperator enum are well-defined and follow Rust conventions. The transition from aggregate-based to threshold-based configuration simplifies the alert system appropriately.

Also applies to: 502-516


524-524: Query-based alert model is consistently implemented.

The replacement of stream field with query and aggregates/aggregate_config with threshold_config is consistently applied across AlertRequest and AlertConfig structs.

Also applies to: 526-526, 542-542, 544-544, 561-561, 563-563


573-602: Migration function structure is well-organized.

The migrate_from_v1 function follows a clear pattern of extracting components, building the migrated structure, and persisting it. The error handling and async design are appropriate.


637-650: SQL query building logic handles aggregate functions correctly.

The query building logic properly handles different aggregate functions including the special case for COUNT DISTINCT and COUNT with specific columns. The SQL syntax generation follows standard conventions.

Also applies to: 676-710


773-809: SQL injection protection is properly implemented.

The WHERE clause formatting includes proper SQL escaping using value.replace('\'', "''") for all string values and uses double-quoted identifiers for column names, which provides good protection against SQL injection attacks.


940-961: Comprehensive query validation with proper security checks.

The validation logic is well-structured with appropriate checks for:

  • Non-empty queries
  • Stream name resolution and distributed stream creation
  • User authorization
  • Aggregate query verification

The integration with existing authorization and stream management functions maintains consistency with the broader system.


964-999: Sophisticated aggregate query detection using DataFusion.

The is_query_aggregate and is_logical_plan_aggregate functions provide robust validation by analyzing the logical plan structure. The recursive analysis correctly handles:

  • Direct aggregates (LogicalPlan::Aggregate)
  • Projections over aggregates with single expression validation
  • Wrapped plans through recursive input checking

This approach is much more reliable than string-based query analysis.


1144-1177: Robust migration detection and handling during alert loading.

The loading logic effectively handles the v1 to v2 migration with multiple detection strategies:

  • Explicit version checking
  • Null query field detection
  • Presence of legacy stream field

The fallback to migration for alerts without version fields ensures backward compatibility. Error handling with continue statements prevents single alert failures from blocking the entire loading process.


1180-1201: Alert task creation includes appropriate retry logic.

The retry mechanism for failed alert task creation provides resilience against transient failures. Boxing the alert configs aligns with the updated AlertTask::Create variant signature.


22-22: Import updates support new functionality appropriately.

The added imports for DataFusion logical plan analysis, JSON handling, and query functions align well with the new validation and migration capabilities.

Also applies to: 29-30, 44-44, 46-46, 49-49


84-84: Method updates consistently apply the new data model.

The updates to use Box<AlertConfig>, query field, and threshold_config field are consistently applied across all relevant methods, maintaining API coherence.

Also applies to: 899-899, 901-901, 1217-1217, 1304-1304


573-1207: Excellent refactoring that successfully modernizes the alert system.

This comprehensive migration from aggregate-based to query-based alerts demonstrates:

Robust migration strategy with backward compatibility
Enhanced security through proper SQL escaping and authorization checks
Improved validation using DataFusion's logical plan analysis
Consistent error handling throughout the migration and validation logic
Clean code structure with well-organized helper functions

The refactoring significantly simplifies the alert model while maintaining full functionality and providing a smooth upgrade path from v1 to v2 alerts.

coderabbitai[bot]
coderabbitai bot previously approved these changes Jul 24, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
src/alerts/alerts_utils.rs (1)

207-222: Fix Clippy format string warnings.

The number formatting logic is excellent, but there are Clippy warnings that should be addressed.

Apply this diff to fix the format string warnings:

-        format!("{:.0}", value)
+        format!("{value:.0}")
-        format!("{:.4}", value)
+        format!("{value:.4}")
-        format!("{:.2}", value)
+        format!("{value:.2}")
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7a84874 and ef52f9c.

📒 Files selected for processing (1)
  • src/alerts/alerts_utils.rs (5 hunks)
🧰 Additional context used
🧠 Learnings (1)
src/alerts/alerts_utils.rs (1)

Learnt from: nikhilsinhaparseable
PR: #1388
File: src/alerts/mod.rs:88-104
Timestamp: 2025-07-24T11:09:21.746Z
Learning: In the Parseable alert system (src/alerts/mod.rs), alert versions are server-generated and controlled via CURRENT_ALERTS_VERSION constant, not user input. The AlertVerison enum's From<&str> implementation correctly defaults unknown versions to V2 since the server only generates known versions (v1, v2). Unknown versions would only occur in exceptional cases like file corruption, making the current fallback approach appropriate.

🪛 GitHub Actions: Lint, Test and Coverage Report
src/alerts/alerts_utils.rs

[error] 211-211: Clippy lint 'uninlined_format_args': variables can be used directly in the format! string. Change format("{:.0}", value) to format("{value:.0}").


[error] 214-214: Clippy lint 'uninlined_format_args': variables can be used directly in the format! string. Change format("{:.4}", value) to format("{value:.4}").


[error] 220-220: Clippy lint 'uninlined_format_args': variables can be used directly in the format! string. Change format("{:.2}", value) to format("{value:.2}").

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: Quest Smoke and Load Tests for Standalone deployments
  • GitHub Check: Quest Smoke and Load Tests for Distributed deployments
  • GitHub Check: Build Default x86_64-apple-darwin
  • GitHub Check: Build Kafka aarch64-apple-darwin
  • GitHub Check: Build Default x86_64-pc-windows-msvc
  • GitHub Check: Build Default aarch64-apple-darwin
  • GitHub Check: Build Default x86_64-unknown-linux-gnu
  • GitHub Check: Build Default aarch64-unknown-linux-gnu
  • GitHub Check: Build Kafka x86_64-unknown-linux-gnu
🔇 Additional comments (11)
src/alerts/alerts_utils.rs (11)

21-41: LGTM! Import changes support the new query-based alert system.

The new imports appropriately support the refactored alert evaluation logic, including remote query execution via cluster handlers and local query processing with DataFusion.


54-66: LGTM! Excellent simplification of alert evaluation logic.

The refactored evaluate_alert function properly implements the new query-and-threshold-based model, replacing the complex aggregate-based evaluation with a much cleaner flow.


68-76: LGTM! Clean time range extraction implementation.

The function correctly extracts time ranges from the evaluation configuration and handles errors appropriately.


78-91: LGTM! Proper mode-based query execution dispatch.

The function correctly routes alert query execution based on the server mode, supporting both local and remote execution patterns.


93-127: LGTM! Solid local query execution implementation.

The function properly integrates with the existing query infrastructure and handles all execution steps with appropriate error handling.


129-149: LGTM! Proper remote query execution implementation.

The function correctly constructs query requests and integrates with the cluster query infrastructure for remote execution.


151-164: LGTM! Robust JSON to numeric conversion.

The function properly handles various JSON numeric types and provides clear error messages for invalid data.


177-205: LGTM! Alert state update properly adapted to new model.

The function correctly uses the new threshold configuration and provides comprehensive alert messages including the SQL query for better troubleshooting.


224-252: LGTM! Robust value extraction from query results.

The function properly handles different numeric array types and provides a safe fallback mechanism with good debugging support.


254-346: LGTM! Simplified filter string generation with proper SQL safety.

The function correctly implements the simplified AND-only logic while maintaining comprehensive operator support and SQL injection protection through proper escaping.


348-383: LGTM! Well-designed ValueType enum with proper trait implementations.

The enum correctly handles different value types with appropriate parsing, formatting, and DataFusion integration.

coderabbitai[bot]
coderabbitai bot previously approved these changes Jul 24, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🔭 Outside diff range comments (1)
src/alerts/alerts_utils.rs (1)

207-235: Add bounds checking for safer array access.

The function accesses array elements without verifying bounds, which could cause panics:

  1. Missing array length check: array.value(0) assumes the array has at least one element.
  2. Limited type support: Only handles Float64Array and Int64Array, but queries might return other numeric types.
 fn get_final_value(records: Vec<RecordBatch>) -> f64 {
     trace!("records-\n{records:?}");
 
+    if records.is_empty() {
+        return 0.0;
+    }
+
+    let batch = &records[0];
+    if batch.num_columns() == 0 || batch.num_rows() == 0 {
+        return 0.0;
+    }
+
     if let Some(f) = records
         .first()
         .and_then(|batch| {
             trace!("batch.column(0)-\n{:?}", batch.column(0));
             batch.column(0).as_any().downcast_ref::<Float64Array>()
         })
+        .filter(|array| array.len() > 0)
         .map(|array| {
             trace!("array-\n{array:?}");
             array.value(0)
         })
     {
         f
     } else {
         records
             .first()
             .and_then(|batch| {
                 trace!("batch.column(0)-\n{:?}", batch.column(0));
                 batch.column(0).as_any().downcast_ref::<Int64Array>()
             })
+            .filter(|array| array.len() > 0)
             .map(|array| {
                 trace!("array-\n{array:?}");
                 array.value(0)
             })
             .unwrap_or_default() as f64
     }
 }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ef52f9c and 164d7d6.

📒 Files selected for processing (1)
  • src/alerts/alerts_utils.rs (5 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: Quest Smoke and Load Tests for Standalone deployments
  • GitHub Check: Quest Smoke and Load Tests for Distributed deployments
  • GitHub Check: coverage
  • GitHub Check: Build Default aarch64-unknown-linux-gnu
  • GitHub Check: Build Kafka aarch64-apple-darwin
  • GitHub Check: Build Default x86_64-pc-windows-msvc
  • GitHub Check: Build Default aarch64-apple-darwin
  • GitHub Check: Build Default x86_64-apple-darwin
  • GitHub Check: Build Kafka x86_64-unknown-linux-gnu
  • GitHub Check: Build Default x86_64-unknown-linux-gnu
🔇 Additional comments (9)
src/alerts/alerts_utils.rs (9)

21-41: LGTM! Import changes align with the new alert architecture.

The added imports support the new query-based alert evaluation system with proper separation between local and remote execution modes.


54-66: Excellent simplification of the alert evaluation logic.

The new 3-step process (extract time range → execute query → evaluate condition) is much cleaner than the previous aggregate-based approach and aligns perfectly with the PR objectives to simplify alert creation and evaluation.


68-77: LGTM! Time range extraction is correctly implemented.

The hardcoded "now" for end_time is appropriate for rolling window evaluations. The error handling properly converts time parsing errors to AlertError.


78-91: Excellent dispatch pattern for query execution modes.

The clean separation between local (All/Query) and remote (Prism) execution with proper error handling for unsupported modes demonstrates good architectural design.


129-149: LGTM! Remote query execution properly structured.

The function correctly constructs the query request with appropriate parameters and handles errors properly. Good separation of concerns with the result conversion delegated to a separate function.


151-164: LGTM! Comprehensive numeric type conversion.

The function properly handles all common numeric JSON types (f64, i64, u64) with appropriate error handling for non-numeric results.


166-175: LGTM! Condition evaluation logic is correct.

All comparison operators are properly implemented with appropriate floating-point comparisons.


177-205: LGTM! Alert state update properly reflects the new model.

The updated message format with threshold configuration and query information is more informative than the previous aggregate-based messaging. State transition logic is correctly preserved.


237-366: LGTM! Filter string generation correctly handles supported operations.

The function properly:

  • Restricts support to AND operator only (as intended)
  • Escapes special characters in LIKE operations to prevent SQL injection
  • Handles multiple value types with appropriate formatting
  • Provides clear error messages for unsupported operations

The ValueType enum implementation is comprehensive and well-structured.

coderabbitai[bot]
coderabbitai bot previously approved these changes Jul 24, 2025
coderabbitai[bot]
coderabbitai bot previously approved these changes Jul 25, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
src/alerts/mod.rs (1)

94-103: Version enum properly configured for v2 migration.

The default to V2 and updated From implementation correctly support the migration from v1 to v2 alerts. As previously discussed, server-controlled versioning makes the fallback approach appropriate.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2cfed25 and d3ce8b1.

📒 Files selected for processing (1)
  • src/alerts/mod.rs (15 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: Quest Smoke and Load Tests for Distributed deployments
  • GitHub Check: coverage
  • GitHub Check: Quest Smoke and Load Tests for Standalone deployments
  • GitHub Check: Build Default aarch64-apple-darwin
  • GitHub Check: Build Default aarch64-unknown-linux-gnu
  • GitHub Check: Build Default x86_64-pc-windows-msvc
  • GitHub Check: Build Kafka x86_64-unknown-linux-gnu
  • GitHub Check: Build Default x86_64-unknown-linux-gnu
  • GitHub Check: Build Default x86_64-apple-darwin
  • GitHub Check: Build Kafka aarch64-apple-darwin
🔇 Additional comments (18)
src/alerts/mod.rs (18)

20-61: LGTM! Good import organization and helper struct design.

The new imports support the query-based alert system and migration functionality. The BasicAlertFields helper struct provides a clean way to extract common fields during v1 to v2 migration.


65-65: Alert version successfully updated to v2.

The version constant change aligns with the migration strategy to make v2 the current version for new alerts.


86-86: Good optimization with boxed AlertConfig.

Boxing the AlertConfig reduces stack allocation overhead for the enum variant, which is beneficial given the complex nested structure of alert configurations.


398-398: Clear and descriptive struct name.

Renaming to ThresholdConfig better describes the purpose compared to the previous aggregate-focused naming.


526-531: Alert request structure successfully simplified.

The new structure with query and threshold_config is much cleaner than the previous aggregate-based approach, making alert creation more intuitive.


563-565: Alert configuration aligned with new query-based model.

The configuration fields properly reflect the transition to SQL query-based alerts with threshold evaluation.


574-606: Comprehensive migration orchestration.

The migration function properly orchestrates the conversion from v1 to v2 format, with good error handling and automatic persistence of migrated alerts.


727-777: Robust WHERE clause generation with proper type handling.

The WHERE clause building logic properly handles schema-based type conversion and SQL injection prevention through proper escaping.


854-929: Comprehensive data type conversion with good error handling.

The type conversion logic properly handles various Arrow data types with appropriate SQL formatting. The error messages provide good context for debugging migration issues.


1039-1046: Alert modification updated for new structure.

The modification function properly updates the new query-based fields while maintaining validation of target IDs.


1080-1101: Comprehensive query validation logic.

The validation ensures:

  • Query is not empty
  • Stream names can be resolved
  • User has proper authorization
  • Query is an aggregate that returns a single value

This provides robust validation for the new query-based model.


1104-1139: Smart aggregate query detection using logical plan analysis.

Using DataFusion's logical plan analysis to determine if a query is aggregate is an excellent approach. The recursive checking handles complex query structures properly.


1284-1317: Intelligent version detection and migration.

The logic properly detects v1 alerts through multiple indicators:

  • Explicit version field
  • Missing query field
  • Presence of stream field

This covers various scenarios where v1 alerts might exist.


1320-1343: Robust alert task creation with retry logic.

The boxed AlertConfig usage and retry mechanism ensure alerts are properly loaded even if there are temporary issues with the task channel.


1222-1223: Appropriate error variant for query validation.

The InvalidAlertQuery error provides clear feedback for query validation failures in the new alert system.


504-518: Clean logical operator enum for SQL generation.

The enum properly supports AND/OR operations needed for WHERE clause construction during migration, with appropriate Display formatting.


1357-1357: Authorization properly updated for query-based alerts.

The user authorization now correctly uses the alert's query field, which is cleaner than the previous aggregate-based approach.


1444-1444: Consistent boxing in task creation.

The boxed AlertConfig usage is consistent with the AlertTask enum change and provides memory optimization.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
src/alerts/mod.rs (1)

574-1031: Well-implemented migration logic with comprehensive error handling.

The v1 to v2 migration logic is thorough and well-structured:

  • Proper separation of concerns with helper functions
  • Comprehensive error handling with informative messages
  • Safe SQL query building with proper column quoting and string escaping
  • Robust data type conversion based on schema information

Consider caching schema information during migration to avoid repeated fetches for the same stream, which could improve performance during startup with many alerts.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d3ce8b1 and a2e93f3.

📒 Files selected for processing (1)
  • src/alerts/mod.rs (15 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: Quest Smoke and Load Tests for Standalone deployments
  • GitHub Check: Quest Smoke and Load Tests for Distributed deployments
  • GitHub Check: coverage
  • GitHub Check: Build Default aarch64-apple-darwin
  • GitHub Check: Build Default x86_64-pc-windows-msvc
  • GitHub Check: Build Default x86_64-apple-darwin
  • GitHub Check: Build Default x86_64-unknown-linux-gnu
  • GitHub Check: Build Default aarch64-unknown-linux-gnu
  • GitHub Check: Build Kafka x86_64-unknown-linux-gnu
  • GitHub Check: Build Kafka aarch64-apple-darwin
🔇 Additional comments (6)
src/alerts/mod.rs (6)

65-65: LGTM: Version updates are consistent.

The version constant update to "v2", default enum variant change to V2, and support for parsing "v2" strings are all correctly implemented and aligned with the migration strategy.

Also applies to: 94-95, 102-103


526-526: LGTM: Structural changes align with query-based model.

The transition from stream/aggregates to query/threshold_config fields is consistent across AlertRequest and AlertConfig structures, properly supporting the new query-based alert model.

Also applies to: 528-528, 544-544, 546-546, 563-563, 565-565


1049-1139: Comprehensive query validation with proper security checks.

The validation logic effectively covers all necessary aspects:

  • Query non-emptiness and stream resolution
  • Distributed execution compatibility
  • User authorization verification
  • Aggregate query validation using DataFusion's logical plan analysis

The recursive logical plan analysis in is_logical_plan_aggregate correctly identifies various aggregate query patterns.


1256-1347: Seamless migration during alert loading.

The loading logic effectively handles the transition from v1 to v2:

  • Smart version detection using JSON parsing before deserialization
  • Transparent migration of v1 alerts with proper error handling
  • Graceful fallback behavior for alerts without version fields
  • Comprehensive logging for debugging migration issues

The approach ensures backward compatibility while transitioning to the new format.


86-86: Good optimization: Boxing AlertConfig reduces memory overhead.

Boxing AlertConfig in AlertTask::Create is a sound optimization that:

  • Reduces stack allocation for the large struct
  • Minimizes copy overhead when passing through channels
  • Is consistently applied throughout the codebase

Also applies to: 1322-1322, 1331-1331, 1444-1444


1222-1223: Clean addition of specific error type for query validation.

The new InvalidAlertQuery error variant provides clear, specific error handling for query validation failures, improving the user experience and debugging capabilities.

Also applies to: 1245-1245

@nitisht nitisht merged commit 7017c28 into parseablehq:main Jul 25, 2025
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants