-
Notifications
You must be signed in to change notification settings - Fork 117
Feature: mimic the qualifire integration in python sdk #84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Feature: mimic the qualifire integration in python sdk #84
Conversation
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. Summary by CodeRabbitRelease Notes
WalkthroughThe PR adds Qualifire integration to the TypeScript SDK by introducing a new QualifireClient utility for reporting evaluation summaries to the Qualifire API, new TypeScript interfaces for structured summary payloads and reporting requests, and minor updates to Python and Go backend components to support scenario type tracking and field rendering. Changes
Sequence DiagramsequenceDiagram
participant Caller
participant QualifireClient
participant Converter
participant API as Qualifire API
Caller->>QualifireClient: reportSummaryToQualifire(results, request, options)
Note over QualifireClient: Log if logger provided
QualifireClient->>QualifireClient: Validate API key & URL
QualifireClient->>Converter: convertWithStructuredSummary()
Converter-->>QualifireClient: QualifireReportPayload
Note over QualifireClient: Set 30s AbortController timeout
QualifireClient->>API: POST /evaluate with JSON + auth header
alt Success (2xx)
API-->>QualifireClient: Response OK
QualifireClient->>QualifireClient: Clear timeout
QualifireClient-->>Caller: Promise resolves
else Failure (non-2xx)
API-->>QualifireClient: Error response
QualifireClient->>QualifireClient: Clear timeout
QualifireClient-->>Caller: Promise rejects with error text
else Timeout (>30s)
QualifireClient-->>Caller: Promise rejects with timeout error
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes
Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (2 warnings, 1 inconclusive)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🧹 Nitpick comments (2)
packages/sdk/src/qualifire.ts (2)
21-21: Consider making logging configurable.Hardcoded
console.infoin library code may clutter user logs. Consider accepting an optional logger callback or making logging opt-in.Example approach:
export interface QualifireClientOptions { logger?: (message: string) => void; } public static async reportSummaryToQualifire( evaluationResults: EvaluationResult, request: ReportSummaryRequest, options?: QualifireClientOptions ): Promise<void> { options?.logger?.("Reporting summary to Qualifire"); // ... }
28-35: Add timeout and improve error handling.The
fetchcall lacks a timeout, which can cause requests to hang indefinitely. Additionally, network failures (connection refused, DNS errors) will throw generic errors without helpful context.+ const controller = new AbortController(); + const timeoutId = setTimeout(() => controller.abort(), 30000); // 30 second timeout + + try { const response = await fetch(`${request.qualifireUrl}/api/rogue/v1/report`, { method: "POST", headers: { "Content-Type": "application/json", "X-qualifire-key": request.qualifireApiKey, }, body: JSON.stringify(apiEvaluationResult), + signal: controller.signal, }); + clearTimeout(timeoutId); if (!response.ok) { const errText = await response.text(); throw new Error( `Qualifire report failed: ${response.status} ${response.statusText} - ${errText}` ); } + } catch (error) { + clearTimeout(timeoutId); + if (error instanceof Error && error.name === 'AbortError') { + throw new Error('Qualifire report timed out after 30 seconds'); + } + throw error; + }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
packages/sdk/src/qualifire.ts(1 hunks)packages/sdk/src/types.ts(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
packages/sdk/src/types.ts (1)
sdks/python/rogue_sdk/types.py (1)
ReportSummaryRequest(501-510)
packages/sdk/src/qualifire.ts (1)
packages/sdk/src/types.ts (2)
EvaluationResult(61-65)ReportSummaryRequest(121-128)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
♻️ Duplicate comments (1)
packages/sdk/src/types.ts (1)
128-136: MakestructuredSummaryoptional to match Python SDK.The Python SDK defines
structured_summaryasOptional[StructuredSummary] = None, but this TypeScript version makes it required. This inconsistency could break when the field is not provided.Apply this diff:
export interface ReportSummaryRequest { job_id: string, qualifireUrl: string; qualifireApiKey: string; - structuredSummary: StructuredSummary; + structuredSummary?: StructuredSummary; deepTest: boolean; startTime: string; judgeModel?: string; }
🧹 Nitpick comments (2)
packages/sdk/src/types.ts (1)
121-126: Typedetailed_breakdownmore precisely.The
detailed_breakdownfield uses a genericobject[]type, but the Go implementation (packages/tui/internal/tui/evaluation.go:95-104) shows it has a specific structure withscenario,status, andoutcomefields.Define a specific interface for detailed breakdown:
+export interface DetailedBreakdownItem { + scenario: string; + status: string; + outcome: string; +} + export interface StructuredSummary { overall_summary: string, key_findings: string[], recommendations: string[], - detailed_breakdown: object[] + detailed_breakdown: DetailedBreakdownItem[] }packages/sdk/src/quatifier.ts (1)
42-42: Check for undefined before accessingoptions.If
optionsis undefined,options?.logger?.(...)is safe, but the function signature doesn't indicate thatoptionsis optional. Either make it required or mark it as optional in the signature.public static async reportSummaryToQualifire( evaluationResults: EvaluationResult, request: ReportSummaryRequest, - options: QualifireClientOptions + options?: QualifireClientOptions ): Promise<void> {
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
packages/sdk/src/quatifier.ts(1 hunks)packages/sdk/src/types.ts(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
packages/sdk/src/types.ts (2)
packages/tui/internal/tui/evaluation.go (1)
StructuredSummary(96-105)sdks/python/rogue_sdk/types.py (2)
StructuredSummary(444-450)ReportSummaryRequest(501-510)
packages/sdk/src/quatifier.ts (1)
packages/sdk/src/types.ts (3)
EvaluationResult(61-65)StructuredSummary(121-126)ReportSummaryRequest(128-136)
packages/sdk/src/quatifier.ts
Outdated
| ); | ||
|
|
||
| const controller = new AbortController(); | ||
| const timeoutId = setTimeout(() => controller.abort(), 3000); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix timeout inconsistency: 3 seconds vs "30 seconds" error message.
Line 52 sets a 3-second timeout (3000ms), but line 75 reports "30 seconds" in the error message.
Apply this diff:
- throw new Error('Qualifire report timed out after 30 seconds');
+ throw new Error('Qualifire report timed out after 3 seconds');Also applies to: 75-75
🤖 Prompt for AI Agents
In packages/sdk/src/quatifier.ts around lines 52 and 75, the timeout is set to
3000ms (3 seconds) but the error message says "30 seconds"; update the timeout
to 30000 (30,000ms) at line 52 so it matches the "30 seconds" message at line 75
(or alternatively change the error message to "3 seconds" if the intended
timeout is 3s) and ensure both the numeric timeout and the textual message
remain consistent.
packages/sdk/src/quatifier.ts
Outdated
| `Qualifire report failed: ${response.status} ${response.statusText} - ${errText}` | ||
| ); | ||
| } | ||
| clearTimeout(timeoutId) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add missing semicolon.
- clearTimeout(timeoutId)
+ clearTimeout(timeoutId);🤖 Prompt for AI Agents
In packages/sdk/src/quatifier.ts around line 71, the statement
"clearTimeout(timeoutId)" is missing a trailing semicolon; add the semicolon to
make it "clearTimeout(timeoutId);" and run the project's formatter/linter to
ensure consistent style.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (3)
packages/sdk/src/types.ts (2)
121-126: Specify the structure ofdetailed_breakdownto match the Go SDK.The Go SDK defines
detailed_breakdownwith a specific nested structure containingscenario,status, andoutcomefields. Usingobject[]is too permissive and loses type safety.Consider defining a more specific type:
+export interface DetailedBreakdownItem { + scenario: string; + status: string; + outcome: string; +} + export interface StructuredSummary { overall_summary: string, key_findings: string[], recommendations: string[], - detailed_breakdown: object[] + detailed_breakdown: DetailedBreakdownItem[] }Based on relevant code snippets from
packages/tui/internal/tui/evaluation.go.
121-136: Minor: Inconsistent trailing commas.Lines 122-125 use trailing commas after each property, but this style is inconsistent (line 135 has a semicolon, others have commas). While both are valid TypeScript, consistent formatting improves readability.
Consider using a consistent delimiter throughout both interfaces (all commas or all semicolons, with or without trailing delimiter).
packages/sdk/src/qualifire.ts (1)
51-78: Consider using afinallyblock forclearTimeout.Currently,
clearTimeout(timeoutId)is called in both the try block (line 71) and catch block (line 73). While this works, using afinallyblock would be cleaner and ensure the timeout is always cleared regardless of the execution path.Refactor to use finally:
try { const response = await fetch(`${request.qualifireUrl}/api/evaluation/evaluate`, { method: "POST", headers: { "Content-Type": "application/json", "X-Qualifire-API-Key": request.qualifireApiKey, }, body: JSON.stringify(apiEvaluationResult), signal: controller.signal }); if (!response.ok) { const errText = await response.text(); throw new Error( `Qualifire report failed: ${response.status} ${response.statusText} - ${errText}` ); } - clearTimeout(timeoutId); } catch (error) { - clearTimeout(timeoutId); if (error instanceof Error && error.name === 'AbortError') { throw new Error('Qualifire report timed out after 3 seconds'); } throw error; + } finally { + clearTimeout(timeoutId); }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
packages/sdk/src/qualifire.ts(1 hunks)packages/sdk/src/types.ts(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
packages/sdk/src/qualifire.ts (1)
packages/sdk/src/types.ts (3)
EvaluationResult(61-65)StructuredSummary(121-126)ReportSummaryRequest(128-136)
packages/sdk/src/types.ts (2)
packages/tui/internal/tui/evaluation.go (1)
StructuredSummary(96-105)sdks/python/rogue_sdk/types.py (2)
StructuredSummary(444-450)ReportSummaryRequest(501-510)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
packages/sdk/src/qualifire.ts(1 hunks)packages/sdk/src/types.ts(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- packages/sdk/src/types.ts
🧰 Additional context used
🧬 Code graph analysis (1)
packages/sdk/src/qualifire.ts (1)
packages/sdk/src/types.ts (3)
EvaluationResult(61-65)StructuredSummary(121-126)ReportSummaryRequest(128-136)
🔇 Additional comments (1)
packages/sdk/src/qualifire.ts (1)
12-14: LGTM!The
QualifireClientOptionsinterface is well-defined with an optional logger callback, providing flexibility for callers to inject custom logging behavior.
| import { EvaluationResult, ReportSummaryRequest, StructuredSummary } from "./types"; | ||
|
|
||
| interface QualifireReportPayload { | ||
| job_id: string; | ||
| evaluations: EvaluationResult; | ||
| structured: StructuredSummary | null; | ||
| deep_test: boolean; | ||
| start_time: string; | ||
| judge_model: string | null; | ||
| } | ||
|
|
||
| export interface QualifireClientOptions { | ||
| logger?: (message: string) => void; | ||
| } | ||
|
|
||
| export class QualifireClient { | ||
| private static convertWithStructuredSummary( | ||
| evaluationResults: EvaluationResult, | ||
| request: ReportSummaryRequest | ||
| ): QualifireReportPayload { | ||
| return { | ||
| job_id: request.job_id, | ||
| evaluations: evaluationResults, | ||
| structured: request.structuredSummary || null, | ||
| deep_test: request.deepTest, | ||
| start_time: request.startTime, | ||
| judge_model: request.judgeModel || null, | ||
| }; | ||
| } | ||
|
|
||
| /** | ||
| * Reports evaluation summary to Qualifire. | ||
| * | ||
| * @param evaluationResults - The evaluation results to report | ||
| * @param request - Configuration including Qualifire URL, API key, and metadata | ||
| * @throws {Error} If the API request fails or returns a non-2xx status | ||
| * @returns A promise that resolves when the report is successfully submitted | ||
| */ | ||
| public static async reportSummaryToQualifire( | ||
| evaluationResults: EvaluationResult, | ||
| request: ReportSummaryRequest, | ||
| options?: QualifireClientOptions | ||
| ): Promise<void> { | ||
| options?.logger?.("Reporting summary to Qualifire"); | ||
|
|
||
| const apiEvaluationResult = this.convertWithStructuredSummary( | ||
| evaluationResults, | ||
| request | ||
| ); | ||
|
|
||
| const controller = new AbortController(); | ||
| const timeoutId = setTimeout(() => controller.abort(), 3000); | ||
|
|
||
| try { | ||
| const response = await fetch(`${request.qualifireUrl}/api/evaluation/evaluate`, { | ||
| method: "POST", | ||
| headers: { | ||
| "Content-Type": "application/json", | ||
| "X-Qualifire-API-Key": request.qualifireApiKey, | ||
| }, | ||
| body: JSON.stringify(apiEvaluationResult), | ||
| signal: controller.signal | ||
| }); | ||
|
|
||
| if (!response.ok) { | ||
| const errText = await response.text(); | ||
| throw new Error( | ||
| `Qualifire report failed: ${response.status} ${response.statusText} - ${errText}` | ||
| ); | ||
| } | ||
| clearTimeout(timeoutId); | ||
| } catch (error) { | ||
| clearTimeout(timeoutId); | ||
| if (error instanceof Error && error.name === 'AbortError') { | ||
| throw new Error('Qualifire report timed out after 3 seconds'); | ||
| } | ||
| throw error; | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Verify duplicate implementation has been removed.
Past reviews flagged a duplicate QualifireClient implementation in packages/sdk/src/quatifier.ts. Ensure the duplicate file has been removed and all imports reference the canonical qualifire.ts module.
Run the following script to verify:
🏁 Script executed:
#!/bin/bash
# Check if quatifier.ts still exists and search for any references to it
# Search for quatifier.ts file
echo "=== Checking for quatifier.ts file ==="
fd -e ts quatifier
# Search for imports from quatifier module
echo "=== Checking for imports from quatifier module ==="
rg -n --type=ts "from ['\"].*quatifier['\"]"
# Search for any references to quatifier in the codebase
echo "=== Checking for any references to 'quatifier' ==="
rg -n --type=ts "quatifier"Length of output: 249
Remove duplicated quatifier.ts and consolidate imports
The file packages/sdk/src/quatifier.ts remains, duplicating QualifireClient. Delete this file and update any imports or references (including doc comments) to use packages/sdk/src/qualifire.ts.
🤖 Prompt for AI Agents
In packages/sdk/src/qualifire.ts lines 1 to 80, remove the duplicate file
packages/sdk/src/quatifier.ts and update all imports/references/docs to point to
packages/sdk/src/qualifire.ts (correct filename spelling) across the repository;
search for any import paths or JSDoc references like "./quatifier" or
"quatifier" and replace them with "./qualifire" (preserve relative paths and
casing), then delete the duplicate quatifier.ts file and run tests/build to
ensure no unresolved imports remain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ignorant05 please fix this one
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!
| interface QualifireReportPayload { | ||
| job_id: string; | ||
| evaluations: EvaluationResult; | ||
| structured: StructuredSummary | null; | ||
| deep_test: boolean; | ||
| start_time: string; | ||
| judge_model: string | null; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Type mismatch: deep_test and start_time should be nullable.
The QualifireReportPayload interface defines deep_test: boolean and start_time: string as required fields. However, the source fields in ReportSummaryRequest are optional (deepTest?: boolean, startTime?: string). This creates a type mismatch where undefined values can be assigned to non-nullable fields in the payload, violating the type contract and potentially causing API errors.
Apply this diff to make the fields nullable:
interface QualifireReportPayload {
job_id: string;
evaluations: EvaluationResult;
structured: StructuredSummary | null;
- deep_test: boolean;
- start_time: string;
+ deep_test: boolean | null;
+ start_time: string | null;
judge_model: string | null;
}Then update the mapping in convertWithStructuredSummary:
return {
job_id: request.job_id,
evaluations: evaluationResults,
structured: request.structuredSummary || null,
- deep_test: request.deepTest,
- start_time: request.startTime,
+ deep_test: request.deepTest ?? null,
+ start_time: request.startTime ?? null,
judge_model: request.judgeModel || null,
};Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In packages/sdk/src/qualifire.ts around lines 3 to 10, the
QualifireReportPayload currently declares deep_test: boolean and start_time:
string but the source ReportSummaryRequest has deepTest?: boolean and
startTime?: string, causing a type mismatch; change the interface to make
deep_test: boolean | null and start_time: string | null, and then update
convertWithStructuredSummary to map optional fields defensively (deep_test:
source.deepTest ?? null and start_time: source.startTime ?? null) so undefined
from the source becomes null in the payload.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ignorant05 I think this comment is correct
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!
packages/sdk/src/qualifire.ts
Outdated
| const response = await fetch(`${request.qualifireUrl}/api/evaluation/evaluate`, { | ||
| method: "POST", | ||
| headers: { | ||
| "Content-Type": "application/json", | ||
| "X-Qualifire-API-Key": request.qualifireApiKey, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Validate required fields before API call.
The method uses request.qualifireUrl and request.qualifireApiKey without validation, even though both are optional in ReportSummaryRequest. If either is undefined, the API call will fail with unclear error messages (malformed URL or invalid authentication header).
Add validation at the beginning of the method:
public static async reportSummaryToQualifire(
evaluationResults: EvaluationResult,
request: ReportSummaryRequest,
options?: QualifireClientOptions
): Promise<void> {
+ if (!request.qualifireUrl) {
+ throw new Error('qualifireUrl is required for reporting to Qualifire');
+ }
+ if (!request.qualifireApiKey) {
+ throw new Error('qualifireApiKey is required for reporting to Qualifire');
+ }
+
options?.logger?.("Reporting summary to Qualifire");📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const response = await fetch(`${request.qualifireUrl}/api/evaluation/evaluate`, { | |
| method: "POST", | |
| headers: { | |
| "Content-Type": "application/json", | |
| "X-Qualifire-API-Key": request.qualifireApiKey, | |
| public static async reportSummaryToQualifire( | |
| evaluationResults: EvaluationResult, | |
| request: ReportSummaryRequest, | |
| options?: QualifireClientOptions | |
| ): Promise<void> { | |
| if (!request.qualifireUrl) { | |
| throw new Error('qualifireUrl is required for reporting to Qualifire'); | |
| } | |
| if (!request.qualifireApiKey) { | |
| throw new Error('qualifireApiKey is required for reporting to Qualifire'); | |
| } | |
| options?.logger?.("Reporting summary to Qualifire"); | |
| const response = await fetch(`${request.qualifireUrl}/api/evaluation/evaluate`, { | |
| method: "POST", | |
| headers: { | |
| "Content-Type": "application/json", | |
| "X-Qualifire-API-Key": request.qualifireApiKey, | |
| }, | |
| // ...rest of payload... | |
| }); | |
| // ...error handling... | |
| } |
🤖 Prompt for AI Agents
In packages/sdk/src/qualifire.ts around lines 55 to 59, the code calls the
Qualifire API using request.qualifireUrl and request.qualifireApiKey without
validating them; add an early validation block that ensures both qualifireUrl
and qualifireApiKey are present and non-empty (e.g., if (!request.qualifireUrl
|| !request.qualifireApiKey) throw new Error("Qualifire URL and API key are
required")), and return or throw a clear, specific error before attempting the
fetch so the caller receives an immediate, actionable message instead of a
malformed request or header failure.
drorIvry
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good but there are some small issues I'd like to be fixed before merging it
packages/sdk/src/qualifire.ts
Outdated
| method: "POST", | ||
| headers: { | ||
| "Content-Type": "application/json", | ||
| "X-qualifire-key": request.qualifireApiKey, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ignorant05 I think this is the wrong header
| interface QualifireReportPayload { | ||
| job_id: string; | ||
| evaluations: EvaluationResult; | ||
| structured: StructuredSummary | null; | ||
| deep_test: boolean; | ||
| start_time: string; | ||
| judge_model: string | null; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ignorant05 I think this comment is correct
packages/sdk/src/quatifier.ts
Outdated
| ); | ||
|
|
||
| const controller = new AbortController(); | ||
| const timeoutId = setTimeout(() => controller.abort(), 3000); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This timeout is way too short please change it to 30 seconds
| import { EvaluationResult, ReportSummaryRequest, StructuredSummary } from "./types"; | ||
|
|
||
| interface QualifireReportPayload { | ||
| job_id: string; | ||
| evaluations: EvaluationResult; | ||
| structured: StructuredSummary | null; | ||
| deep_test: boolean; | ||
| start_time: string; | ||
| judge_model: string | null; | ||
| } | ||
|
|
||
| export interface QualifireClientOptions { | ||
| logger?: (message: string) => void; | ||
| } | ||
|
|
||
| export class QualifireClient { | ||
| private static convertWithStructuredSummary( | ||
| evaluationResults: EvaluationResult, | ||
| request: ReportSummaryRequest | ||
| ): QualifireReportPayload { | ||
| return { | ||
| job_id: request.job_id, | ||
| evaluations: evaluationResults, | ||
| structured: request.structuredSummary || null, | ||
| deep_test: request.deepTest, | ||
| start_time: request.startTime, | ||
| judge_model: request.judgeModel || null, | ||
| }; | ||
| } | ||
|
|
||
| /** | ||
| * Reports evaluation summary to Qualifire. | ||
| * | ||
| * @param evaluationResults - The evaluation results to report | ||
| * @param request - Configuration including Qualifire URL, API key, and metadata | ||
| * @throws {Error} If the API request fails or returns a non-2xx status | ||
| * @returns A promise that resolves when the report is successfully submitted | ||
| */ | ||
| public static async reportSummaryToQualifire( | ||
| evaluationResults: EvaluationResult, | ||
| request: ReportSummaryRequest, | ||
| options?: QualifireClientOptions | ||
| ): Promise<void> { | ||
| options?.logger?.("Reporting summary to Qualifire"); | ||
|
|
||
| const apiEvaluationResult = this.convertWithStructuredSummary( | ||
| evaluationResults, | ||
| request | ||
| ); | ||
|
|
||
| const controller = new AbortController(); | ||
| const timeoutId = setTimeout(() => controller.abort(), 3000); | ||
|
|
||
| try { | ||
| const response = await fetch(`${request.qualifireUrl}/api/evaluation/evaluate`, { | ||
| method: "POST", | ||
| headers: { | ||
| "Content-Type": "application/json", | ||
| "X-Qualifire-API-Key": request.qualifireApiKey, | ||
| }, | ||
| body: JSON.stringify(apiEvaluationResult), | ||
| signal: controller.signal | ||
| }); | ||
|
|
||
| if (!response.ok) { | ||
| const errText = await response.text(); | ||
| throw new Error( | ||
| `Qualifire report failed: ${response.status} ${response.statusText} - ${errText}` | ||
| ); | ||
| } | ||
| clearTimeout(timeoutId); | ||
| } catch (error) { | ||
| clearTimeout(timeoutId); | ||
| if (error instanceof Error && error.name === 'AbortError') { | ||
| throw new Error('Qualifire report timed out after 3 seconds'); | ||
| } | ||
| throw error; | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ignorant05 please fix this one
drorIvry
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please address the coderabbit issues
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is being reviewed by Cursor Bugbot
Details
You are on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle.
To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.
packages/sdk/src/qualifire.ts
Outdated
| evaluations: evaluationResults, | ||
| structured: request.structuredSummary || null, | ||
| deep_test: request.deepTest, | ||
| start_time: request.startTime, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Optional Fields Assigned to Required Fields
The convertWithStructuredSummary method assigns optional deepTest and startTime fields from ReportSummaryRequest directly to required fields in QualifireReportPayload. This can result in undefined values being sent for fields expecting boolean or string, leading to an invalid API payload.
packages/sdk/src/qualifire.ts
Outdated
| method: "POST", | ||
| headers: { | ||
| "Content-Type": "application/json", | ||
| "X-Qualifire-API-Key": request.qualifireApiKey, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Undefined API Key and URL Cause Fetch Failures
The qualifireApiKey and qualifireUrl fields are optional in ReportSummaryRequest but are used directly in the fetch call. If either is undefined, the API key header will be set to the string "undefined" and the URL will become "undefined/...", causing authentication failures and an invalid endpoint.
packages/sdk/src/qualifire.ts
Outdated
| const timeoutId = setTimeout(() => controller.abort(), 30000); | ||
|
|
||
| try { | ||
| const response = await fetch(`${request.qualifireUrl}/api/evaluation/evaluate`, { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2ca5a0f to
d9d3273
Compare
Already done from days ago. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
packages/sdk/src/types.ts (2)
121-126: Remove trailing commas and consider strengtheningdetailed_breakdowntype.The trailing commas after each property are inconsistent with the existing code style in this file (see lines 29-119 where interfaces use no trailing punctuation). Additionally,
object[]fordetailed_breakdownis loosely typed—the Go implementation shows it has a specific structure withscenario,status, andoutcomefields.Apply this diff to fix the style consistency:
export interface StructuredSummary { - overall_summary: string, - key_findings: string[], - recommendations: string[], - detailed_breakdown: object[] + overall_summary: string + key_findings: string[] + recommendations: string[] + detailed_breakdown: object[] }Optional enhancement: Define a more specific type for
detailed_breakdown:export interface DetailedBreakdownItem { scenario: string status: string outcome: string } export interface StructuredSummary { overall_summary: string key_findings: string[] recommendations: string[] detailed_breakdown: DetailedBreakdownItem[] }
128-136: Remove trailing punctuation for consistency.The mixed trailing punctuation (comma on line 129, semicolons on lines 130-136) is inconsistent with the existing code style in this file, where interfaces use no trailing punctuation after properties.
Apply this diff:
export interface ReportSummaryRequest { - job_id: string, - qualifireUrl?: string; - qualifireApiKey?: string; - structuredSummary?: StructuredSummary; - deepTest?: boolean; - startTime?: string; - judgeModel?: string; + job_id: string + qualifireUrl?: string + qualifireApiKey?: string + structuredSummary?: StructuredSummary + deepTest?: boolean + startTime?: string + judgeModel?: string }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
packages/sdk/src/qualifire.ts(1 hunks)packages/sdk/src/types.ts(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- packages/sdk/src/qualifire.ts
🧰 Additional context used
🧬 Code graph analysis (1)
packages/sdk/src/types.ts (2)
packages/tui/internal/tui/evaluation.go (1)
StructuredSummary(96-105)sdks/python/rogue_sdk/types.py (2)
StructuredSummary(444-450)ReportSummaryRequest(501-510)
|
I believe that your PR does not do what it should. We should:
I think in the context of this PR we should address passing the api-key to the rogue server, and in subsequent PRs address the imlementation in the typescript and python sdk to report directly from there. |
|
@yuval-qf that is actually wrong, i mean look at the issue and it's description it's just a mimicking task, not a redesigning task. you can actually create another issue for that if you want tho. |
Hi. Please ensure all communication follows our Code of Conduct. Let's keep the tone respectful as we collaborate on this. Regarding the implementation: Specifically, the PR needs to:
Please update the PR to use the correct API endpoint as needed to align with the existing SDK structure. |
* Add protocol and transport settings to tui * Fix mcp example command * Fix log_evaluation kwargs
Sorry for that, i actually forgot to fix the path (i used it just as a place holder till i see what the real path is and i thought that i changed it). As for the others i'm sure that i made my changes long ago, feel free to address me if you find anything wrong with the code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
rogue/evaluator_agent/base_evaluator_agent.py (1)
362-371: Fix tool signature to avoid runtime TypeError (scenario_type currently required).The tool docs expose
_log_evaluation(scenario, context_id, evaluation_passed, reason), but the function requiresscenario_type. Calls following the docs will crash. Makescenario_typeoptional and type**kwargs. Also widenscenarioinput type. As per coding guidelines.- def _log_evaluation( - self, - scenario: dict[str, str], - context_id: str, - evaluation_passed: bool, - reason: str, - scenario_type: Optional[str], - **kwargs, - ) -> None: + def _log_evaluation( + self, + scenario: dict[str, Any] | str, + context_id: str, + evaluation_passed: bool, + reason: str, + scenario_type: Optional[str] = None, + **kwargs: Any, + ) -> None:packages/tui/internal/tui/keyboard_controller.go (1)
137-151: Bug: JudgeModel "/" input is unreachable.Outer condition limits to currentField <= 1, but switch handles case 3; case 3 never matches. Fix condition.
- if m.currentScreen == NewEvaluationScreen && m.evalState != nil && m.evalState.currentField <= 1 { + if m.currentScreen == NewEvaluationScreen && m.evalState != nil && + (m.evalState.currentField == 0 || m.evalState.currentField == 3) {packages/tui/internal/tui/eval_form_controller.go (1)
99-124: Backspace: AgentURL unreachable + panic at cursorPos==0
- Outer guard currentField > 0 blocks case 0, so backspacing AgentURL never works.
- Slicing with cursorPos-1 when cursorPos==0 panics.
- ParallelRuns case (index 6) is unrelated to this form and complicates logic.
Apply:
- if m.evalState.currentField > 0 { - switch m.evalState.currentField { + if m.evalState.currentField >= 0 { + switch m.evalState.currentField { case 0: // AgentURL - runes := []rune(m.evalState.AgentURL) - if m.evalState.cursorPos <= len(runes) { - m.evalState.AgentURL = string(runes[:m.evalState.cursorPos-1]) + string(runes[m.evalState.cursorPos:]) - m.evalState.cursorPos-- - } + pos := m.evalState.cursorPos + runes := []rune(m.evalState.AgentURL) + if pos > 0 && pos <= len(runes) { + m.evalState.AgentURL = string(runes[:pos-1]) + string(runes[pos:]) + m.evalState.cursorPos-- + } case 3: // JudgeModel - runes := []rune(m.evalState.JudgeModel) - if m.evalState.cursorPos <= len(runes) { - m.evalState.JudgeModel = string(runes[:m.evalState.cursorPos-1]) + string(runes[m.evalState.cursorPos:]) - m.evalState.cursorPos-- - } - case 6: // ParallelRuns (special handling for numbers) - if m.evalState.ParallelRuns >= 10 { - m.evalState.ParallelRuns /= 10 - m.evalState.cursorPos-- - } else if m.evalState.ParallelRuns > 0 { - m.evalState.ParallelRuns = 1 // Don't allow 0 - m.evalState.cursorPos = 0 - } + pos := m.evalState.cursorPos + runes := []rune(m.evalState.JudgeModel) + if pos > 0 && pos <= len(runes) { + m.evalState.JudgeModel = string(runes[:pos-1]) + string(runes[pos:]) + m.evalState.cursorPos-- + } } return m, nil }
♻️ Duplicate comments (1)
packages/sdk/src/qualifire.ts (1)
3-10: Make deep_test and start_time nullable to reflect optional inputs.Inputs are optional; forcing defaults may misrepresent data. Prefer nulls when absent and map accordingly.
interface QualifireReportPayload { job_id: string; evaluations: EvaluationResult; structured: StructuredSummary | null; - deep_test: boolean; - start_time: string; + deep_test: boolean | null; + start_time: string | null; judge_model: string | null; } return { job_id: request.job_id, evaluations: evaluationResults, structured: request.structuredSummary || null, - deep_test: request.deepTest ?? false, - start_time: request.startTime ?? new Date().toISOString(), + deep_test: request.deepTest ?? null, + start_time: request.startTime ?? null, judge_model: request.judgeModel || null, };Also applies to: 21-28
🧹 Nitpick comments (8)
packages/sdk/src/qualifire.ts (1)
63-91: Minor: move clearTimeout to finally for single-exit cleanup.Functionally equivalent but simpler.
- try { + try { const response = await fetch(endpoint, { method: "POST", headers: { "Content-Type": "application/json", "X-Qualifire-API-Key": apiKey, }, body: JSON.stringify(apiEvaluationResult), signal: controller.signal }); if (!response.ok) { const errText = await response.text(); throw new Error( `Qualifire report failed: ${response.status} ${response.statusText} - ${errText}` ); } - clearTimeout(timeoutId); } catch (error) { - clearTimeout(timeoutId); if (error instanceof Error && error.name === 'AbortError') { throw new Error('Qualifire report timed out after 30 seconds'); } throw error; + } finally { + clearTimeout(timeoutId); }packages/tui/internal/tui/common_controller.go (1)
203-214: Initialize protocol/transport from user config (follow TODO).Reading defaults from .rogue/user_config.json will improve UX and parity with Python SDK behavior.
packages/tui/internal/tui/evaluation.go (3)
426-432: Strengthen URL validation (scheme/host).url.Parse accepts many non-URLs. Validate scheme and host to avoid misfires.
- if _, err := url.Parse(serverURL); err != nil { + if u, err := url.Parse(serverURL); err != nil || u.Scheme == "" || u.Host == "" { return nil, nil, fmt.Errorf("invalid server url: %w", err) } - if _, err := url.Parse(agentURL); err != nil { + if u, err := url.Parse(agentURL); err != nil || u.Scheme == "" || u.Host == "" { return nil, nil, fmt.Errorf("invalid agent url: %w", err) }
19-25: AuthType string values: align with Python (“bearer_token”).Python uses AuthType.BEARER_TOKEN = "bearer_token". Consider matching to avoid backend mismatches even if unused now.
-const ( - AuthTypeNoAuth AuthType = "no_auth" - AuthTypeBearer AuthType = "bearer" - AuthTypeBasic AuthType = "basic" -) +const ( + AuthTypeNoAuth AuthType = "no_auth" + AuthTypeBearer AuthType = "bearer_token" + AuthTypeBasic AuthType = "basic" +)
295-356: State mutation from goroutine: consider routing via Tea messages.The goroutine appends to st.Events and updates st fields concurrently with UI reads. Route events through the Bubble Tea Update loop to avoid data races and ensure timely re-render.
packages/tui/internal/tui/eval_form_controller.go (3)
85-88: Comment drift: DeepTest indexCode checks index 4; comment says “now index 2”. Update comment to avoid confusion.
- if m.evalState.currentField == 4 { // DeepTest field is now index 2 + if m.evalState.currentField == 4 { // DeepTest field index 4
126-149: Dead code: ParallelRuns path is unreachable hereField index range is 0–5, yet default insertion handles case 6 (ParallelRuns) and uses clampToInt. Either wire ParallelRuns as a real field or drop this branch from the form handler to reduce confusion.
Option A (remove now, reintroduce when needed):
- case 6: // ParallelRuns (numeric only) - if s[0] >= '0' && s[0] <= '9' { - numStr := fmt.Sprintf("%d", m.evalState.ParallelRuns) - runes := []rune(numStr) - newNumStr := string(runes[:m.evalState.cursorPos]) + s + string(runes[m.evalState.cursorPos:]) - m.evalState.ParallelRuns = clampToInt(newNumStr) - m.evalState.cursorPos++ - }Option B: If ParallelRuns must stay, extract field indices into named constants and ensure navigation bounds include it.
37-49: Replace magic field indices with named constantsIndices 0..5 are spread across navigation and handlers. Define constants to prevent drift and off-by-one mistakes as fields evolve.
Add near imports:
+const ( + fieldAgentURL = iota + fieldProtocol + fieldTransport + fieldJudgeModel + fieldDeepTest + fieldStartButton +)Then replace literals: 0→fieldAgentURL, 1→fieldProtocol, 2→fieldTransport, 3→fieldJudgeModel, 4→fieldDeepTest, 5→fieldStartButton; and use currentField < fieldStartButton in the “down” bound.
Also applies to: 52-63, 84-97
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (12)
.github/workflows/rogue.yml(2 hunks).gitignore(1 hunks).vscode/launch.json(1 hunks)examples/mcp/tshirt_store_langgraph_mcp/__main__.py(1 hunks)packages/sdk/src/qualifire.ts(1 hunks)packages/tui/internal/tui/common_controller.go(2 hunks)packages/tui/internal/tui/eval_form.go(5 hunks)packages/tui/internal/tui/eval_form_controller.go(4 hunks)packages/tui/internal/tui/eval_ui.go(4 hunks)packages/tui/internal/tui/evaluation.go(3 hunks)packages/tui/internal/tui/keyboard_controller.go(3 hunks)rogue/evaluator_agent/base_evaluator_agent.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py: Format Python code with Black
Ensure code passes flake8 linting
Run mypy with the repository configuration for static typing
Run Bandit security checks using .bandit.yaml configuration
Use isort import conventions for import ordering
Add type hints to all function signatures
Follow PEP 8 naming (snake_case for variables/functions, PascalCase for classes)
Use try/except around code that may raise exceptions
Files:
rogue/evaluator_agent/base_evaluator_agent.pyexamples/mcp/tshirt_store_langgraph_mcp/__main__.py
🧬 Code graph analysis (7)
packages/sdk/src/qualifire.ts (1)
packages/sdk/src/types.ts (3)
EvaluationResult(61-65)StructuredSummary(121-126)ReportSummaryRequest(128-136)
packages/tui/internal/tui/eval_ui.go (2)
sdks/python/rogue_sdk/types.py (2)
Protocol(68-79)Transport(82-93)packages/tui/internal/tui/evaluation.go (1)
EvalScenario(46-50)
packages/tui/internal/tui/keyboard_controller.go (1)
packages/tui/internal/tui/eval_ui.go (3)
EvaluationViewState(29-56)ProtocolA2A(13-13)TransportHTTP(25-25)
packages/tui/internal/tui/evaluation.go (2)
packages/tui/internal/tui/eval_ui.go (2)
Protocol(10-10)Transport(17-17)sdks/python/rogue_sdk/types.py (3)
Protocol(68-79)Transport(82-93)AuthType(23-48)
packages/tui/internal/tui/common_controller.go (1)
packages/tui/internal/tui/eval_ui.go (3)
EvaluationViewState(29-56)ProtocolA2A(13-13)TransportHTTP(25-25)
packages/tui/internal/tui/eval_form_controller.go (1)
packages/tui/internal/components/llm_config_dialog.go (1)
NewLLMConfigDialog(183-270)
packages/tui/internal/tui/eval_form.go (1)
packages/tui/internal/styles/styles.go (3)
NewStyle(45-47)TextMuted(22-22)Primary(13-13)
🔇 Additional comments (23)
.gitignore (1)
93-95: Verify the debug artifact pattern matches Go debugger output.The pattern
packages/tui/cmd/rogue/__debug*appears designed to ignore Go debugger artifacts (e.g.,__debug_binfrom Delve). Confirm that this pattern correctly captures all debug artifacts generated during local development with the Go debugger.Can you verify:
- What debug artifacts are actually generated when debugging the Rogue TUI with Go?
- Whether the
__debug*pattern correctly matches them?- Whether the scope should be broader (e.g., recursive
**/__debug*) or remain directory-specific?Reference: The pattern placement is consistent with the file structure, but its effectiveness depends on the actual artifact naming conventions of your Go debugger setup.
.vscode/launch.json (1)
66-73: LGTM: useful TUI debug config added.No impact on runtime or security.
examples/mcp/tshirt_store_langgraph_mcp/__main__.py (1)
15-21: LGTM: input validation via Click Choice.Keeps CLI robust and matches the typed Literal union.
.github/workflows/rogue.yml (2)
4-5: Trigger change to pull_request is safer than pull_request_target.Good move for PR security.
14-16: Nice guard against untrusted forks.Job only runs for same-repo PRs or non-PR events. Sensible hardening.
packages/sdk/src/qualifire.ts (1)
1-1: Duplicatequatifier.tssuccessfully removed—no stray imports or references detected.The verification confirms the codebase is clean: no
quatifier.tsfile exists, and no imports or references to it remain.packages/tui/internal/tui/eval_form.go (4)
40-85: Text field renderer: solid UX with inline cursor.Nice rune-aware cursor insertion and active-state styling.
86-124: Dropdown field renderer looks good.Clear focus affordance and indicators; consistent with text/toggle fields.
125-161: Toggle field renderer: LGTM.Consistent styling and focus treatment.
174-174: Start button index (5) is consistently used across all handlers.Verification confirms no issues:
eval_form.go:174renders with index 5 ✓keyboard_controller.go:272handles enter with index 5 ✓- Navigation bounds (
eval_form_controller.go:38) confirm max field is 5 ✓- All references to other indices (3 for Judge LLM, 4 for DeepTest) are also consistent ✓
packages/tui/internal/tui/evaluation.go (2)
35-44: JSON contract additions: protocol/transport look correct.Fields and tags align with the Python-side types; good step toward parity.
446-454: Hard-coded InterviewMode=true: verify matches server expectations.If interview mode isn’t always desired, expose it in UI or derive from scenarios.
packages/tui/internal/tui/keyboard_controller.go (2)
80-91: Defaults for protocol/transport: LGTM.Sensible defaults (a2a/http) with cursor at Agent URL end.
272-276: Enter handling with new indices: OK.Start button (5) and Judge LLM (3) paths are correct and match UI.
packages/tui/internal/tui/eval_ui.go (5)
10-27: Protocol/Transport enums match Python values.String values align with Python SDK; good for cross-SDK parity.
30-37: State extensions for protocol/transport: LGTM.Fields are clearly named and integrated across UI and evaluation flow.
97-97: Wiring protocol/transport into StartEvaluation: LGTM.End-to-end parameter propagation is correct.
147-193: Protocol cycling resets transport to a valid default.Correctly rebinds transport to a valid option on protocol change.
195-223: Transport cycling constrained to valid set.Wrap-around logic is correct.
packages/tui/internal/tui/common_controller.go (1)
272-276: Field indices verified and correctly aligned across all handlers.The verification confirms that all enter handlers consistently use field indices 3 and 5:
- Field 5: Start button (keyboard_controller.go:272, eval_form.go:174)
- Field 3: Judge LLM field (keyboard_controller.go:276, eval_form_controller.go:92)
The code in common_controller.go correctly updates JudgeModel when on the evaluation screen, maintaining consistency with the form layout.
packages/tui/internal/tui/eval_form_controller.go (3)
129-135: Insertion paths look goodRune-safe insertion for AgentURL and JudgeModel is consistent.
Confirm that cursorPos is always clamped to [0, runeCount] when focusing text fields. If not, add a small clamp to prevent out-of-range inserts in future edits.
Also applies to: 136-138
90-96: LLM config dialog trigger looks correctTab key opens dialog only on JudgeModel; wiring matches components.NewLLMConfigDialog signature.
6-8: Import path is correct—no action neededThe import
github.com/rogue/tui/internal/componentsis valid. The module path declared inpackages/tui/go.modisgithub.com/rogue/tui, and thecomponentspackage resides atpackages/tui/internal/components/, making the import path correct.Likely an incorrect or invalid review comment.
| const apiKey = request.qualifireApiKey; | ||
| const baseUrl = request.qualifireUrl ?? "https://api.qualifire.com"; | ||
| const endpoint = `${baseUrl}/api/v1/evaluations`; | ||
|
|
||
| if (!apiKey) { | ||
| throw new Error("qualifireApiKey is required but was undefined"); | ||
| } | ||
|
|
||
| if (!baseUrl || baseUrl === "undefined") { | ||
| throw new Error("Invalid qualifireUrl provided"); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Design mismatch with PR objective: route via Rogue server, not direct Qualifire.
This class posts directly to Qualifire (/api/v1/evaluations). The stated goal is to mimic the Python SDK and pass the Qualifire API key to the Rogue server’s summary endpoint. Please change this to call the Rogue server endpoint and forward the key, or gate direct reporting behind an explicit opt-in and default to server routing for this PR.
Would you run the script below to locate the server summary endpoint and confirm the expected header so we can align the client?
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== Find Rogue summary endpoint and Qualifire service usage =="
fd -a llm.py rogue/server | xargs -I{} rg -n -C3 -e 'router\.(post|get|put)\(' -e 'summary' {}
rg -n -C3 'qualifire' rogue/server || true
rg -n -C3 'X-Qualifire' rogue/server || trueLength of output: 8186
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== Check SDK qualifire client implementation =="
fd -a "qualifire.ts|qualifier.ts" packages/sdk/src
echo ""
echo "== Find request models for server /summary endpoint =="
rg -n -A10 "class SummaryGenerationRequest|SummaryGenerationRequest =" rogue/server --type py
echo ""
echo "== Check what auth/headers are expected by Rogue /summary endpoint =="
rg -n -B5 -A15 "@router.post\(\s*\"/summary\"" rogue/server --type pyLength of output: 152
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== SDK qualifire.ts current implementation =="
cat /home/jailuser/git/packages/sdk/src/qualifire.ts
echo ""
echo "== Rogue server request models =="
fd -a "models.py|schemas.py|types.py" rogue/server/api rogue/server/schemas 2>/dev/null | head -5 | xargs -I{} sh -c 'echo "=== {} ===" && head -100 {}'
echo ""
echo "== Search for SummaryGenerationRequest in all Python files =="
rg -n "SummaryGenerationRequest" rogue/server --type py -B2 -A5Length of output: 3865
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== Find SummaryGenerationRequest definition =="
rg -n "class SummaryGenerationRequest|SummaryGenerationRequest\s*=" rogue/server -A15 --type py
echo ""
echo "== Check import source for SummaryGenerationRequest =="
rg -n "from.*import.*SummaryGenerationRequest" rogue/server --type pyLength of output: 46
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== Search for Summary in rogue module structure =="
fd -a ".py$" rogue/server | xargs rg -l "SummaryGenerationRequest" | head -10
echo ""
echo "== Find all python files in rogue/server/api =="
fd -a ".py$" rogue/server/api
echo ""
echo "== Check rogue/server/schemas or similar =="
fd -a "schema|model" rogue/server --type dLength of output: 463
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== List models directory =="
fd -a ".py$" rogue/server/models
echo ""
echo "== Search for SummaryGenerationRequest in models =="
rg -n "SummaryGenerationRequest" rogue/server/models -A10
echo ""
echo "== Check rogue/server/api/__init__.py for exports =="
cat rogue/server/api/__init__.pyLength of output: 184
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== Check models/__init__.py =="
cat rogue/server/models/__init__.py
echo ""
echo "== Check models/api_format.py =="
head -150 rogue/server/models/api_format.py
echo ""
echo "== Broader search for SummaryGenerationRequest =="
rg -n "SummaryGenerationRequest" rogue/ -A5 | head -50Length of output: 2608
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== Check llm.py imports (first 30 lines) =="
head -30 rogue/server/api/llm.py
echo ""
echo "== Search for SummaryGenerationRequest definition anywhere in repo =="
rg -n "class SummaryGenerationRequest|SummaryGenerationRequest.*=" --type pyLength of output: 1043
Design mismatch confirmed: SDK posts directly to Qualifire instead of routing through Rogue server.
The SDK calls ${baseUrl}/api/v1/evaluations directly (packages/sdk/src/qualifire.ts:48). Per the PR objective, it should POST to the Rogue server's /llm/summary endpoint instead, which internally handles Qualifire reporting. This aligns with how the Python SDK works and keeps the server in control of credential forwarding.
🤖 Prompt for AI Agents
In packages/sdk/src/qualifire.ts around lines 46 to 56, the SDK currently
constructs and posts directly to Qualifire (`${baseUrl}/api/v1/evaluations`);
change this to post to the Rogue server's `/llm/summary` endpoint instead (use
request.rogueUrl as the base, e.g. `${request.rogueUrl}/llm/summary`, with a
sensible default or raise if missing). Update the request payload to match the
Rogue `/llm/summary` contract and forward any needed Qualifire
credentials/metadata in the body or headers so Rogue can handle reporting to
Qualifire; remove the direct Qualifire endpoint logic and keep validation for
the rogue URL instead of qualifireUrl.
| switch m.evalState.currentField { | ||
| case 0: // Agent URL | ||
| // Insert at cursor position | ||
| runes := []rune(m.evalState.AgentURL) | ||
| m.evalState.AgentURL = string(runes[:m.evalState.cursorPos]) + cleanText + string(runes[m.evalState.cursorPos:]) | ||
| m.evalState.cursorPos += len([]rune(cleanText)) | ||
| case 3: // Judge Model | ||
| // Insert at cursor position | ||
| runes := []rune(m.evalState.JudgeModel) | ||
| m.evalState.JudgeModel = string(runes[:m.evalState.cursorPos]) + cleanText + string(runes[m.evalState.cursorPos:]) | ||
| m.evalState.cursorPos += len([]rune(cleanText)) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clamp cursorPos on paste to prevent panic.
If cursorPos > len(runes), slicing will panic. Clamp before insertion.
- // Only paste into text fields (Agent URL and Judge Model)
+ // Only paste into text fields (Agent URL and Judge Model)
switch m.evalState.currentField {
case 0: // Agent URL
- // Insert at cursor position
- runes := []rune(m.evalState.AgentURL)
+ // Insert at cursor position
+ runes := []rune(m.evalState.AgentURL)
+ if m.evalState.cursorPos > len(runes) {
+ m.evalState.cursorPos = len(runes)
+ }
m.evalState.AgentURL = string(runes[:m.evalState.cursorPos]) + cleanText + string(runes[m.evalState.cursorPos:])
m.evalState.cursorPos += len([]rune(cleanText))
case 3: // Judge Model
- // Insert at cursor position
- runes := []rune(m.evalState.JudgeModel)
+ // Insert at cursor position
+ runes := []rune(m.evalState.JudgeModel)
+ if m.evalState.cursorPos > len(runes) {
+ m.evalState.cursorPos = len(runes)
+ }
m.evalState.JudgeModel = string(runes[:m.evalState.cursorPos]) + cleanText + string(runes[m.evalState.cursorPos:])
m.evalState.cursorPos += len([]rune(cleanText))
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| switch m.evalState.currentField { | |
| case 0: // Agent URL | |
| // Insert at cursor position | |
| runes := []rune(m.evalState.AgentURL) | |
| m.evalState.AgentURL = string(runes[:m.evalState.cursorPos]) + cleanText + string(runes[m.evalState.cursorPos:]) | |
| m.evalState.cursorPos += len([]rune(cleanText)) | |
| case 3: // Judge Model | |
| // Insert at cursor position | |
| runes := []rune(m.evalState.JudgeModel) | |
| m.evalState.JudgeModel = string(runes[:m.evalState.cursorPos]) + cleanText + string(runes[m.evalState.cursorPos:]) | |
| m.evalState.cursorPos += len([]rune(cleanText)) | |
| } | |
| switch m.evalState.currentField { | |
| case 0: // Agent URL | |
| // Insert at cursor position | |
| runes := []rune(m.evalState.AgentURL) | |
| if m.evalState.cursorPos > len(runes) { | |
| m.evalState.cursorPos = len(runes) | |
| } | |
| m.evalState.AgentURL = string(runes[:m.evalState.cursorPos]) + cleanText + string(runes[m.evalState.cursorPos:]) | |
| m.evalState.cursorPos += len([]rune(cleanText)) | |
| case 3: // Judge Model | |
| // Insert at cursor position | |
| runes := []rune(m.evalState.JudgeModel) | |
| if m.evalState.cursorPos > len(runes) { | |
| m.evalState.cursorPos = len(runes) | |
| } | |
| m.evalState.JudgeModel = string(runes[:m.evalState.cursorPos]) + cleanText + string(runes[m.evalState.cursorPos:]) | |
| m.evalState.cursorPos += len([]rune(cleanText)) | |
| } |
🤖 Prompt for AI Agents
In packages/tui/internal/tui/common_controller.go around lines 48 to 59, the
code slices runes using m.evalState.cursorPos which can be > len(runes) and
cause a panic on paste; clamp cursorPos to the valid range before slicing (e.g.,
if cursorPos < 0 set to 0, if cursorPos > len(runes) set to len(runes)), then
perform the insert using the clamped index and update cursorPos by adding the
length of cleanText in runes; apply this fix in both the AgentURL and JudgeModel
cases.
| switch m.evalState.currentField { | ||
| case 0: // AgentURL text field | ||
| fieldLen := len(m.evalState.AgentURL) | ||
| if m.evalState.cursorPos < fieldLen { | ||
| m.evalState.cursorPos++ | ||
| } | ||
| case 1: // Protocol dropdown | ||
| m.evalState.cycleProtocol(false) // cycle forwards | ||
| case 2: // Transport dropdown | ||
| m.evalState.cycleTransport(false) // cycle forwards | ||
| case 3: // JudgeModel text field | ||
| fieldLen := len(m.evalState.JudgeModel) | ||
| if m.evalState.cursorPos < fieldLen { | ||
| m.evalState.cursorPos++ | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cursor bounds use bytes instead of runes (fix off-by-multibyte errors)
Right-arrow compares cursorPos to len(string) in bytes; elsewhere you use rune-aware lengths. Align to runes to avoid cursor getting “stuck” on multibyte chars.
Apply:
- fieldLen := len(m.evalState.AgentURL)
+ fieldLen := len([]rune(m.evalState.AgentURL))
@@
- fieldLen := len(m.evalState.JudgeModel)
+ fieldLen := len([]rune(m.evalState.JudgeModel))📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| switch m.evalState.currentField { | |
| case 0: // AgentURL text field | |
| fieldLen := len(m.evalState.AgentURL) | |
| if m.evalState.cursorPos < fieldLen { | |
| m.evalState.cursorPos++ | |
| } | |
| case 1: // Protocol dropdown | |
| m.evalState.cycleProtocol(false) // cycle forwards | |
| case 2: // Transport dropdown | |
| m.evalState.cycleTransport(false) // cycle forwards | |
| case 3: // JudgeModel text field | |
| fieldLen := len(m.evalState.JudgeModel) | |
| if m.evalState.cursorPos < fieldLen { | |
| m.evalState.cursorPos++ | |
| } | |
| } | |
| switch m.evalState.currentField { | |
| case 0: // AgentURL text field | |
| fieldLen := len([]rune(m.evalState.AgentURL)) | |
| if m.evalState.cursorPos < fieldLen { | |
| m.evalState.cursorPos++ | |
| } | |
| case 1: // Protocol dropdown | |
| m.evalState.cycleProtocol(false) // cycle forwards | |
| case 2: // Transport dropdown | |
| m.evalState.cycleTransport(false) // cycle forwards | |
| case 3: // JudgeModel text field | |
| fieldLen := len([]rune(m.evalState.JudgeModel)) | |
| if m.evalState.cursorPos < fieldLen { | |
| m.evalState.cursorPos++ | |
| } | |
| } |
🤖 Prompt for AI Agents
In packages/tui/internal/tui/eval_form_controller.go around lines 66 to 81, the
right-arrow handler uses len(string) (bytes) to bound cursorPos for AgentURL and
JudgeModel, causing incorrect behavior with multibyte runes; replace byte-length
checks with rune-aware lengths (use utf8.RuneCountInString on the respective
string fields) so cursorPos is compared against and advanced relative to rune
count; ensure you import "unicode/utf8" if not already present.
| buttonSection := renderStartButton() | ||
|
|
||
| helpText := helpStyle.Render("t Test Server ↑/↓ switch fields ←/→ move cursor Space toggle Enter activate Esc Back") | ||
| helpText := helpStyle.Render("t Test Server ↑/↓ switch fields ←/→ move cursor/cycle dropdown Space toggle Enter activate Esc Back") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Help text advertises "t Test Server" with no visible key binding.
Either wire a 't' handler or remove/update the hint to avoid confusing users.
Apply one of:
- helpText := helpStyle.Render("t Test Server ↑/↓ switch fields ←/→ move cursor/cycle dropdown Space toggle Enter activate Esc Back")
+ helpText := helpStyle.Render("↑/↓ switch fields ←/→ move cursor/cycle dropdown Space toggle Enter activate Esc Back")Or add the actual 't' binding where appropriate.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| helpText := helpStyle.Render("t Test Server ↑/↓ switch fields ←/→ move cursor/cycle dropdown Space toggle Enter activate Esc Back") | |
| helpText := helpStyle.Render("↑/↓ switch fields ←/→ move cursor/cycle dropdown Space toggle Enter activate Esc Back") |
🤖 Prompt for AI Agents
In packages/tui/internal/tui/eval_form.go around line 251, the help text shows
"t Test Server" but there is no 't' key handler; either remove or change that
hint or implement the 't' keybinding. Fix by either updating the helpText string
to remove or alter the "t Test Server" entry to reflect actual available keys,
or add a key handler for 't' in the form input/key event handling (register the
't' shortcut, perform the Test Server action, and update UI/state and any
related focus/feedback) and ensure tests/update helpText remain consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
packages/sdk/src/qualifire.ts (1)
7-8: Consider whether non-nullable types match the API contract.The interface declares
deep_test: booleanandstart_time: stringas required, and the implementation provides defaults (falseand current ISO timestamp). If the downstream API acceptsnullfor these fields, consider making them nullable in the interface to more accurately represent the contract. Otherwise, the current approach with defaults is acceptable.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
packages/sdk/src/qualifire.ts(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
packages/sdk/src/qualifire.ts (1)
packages/sdk/src/types.ts (3)
EvaluationResult(61-65)StructuredSummary(121-126)ReportSummaryRequest(128-136)
🔇 Additional comments (4)
packages/sdk/src/qualifire.ts (4)
63-90: LGTM: Robust timeout and error handling.The implementation properly uses
AbortControllerfor the 30-second timeout, cleans up in both success and error paths, and provides clear error messages distinguishing timeout errors from other failures.
17-29: LGTM: Clean payload mapping with sensible defaults.The helper method appropriately uses the nullish coalescing operator (
??) to provide defaults. Note thatstart_timedefaults to the current timestamp if not provided, which represents the reporting time rather than the evaluation start time—ensure this aligns with your intended semantics.
31-43: LGTM: Clear documentation and appropriate method signature.The JSDoc properly documents the method's purpose, parameters, return type, and error conditions. The optional
optionsparameter matches its usage in the implementation.
46-56: The core architectural premise of this review is incorrect.The review assumes the SDK should route through a Rogue server, but the actual Python SDK implementation (in
rogue/server/services/qualifire_service.py) also posts directly to Qualifire at{request.qualifire_url}/api/rogue/v1/report, not through Rogue. Both SDKs follow the same pattern: they post directly to Qualifire's API.However, there are real issues that should be fixed to align with the Python SDK:
- Endpoint path: TypeScript uses
/llm/summarybut should use/api/rogue/v1/report(line 48)- Header name: TypeScript uses
X-Qualifire-API-Keybut should useX-qualifire-key(line 71)The recommendation to introduce
rogueUrland refactor the architecture is architecturally incorrect and should be disregarded. Instead, fix the endpoint and header to match the working Python SDK implementation.Likely an incorrect or invalid review comment.
|
@ignorant05 there has been some code sifting around, can you pease resolve the conflicts? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
packages/tui/internal/screens/evaluations/form_view.go(1 hunks)packages/tui/internal/tui/utils.go(1 hunks)rogue/evaluator_agent/base_evaluator_agent.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- rogue/evaluator_agent/base_evaluator_agent.py
🧰 Additional context used
🧬 Code graph analysis (2)
packages/tui/internal/tui/utils.go (1)
packages/tui/internal/tui/types.go (2)
EvalScenario(119-123)Scenario(100-104)
packages/tui/internal/screens/evaluations/form_view.go (1)
packages/tui/internal/styles/styles.go (3)
NewStyle(45-47)TextMuted(22-22)Primary(13-13)
Description
I just mimicked what was already done in the python sdk in ts.
Motivation and Context
Fixes #75
Type of Change