Skip to content

Commit d9f1be0

Browse files
mitchdennyMitch DennyCopilotMitch Denny
authored andcommitted
Mirror deployment E2E summary in CLI E2E recording comment (microsoft#17154)
* Mirror deployment E2E summary in CLI E2E recording comment Update the CLI E2E recording PR comment to surface a passed/failed breakdown in the same style as the Deployment E2E Tests comment, e.g. "✅ **CLI E2E Tests passed** — 12 passed, 0 failed". Counts are derived from the TRX outcomes that are already parsed, adding explicit pass and unknown tallies alongside the existing fail count. Recording upload failures stay surfaced but no longer drive the headline status, since they are unrelated to test correctness. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Embed per-row outcome in CLI E2E recording link label Surface the test outcome on the recording link itself, not only in the table's Status column, so a URL pulled out of the table still tells a reader whether it represents a pass or a failure. Failed recordings get "❌ ▶️ View failure recording", passes get "✅ ▶️ View recording", and unknowns keep "❔". Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Allow workflow_dispatch to override target PR for testing Adds a pr_number input to the manual trigger so the workflow can be exercised against a chosen PR's artifacts without depending on the open head-SHA lookup, which is needed to test changes to this workflow before they reach main. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix TRX outcome parser The previous yq query used jq-style 'if type == "!!seq"' which the Go yq lexer rejects, so test_outcomes.json was always empty and every recording showed as Unknown. The step had continue-on-error: true, so the failure stayed silent until the new pass/fail summary surfaced it. Convert TRX to JSON with yq -p xml -o json, then do all reshaping in jq where it handles single-vs-array UnitTestResult shapes correctly. yq also emits attributes as '+@<attr>' (newer) or '@<attr>' (older), so the lookup tries both forms. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Match cast files against TRX theory tests by bare method name Theory test recordings (e.g. recordings produced from [Theory] methods) are saved with their [CallerMemberName] bare method name, while TRX reports them with parameter data appended: 'Namespace.Class.Method(toolchain: "pnpm")'. The previous parser only keyed by the last dotted segment, which preserved those parens and so never matched the .cast filename, leaving theory recordings reported as 'Unknown'. Strip the parameter suffix and also key the outcome map by the bare method name and the no-parameter FQN. 'Failed' still wins over other outcomes when the same key is seen multiple times (retries, multiple theory cases), so a single failing case still surfaces as ❌ in the comment. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Thread CallerMemberName through OtelLogs core helper Both [Fact]s delegate to a private helper that calls CreateDockerTestTerminal, so [CallerMemberName] resolved to the helper name. That produced a single .cast file named after the private method with no matching TRX entry, which the comment workflow reported as 'Unknown'. Pass the public test name through explicitly so each Fact gets its own recording and matches TRX. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Replace zero pass/fail counts with recording count in fallback summary When no TRX outcomes can be matched to any recording the headline used to render '0 passed, 0 failed', which reads as 'no tests ran' rather than 'outcome data unavailable'. Switch the fallback wording to '<N> recording(s), outcomes unavailable' so the headline reflects what actually happened. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Mitch Denny <midenn@orangecake.localdomain> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: Mitch Denny <midenn@orangecake.local>
1 parent 6870770 commit d9f1be0

2 files changed

Lines changed: 111 additions & 36 deletions

File tree

.github/workflows/cli-e2e-recording-comment.yml

Lines changed: 108 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,10 @@ on:
1414
description: 'Workflow run ID to download artifacts from'
1515
required: true
1616
type: number
17+
pr_number:
18+
description: 'Optional PR number to comment on (skips the head-SHA lookup; useful for testing against merged PRs)'
19+
required: false
20+
type: number
1721

1822
jobs:
1923
add-recording-comment:
@@ -50,14 +54,21 @@ jobs:
5054
});
5155
headSha = run.data.head_sha;
5256
53-
// Find PR by head SHA
54-
const prs = await github.rest.pulls.list({
55-
owner: context.repo.owner,
56-
repo: context.repo.repo,
57-
state: 'open',
58-
head: `${context.repo.owner}:${run.data.head_branch}`
59-
});
60-
prNumber = prs.data.length > 0 ? prs.data[0].number : null;
57+
// Allow explicit PR override so we can dry-run against a merged PR's
58+
// artifacts without depending on the open-PR head-SHA lookup below.
59+
const overridePr = context.payload.inputs.pr_number;
60+
if (overridePr) {
61+
prNumber = Number(overridePr);
62+
} else {
63+
// Find PR by head SHA
64+
const prs = await github.rest.pulls.list({
65+
owner: context.repo.owner,
66+
repo: context.repo.repo,
67+
state: 'open',
68+
head: `${context.repo.owner}:${run.data.head_branch}`
69+
});
70+
prNumber = prs.data.length > 0 ? prs.data[0].number : null;
71+
}
6172
} else {
6273
// Triggered by workflow_run
6374
runId = context.payload.workflow_run.id;
@@ -164,28 +175,55 @@ jobs:
164175
continue-on-error: true
165176
shell: bash
166177
run: |
167-
# Parse TRX (XML) files to extract test method outcomes using yq (pre-installed on ubuntu-latest).
168-
# Produces a JSON map of testMethodName -> outcome for the bash comment step to consume.
169-
# When the same test appears in multiple TRX files (e.g. retries), "Failed" wins over other outcomes.
178+
# Parse TRX (XML) files to extract test method outcomes using yq + jq
179+
# (both pre-installed on ubuntu-latest). Produces a JSON map of
180+
# testMethodName -> outcome for the bash comment step to consume.
181+
# When the same test appears in multiple TRX files (e.g. retries),
182+
# "Failed" wins over other outcomes.
170183
if compgen -G trx_files/*.trx > /dev/null 2>&1; then
171184
echo "Parsing TRX files with yq..."
172-
# yq can read XML natively; extract testName+outcome from each UnitTestResult.
173-
# -s merges all files, producing a combined JSON array of results.
174-
# Prefer "Failed" over other outcomes when duplicates exist.
175-
yq -p xml -o json '.TestRun.Results.UnitTestResult | (if type == "!!seq" then .[] else . end) | {(."@testName" // ."+@testName"): (."@outcome" // ."+@outcome")}' trx_files/*.trx \
185+
# Convert each TRX to JSON with yq (Go yq's expression language is
186+
# limited, so do all reshaping in jq). The resulting documents look like:
187+
# { "TestRun": { "Results": { "UnitTestResult": <object|array> } } }
188+
# Attribute keys are exposed by yq as "+@<attr>" (newer yq) or
189+
# "@<attr>" (older yq), and a single result is emitted as an object
190+
# rather than an array, so jq must handle both shapes.
191+
yq -p xml -o json '.' trx_files/*.trx \
176192
| jq -s '
177-
reduce (.[] | to_entries[]) as {$key, $value} ({};
178-
# Extract simple method name (last segment after dot) for .cast file matching
179-
($key | split(".") | last) as $method |
180-
# Prefer "Failed" over any other outcome
181-
if .[$method] == "Failed" then . else .[$method] = $value end |
182-
if .[$key] == "Failed" then . else .[$key] = $value end
193+
def attr(o; k): o["+@" + k] // o["@" + k];
194+
# Best-effort: also key by the bare method name (with theory
195+
# parameter data stripped) so a .cast file named after the
196+
# CallerMemberName matches a TRX entry like
197+
# "Namespace.Class.Method(toolchain: "pnpm")".
198+
def bare_method(name):
199+
(name | split(".") | last) | sub("\\(.*$"; "");
200+
def fqn_no_params(name):
201+
name | sub("\\(.*$"; "");
202+
def merge(map; key; outcome):
203+
if map[key] == "Failed" then map else map + {(key): outcome} end;
204+
reduce (
205+
.[]
206+
| .TestRun.Results.UnitTestResult
207+
| (if type == "array" then .[] else . end)
208+
) as $r ({};
209+
attr($r; "testName") as $name |
210+
attr($r; "outcome") as $outcome |
211+
if $name == null or $outcome == null then .
212+
else
213+
merge(.; bare_method($name); $outcome)
214+
| merge(.; fqn_no_params($name); $outcome)
215+
| merge(.; $name; $outcome)
216+
end
183217
)
184218
' > test_outcomes.json
185219
186220
OUTCOME_COUNT=$(jq 'length' test_outcomes.json)
187221
echo "Parsed $OUTCOME_COUNT test outcome(s)"
188-
echo "has_outcomes=true" >> "$GITHUB_OUTPUT"
222+
if [ "$OUTCOME_COUNT" -gt 0 ]; then
223+
echo "has_outcomes=true" >> "$GITHUB_OUTPUT"
224+
else
225+
echo "has_outcomes=false" >> "$GITHUB_OUTPUT"
226+
fi
189227
else
190228
echo "No TRX files found"
191229
echo '{}' > test_outcomes.json
@@ -231,7 +269,9 @@ jobs:
231269
UPLOAD_COUNT=0
232270
FAIL_COUNT=0
233271
TOTAL_COUNT=0
272+
TEST_PASS_COUNT=0
234273
TEST_FAIL_COUNT=0
274+
TEST_UNKNOWN_COUNT=0
235275
236276
# Arrays to track failed test recordings separately
237277
FAILED_TESTS_BODY=""
@@ -252,14 +292,21 @@ jobs:
252292
# Look up test outcome from TRX data.
253293
# .cast files are named after the test method name (via [CallerMemberName] in CreateTestTerminal),
254294
# so the filename matches the method name key in the outcomes JSON.
295+
# Per-link label carries the outcome too, so a recording URL copied out of
296+
# the table still tells reviewers whether it represents a pass or a failure.
255297
TEST_OUTCOME=$(jq -r --arg name "$filename" '.[$name] // "Unknown"' test_outcomes.json)
256298
if [ "$TEST_OUTCOME" = "Passed" ]; then
257299
STATUS_EMOJI="✅"
300+
LINK_LABEL="✅ ▶️ View recording"
301+
TEST_PASS_COUNT=$((TEST_PASS_COUNT + 1))
258302
elif [ "$TEST_OUTCOME" = "Failed" ]; then
259303
STATUS_EMOJI="❌"
304+
LINK_LABEL="❌ ▶️ View failure recording"
260305
TEST_FAIL_COUNT=$((TEST_FAIL_COUNT + 1))
261306
else
262307
STATUS_EMOJI="❔"
308+
LINK_LABEL="❔ ▶️ View recording"
309+
TEST_UNKNOWN_COUNT=$((TEST_UNKNOWN_COUNT + 1))
263310
fi
264311
265312
# Upload to asciinema with retry logic for transient failures
@@ -279,14 +326,14 @@ jobs:
279326
280327
if [ -n "$ASCIINEMA_URL" ]; then
281328
TABLE_BODY="${TABLE_BODY}
282-
| ${STATUS_EMOJI} | ${safe_filename} | [▶️ View Recording](${ASCIINEMA_URL}) |"
329+
| ${STATUS_EMOJI} | ${safe_filename} | [${LINK_LABEL}](${ASCIINEMA_URL}) |"
283330
echo "Uploaded: $ASCIINEMA_URL"
284331
UPLOAD_COUNT=$((UPLOAD_COUNT + 1))
285332
286333
# Track failed tests for the prominent section
287334
if [ "$TEST_OUTCOME" = "Failed" ]; then
288335
FAILED_TESTS_BODY="${FAILED_TESTS_BODY}
289-
- ❌ **${safe_filename}** — [▶️ View Recording](${ASCIINEMA_URL})"
336+
- ❌ **${safe_filename}** — [${LINK_LABEL}](${ASCIINEMA_URL})"
290337
fi
291338
else
292339
TABLE_BODY="${TABLE_BODY}
@@ -302,18 +349,45 @@ jobs:
302349
fi
303350
done
304351
305-
echo "Uploaded $UPLOAD_COUNT recordings, $FAIL_COUNT upload failures, $TEST_FAIL_COUNT test failures"
306-
307-
# Build comment with summary outside collapsible and table inside
352+
echo "Uploaded $UPLOAD_COUNT recordings, $FAIL_COUNT upload failures, $TEST_PASS_COUNT passed, $TEST_FAIL_COUNT failed, $TEST_UNKNOWN_COUNT unknown"
353+
354+
# Build the summary line in the same style as the deployment E2E comment:
355+
# "<emoji> **CLI E2E Tests <status>** — X passed, Y failed[, Z unknown]"
356+
# Status reflects test outcomes; recording-upload failures are a secondary concern
357+
# surfaced in the table rather than the headline status.
358+
# Choose headline emoji + status word from the tallied outcomes.
359+
# We never let unknowns suppress a real failure, but we do flag
360+
# unknowns explicitly when the rest of the run was clean so
361+
# reviewers don't read a misleading 'passed'.
362+
FALLBACK_TEXT=""
308363
if [ "$TEST_FAIL_COUNT" -gt 0 ]; then
309364
SUMMARY_EMOJI="❌"
310-
SUMMARY_TEXT="${TEST_FAIL_COUNT} test(s) failed, ${UPLOAD_COUNT} recordings uploaded"
311-
elif [ "$FAIL_COUNT" -gt 0 ]; then
312-
SUMMARY_EMOJI="⚠️"
313-
SUMMARY_TEXT="${UPLOAD_COUNT}/${TOTAL_COUNT} recordings uploaded, ${FAIL_COUNT} upload(s) failed"
314-
else
365+
SUMMARY_STATUS="failed"
366+
elif [ "$TEST_PASS_COUNT" -gt 0 ] && [ "$TEST_UNKNOWN_COUNT" -eq 0 ]; then
367+
SUMMARY_EMOJI="✅"
368+
SUMMARY_STATUS="passed"
369+
elif [ "$TEST_PASS_COUNT" -eq 0 ] && [ "$TEST_FAIL_COUNT" -eq 0 ]; then
370+
# No TRX outcomes matched any recording — describe the run by
371+
# recording count rather than zero pass/fail counts, which would
372+
# read as 'no tests ran' instead of 'outcome data unavailable'.
315373
SUMMARY_EMOJI="🎬"
316-
SUMMARY_TEXT="${UPLOAD_COUNT} recordings uploaded"
374+
SUMMARY_STATUS="completed"
375+
FALLBACK_TEXT="${TOTAL_COUNT} recording(s), outcomes unavailable"
376+
else
377+
SUMMARY_EMOJI="❓"
378+
SUMMARY_STATUS="unknown"
379+
fi
380+
381+
if [ -n "$FALLBACK_TEXT" ]; then
382+
SUMMARY_TEXT="$FALLBACK_TEXT"
383+
else
384+
SUMMARY_TEXT="${TEST_PASS_COUNT} passed, ${TEST_FAIL_COUNT} failed"
385+
if [ "$TEST_UNKNOWN_COUNT" -gt 0 ]; then
386+
SUMMARY_TEXT="${SUMMARY_TEXT}, ${TEST_UNKNOWN_COUNT} unknown"
387+
fi
388+
fi
389+
if [ "$FAIL_COUNT" -gt 0 ]; then
390+
SUMMARY_TEXT="${SUMMARY_TEXT} (${UPLOAD_COUNT}/${TOTAL_COUNT} recordings uploaded, ${FAIL_COUNT} upload(s) failed)"
317391
fi
318392
319393
# Build the failed tests section (shown outside the collapsible)
@@ -326,7 +400,7 @@ jobs:
326400
fi
327401
328402
COMMENT_BODY="${COMMENT_MARKER}
329-
${SUMMARY_EMOJI} **CLI E2E Test Recordings** — ${SUMMARY_TEXT} (commit \`${SHORT_SHA}\`)
403+
${SUMMARY_EMOJI} **CLI E2E Tests ${SUMMARY_STATUS}** — ${SUMMARY_TEXT} (commit \`${SHORT_SHA}\`)
330404
${FAILED_SECTION}
331405
<details>
332406
<summary>View all recordings</summary>

tests/Aspire.Cli.EndToEnd.Tests/OtelLogsTests.cs

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
// Licensed to the .NET Foundation under one or more agreements.
22
// The .NET Foundation licenses this file to you under the MIT license.
33

4+
using System.Runtime.CompilerServices;
45
using Aspire.Cli.EndToEnd.Tests.Helpers;
56
using Aspire.Cli.Tests.Utils;
67
using Hex1b.Automation;
@@ -24,14 +25,14 @@ public Task OtelLogsReturnsStructuredLogsFromStarterApp()
2425
public Task OtelLogsReturnsStructuredLogsFromStarterAppIsolated()
2526
=> OtelLogsReturnsStructuredLogsFromStarterAppCore(isolated: true);
2627

27-
private async Task OtelLogsReturnsStructuredLogsFromStarterAppCore(bool isolated)
28+
private async Task OtelLogsReturnsStructuredLogsFromStarterAppCore(bool isolated, [CallerMemberName] string testName = "")
2829
{
2930
var repoRoot = CliE2ETestHelpers.GetRepoRoot();
3031
var strategy = CliInstallStrategy.Detect(output.WriteLine);
3132

3233
using var workspace = TemporaryWorkspace.Create(output);
3334

34-
using var terminal = CliE2ETestHelpers.CreateDockerTestTerminal(repoRoot, strategy, output, mountDockerSocket: true, workspace: workspace);
35+
using var terminal = CliE2ETestHelpers.CreateDockerTestTerminal(repoRoot, strategy, output, mountDockerSocket: true, workspace: workspace, testName: testName);
3536

3637
var pendingRun = terminal.RunAsync(TestContext.Current.CancellationToken);
3738

0 commit comments

Comments
 (0)