Add VS Code extension E2E test architecture#17772
Conversation
…board signals Implements the VS-Code-side asks from microsoft#17721 (CLI signals tracked separately): Foundation - Singleton telemetry reporter accessor + sendTelemetryErrorEvent helper - withCommandTelemetry wrapper records outcome (success/error/canceled) and duration_ms - Common-properties helpers publish apphost_languages + apphost_present once known - Test seams (__setReporterForTests / __resetCommonPropertiesForTests) for unit tests Command coverage (S3) - tryExecuteCommand routed through withCommandTelemetry - registerInstrumentedCommand wraps the bypass paths (runAppHost / debugAppHost editor and tree commands, code lens commands, walkthrough commands, restore, etc.) - Each event carries a source dimension (command_palette / editor / tree / codelens / walkthrough) for downstream cohort analysis Meaningful engagement (S1) - MeaningfulEngagementReporter fires engagement/active at most once per session on the first of: AppHost present in workspace, any command run, any debug session - Publishes apphost_languages summary via summarizeAppHostLanguages classifier Running AppHosts view (S2) - AppHostsViewTelemetry subscribes to tree-view visibility with a 1s debounce - Emits runningAppHostsView/shown with view_mode + running_apphosts / total_resources Debug lifecycle (S4) - debug/runSession/start on PUT /run_session in AspireDcpServer - debug/runSession/end from sessionTerminated (and the launch-failure catch block) with exit_code_bucket + duration_ms - debug/appHost/start emitted from AspireDebugSession on the launch DAP request - debug/appHost/end emitted from AspireDebugSession.dispose() with aggregated stats (total_child_sessions, distinct_resource_type_count + names, ended_with_error, duration_ms). DCP server tracks aggregates via takeDebugSessionAggregateStats. Dashboard passthrough (S5) - Replaces the hardcoded GET /telemetry/enabled stub with the full dashboard telemetry contract (start/startOperation/endOperation/startUserTask/endUserTask/ operation/userTask/fault/asset/property/recurringProperty/commandLineFlags) - Property flattening of AspireTelemetryProperty into TelemetryReporter's flat key/value model, documented encoding rules - Bearer-only auth middleware for routes the dashboard reaches (it does not send the DCP instance-id header) - Per-operation correlation Map with a 1h TTL for abandoned operations - /telemetry/enabled now honors the global VS Code telemetry setting README data-collection notice documents every category of event the extension now emits and confirms we do not report source, paths, env values, or resource names. Tests: - 13 new unit tests for telemetry + appHostLanguage helpers - Full extension test suite passes (635 passing) Refs microsoft#17721 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Fixes from a parallel pass by Opus 4.7, Opus 4.8, and GPT 5.5 reviewers: - AspireDcpServer: extract emitRunSessionFailureEnd helper so the three early-return failure paths in PUT /run_session (unsupported launch config, missing debug session, debugger did not start) emit debug/runSession/end and update the AppHost aggregate. Previously a debug/runSession/start without a matching /end was emitted and anyNonZeroExit was not flipped, so debug/appHost/end under-reported failures. - AspireDcpServer: route nonzero debug/runSession/end through sendTelemetryErrorEvent (matches the catch-block path) so faults get the reporter's stricter scrubbing. - DashboardTelemetryPassthrough: dispatch flattenProperties on PropertyType.Metric (the C# dashboard sends metric values as invariant-culture strings) instead of typeof === 'number'; drop PropertyType.Pii properties defensively; stringify other numbers instead of promoting to measurements. - AspireDebugSession.dispose: snapshot start-event metadata, run disposables before emitting telemetry, and defer debug/appHost/end via setTimeout(500ms) so child sessionTerminated notifications have a chance to flow through the adapter tracker before the aggregate is taken. - README: scope the absolute privacy claim to events the extension originates, and call out the Pii-drop behavior on dashboard passthrough. - Add dashboardTelemetryPassthrough.test.ts (8 tests) covering Metric-string parsing, Pii drop, numeric non-promotion, and complex object stringification. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…nsion
Wire-format / correctness:
- Dashboard sends JSON via HttpClient.PostAsJsonAsync which since .NET 9
uses JsonSerializerOptions.Web (camelCase property naming, numeric enum
serialization). The TS interfaces in DashboardTelemetryPassthrough were
PascalCase, so every dashboard request bound to undefined fields:
- flattenProperties dropped all dashboard properties (Value/PropertyType
vs value/propertyType).
- PostOperationRequest.Result was undefined, so failure routing to
sendTelemetryErrorEvent never fired.
- All other route handlers (operation, userTask, fault, asset, property,
recurringProperty, commandLineFlags) read undefined fields.
Rewrite the interface definitions to camelCase and add numeric-enum
label mappers (telemetryResultLabel, faultSeverityLabel, isFailureResult).
- Route Failure/UserFault results from PostOperation/PostUserTask/endOperation
through sendTelemetryErrorEvent so they participate in the reporter's
stricter scrubbing pass.
Privacy:
- Add scrubFreeformDiagnosticText to truncate forwarded dashboard exception
messages and fault descriptions to 1024 chars (defense-in-depth on top of
the reporter's pattern-based PII scrubbing).
- Clamp resource_type / distinct_resource_types telemetry to the supported
resource-type set; unsupported launch-configuration types are now reported
as 'unsupported' instead of leaking the raw CLI-supplied type string.
- Validate the launch.json command against the AspireCommandType allowlist
('run' | 'deploy' | 'publish' | 'do') before emitting it; unknown values
collapse to 'other'.
- Tighten README to be explicit that forwarded dashboard exception messages
may contain user-controlled fragments and document the new resource_type
clamping.
Performance:
- Fix _handleStart abandonment-timer closure leak: project eventName out
of payload before capturing it in the 1h setTimeout closure so the
arbitrarily large payload.settings.startEventProperties bag is not
retained for the entire TTL.
AppHost classification:
- Add classifyAppHostDirectory which synchronously enumerates marker files
(apphost.{ts,mts,cts,js,mjs,cjs}, AppHost.cs, *.csproj). Previously,
every directory-style AppHost was classified as 'unknown', which would
miscategorize the entire TypeScript AppHost cohort once it landed.
- AspireDebugSession now uses classifyAppHostDirectory when appHostPath
refers to a directory.
Security:
- Fix loose bearer-prefix check in requireHeaders/requireBearerOnly.
auth.split('Bearer ').length === 2 accepted other schemes that happened
to contain 'Bearer ' as a substring (e.g. 'X-Bearer <token>'). Replace
with startsWith('Bearer ') + slice and factor the two middlewares onto
a shared validateBearerToken helper.
Tests:
- Update dashboardTelemetryPassthrough.test.ts fixtures from PascalCase to
camelCase and add coverage for enum label mappers and scrub helper.
- Add classifyAppHostDirectory tests covering csharp / typescript /
unknown / mixed cases.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The Aspire dashboard sends arbitrary event names and arbitrary property keys over HTTP. Previously the passthrough forwarded those names verbatim, which meant every new dashboard event or property silently added rows to the (EntityName, PropertyName) classification catalog and required someone to chase down the new entries to classify them. This change: - Introduces src/utils/telemetryRegistry.ts: a per-event schema mapping every event name to its allowed properties + measurements. The schema mirrors the catalog's pair-based model so every "row" the extension is allowed to produce is enumerable in one file. - Makes sendTelemetryEvent / sendTelemetryErrorEvent generic over the registered event names, so TypeScript rejects calls with unregistered event names or unregistered property/measurement keys at compile time. - Restricts setCommonTelemetryProperties to the registered common-property keys, since every common property duplicates into a row per event. - Replaces the dashboard passthrough's per-route emit-with-payload-name pattern with a fixed set of extension event names (dashboard/operation, dashboard/userTask, dashboard/fault, dashboard/asset, dashboard/scope/start, dashboard/scope/end, dashboard/property/set, dashboard/property/recurring, dashboard/commandLineFlags). The dashboard's original event name now lives in 'dashboard_event_name'. - Replaces flattenProperties with bundleDashboardData, which JSON-encodes the per-event property/metric bag into stable 'dashboard_properties' / 'dashboard_measurements' string fields, with size guardrails (100 entries / 8 KB per bundle, with a __truncated__ marker on overflow). - Drops the getTelemetryReporter export to remove a bypass path that let callers skip the typed wrapper layer. - Updates the README to describe the new bounded surface and points reviewers at the registry. Tests: 664 passing (was 658 — 6 new bundleDashboardData / formatCorrelations tests cover Pii drops, truncation by count, truncation by size, independent property/measurement budgets, and correlation formatting). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adversarial multi-model review of the VS Code dashboard telemetry
passthrough surfaced several ways user/workspace content or attacker
input could reach the telemetry channel despite the README's "no
resource names or workspace contents" guarantee.
Privacy hardening — drop free-form, dashboard-composed strings instead
of merely truncating them (truncation bounds volume, not sensitivity):
- bundleDashboardData now drops the Basic-tagged exception message and
stack trace keys (Aspire.Dashboard.Exception.Message/StackTrace) which
embed resource names and home-directory paths. Structured
Exception.Type/RuntimeVersion are retained.
- /telemetry/fault no longer forwards `description` (the dashboard builds
it as `${type}: ${ex.Message}`).
- /telemetry/endOperation no longer forwards `error_message` (the only
producer passes a caught `ex.Message`, see DashboardCommandExecutor).
- /telemetry/operation and /telemetry/userTask no longer forward
`result_summary` (latent free-form field at the network boundary).
- Removed the now-unused description/error_message/result_summary members
from the telemetry registry so they can't be reintroduced accidentally.
Additional boundary hardening:
- formatFlagPrefixes strips flag values (keeps only the name before the
first `=`, `:`, or whitespace) so `--token=secret` can't leak the value.
- formatCorrelations clamps each eventType/id (previously only count-capped).
- scrubFreeformDiagnosticText accepts `unknown` and coerces non-strings to
'' so a malformed JSON body can't throw a TypeError -> Express 500.
- Telemetry `mode` is clamped to Debug/NoDebug/Unknown/other instead of
forwarding the CLI-controlled value verbatim.
Updated the README data/telemetry section and added unit + route tests.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Final multi-model privacy/security review of the telemetry passthrough surfaced several JSON-boundary hardening gaps. Fixes: - Property routes now bundle the value under the real property name so the free-form drop-list and key clamping apply (they were bypassed by the synthetic 'value' wrapper key). - asset_event_version, result/severity labels, and the Pii discriminator now coerce numeric input instead of interpolating/forwarding arbitrary strings or arrays verbatim. - Truncation markers are reserved inside the cap so clamped values never exceed their documented maximum. - formatCorrelations now bounds total serialized size, not just per-element and count caps. - Dropped the console-logs resource-name key as defense-in-depth. - Removed a dead isExtensionTelemetryEnabled import. Added unit tests for each behavior; full extension suite passes (708). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- tryExecuteCommand: when the CLI is unavailable and the user is redirected to install it, throw a cancellation so withCommandTelemetry records the invocation as 'canceled' instead of a false 'success'. The existing catch suppresses the error toast since the redirect already informed the user. - requireBearerOnly (/telemetry/*): the 'missing' bearer failure now returns an Authorization-only message. Those routes intentionally do not require the DCP instance-id header, so the combined 'Authorization and ...DCP-Instance-ID headers are required' message was misleading. This branch is only reachable from requireBearerOnly (requireHeaders handles missing headers inline). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Address final review findings in the VS Code extension telemetry implementation by tightening dashboard passthrough privacy, binding DCP to localhost, preserving AppHost failure telemetry, and classifying user cancellations correctly. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Require dashboard telemetry requests to carry a scoped DCP instance id, propagate that id from AppHost configuration into the dashboard sender, and register/prefix the extension telemetry catalog for VS Code visibility. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
🚀 Dogfood this PR with:
curl -fsSL https://raw.githubusercontent.com/microsoft/aspire/main/eng/scripts/get-aspire-cli-pr.sh | bash -s -- 17772Or
iex "& { $(irm https://raw.githubusercontent.com/microsoft/aspire/main/eng/scripts/get-aspire-cli-pr.ps1) } 17772" |
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…nsion-e2e-tests # Conflicts: # extension/yarn.lock
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Treat a non-zero aspire stop during teardown as fatal only if the extension still observes a running or launching AppHost afterward. This handles the debug-session shutdown race while preserving leak detection. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Wait for ffmpeg to close before reporting recordings as saved, and keep the ExTester feed preflight from creating temporary E2E run directories. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Keep teardown cleanup isolated, report debug-launch rejection as a failed command, align runner timeouts with the workflow budget, and preserve diagnostics by terminating ExTester gracefully before SIGKILL. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Move the empty-discovery fixture outside the workspace so recursive AppHost discovery cannot rediscover it on Windows. Pass the runner root into E2E tests so crash cleanup still owns the hidden directory when possible. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Final update before merge: I updated the PR description with the scenario matrix and latest validated commit ( The main hardening here is around making the VS Code E2E runner less flaky under CI: internal-feed-only ExTester restore, more resilient teardown/timeout handling, better diagnostics/recording behavior, and the Windows discovery fix where the hidden AppHost now moves outside the workspace so recursive discovery cannot find it again. CI is green now ( |
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
mitchdenny
left a comment
There was a problem hiding this comment.
Earlier feedback addressed: the ffmpeg recording-stop race is fixed with the SIGINT/SIGTERM/SIGKILL escalation and a real 'close' await (so "saved" is only logged on a graceful exit), and the verify-feed preflight no longer creates/destroys an unused temp tree. Bonus: Linux recording default flipped to always so we'll have video on first occurrence of any flake. The remaining duplicate .test-storage/**/logs/** glob in the upload step is cosmetic.
|
there is one more e2e scenario i am adding tests for, then will merge |
|
just kidding, there are TWO scenarios |
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Allow compatibility runs against older Aspire CLI builds to opt out of the current-CLI-only debug console regression while keeping the default repo validation path covered. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
actually 3 scenarios... |
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Await terminal dispatch for tree commands before reporting success and expose stopping state through the E2E state bridge. Tighten teardown isolation, watched-file writes, and diagnostics artifacts to make VS Code extension E2E failures easier to diagnose and less flaky. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Show the stopping state before terminal dispatch so the tree cannot miss fast stop transitions, clear the optimistic state when terminal dispatch fails, and add richer E2E teardown/failure diagnostics. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Avoid relying on Windows WebDriver tree-description polling after the stop-state bridge has already observed the transient state, and close all editors before deleting generated zero-to-running projects so Windows file handles are released. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Refresh workspace AppHost state when Aspire debug sessions terminate so stale stopped AppHosts do not keep E2E teardown waiting. Harden teardown cleanup by capturing the target AppHost PID, falling back to aspire ps when extension state was already refreshed, and force-terminating the target process tree only when aspire stop times out. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Description
Adds real VS Code UI end-to-end coverage for the Aspire extension, for #17727. The tests package the extension as a VSIX and drive a real VS Code window with a real Aspire CLI and generated AppHost workspace, rather than only asserting mocked command routing.
Major pieces in this PR:
Adds
extension/src/test-e2ewith sharded ExTester specs for command palette flows, settings-file creation/opening, AppHost discovery/configuration, AppHost tree lifecycle, tree actions, debug/dashboard startup, package manifest surface, negative-path edge cases, and a zero-to-running CLI flow.Adds
extension/scripts/run-e2e.jsto package/reuse the VSIX, isolateASPIRE_HOME, generate workspaces, validate CLI/VSIX inputs, install the E2E runner from the internaldotnet-public-npmfeed, and collect diagnostics.Wires
.github/workflows/extension-e2e-tests.ymlintotests.ymlwith Linux/Windows shards and path gating so the expensive E2E matrix runs only for extension, Aspire CLI, CLI archive/package workflow, or E2E workflow changes.Adds Linux/Xvfb failure recording plus diagnostics upload for logs, screenshots, workspaces, Aspire home, and terminal recordings.
Hardens CI diagnostics: retries transient Azure Artifacts/Yarn install failures, makes diagnostics upload non-blocking, reacquires refreshed tree sections during E2E waits, and includes richer timeout output for workbench text, editor titles, tree item descriptions, and browser reloads.
Hardens the E2E bridge with sequence-numbered command, terminal, debug, and status events so tests wait for the intended event instead of relying on fixed sleeps or capped-buffer positions. Terminal events now record whether commands were suppressed, sent through shell integration, or sent through sendText.
Makes AppHost stopping state observable through the extension state snapshot so E2E teardown and assertions can wait on reliable state-file data instead of only WebDriver tree polling. The tree now enters the optimistic
Stopping...state before terminal dispatch and clears it if dispatch fails, so fast stop transitions cannot skip the UX state.Hardens watched E2E fixture writes, teardown isolation, and diagnostics artifacts so stale workspace state, leaked AppHosts/debug sessions, missing copied source files, individual teardown cleanup failures, and Windows editor file locks are surfaced or cleaned up directly.
Hardens two CLI flakes found while iterating CI: profiling tests now isolate process-wide
ActivitySourcelisteners, and peer-probe output capture gives normally exited peers enough time to drain stderr under CI load.Adds a resource-command argument-prompt scenario covering text, choice, boolean, number, and secret inputs, including secret redaction in terminal command routing.
Adds current-CLI regression coverage for extension debug-console output, with an opt-out for older CLI compatibility runs via
ASPIRE_EXTENSION_E2E_SKIP_CURRENT_CLI_REGRESSIONS=true.Product bugs found and fixed while hardening the E2E coverage:
fsinstead of VS Code's workspace filesystem abstraction, which could break AppHost discovery for non-file workspace providers.Stopping...state for AppHosts after Stop was invoked, including the workspace-candidate path when running-state discovery lagged behind debug startup or the stop command routed quickly (VS Code extension: Show "Stopping..." state for AppHosts in the Aspire pane #17707).A note on dependency sources: the workflow uses only the internal
dotnet-public-npmAzure Artifacts feed. The reusable E2E workflow preflights the feed and retries transient install failures rather than falling back to an external registry.Scenario matrix:
Stopping...state, dashboard URL routing, endpoint/log/resource commands, duplicate display-name resources, child-resource endpoint lookupaspire new, package add, AppHost source opening, debug/dashboard startupUseful targeted commands:
Validation:
corepack yarn testASPIRE_EXTENSION_E2E_SHARD=apphost-tree ASPIRE_EXTENSION_E2E_SPEC='out/test-e2e/test-e2e/appHostTree.e2e.test.js' corepack yarn test:e2eASPIRE_EXTENSION_E2E_SHARD=tree-actions ASPIRE_EXTENSION_E2E_SPEC='out/test-e2e/test-e2e/treeActions.e2e.test.js' corepack yarn test:e2eASPIRE_EXTENSION_E2E_SHARD=debug-dashboard ASPIRE_EXTENSION_E2E_SPEC='out/test-e2e/test-e2e/debugDashboard.e2e.test.js' corepack yarn test:e2ecorepack yarn compile-testscorepack yarn compile-e2ecorepack yarn compile && corepack yarn lintgit diff --check, and changed-file scans for external npm registry referencesnpm testASPIRE_EXTENSION_E2E_SHARD=debug-dashboard ASPIRE_EXTENSION_E2E_SPEC='out/test-e2e/test-e2e/debugDashboard.e2e.test.js' npm run test:e2eASPIRE_EXTENSION_E2E_SHARD=zero-to-running ASPIRE_EXTENSION_E2E_SPEC='out/test-e2e/test-e2e/zeroToRunning.e2e.test.js' npm run test:e2edotnet test --project tests/Aspire.Cli.Tests/Aspire.Cli.Tests.csproj --no-launch-profile -- --filter-class "*.ProfilingTelemetryTests" --filter-not-trait "quarantined=true" --filter-not-trait "outerloop=true"dotnet test --project tests/Aspire.Cli.Tests/Aspire.Cli.Tests.csproj --no-launch-profile -- --filter-class "*.PeerInstallProbeTests" --filter-not-trait "quarantined=true" --filter-not-trait "outerloop=true"dotnet test --project tests/Aspire.Cli.Tests/Aspire.Cli.Tests.csproj --no-launch-profile -- --filter-method "*.ProfilingSpansStoreGeneratedSessionOnAmbientAncestorsForSiblings" --filter-method "*.ProbeAsync_PeerExitsNonZero_IncludesCapturedStderr" --filter-not-trait "quarantined=true" --filter-not-trait "outerloop=true"Fixes #17727
Fixes #17707
Checklist
<remarks />and<code />elements on your triple slash comments?Latest commit:
d983a7b0279f24add9657759056c7dfcfef9c984