fix(meta-tags): add workaround for Preflight-Audit #1216

LeChef318 · 2025-09-03T08:39:41Z

adds a simple workaround in the preflight audit to work with the updated meta-tags audit
re-add the updates to meta-tags audit to use the ScrapeClient

Note: as soon as all audits are updated to use the ScrapeClient, preflight can be updated too and the workarounds can be removed

…ctly

…bjects

…sageGroupId to sendMessage in sqs.js

…trings

…for dependencies

…bility

…comment for loading scrape result paths

…ed-audits

…/spacecat-shared-scrape-client to 2.1.0

…LIENT in README

# Conflicts: # package-lock.json # package.json # src/metatags/handler.js # test/audits/metatags.test.js # test/common/async-job-runner.test.js

…at-implement-scrapeClient-stepped-audits # Conflicts: # package-lock.json # package.json

…at-implement-scrapeClient-stepped-audits

github-actions · 2025-09-03T09:26:56Z

This PR will trigger a minor release when merged.

…at-implement-scrapeClient-stepped-audits

## [1.186.1](v1.186.0...v1.186.1) (2025-09-23) ### Bug Fixes * **meta-tags:** add workaround for Preflight-Audit ([#1216](#1216)) ([8918323](8918323))

solaris007 · 2025-09-23T15:00:18Z

🎉 This PR is included in version 1.186.1 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

dipratap · 2025-09-23T15:20:34Z

src/metatags/handler.js

-
-  // Transform URLs into scrape.json paths and combine them into a Set
-  const topPagePaths = topPages.map((page) => getScrapeJsonPath(page.url, siteId));
-  const includedUrlPaths = includedURLs.map((url) => getScrapeJsonPath(url, siteId));


Since this code is now removed, how are we incorporating the includedUrls in the meta-tags audit?

we use all URLs that were successfully scraped. in the submitForScraping-step the included urls will be sent with the top pages to the scraper (see here).
Only the URLs that were submitted by this audit will be returned (no old scrapes or other pages that were not submitted)

dipratap · 2025-09-23T15:23:09Z

src/metatags/handler.js

-  const includedURLs = await site?.getConfig()?.getIncludedURLs('meta-tags') || [];
-
-  // Transform URLs into scrape.json paths and combine them into a Set
-  const topPagePaths = topPages.map((page) => getScrapeJsonPath(page.url, siteId));


Since this code is now removed, will the scrapeResultPaths coming from context ensure it only contain the latest top-pages scrapes, and no older scrapes (from previous top-pages imports)?

scrapeResultPaths will only contain URLs that were originally submitted for scraping by this audit. the new scrapeClient workflow will ensure that.

dipratap · 2025-09-23T15:24:41Z

src/metatags/handler.js

+  const pageUrl = object.finalUrl ? new URL(object.finalUrl).pathname
+    : new URL(url).pathname;
  // handling for homepage
-  if (pageUrl === '') {


This check was added since the home page had no path(empty). Will the page url being computed above be '/' for home page now?

new URL(url).pathname will resolve to '/' for the home page

LeChef318 added 30 commits August 14, 2025 13:45

testing: commit to test a few things

0275a48

fix: update dependencies for scrapeClient and data access with new URLs

4c82e06

test: comment out invalid step destination test case

b239e02

fix: update step destination for scraping to use SCRAPE_CLIENT

7296fb6

fix: simplify URL structure in handler.js by returning finalUrls dire…

d9d5497

…ctly

fix: update metatag test cases to return URLs as strings instead of o…

56572fd

…bjects

fix: modify URL structure to return objects in handler.js and add Mes…

7eef64c

…sageGroupId to sendMessage in sqs.js

fix: update metatag test cases to return URLs as objects instead of s…

d6d02a4

…trings

fix: update package-lock.json to reflect changes in integrity hashes …

006e040

…for dependencies

fix: refactor fetchAndProcessPageObject to accept URL and key parameters

67a3530

fix: update metatag test cases to use block comments for better reada…

8164136

…bility

fix: update integrity hash for spacecat-shared-scrape-client and add …

0c2db3f

…comment for loading scrape result paths

fix: log scrape result paths for better debugging

a1817c5

fix: await scrape result paths retrieval in step-audit.js

090ca66

fix: update fetchAndProcessPageObject parameters for clarity

acbd243

fix: update integrity hash for spacecat-shared-scrape-client

09f9539

fix: destructure parameters in fetchAndProcessPageObject mapping

86aaf74

feat: implement ScrapeClient integration for stepped audits

26a6221

Merge branch 'refs/heads/main' into feat-implement-scrapeClient-stepp…

ff507e8

…ed-audits

Merge branch 'main' into feat-implement-scrapeClient-stepped-audits

116f374

Merge branch 'main' into feat-implement-scrapeClient-stepped-audits

5ab779b

fix: rebuilt package-lock.json (again)

e8a5208

chore: update @adobe/spacecat-shared-data-access to 2.55.0 and @adobe…

e8fc6a4

…/spacecat-shared-scrape-client to 2.1.0

Merge branch 'main' into feat-implement-scrapeClient-stepped-audits

f023819

refactor: clean up unused code and improve error logging in handler.js

19ef307

fix: update error message in metatags test for clarity

d1755da

feat: add scrapeResultPaths and update payload structure for SCRAPE_C…

7527bbe

…LIENT in README

Merge branch 'main' into feat-implement-scrapeClient-stepped-audits

6d998f7

# Conflicts: # package-lock.json # package.json # src/metatags/handler.js # test/audits/metatags.test.js # test/common/async-job-runner.test.js

fix: package-lock.json errors

51d8a4a

fix: package-lock.json errors, again

b2b0564

LeChef318 added 4 commits August 28, 2025 17:15

Merge branch 'main' of github.com:adobe/spacecat-audit-worker into fe…

c8972fd

…at-implement-scrapeClient-stepped-audits # Conflicts: # package-lock.json # package.json

Merge branch 'main' of github.com:adobe/spacecat-audit-worker into fe…

fa232fb

…at-implement-scrapeClient-stepped-audits

fix: update preflight/metatags audit to use URL to S3 key mapping

8883d06

fix: update metatagsAutoDetect to use pagesMap for S3 key mapping

af06f60

LeChef318 self-assigned this Sep 3, 2025

fix: log start of meta tags audit with new scraper data format

3ea0781

LeChef318 added 5 commits September 5, 2025 09:20

Merge branch 'main' into feat-implement-scrapeClient-stepped-audits

faca2c9

Merge branch 'main' into feat-implement-scrapeClient-stepped-audits

76ebad3

Merge branch 'main' into feat-implement-scrapeClient-stepped-audits

3b9b25d

Merge branch 'main' into feat-implement-scrapeClient-stepped-audits

c934ba7

Merge branch 'main' into feat-implement-scrapeClient-stepped-audits

10b5a4b

LeChef318 requested review from radhikagpt1208 and anagarwa September 10, 2025 13:15

Merge branch 'main' of github.com:adobe/spacecat-audit-worker into fe…

c29a6af

…at-implement-scrapeClient-stepped-audits

LeChef318 mentioned this pull request Sep 17, 2025

feat(prflight): migrate to scrape client #1256

Open

Merge branch 'main' into feat-implement-scrapeClient-stepped-audits

ede2cb6

dzehnder approved these changes Sep 23, 2025

View reviewed changes

LeChef318 merged commit 8918323 into main Sep 23, 2025
11 checks passed

LeChef318 deleted the feat-implement-scrapeClient-stepped-audits branch September 23, 2025 14:50

solaris007 pushed a commit that referenced this pull request Sep 23, 2025

chore(release): 1.186.1 [skip ci]

94210dc

## [1.186.1](v1.186.0...v1.186.1) (2025-09-23) ### Bug Fixes * **meta-tags:** add workaround for Preflight-Audit ([#1216](#1216)) ([8918323](8918323))

solaris007 added the released label Sep 23, 2025

dipratap reviewed Sep 23, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(meta-tags): add workaround for Preflight-Audit #1216

fix(meta-tags): add workaround for Preflight-Audit #1216

Uh oh!

LeChef318 commented Sep 3, 2025

Uh oh!

github-actions bot commented Sep 3, 2025

Uh oh!

Uh oh!

solaris007 commented Sep 23, 2025

Uh oh!

dipratap Sep 23, 2025 •

edited

Loading

Uh oh!

LeChef318 Sep 23, 2025

Uh oh!

dipratap Sep 23, 2025

Uh oh!

LeChef318 Sep 23, 2025

Uh oh!

dipratap Sep 23, 2025

Uh oh!

LeChef318 Sep 23, 2025

Uh oh!

Uh oh!

fix(meta-tags): add workaround for Preflight-Audit #1216

fix(meta-tags): add workaround for Preflight-Audit #1216

Uh oh!

Conversation

LeChef318 commented Sep 3, 2025

Uh oh!

github-actions bot commented Sep 3, 2025

Uh oh!

Uh oh!

solaris007 commented Sep 23, 2025

Uh oh!

dipratap Sep 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

LeChef318 Sep 23, 2025

Choose a reason for hiding this comment

Uh oh!

dipratap Sep 23, 2025

Choose a reason for hiding this comment

Uh oh!

LeChef318 Sep 23, 2025

Choose a reason for hiding this comment

Uh oh!

dipratap Sep 23, 2025

Choose a reason for hiding this comment

Uh oh!

LeChef318 Sep 23, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

dipratap Sep 23, 2025 •

edited

Loading