Skip to content

Conversation

LeChef318
Copy link
Contributor

  • adds a simple workaround in the preflight audit to work with the updated meta-tags audit
  • re-add the updates to meta-tags audit to use the ScrapeClient

Note: as soon as all audits are updated to use the ScrapeClient, preflight can be updated too and the workarounds can be removed

# Conflicts:
#	package-lock.json
#	package.json
#	src/metatags/handler.js
#	test/audits/metatags.test.js
#	test/common/async-job-runner.test.js
@LeChef318 LeChef318 self-assigned this Sep 3, 2025
Copy link

github-actions bot commented Sep 3, 2025

This PR will trigger a minor release when merged.

@LeChef318 LeChef318 merged commit 8918323 into main Sep 23, 2025
11 checks passed
@LeChef318 LeChef318 deleted the feat-implement-scrapeClient-stepped-audits branch September 23, 2025 14:50
solaris007 pushed a commit that referenced this pull request Sep 23, 2025
## [1.186.1](v1.186.0...v1.186.1) (2025-09-23)

### Bug Fixes

* **meta-tags:** add workaround for Preflight-Audit ([#1216](#1216)) ([8918323](8918323))
@solaris007
Copy link
Member

🎉 This PR is included in version 1.186.1 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀


// Transform URLs into scrape.json paths and combine them into a Set
const topPagePaths = topPages.map((page) => getScrapeJsonPath(page.url, siteId));
const includedUrlPaths = includedURLs.map((url) => getScrapeJsonPath(url, siteId));
Copy link
Contributor

@dipratap dipratap Sep 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this code is now removed, how are we incorporating the includedUrls in the meta-tags audit?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we use all URLs that were successfully scraped. in the submitForScraping-step the included urls will be sent with the top pages to the scraper (see here).
Only the URLs that were submitted by this audit will be returned (no old scrapes or other pages that were not submitted)

const includedURLs = await site?.getConfig()?.getIncludedURLs('meta-tags') || [];

// Transform URLs into scrape.json paths and combine them into a Set
const topPagePaths = topPages.map((page) => getScrapeJsonPath(page.url, siteId));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this code is now removed, will the scrapeResultPaths coming from context ensure it only contain the latest top-pages scrapes, and no older scrapes (from previous top-pages imports)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scrapeResultPaths will only contain URLs that were originally submitted for scraping by this audit. the new scrapeClient workflow will ensure that.

const pageUrl = object.finalUrl ? new URL(object.finalUrl).pathname
: new URL(url).pathname;
// handling for homepage
if (pageUrl === '') {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check was added since the home page had no path(empty). Will the page url being computed above be '/' for home page now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new URL(url).pathname will resolve to '/' for the home page

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants