Skip to content

Conversation

jaygiang
Copy link
Collaborator

@jaygiang jaygiang commented Sep 26, 2025

Summary

This PR implements SEO improvements and fixes duplicate content issues by addressing canonical URLs, trailing slashes, and troubleshooting page conflicts.

Key Changes:

  • Fixed trailing slash inconsistency: Added trailingSlash: false to Docusaurus config to ensure consistent canonical URLs without trailing slashes across
  • Updated internal links to use absolute paths: Changed relative links to include /docs/ prefix (e.g., [When to use OAuth2](/docs/hydra/concepts/before-oauth2)) to ensure correct resolution when trailingSlash: false
  • Resolved duplicate content: Eliminated duplicate troubleshooting pages by removing manual /troubleshooting index and redirecting to the auto-generated /category/troubleshooting/ page

This addresses issues where search engines were finding duplicate content and inconsistent canonical URLs, improving overall SEO performance.

Related Issue or Design Document

SEO optimization - canonical tag updates #2317

Checklist

  • I have read the contributing guidelines and signed the CLA.
  • I have referenced an issue containing the design document if my change introduces a new feature.
  • I have read the security policy.
  • I confirm that this pull request does not address a security vulnerability.
    If this pull request addresses a security vulnerability,
    I confirm that I got approval (please contact [email protected]) from the maintainers to push the changes.
  • I have added tests that prove my fix is effective or that my feature works.
  • I have added the necessary documentation within the code base (if appropriate).

Further comments

Technical Details:

Trailing Slash Fix:

  • The issue was that dev server showed canonical URLs without trailing slashes while production builds included them
  • Adding trailingSlash: false ensures consistency across environments
  • Internal links were updated to use absolute paths with /docs/ prefix to ensure proper resolution

Troubleshooting Duplicate Content:

  • Previously had both /troubleshooting (manual index) and /category/troubleshooting/ (auto-generated) serving similar content
  • Solution: Remove manual index and redirect /troubleshooting/category/troubleshooting/ to consolidate to one canonical page
  • This eliminates the duplicate content penalty from search engines

Testing Instructions:

Before the fix:

  1. Run npm run build
  2. Look for /oauth2-oidc in the build directory
  3. Check the HTML file and observe that the canonical tag includes a trailing slash: <link rel="canonical" href="https://www.ory.sh/docs/oauth2-oidc/">

After the fix:

  1. Run npm run build
  2. Look for /oauth2-oidc in the build directory
  3. Check the HTML file and observe that the canonical tag no longer has a trailing slash: <link rel="canonical" href="https://www.ory.sh/docs/oauth2-oidc">

Update: Pivoted Approach

Previous Approach (Abandoned):
Initially attempted to set trailingSlash: false in Docusaurus config, but this broke hundreds of relative links throughout the documentation.

Current Approach (Implemented):
Implemented a fix using a reusable CanonicalUrl component that:

  • Targets only the specific files that generate trailing slashes (index files and directory-named files)
  • Preserves all existing relative links and navigation structure
  • Fixes canonical URLs without disrupting the established documentation patterns

@CLAassistant
Copy link

CLAassistant commented Sep 26, 2025

CLA assistant check
All committers have signed the CLA.

@jaygiang jaygiang changed the title Chore/canonical forward slash chore/canonical forward slash Sep 26, 2025
@jaygiang jaygiang changed the title chore/canonical forward slash chore: canonical forward slash Sep 26, 2025
Copy link
Member

@aeneasr aeneasr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR! Unfortunately, all links in the docs must be relative, otherwise many things break:

  • Links will keep working on the GitHub interface and many Markdown editors
  • You can customize the files' slugs without having to update all the links
  • Moving files around the folders can be tracked by your editor, and some editors may automatically update file links
  • A versioned doc will link to another doc of the exact same version
  • Relative URL links are very likely to break if you update the trailingSlash config

Please also read: https://docusaurus.io/docs/markdown-features/links

If you find broken links please resolve them with the relative path structure.

To achieve what you want, there should be other solutions, such as turning off auto-generation of the categories you don't want. This is all possible using Docusaurus and should be possible to read up on in the Docusaurus docs!

Also please append (to the end) all redirects into https://github.com/ory/docs/blob/master/vercel.json since the docusaurus redirects are only JS redirects.

@jaygiang
Copy link
Collaborator Author

Thanks @aeneasr for your review!

The changing to absolute paths wasn't the right approach. Appreciate you sharing the docs and context!

After digging into it more, realized the issue is specifically with index files and directory-named files. It looks like Docusaurus automatically adds trailing slashes to those in production. These specific files match up with the canonical errors we are seeing in Ahrefs.

Pattern 1: Index Files

  • docs/identities/index.mdx → URL: /identities/ (with trailing slash)
  • docs/polis/index.mdx → URL: /polis/` (with trailing slash)
  • docs/keto/index.md → URL: /keto/ (with trailing slash)

Pattern 2: Directory-Named Files

  • docs/kratos/organizations/organizations.mdx → URL: /kratos/organizations/ (with trailing slash)
  • docs/self-hosted/oel/monitoring/monitoring.mdx → URL: /self-hosted/oel/monitoring/ (with trailing slash)

Updated this PR and created a reusable CanonicalUrl component (just a head tag) that lets us override the canonical URL in the MDX file. Had to update a few relative links where files changed from .md to .mdx, but everything stays relative

@jaygiang jaygiang requested a review from aeneasr September 30, 2025 06:37
Copy link
Member

@vinckr vinckr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks! I think this makes sense, but someone else should also take a look before merging.
if this approach is accepted can you please add a short paragraph to README that explains when and how to use this custom component? my concern is that down the line we either forget to use the component or dont remember in which cases we need to use it.

@aeneasr
Copy link
Member

aeneasr commented Sep 30, 2025

I'm confused, the files changed in this PR don't appear to add trailing slashes on production in the first place? Calling

https://www.ory.sh/docs/account-experience

just gives 200 OK on https://www.ory.sh/docs/account-experience

curl -v https://www.ory.sh/docs/account-experience > /dev/null

< HTTP/2 200 

What I could find is

https://www.ory.sh/docs/troubleshooting

and

https://www.ory.sh/docs/category/troubleshooting/

where it probably doesn't make sense to keep the former "manual" page (https://www.ory.sh/docs/troubleshooting) if the content is the same

Read up on categories here:

https://docusaurus.io/docs/sidebar/items#sidebar-item-category

@jaygiang
Copy link
Collaborator Author

jaygiang commented Sep 30, 2025

@aeneasr Oh yes I can see the confusion. The issue isn't with the URLs themselves.

The problem is in the canonical meta tags inside the HTML <head>:

1. Direct Browser Inspection

1. Go to https://www.ory.sh/docs/account-experience
2. Right-click → "View Page Source" (I believe this shows server-rendered HTML that SEO crawlers see)
3. Search for "canonical"
4. Result: `<link rel="canonical" href="https://www.ory.sh/docs/account-experience/">` (with trailing slash)

# Note: DevTools shows the canonical WITHOUT trailing slash because client-side JavaScript modifies it after page load(I think from vercel's config of "trailingSlash: false"), but SEO crawlers don't see this change.

2. Command Line Check

# Get the server HTML that search engines see
curl -s https://www.ory.sh/docs/account-experience | grep -i canonical
# Result: <link rel="canonical" href="https://www.ory.sh/docs/account-experience/">

The mismatch:

The mismatch is showing up in the Ahrefs Audit:
Screenshot 2025-09-30 at 9 34 50 AM

Thanks for providing more info about categories. And yes, I totally agree and have removed the manual troubleshooting index page so that it can be generated from the sidebar config. Looks like there was a trailing slash at slug: "category/troubleshooting/", in the sidebar.ts file. Removing it fixed the canonical issue for troubleshooting category page.

@aeneasr
Copy link
Member

aeneasr commented Oct 1, 2025

Oh I see, that makes much more sense to me now! why the heck is docusaurus doing this 🤔 Is it only on those pages you edited here or on all of the pages?

@jaygiang
Copy link
Collaborator Author

jaygiang commented Oct 1, 2025

@aeneasr Yeah seems to be a weird problem with docusaurus - https://stackoverflow.com/questions/78508730/how-to-correct-a-docusaurus-generated-canonical-url-that-points-to-a-redirect

Yup, it's only happening on the pages I edited. It seems to only happen on index.mdx files and files that have the same name as it's parent folder for example docs/kratos/organizations/organizations.mdx.

@vinckr I have added a section that explains the canonicalUrl component in the README.


<CanonicalUrl path="/account-experience" />

---
Copy link
Member

@aeneasr aeneasr Oct 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This <hr> does not belong here or in the other files where it was added

Image

Comment on lines +7 to +9
import CanonicalUrl from "@site/src/components/CanonicalUrl"

<CanonicalUrl path="/account-experience" />
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@aeneasr aeneasr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost there

@jaygiang
Copy link
Collaborator Author

jaygiang commented Oct 6, 2025

@aeneasr Thanks for catching those! I've made the appropriate updates.

Note on quote consistency: I noticed that after wrapping the CanonicalUrl imports in mdx-code-block fences, there's inconsistent quote usage across the files (some imports use single quotes, others use double quotes). It looks like prettier doesn't format code inside mdx-code-block fences. Is the current state acceptable since these blocks are outside Prettier's scope?

@jaygiang jaygiang requested a review from aeneasr October 6, 2025 21:02
@aeneasr
Copy link
Member

aeneasr commented Oct 7, 2025

Yes, it‘s unfortunately the case but OK!

Copy link
Member

@aeneasr aeneasr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! Could you also add to the readme an instruction for this pattern so any future contributors know what to do? :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants