Skip to content

Releases: Pagefind/pagefind

v1.5.2

12 Apr 12:53
bf17396

Choose a tag to compare

  • v1.5.0 was meant to 2x indexing performance, which it does on macOS and Windows. On Linux, with the published musl build, it actually halves the indexing performance. This release subs in jemalloc on Linux musl builds to fix the musl allocator thrashing, and performance now achieves the 2x v1.4.0 claim.
  • Further improved deterministic index filenames between indexes (PR #1104 — thanks @gissimo !).
  • Cleaned up a wasm-bindgen deprecation warning popping up in the browser console.

v1.5.0

06 Apr 08:44
0c6d2fe

Choose a tag to compare

Hey! This is a big one. Pagefind 1.5.0 has been fermenting for a while, and addresses a lot of long-standing issues and feature requests. This release brings an entirely new search UI built on web components, major improvements to search relevance and ranking, diacritics support, automatic CJK segmentation, Web Worker search, notably smaller indexes, and a much faster indexing binary. Enormous thanks to everyone who contributed features and fixes, as well as to everyone who tested the beta releases and provided feedback ❤️ - @bglw

If you only read this far, I should mention up front: The existing Default UI and Modular UI remain available and supported for now, so you can upgrade your sites to Pagefind v1.5.0 without migrating to the Component UI.

Pagefind Component UI

Pagefind ships a brand new UI system built entirely on web components. The Component UI gives you searchboxes, modals, result lists, and filter controls as composable <pagefind-*> elements that you can mix, match, and style with CSS variables.

The Component UI is available as vendored files in your /pagefind/ output directory, or as an npm package to install and import.

The best way to get a feel for the new components is on the 📘 Pagefind Component UI page of the docs, where interactive examples of various components are shown.

Extra goodies with the Component UI:

  • Greatly improved accessibility over the Default UI
  • Keyboard navigation through search results
  • Configurable keyboard shortcuts (thanks @miketheman !)
  • Full custom templates for rendering results and placeholders
  • Exported types for Component UI npm consumers (thanks @vanruesc !)
  • Support for multiple scoped Pagefind instances on one page
  • A range of CSS variables available for light-touch customization (thanks @miketheman for some of these!)
  • Improved RTL and locale-specific rendering

Search Relevance, and Searching Metadata

Pagefind now searches metadata by default! Importantly, this means it now searches the title metadata. Matches in titles are now taken into account, and search results are very hard to shake from prime positions if all (or much) of the title matches the search query.

You can configure the weight of any metadata field. See 📘 Configuring Metadata Weights to change the title boost or apply custom weights to your own metadata fields.

Beyond metadata searching, a bunch of weird and wonderful ranking bugs were resolved:

  • Metadata-only matches now return results. Previously, if a page matched the search query only in its metadata (e.g. the title) but not in the body content, it would be missed. These pages now correctly appear in results.
  • Word splitting and indexing was revisited to properly handle diacritics, stemming, and compound words together. This fixes a broad set of edge cases where compound word parts weren't indexed correctly.
  • Loading index chunks now correctly uses stemmed terms. This was a discrepancy in how chunks were identified, and could cause some hard to pin down issues where the wrong chunk would be loaded for a search term, leaving you with no (or fewer) results.
  • A couple of pathways left you with only the first matching chunk loaded, which would also give you fewer results. Words that straddle multiple chunks now behave better.
  • Fancy-pants unicode characters in words could really mess up the chunk loading, which has been fixed.

Diacritics Support

We finally properly support matching across diacritics. You can now find your cafés without remembering how to type é.

By default, exact diacritic matches are preferred. So if you're searching "cafe", pages with "cafe" will rank higher than pages with "café". Getting this relevance right by default was the final piece of the puzzle for shipping this, which is why it took a while to land. See 📘 Configuring Diacritic Similarity to adjust how this plays out on your site.

If you need strict matching, set exactDiacritics: true to disable normalization entirely — "cafe" will only match "cafe", and "café" will only match "café". 📘 Exact Diacritics

Multilingual Improvements

Thanks browsers! Pagefind now taps into Intl.Segmenter to chop search queries in CJK (Chinese, Japanese, Korean) non-whitespace-delimited languages. This was already done during indexing by Pagefind, but users searching still had to delimit their queries. Now searching "这是一段简单的测试文本" searches for the words "这", "是", "一段", "简单", "的", "测试", and "文本", which is also how that sentence was indexed.

We also updated the underlying stemming library (thanks @uded !) which brings stemming support for Polish and Estonian (and Esperanto, if anyone is out there indexing some lang="eo" pages). The Snowball upgrade also improves stemming quality across many already-supported languages.

Indexing Performance

The indexing binary (the one you install through npx or your wrapper of choice) is now both smaller (so, faster to download) and faster to run, by quite a lot on both fronts. On some sites, indexing is more than twice as fast. Thanks to @zmre for much of this!

Search Performance

Pagefind's search now runs in a Web Worker automatically. This doesn't make the search faster, per se, but it dramatically improves perceived performance on large websites by keeping the main thread responsive. If Web Workers are unavailable, it falls back to the main thread automatically.

Plus: Some low-hanging fruit was picked off, and Pagefind's index chunks are now ~45% smaller thanks to delta-encoding page numbers and word locations.

New Search Options

  • metaCacheTag — Allows you to configure the cache-busting tag on the metadata file (which is fetched fresh on every page load). For offline/PWA scenarios where assets need to be served with service workers, this can now be overridden.
  • plain_excerpt — Search results and sub-results now include a plain_excerpt field containing the excerpt text without highlight mark tags, for those who want to handle highlighting themselves (or don't want it at all).
  • matchedMetaFields — Search results now include a matchedMetaFields field listing which metadata fields matched the search query.
  • includeCharacters is now available in the Node and Python wrapper APIs.

UI Translations

  • Added Greek (el) translations. (PR #1019 — thanks @Yoda-Soda !)
  • Improved Chinese Traditional (zh-TW) translations. (PR #990 —thanks @510208 !)
  • Improved German (de) translations. (PR #953 —thanks @randomguy-2650 !)
  • Added translations for new Component UI strings across all existing languages.

Other bits and bobs

  • Fixed relative image URLs (e.g. ./image.png) breaking when displayed in search results. (PR #1087)
  • Fixed Python x86_64 macOS wheel being incorrectly tagged as arm64. (PR #950 — thanks @lioman !)
  • Fixed Python wheel tags being written in compressed form. (PR #989 — thanks @ichard26 !)
  • Excluded the vendor directory from the main pagefind PyPI package. (PR #991)
  • Migrated Python wrapper build tooling from Poetry to uv. (PR #934 — thanks @SKalt !)
  • Fixed subresult URLs ignoring page meta URL overrides. (PR #1076)
  • Fixed subresult highlight mark color. (PR #1024)
  • Index chunk fetches are now throttled to avoid overwhelming the network on large sites. (PR #1071)
  • Added Windows ARM64 (aarch64-pc-windows-msvc) as a supported platform. (PR #1079)
  • For crate consumers: Moved actix-web and related serving dependencies behind a serve feature flag (PR #1023)

Looking Forward

The Component UI is the new recommended way to add search to your site, and future UI work will focus there. The Default UI and Modular UI are sticking around for now, but the Component UI is where new features will land first.

Thanks again to everyone who contributed to this release!

v1.5.0-beta.2

04 Apr 00:34
038965f

Choose a tag to compare

v1.5.0-beta.2 Pre-release
Pre-release

Changes since v1.5.0-beta.1:

  • New: Greek translations
  • Core refactor to speed up indexing
  • Improved type exports for the component UI
  • Better keybind configuration for searchbox and modal
  • Fixed closing the modal with Esc when the input is focused
  • Improved a11y and usability of keyboard navigation in the searchbox when results are loading
  • Moved some crate dependencies behind a serve feature flag for crate consumers
  • Upgraded the stemming library (Snowball), introducing Polish and Estonian stemming
  • Exposed more CSS variables for customizing component UI styling
  • Some misc. UI tidy-ups

v1.5.0 will follow in a few days.

v1.5.0-beta.1

11 Jan 10:20
4010397

Choose a tag to compare

v1.5.0-beta.1 Pre-release
Pre-release

Hey! This is a big one, so I thought we'd give a beta release a try. This release addresses a lot of long-standing issues and feature requests, alongside delivering an entirely new search UI.

Pagefind Component UI

This is the main reason for a beta release here. Writing a new ground-up UI system for Pagefind has been a big job, and I would love to get some more eyes on it before we send it out into the world on its stable release.

It's so large, in fact, that it has its own documentation site! If you're itching to see what it looks like, check out https://ui.pagefind.app/ for modals and searchboxes and custom web components and more!

The Component UI is available as vendored files in your /pagefind/ output directory, or it's available as an npm package to install and import.

One big request, for anybody reading this who has or wants to contribute translations, is to go look over the files changed in PR 1005. As part of this new UI, we have a new set of translation strings. Thankfully these could mostly be inferred from existing ones, but overall they need reviews from fluent speakers. ❤️

Search Relevance, and Searching Metadata

Pagefind now searches metadata! Yes, all metadata. By default. Importantly, this means it now searches the title metadata. Matches in titles are now taken into account, and search results are very hard to shake from prime positions if all (or much) of the title matches the search query.

This is also something you can configure! See 📘 Configuring Metadata Weights for how to change this title boost, or apply it to any and all metadata fields of your choosing.

Alongside that, a bunch of weird and wonderful ranking bugs were resolved. I'll write more in the final release notes, but PR 1003 goes into great detail on the various improvements that make searches better across the board. Plus PR 996 covers some of the chunk loading bugs that could also cause corner cases with searches.

Diacritics Support

We finally properly support matching across diacritics. You can now find your cafés without remembering how to type é!

This is, yet again, something you can configure. By default, exact diacritic matches are preferred. So if you're searching "cafe", pages with "cafe" will rank higher than pages with "café". Getting this relevance right by default was the final piece of the puzzle for shipping this, which is why it took a while to land. See 📘 Configuring Diacritic Similarity to adjust how this plays out on your site.

Multilingual Improvements

Thanks browsers! Pagefind now taps into Intl.Segmenter to chop search queries in CJK (Chinese, Japanese, Korean) non-whitespace-delimited languages. This was already done during indexing by Pagefind, but users searching still had to delimit their queries. Now searching "这是一段简单的测试文本" searches for the words "这", "是", "一段", "简单", "的", "测试", and "文本", which is also how that sentence was indexed.

Performance!

Pagefind's search now runs in a Web Worker automatically. This doesn't make the search faster, per se, but it dramatically improves perceived performance on large websites by keeping the main thread responsive.

If Web Workers are unavailable, it falls back to the main thread automatically.

More Performance!

Some low-hanging fruit was picked off, and Pagefind's index chunks are now ~45% smaller. The indexing binary (the one you install through npx or your wrapper of choice) is now both smaller (so, faster to download) and faster to run, at least on my macOS machine.

Phew

Most of these things are pretty solid, and the new Component UI is the most in flux. I don't expect it to change much, but be aware nonetheless that some things might change before it stabilises in 1.5.0.

I would love your feedback! Please jump into the GitHub discussion for this release and highlight any issues. This will simmer happily, depending on feedback, for a week or three, so there's room for changes.

If you haven't used a beta release before, simply:

# if you run via npx, sub out `pagefind` for `pagefind@beta`
npx pagefind@beta
# or, if you have it as a dependency:
npm i pagefind@beta
# or, if you use the python distribution:
pip install --pre pagefind
# or, via cargo:
cargo install pagefind --version 1.5.0-beta1

Or download the correct binary from this GitHub Release page.

Hope to hear from you!

v1.4.0

01 Sep 06:22
8c3de15

Choose a tag to compare

Core Features & Improvements

  • Added the "Include Characters" option to allow indexing of specific special characters.
  • Reduced filesizes for the Pagefind WebAssembly modules.
  • Added FreeBSD as a supported platform (PR #813 — thanks @nguthiru !)
  • Fixed an issue where matches in compound words could be ranked with zero weight. (PR #806 — thanks @teamdandelion !)

Pagefind Playground

Modular UI Features & Improvements

  • Added option to hide images on result templates in the Modular UI (PR #874 — thanks @HannesOberreiter !)
  • Added a data attribute for result count on the filter pills. (PR #827 — thanks @cmahnke !)

Default UI Features & Improvements

  • Added title attribute to the default UI search input for improved accessibility (PR #798 — thanks @rdela !)

UI Translations

  • Added Thai (th) translations (PR #801 — thanks @Phon1209 !)
  • Added Thai segmenter support when indexing (PR #807 — thanks @anonymaew !)
  • Added Basque (eu) translations (PR #826 — thanks @erral !)
  • Added Norwegian Bokmål (nb) and Norwegian Nynorsk (nn) translations (PR #878 — thanks @altinnadmin !)
  • Added Burmese (my) translations (PR #768 — thanks @harrymkt !)

Everything Else

  • Added a development justfile, and improved CONTRIBUTING.md (hint hint)
  • The Pagefind JavaScript should support running in Node.js a bit better (PR #828 — thanks @justsml !)

Looking Forward

👋 from @bglw — I thought I'd add a new section to these release notes talking about what's next.

The biggest item on my list is to improve the relevance of the Pagefind search results. With the current setup, you can tweak enough settings to get decent results for a given site, but it needs to better meet the goal of working more-than-good-enough out of the box.

The second-biggest item is to fill out the Modular UI and transition the default Pagefind experience to use that. This has been pending for a long time, and will be a much better base for those who wish to customize their search more than the Default UI currently allows.

Releases also now trigger a GitHub Discussion to be created, so please drop any general thoughts, comments, or feedback there 🙂

v1.3.0

18 Dec 02:00
df0f721

Choose a tag to compare

Core Features & Improvements

  • Added --quiet and --silent flags when running the Pagefind CLI, which reduce the logging output to only warnings or only errors respectively.
  • Stablized the Pagefind Rust library.
    • Thanks to @cdxker for leading this in #751 ❤️
    • This library interface has feature parity with the Node and Python indexing APIs, and is a great solution for integrating Pagefind indexing into any Rust-based tooling.

Default UI Features & Improvements

  • Added a data-pagefind-ui-meta attribute to the metadata tags on search results in the Default UI, allowing them to be targeted with CSS.
    • For example, a tag on a result containing Date: April 19, 2024 will now have data-pagefind-ui-meta="date".

Fixes & Tweaks

  • Fixed an issue where inline metadata would incorrectly render with html-escaped characters.
    • Specifically, tagging metadata inline with data-pagefind-meta="phrase:this &lt; that would index the literal &lt; rather than a < character.
    • This bug didn't occur when using data-pagefind-meta to capture the content of an element.
  • Fixed an issue where matches in compound words could (sometimes) be ranked lower than intended.
    • Specifically, for example, matching just the Cannon in CloudCannon may have ranked the word incorrectly.
  • Fixed an issue where fragment hashes would change between every Pagefind build.
    • Now, if an HTML page has not changed between two Pagefind indexes, the fragment filename will not change.
    • This saves you from having to re-upload all fragment files after every Pagefind build.

v1.2.0

06 Nov 02:15

Choose a tag to compare

Core Features & Improvements

UI Translations

*️⃣ : These languages are the first right-to-left languages in the translation set.
Please open any issues if improvements can be made to the Pagefind UI libraries when rendered for these RTL languages.

v1.1.1

03 Sep 01:52

Choose a tag to compare

Fixes & Tweaks

  • Fixes an issue where internal anchor and weight tokens would leak when captured in meta or filter attributes.
  • Improves segmentation for extended languages (PR #600 — thanks @hamano !).
  • Improves Pagefind's processing of "index.html" URLs (PR #604 — thanks @dscho !).
  • Fixes some instances of incorrect types in the Pagefind NodeJS API (PRs #642 & #655 — thanks @vanyauhalin & SKalt !).

UI Translations

  • Added Swahili translations

Secutiry

v1.1.0

02 Apr 19:37
8dc9eca

Choose a tag to compare

Core Features & Improvements

  • Improved Pagefind's core result ranking algorithm to align with BM25. For existing sites, this will change the ordering of search results, and should provide better relevance for search results by default.
  • Added the abitity to configure Pagefind's ranking algorithm.
    • Certain categories of site (i.e. reference documentation) can benefit from tweaks to the way pages are ranked. To support this, a set of ranking parameters are now configurable. Enormous thanks to @dscho for kicking off this work in #534 ❤️
    • See 📘 Customize ranking to read up on the new ranking parameters.

Default UI Features & Improvements

Fixes & Tweaks

  • Fixed a bug where the forceLanguage setting would not take priority when using the NodeJS Indexing API.
  • Fixed a bug where zero-width spaces in the source content could cause errors in search excerpts.

UI Translations

  • Added Ukranian translations (PR #523 — thanks @vladdnepr !).
  • Added Romanian translations (PR #541 — thanks @mateesville93 !).
  • Added Czech translations (PR #543 — thanks @dallyh !).
  • Added Korean translations (PR #583 — thanks @seokho-son !).
  • Improved Japanese translations (PR #560 — thanks @hamano !).

v1.0.4

16 Nov 03:04

Choose a tag to compare

Features & Improvements

Fixes & Tweaks

  • Fixed a bug, resulting in a (very) large improvement to the NodeJS Indexing API performance (~100x).
  • Fixed HTML entities being rendered escaped in metadata, filters, and custom page titles.
  • Fixed a bug where debouncedSearch returns null if any options object is passed to it.
  • Fixed a bug where a fully-qualified URL set via the NodeJS indexing API would be broken when returned as a search result.
  • Fixed Pagefind's reporting of really fast indexing times (previously logged as slower than reality) — thanks to @danpls in #448.
  • Fixed extracting sub-results when headings contain non-ascii text (especially RTL languages).

UI Translations