What's the best way to understand why a specific result is shown? #584

nhoizey · 2024-03-27T22:44:13Z

nhoizey
Mar 27, 2024

I don't understand why the results in my site's search are generaly right, but they're not when I search for sphinx:

Here's the search: https://nicolas-hoizey.photo/search/?q=sphinx

It finds this content, which is not relevant: https://nicolas-hoizey.photo/galleries/travels/europe/spain/andalusia/the-arch-from-once-upon-a-time-in-the-west-in-texas-hollywood/

But it doesn't find this one, which should be found: https://nicolas-hoizey.photo/galleries/animals/arthropods/insects/butterflies-and-moths/a-sphinx-moth-in-the-making/

Any advice on how to understand why the results are wrong?

Answered by bglw

Mar 27, 2024

Ah, yes this is an interesting one.

At the moment, Pagefind doesn't include metadata in the searchable index — they're two separate systems, essentially.

In this case, your title:

<h1 class="p-name" data-pagefind-meta="title">A sphinx moth in the making</h1>

Is outside your indexed body:

<div class="description e-content" data-pagefind-body=""> ...

This puts it in the metadata, but not in the index. As a result, nothing is in the index with sphinx, and Pagefind regresses your search term all the way back to s to try find some result. (Possibly not the most helpful step, but Pagefind really likes giving some result over nothing).

Fix here is to chuck a data-pagefind-body on your h1 as we…

View full answer

bglw · 2024-03-27T22:48:18Z

bglw
Mar 27, 2024
Maintainer

Ah, yes this is an interesting one.

At the moment, Pagefind doesn't include metadata in the searchable index — they're two separate systems, essentially.

In this case, your title:

<h1 class="p-name" data-pagefind-meta="title">A sphinx moth in the making</h1>

Is outside your indexed body:

<div class="description e-content" data-pagefind-body=""> ...

This puts it in the metadata, but not in the index. As a result, nothing is in the index with sphinx, and Pagefind regresses your search term all the way back to s to try find some result. (Possibly not the most helpful step, but Pagefind really likes giving some result over nothing).

Fix here is to chuck a data-pagefind-body on your h1 as well — multiple on the same page poses no issue.

This is a common pitfall, so the longer term fix for Pagefind is #532

2 replies

nhoizey Mar 27, 2024
Author

I understand and I just did what you recommend, and it's much better indeed! 🙏

I would prefer if the title didn't appear twice, once as title and the second in the beginning of the excerpt, but it's much better with accurate results.

rea1shane May 10, 2024

I would prefer if the title didn't appear twice, once as title and the second in the beginning of the excerpt

This will make the result neater, can make it possible?

bobmonsour · 2025-12-13T21:33:06Z

bobmonsour
Dec 13, 2025

After some wrangling, I have been able to remove the title from the except. Note that in the following PagefindUI, I have chosen to use the highlightParam. As a result, the code here also does the work of removing the and tags from the excerpt as they have to be removed to enable identifying the duplication. Hopefully, some will find this helpful. Note also that I am only processing subResults here, not the main result.

document.addEventListener("DOMContentLoaded", () => {
  new PagefindUI({
    element: "#search",
    translations: {
      placeholder: "Enter search terms...",
      zero_results: "Count not find [SEARCH_TERM]",
    },
    excerptLength: 100,
    highlightParam: "highlight",
    resetStyles: true,
    pageSize: 5,
    showImages: false,
    showEmptyFilters: false,
    showSubResults: true,
    processResult: function (result) {
      if (result.sub_results && Array.isArray(result.sub_results)) {
        result.sub_results.forEach((subResult) => {
          // --- Remove Title from Excerpt ---
          const title = subResult.title;
          let excerpt = subResult.excerpt;

          // Remove all <mark> and </mark> tags from excerpt for comparison
          const cleanExcerpt = excerpt.replace(/<\/?mark>/g, "");

          // Check if cleaned excerpt starts with title
          if (cleanExcerpt.startsWith(title)) {
            // Find the position in the original excerpt where the title ends
            let charCount = 0;
            let position = 0;

            while (charCount < title.length && position < excerpt.length) {
              if (excerpt.substring(position).startsWith("<mark>")) {
                position += 6; // Skip "<mark>"
              } else if (excerpt.substring(position).startsWith("</mark>")) {
                position += 7; // Skip "</mark>"
              } else {
                charCount++;
                position++;
              }
            }

            // Remove the title portion and trim
            let newExcerpt = excerpt.substring(position).trim();

            // Remove any leading <mark> or </mark> tags
            newExcerpt = newExcerpt.replace(/^(<\/?mark>)+/, "");

            // Remove leading punctuation and whitespace
            newExcerpt = newExcerpt.replace(/^[.,;:!?\s]+/, "");

            if (newExcerpt.length > 0) {
              // Capitalize first letter (skip over any leading <mark> tag)
              const markMatch = newExcerpt.match(/^(<mark>)?(.)/);
              if (markMatch) {
                const prefix = markMatch[1] || "";
                const firstChar = markMatch[2].toUpperCase();
                newExcerpt =
                  prefix + firstChar + newExcerpt.slice(prefix.length + 1);
              }
              subResult.excerpt = newExcerpt;
            }
          }
        });
      }
      // Return the modified result object
      return result;
    },
  });
});

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What's the best way to understand why a specific result is shown? #584

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

What's the best way to understand why a specific result is shown? #584

Uh oh!

nhoizey Mar 27, 2024

Replies: 2 comments · 2 replies

Uh oh!

Uh oh!

bglw Mar 27, 2024 Maintainer

Uh oh!

nhoizey Mar 27, 2024 Author

Uh oh!

rea1shane May 10, 2024

Uh oh!

bobmonsour Dec 13, 2025

nhoizey
Mar 27, 2024

Replies: 2 comments 2 replies

bglw
Mar 27, 2024
Maintainer

nhoizey Mar 27, 2024
Author

bobmonsour
Dec 13, 2025