normalize rank to be 0..1 from database searches by ilkka-ollakka · Pull Request #3555 · bookwyrm-social/bookwyrm

ilkka-ollakka · 2025-04-19T20:31:32Z

Description

This gives the local search ranks in scale of 0..1, same what we assume connector confidence to be.

I'm not sure if this is desired change, as it might rule out some results that were previously found, as now the min_confidence most likely has more relevance that is given as filtering criteria.

value 32 is documented in
https://www.postgresql.org/docs/current/textsearch-controls.html#TEXTSEARCH-RANKING

Related Issue #
Closes #

What type of Pull Request is this?

Bug Fix
Enhancement
Plumbing / Internals / Dependencies
Refactor

Does this PR change settings or dependencies, or break something?

This PR changes or adds default settings, configuration, or .env values
This PR changes or adds dependencies
This PR introduces other breaking changes

Details of breaking or configuration changes (if any of above checked)

Documentation

New or amended documentation will be required if this PR is merged
I have created a matching pull request in the Documentation repository
I intend to create a matching pull request in the Documentation repository after this PR is merged

Tests

My changes do not need new tests
All tests I have added are passing
I have written tests but need help to make them pass
I have not written tests and need help to write them

ilkka-ollakka · 2025-04-20T10:25:53Z

pytest seems to fail on importer job checks, so most likely timing issue and some other item were also in queue at the same time so index was not the first one.

ilkka-ollakka · 2025-04-26T15:18:31Z

Seem that I'm unable to reproduce the issue locally :/

mouse-reeve · 2025-04-26T15:21:04Z

I don't think this issue is related to your code, it seems like an intermittent test failure

ilkka-ollakka · 2025-04-26T17:59:19Z

I added commit that github action reruns failed tests if any found, just to rule out timing issues in the future, and of course now things don't fail anymore ;)

ilkka-ollakka · 2025-04-26T18:06:46Z

I can split the workflow commit to separate PR if anyone sees it useful and doesn't want to check yet the normalization things.

mouse-reeve · 2025-04-26T18:18:41Z

A separate commit would be great, I can also re-run the workflows if that would be helpful

ilkka-ollakka · 2025-04-26T18:33:23Z

I splitted it up to #3559 and I'll take the commit away from this PR.

This gives the local search ranks in scale of 0..1, same what we assume connector confidence to be. value 32 is documented in https://www.postgresql.org/docs/current/textsearch-controls.html#TEXTSEARCH-RANKING

index value is not that strict required to be 0, more relevant is that the titles are correct

mouse-reeve · 2025-04-26T18:35:38Z

This seems like a sensible change -- my understanding from reading the docs is that it will keep the search ranking from prioritizing long titles/author strings over ones that are closer matches, is that right? I've been trying this locally but haven't figured out any good combinations of books and queries to see the differences. Do you have suggestions?

ilkka-ollakka · 2025-04-26T18:48:09Z

This mainly scales the match confidence-values to be under 1 always, so really high score (10000 for example) gets confidence of 0.9999 and score of 10 gets confidence of 0.9090... . So it shouldn't change the search ranking/ordering, just scaling the values to known range.

I didn't yet extensively check any good examples, but I can check out if I can spot any examples.

Mainly I did this so the min_confidence would have effect on local searches too, as the queries have the filtering in place.

mouse-reeve · 2025-04-26T18:54:50Z

I see! That makes more sense -- I fully support normalizing the rank values.

ilkka-ollakka added 2 commits April 26, 2025 21:33

SearchRank: normalize rank to be 0..1

28976c2

This gives the local search ranks in scale of 0..1, same what we assume connector confidence to be. value 32 is documented in https://www.postgresql.org/docs/current/textsearch-controls.html#TEXTSEARCH-RANKING

update openreads test not to check absolute index value

aa25564

index value is not that strict required to be 0, more relevant is that the titles are correct

ilkka-ollakka force-pushed the tweak/normalize_search_rank branch from 7424a59 to aa25564 Compare April 26, 2025 18:33

mouse-reeve merged commit 0627abe into bookwyrm-social:main Apr 26, 2025
10 checks passed

ilkka-ollakka deleted the tweak/normalize_search_rank branch April 26, 2025 18:59

hughrun added the plumbing PR for internal processes or background jobs label Aug 22, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Comments

normalize rank to be 0..1 from database searches#3555

normalize rank to be 0..1 from database searches#3555
mouse-reeve merged 2 commits intobookwyrm-social:mainfrom
ilkka-ollakka:tweak/normalize_search_rank

ilkka-ollakka commented Apr 19, 2025

Uh oh!

ilkka-ollakka commented Apr 20, 2025

Uh oh!

ilkka-ollakka commented Apr 26, 2025

Uh oh!

mouse-reeve commented Apr 26, 2025

Uh oh!

ilkka-ollakka commented Apr 26, 2025

Uh oh!

ilkka-ollakka commented Apr 26, 2025

Uh oh!

mouse-reeve commented Apr 26, 2025

Uh oh!

ilkka-ollakka commented Apr 26, 2025

Uh oh!

mouse-reeve commented Apr 26, 2025

Uh oh!

ilkka-ollakka commented Apr 26, 2025

Uh oh!

mouse-reeve commented Apr 26, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Comments

Conversation

ilkka-ollakka commented Apr 19, 2025

Description

What type of Pull Request is this?

Does this PR change settings or dependencies, or break something?

Details of breaking or configuration changes (if any of above checked)

Documentation

Tests

Uh oh!

ilkka-ollakka commented Apr 20, 2025

Uh oh!

ilkka-ollakka commented Apr 26, 2025

Uh oh!

mouse-reeve commented Apr 26, 2025

Uh oh!

ilkka-ollakka commented Apr 26, 2025

Uh oh!

ilkka-ollakka commented Apr 26, 2025

Uh oh!

mouse-reeve commented Apr 26, 2025

Uh oh!

ilkka-ollakka commented Apr 26, 2025

Uh oh!

mouse-reeve commented Apr 26, 2025

Uh oh!

ilkka-ollakka commented Apr 26, 2025

Uh oh!

mouse-reeve commented Apr 26, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants