Skip to content

Commit 3d00693

Browse files
Pg textsearch0.3.0 (#4635)
* chore: update glossary internal links. * chore: pg_textsearce v0.3.0. * chore: update for 0.4.0 release note * Update pg-textsearch.md Signed-off-by: Iain Cox <iain@tigerdata.com> * chore: latest features * Remove parallel build Signed-off-by: Iain Cox <iain@tigerdata.com> * chore: missing file . * chore: missing file . * chore: missing file . --------- Signed-off-by: Iain Cox <iain@tigerdata.com>
1 parent da86d16 commit 3d00693

File tree

4 files changed

+56
-14
lines changed

4 files changed

+56
-14
lines changed

_partials/_since_0_4_0.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
<Tag variant="hollow">Since [pg_textsearch v0.4.0](https://github.com/timescale/pg_textsearch/releases/tag/v0.4.0)</Tag>
File renamed without changes.

use-timescale/extensions/pg-textsearch.md

Lines changed: 54 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -8,14 +8,18 @@ products: [cloud, self_hosted]
88

99
import EA1125 from "versionContent/_partials/_early_access_11_25.mdx";
1010
import SINCE010 from "versionContent/_partials/_since_0_1_0.mdx";
11+
import SINCE040 from "versionContent/_partials/_since_0_4_0.mdx";
1112
import IntegrationPrereqs from "versionContent/_partials/_integration-prereqs.mdx";
1213

1314
# Optimize full text search with BM25
1415

15-
$PG full-text search at scale consistently hits a wall where performance degrades catastrophically.
16+
$PG full-text search at scale consistently hits a wall where performance degrades catastrophically.
1617
$COMPANY's [pg_textsearch][pg_textsearch-github-repo] brings modern [BM25][bm25-wiki]-based full-text search directly into $PG,
17-
with a memtable architecture for efficient indexing and ranking. `pg_textsearch` integrates seamlessly with SQL and
18-
provides better search quality and performance than the $PG built-in full-text search.
18+
with a memtable architecture for efficient indexing and ranking. `pg_textsearch` integrates seamlessly with SQL and
19+
provides better search quality and performance than the $PG built-in full-text search. With Block-Max WAND optimization,
20+
`pg_textsearch` delivers up to **4x faster top-k queries** compared to native BM25 implementations. Advanced compression
21+
using delta encoding and bitpacking reduces index sizes by **41%** while improving query performance by 10-20% for
22+
shorter queries.
1923

2024
BM25 scores in `pg_textsearch` are returned as negative values, where lower (more negative) numbers indicate better
2125
matches. `pg_textsearch` implements the following:
@@ -73,7 +77,7 @@ You have installed `pg_textsearch` on $CLOUD_LONG.
7377

7478
## Create BM25 indexes on your data
7579

76-
BM25 indexes provide modern relevance ranking that outperforms $PG's built-in ts_rank functions by using corpus
80+
BM25 indexes provide modern relevance ranking that outperforms $PG's built-in ts_rank functions by using corpus
7781
statistics and better algorithmic design.
7882
7983
To create a BM25 index with pg_textsearch:
@@ -109,21 +113,31 @@ To create a BM25 index with pg_textsearch:
109113
WITH (text_config='english');
110114
```
111115
112-
BM25 supports single-column indexes only.
116+
BM25 supports single-column indexes only. For optimal performance, load your data first, then create the index.
113117
114118
</Procedure>
115119
116120
You have created a BM25 index for full-text search.
117121
118122
## Optimize search queries for performance
119123
120-
Use efficient query patterns to leverage BM25 ranking and optimize search performance.
124+
Use efficient query patterns to leverage BM25 ranking and optimize search performance. The `<@>` operator provides
125+
BM25-based ranking scores as negative values, where lower (more negative) scores indicate better matches. In `ORDER BY`
126+
clauses, the index is automatically detected from the column. For `WHERE` clause filtering, use `to_bm25query()` with
127+
an explicit index name.
121128
122129
<Procedure>
123130
124131
1. **Perform ranked searches using the distance operator**
125132
126133
```sql
134+
-- Simplified syntax: index is automatically detected in ORDER BY
135+
SELECT name, description, description <@> 'ergonomic work' as score
136+
FROM products
137+
ORDER BY score
138+
LIMIT 3;
139+
140+
-- Alternative explicit syntax (works in all contexts)
127141
SELECT name, description, description <@> to_bm25query('ergonomic work', 'products_search_idx') as score
128142
FROM products
129143
ORDER BY score
@@ -142,6 +156,8 @@ Use efficient query patterns to leverage BM25 ranking and optimize search perfor
142156
143157
1. **Filter results by score threshold**
144158
159+
For filtering with WHERE clauses, use explicit index specification with `to_bm25query()`:
160+
145161
```sql
146162
SELECT name, description <@> to_bm25query('wireless', 'products_search_idx') as score
147163
FROM products
@@ -163,7 +179,7 @@ Use efficient query patterns to leverage BM25 ranking and optimize search perfor
163179
FROM products
164180
WHERE price < 500
165181
AND description <@> to_bm25query('ergonomic', 'products_search_idx') < -0.5
166-
ORDER BY description <@> to_bm25query('ergonomic', 'products_search_idx')
182+
ORDER BY score
167183
LIMIT 5;
168184
```
169185
@@ -342,17 +358,30 @@ Customize `pg_textsearch` behavior for your specific use case and data character
342358
threshold, it automatically flushes to a segment at transaction commit.
343359
344360
```sql
345-
-- Set memtable spill threshold (default 800000 posting entries, ~8MB segments)
346-
SET pg_textsearch.memtable_spill_threshold = 1000000;
361+
-- Set memtable spill threshold (default 32000000 posting entries, ~1M docs/segment)
362+
SET pg_textsearch.memtable_spill_threshold = 32000000;
347363
348364
-- Set bulk load spill threshold (default 100000 terms per transaction)
349365
SET pg_textsearch.bulk_load_threshold = 150000;
350366
351367
-- Set default query limit when no LIMIT clause is present (default 1000)
352368
SET pg_textsearch.default_limit = 5000;
369+
370+
-- Enable Block-Max WAND optimization for faster top-k queries (enabled by default)
371+
SET pg_textsearch.enable_bmw = true;
372+
373+
-- Log block skip statistics for debugging query performance (disabled by default)
374+
SET pg_textsearch.log_bmw_stats = false;
353375
```
354376
<SINCE010 />
355377
378+
```sql
379+
-- Enable segment compression using delta encoding and bitpacking (enabled by default)
380+
-- Reduces index size by ~41% with 10-20% query performance improvement for shorter queries
381+
SET pg_textsearch.compress_segments = on;
382+
```
383+
<SINCE040 />
384+
356385
1. **Configure language-specific text processing**
357386
358387
You can create multiple BM25 indexes on the same column with different language configurations:
@@ -387,11 +416,26 @@ Customize `pg_textsearch` behavior for your specific use case and data character
387416
WHERE indexrelid::regclass::text ~ 'bm25';
388417
```
389418
390-
- View detailed index information
419+
- View index summary with corpus statistics and memory usage
420+
```sql
421+
SELECT bm25_summarize_index('products_search_idx');
422+
```
423+
424+
- View detailed index structure (output is truncated for display)
391425
```sql
392426
SELECT bm25_dump_index('products_search_idx');
393427
```
394428
429+
- Export full index dump to a file for detailed analysis
430+
```sql
431+
SELECT bm25_dump_index('products_search_idx', '/tmp/index_dump.txt');
432+
```
433+
434+
- Force memtable spill to disk (useful for testing or memory management)
435+
```sql
436+
SELECT bm25_spill_index('products_search_idx');
437+
```
438+
395439
</Procedure>
396440
397441
You have configured `pg_textsearch` for optimal performance. For production applications, consider implementing result

use-timescale/schema-management/about-constraints.md

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -38,8 +38,6 @@ CREATE TABLE conditions (
3838
);
3939
```
4040

41-
<CreateHypertablePolicyNote />
42-
4341
This example also references values in another `locations` table using a foreign
4442
key constraint.
4543

@@ -50,7 +48,6 @@ Time columns used for partitioning must not allow `NULL` values. A
5048

5149
</Highlight>
5250

53-
For more information on how to manage constraints, see the
54-
[$PG docs][postgres-createconstraint].
51+
For more information on how to manage constraints, see the [$PG docs][postgres-createconstraint].
5552

5653
[postgres-createconstraint]: https://www.postgresql.org/docs/current/ddl-constraints.html

0 commit comments

Comments
 (0)