Skip to content

Comments

[feat](query_v2) Add PrefixQuery, PhrasePrefixQuery and UnionPostings support #60701

Open
zzzxl1993 wants to merge 1 commit intoapache:masterfrom
zzzxl1993:202602121310
Open

[feat](query_v2) Add PrefixQuery, PhrasePrefixQuery and UnionPostings support #60701
zzzxl1993 wants to merge 1 commit intoapache:masterfrom
zzzxl1993:202602121310

Conversation

@zzzxl1993
Copy link
Contributor

@zzzxl1993 zzzxl1993 commented Feb 12, 2026

What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Thearas
Copy link
Contributor

Thearas commented Feb 12, 2026

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@zzzxl1993
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 29932 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 1ba423e7617c9b1c9ed00d1a8046193921f80246, data reload: false

------ Round 1 ----------------------------------
q1	17613	4489	4289	4289
q2	2050	353	238	238
q3	10190	1272	712	712
q4	10196	771	309	309
q5	7553	2186	1887	1887
q6	193	178	146	146
q7	870	743	605	605
q8	9268	1414	1093	1093
q9	4690	4610	4591	4591
q10	6788	1943	1579	1579
q11	480	254	227	227
q12	329	378	228	228
q13	17803	4045	3206	3206
q14	236	254	220	220
q15	901	819	808	808
q16	675	680	618	618
q17	680	831	500	500
q18	6475	5842	5657	5657
q19	1284	983	585	585
q20	516	497	404	404
q21	2493	1787	1817	1787
q22	342	288	243	243
Total cold run time: 101625 ms
Total hot run time: 29932 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4400	4368	4365	4365
q2	268	347	244	244
q3	2129	2726	2292	2292
q4	1410	1772	1316	1316
q5	4392	4233	4236	4233
q6	208	177	134	134
q7	1861	1808	1669	1669
q8	2438	2657	2478	2478
q9	7646	7623	7385	7385
q10	2945	3001	2596	2596
q11	512	444	441	441
q12	741	724	602	602
q13	3896	4349	3574	3574
q14	352	377	298	298
q15	851	810	778	778
q16	662	719	696	696
q17	1182	1438	1528	1438
q18	8189	7821	7891	7821
q19	890	903	877	877
q20	2051	2179	2015	2015
q21	5177	4411	4307	4307
q22	525	466	434	434
Total cold run time: 52725 ms
Total hot run time: 49993 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 190898 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 1ba423e7617c9b1c9ed00d1a8046193921f80246, data reload: false

query5	4337	631	506	506
query6	334	220	198	198
query7	4235	469	261	261
query8	332	239	233	233
query9	8717	2762	2703	2703
query10	460	367	342	342
query11	16516	16528	16252	16252
query12	177	119	117	117
query13	1244	460	335	335
query14	6198	3169	2909	2909
query14_1	2760	2746	2731	2731
query15	197	191	171	171
query16	982	495	454	454
query17	1111	720	586	586
query18	2466	449	359	359
query19	213	207	180	180
query20	136	129	123	123
query21	221	146	133	133
query22	4801	5025	4960	4960
query23	19250	18570	18513	18513
query23_1	18314	18422	18432	18422
query24	7160	1596	1230	1230
query24_1	1240	1222	1234	1222
query25	559	468	423	423
query26	1130	274	152	152
query27	2741	461	293	293
query28	4525	1854	1846	1846
query29	792	554	463	463
query30	316	261	223	223
query31	825	777	658	658
query32	89	74	77	74
query33	532	359	301	301
query34	907	940	565	565
query35	644	676	609	609
query36	1126	1187	996	996
query37	137	102	89	89
query38	2958	2924	2809	2809
query39	948	897	908	897
query39_1	873	880	886	880
query40	223	141	124	124
query41	73	72	67	67
query42	105	102	100	100
query43	423	414	427	414
query44	1313	721	729	721
query45	196	193	185	185
query46	897	978	614	614
query47	2150	2171	2125	2125
query48	329	326	238	238
query49	621	455	366	366
query50	687	280	215	215
query51	4090	4129	4073	4073
query52	105	106	98	98
query53	298	335	275	275
query54	310	287	280	280
query55	94	86	80	80
query56	316	329	345	329
query57	1417	1390	1292	1292
query58	283	274	274	274
query59	2086	2127	1926	1926
query60	362	381	318	318
query61	143	143	141	141
query62	598	567	515	515
query63	296	263	266	263
query64	4672	1243	937	937
query65	4577	4468	4418	4418
query66	1380	449	333	333
query67	16262	16536	16325	16325
query68	2526	1077	698	698
query69	408	307	279	279
query70	1012	937	961	937
query71	321	318	292	292
query72	2941	2794	2487	2487
query73	529	552	322	322
query74	9742	9657	9545	9545
query75	2767	2755	2425	2425
query76	2288	1057	657	657
query77	366	367	312	312
query78	11359	11393	10837	10837
query79	2251	920	600	600
query80	1684	556	494	494
query81	561	275	258	258
query82	977	148	111	111
query83	319	256	233	233
query84	246	128	100	100
query85	921	477	419	419
query86	427	311	310	310
query87	3104	3072	2975	2975
query88	3522	2644	2625	2625
query89	382	354	335	335
query90	1829	178	171	171
query91	160	161	133	133
query92	73	80	73	73
query93	1102	866	476	476
query94	671	320	286	286
query95	598	330	380	330
query96	647	515	233	233
query97	2449	2511	2407	2407
query98	226	216	216	216
query99	947	938	862	862
Total cold run time: 263457 ms
Total hot run time: 190898 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 28.51 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 1ba423e7617c9b1c9ed00d1a8046193921f80246, data reload: false

query1	0.06	0.05	0.05
query2	0.09	0.04	0.04
query3	0.25	0.08	0.09
query4	1.60	0.12	0.11
query5	0.28	0.26	0.24
query6	1.16	0.68	0.68
query7	0.03	0.03	0.02
query8	0.05	0.04	0.04
query9	0.56	0.51	0.50
query10	0.55	0.55	0.55
query11	0.14	0.09	0.10
query12	0.14	0.10	0.10
query13	0.62	0.62	0.62
query14	1.08	1.06	1.06
query15	0.89	0.88	0.87
query16	0.39	0.39	0.38
query17	1.14	1.16	1.19
query18	0.24	0.21	0.20
query19	2.04	1.99	1.98
query20	0.02	0.01	0.01
query21	15.40	0.28	0.15
query22	5.23	0.07	0.06
query23	16.04	0.29	0.11
query24	2.65	0.37	0.59
query25	0.09	0.07	0.08
query26	0.15	0.14	0.14
query27	0.06	0.06	0.06
query28	4.36	1.13	0.98
query29	12.56	3.95	3.14
query30	0.29	0.13	0.12
query31	2.81	0.64	0.40
query32	3.24	0.60	0.49
query33	3.17	3.24	3.23
query34	16.00	5.39	4.72
query35	4.82	4.87	4.82
query36	0.66	0.49	0.48
query37	0.11	0.07	0.07
query38	0.08	0.05	0.04
query39	0.05	0.04	0.04
query40	0.19	0.17	0.15
query41	0.09	0.03	0.03
query42	0.04	0.03	0.03
query43	0.04	0.04	0.03
Total cold run time: 99.46 s
Total hot run time: 28.51 s

@zzzxl1993
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 30923 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 1757372a12e3b87c06c1710a6ef32e4f2c563014, data reload: false

------ Round 1 ----------------------------------
q1	17687	4518	4335	4335
q2	2054	361	235	235
q3	10141	1281	728	728
q4	10191	783	310	310
q5	7487	2217	1930	1930
q6	202	174	146	146
q7	920	751	600	600
q8	9279	1378	1148	1148
q9	5022	4706	4708	4706
q10	6848	1949	1553	1553
q11	459	268	246	246
q12	410	379	228	228
q13	17805	4082	3197	3197
q14	242	236	216	216
q15	872	802	814	802
q16	699	677	627	627
q17	695	827	539	539
q18	6562	6023	6360	6023
q19	1358	1097	708	708
q20	569	528	461	461
q21	2805	2040	1929	1929
q22	369	319	256	256
Total cold run time: 102676 ms
Total hot run time: 30923 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4614	4558	4589	4558
q2	263	329	262	262
q3	2299	2980	2493	2493
q4	1557	1856	1394	1394
q5	4765	4742	4699	4699
q6	231	173	132	132
q7	1911	1952	1844	1844
q8	2598	2371	2365	2365
q9	7486	7727	7429	7429
q10	2847	2873	2421	2421
q11	495	416	410	410
q12	685	695	570	570
q13	3575	4089	3198	3198
q14	281	283	276	276
q15	821	783	780	780
q16	631	673	656	656
q17	1079	1235	1288	1235
q18	7408	7227	7378	7227
q19	829	795	828	795
q20	1999	2073	1928	1928
q21	4416	4228	4076	4076
q22	533	464	426	426
Total cold run time: 51323 ms
Total hot run time: 49174 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 188542 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 1757372a12e3b87c06c1710a6ef32e4f2c563014, data reload: false

query5	4904	617	488	488
query6	336	222	198	198
query7	4219	459	263	263
query8	350	248	256	248
query9	8701	2690	2676	2676
query10	525	389	335	335
query11	17228	17031	16825	16825
query12	185	124	122	122
query13	1252	441	329	329
query14	6322	3197	2920	2920
query14_1	2778	2786	2800	2786
query15	199	191	173	173
query16	987	492	453	453
query17	1068	694	597	597
query18	2531	456	326	326
query19	213	198	178	178
query20	140	129	127	127
query21	223	140	120	120
query22	4875	5146	4815	4815
query23	17292	16918	16636	16636
query23_1	16833	16791	16750	16750
query24	7118	1619	1224	1224
query24_1	1220	1235	1242	1235
query25	532	437	389	389
query26	1231	270	152	152
query27	2761	470	295	295
query28	4545	1864	1874	1864
query29	794	542	447	447
query30	314	241	217	217
query31	898	736	621	621
query32	87	83	76	76
query33	522	339	302	302
query34	935	927	578	578
query35	658	682	614	614
query36	1073	1145	965	965
query37	155	102	93	93
query38	2913	2887	2901	2887
query39	886	901	836	836
query39_1	795	841	797	797
query40	227	142	125	125
query41	75	71	67	67
query42	113	111	109	109
query43	391	383	351	351
query44	1315	713	720	713
query45	203	194	185	185
query46	882	986	620	620
query47	2129	2141	2083	2083
query48	318	314	239	239
query49	627	447	373	373
query50	670	284	222	222
query51	4180	4144	4034	4034
query52	109	110	101	101
query53	306	344	286	286
query54	316	277	276	276
query55	90	85	80	80
query56	316	332	339	332
query57	1399	1349	1252	1252
query58	299	293	285	285
query59	2606	2634	2554	2554
query60	357	360	338	338
query61	176	176	178	176
query62	618	579	548	548
query63	310	276	283	276
query64	5004	1337	1060	1060
query65	4629	4551	4525	4525
query66	1445	479	384	384
query67	16412	16492	16283	16283
query68	2340	1128	704	704
query69	404	312	286	286
query70	1066	980	985	980
query71	335	311	294	294
query72	2905	2733	2290	2290
query73	515	538	319	319
query74	9616	9539	9307	9307
query75	2857	2758	2446	2446
query76	2315	1070	646	646
query77	361	381	303	303
query78	11033	11147	10469	10469
query79	1077	935	590	590
query80	1266	589	517	517
query81	552	275	265	265
query82	969	156	115	115
query83	340	261	242	242
query84	252	127	99	99
query85	884	480	415	415
query86	412	299	277	277
query87	3128	3095	2964	2964
query88	3537	2671	2642	2642
query89	430	367	356	356
query90	1941	171	172	171
query91	160	159	130	130
query92	79	77	71	71
query93	934	847	485	485
query94	634	321	280	280
query95	591	348	371	348
query96	627	507	223	223
query97	2496	2481	2418	2418
query98	219	216	214	214
query99	995	989	936	936
Total cold run time: 261670 ms
Total hot run time: 188542 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 28.34 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 1757372a12e3b87c06c1710a6ef32e4f2c563014, data reload: false

query1	0.06	0.05	0.05
query2	0.10	0.05	0.05
query3	0.25	0.08	0.09
query4	1.61	0.12	0.11
query5	0.28	0.24	0.25
query6	1.17	0.68	0.69
query7	0.04	0.02	0.03
query8	0.05	0.05	0.04
query9	0.57	0.51	0.50
query10	0.54	0.56	0.55
query11	0.14	0.09	0.09
query12	0.13	0.10	0.10
query13	0.63	0.62	0.62
query14	1.07	1.07	1.06
query15	0.87	0.86	0.88
query16	0.40	0.39	0.41
query17	1.09	1.16	1.12
query18	0.22	0.21	0.21
query19	2.10	1.99	2.03
query20	0.02	0.02	0.01
query21	15.42	0.25	0.15
query22	4.96	0.06	0.05
query23	15.76	0.28	0.11
query24	2.38	0.24	0.54
query25	0.10	0.11	0.07
query26	0.14	0.13	0.14
query27	0.05	0.11	0.08
query28	4.60	1.11	0.97
query29	12.55	3.88	3.13
query30	0.28	0.13	0.11
query31	2.82	0.64	0.41
query32	3.25	0.59	0.49
query33	3.26	3.22	3.20
query34	16.31	5.39	4.76
query35	4.81	4.84	4.82
query36	0.65	0.51	0.48
query37	0.10	0.07	0.06
query38	0.07	0.04	0.04
query39	0.05	0.03	0.03
query40	0.21	0.17	0.15
query41	0.08	0.03	0.03
query42	0.04	0.03	0.02
query43	0.04	0.04	0.04
Total cold run time: 99.27 s
Total hot run time: 28.34 s

@hello-stephen
Copy link
Contributor

BE UT Coverage Report

Increment line coverage 95.05% (269/283) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 52.73% (19516/37010)
Line Coverage 36.29% (181993/501528)
Region Coverage 32.62% (141052/432349)
Branch Coverage 33.66% (61165/181731)

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 73.91% (17/23) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 71.77% (26011/36241)
Line Coverage 54.43% (272188/500031)
Region Coverage 51.67% (225548/436513)
Branch Coverage 53.22% (97026/182317)

@zzzxl1993
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 28990 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit b3d60dfaa1b3a3c0513ef2f7e60197e03ac54e3c, data reload: false

------ Round 1 ----------------------------------
============================================
q1	17636	4503	4302	4302
q2	q3	10666	766	516	516
q4	4680	350	250	250
q5	7553	1191	1024	1024
q6	183	181	151	151
q7	788	854	660	660
q8	9296	1492	1393	1393
q9	4835	4733	4737	4733
q10	6791	1871	1661	1661
q11	473	270	246	246
q12	708	563	463	463
q13	17778	4204	3400	3400
q14	235	248	223	223
q15	913	799	786	786
q16	747	721	682	682
q17	722	866	458	458
q18	6032	5401	5192	5192
q19	1256	1004	668	668
q20	525	503	388	388
q21	4878	2039	1515	1515
q22	421	327	279	279
Total cold run time: 97116 ms
Total hot run time: 28990 ms

----- Round 2, with runtime_filter_mode=off -----
============================================
q1	4695	4565	4579	4565
q2	q3	1787	2255	1759	1759
q4	921	1214	762	762
q5	4118	4404	4385	4385
q6	189	178	142	142
q7	1784	1635	1547	1547
q8	2518	2913	2563	2563
q9	7579	7532	7360	7360
q10	2651	2849	2396	2396
q11	547	458	438	438
q12	509	597	462	462
q13	4061	4417	3614	3614
q14	286	297	285	285
q15	832	779	816	779
q16	701	760	694	694
q17	1160	1518	1395	1395
q18	7210	6890	6588	6588
q19	973	956	919	919
q20	2134	2164	2042	2042
q21	4205	3451	3419	3419
q22	522	449	394	394
Total cold run time: 49382 ms
Total hot run time: 46508 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 185779 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit b3d60dfaa1b3a3c0513ef2f7e60197e03ac54e3c, data reload: false

query5	4777	650	511	511
query6	352	242	207	207
query7	4235	478	274	274
query8	335	247	234	234
query9	8746	2718	2701	2701
query10	514	386	342	342
query11	17298	17117	16853	16853
query12	170	124	121	121
query13	1256	460	336	336
query14	6413	3307	2957	2957
query14_1	2788	2787	2829	2787
query15	209	191	176	176
query16	1004	481	465	465
query17	1080	739	618	618
query18	2607	453	350	350
query19	220	206	186	186
query20	143	133	128	128
query21	230	148	122	122
query22	5699	5624	5644	5624
query23	17736	17418	17140	17140
query23_1	17030	17049	16789	16789
query24	7130	1650	1252	1252
query24_1	1239	1264	1241	1241
query25	569	486	427	427
query26	1239	268	159	159
query27	2767	486	295	295
query28	4496	1866	1869	1866
query29	824	592	508	508
query30	328	254	212	212
query31	909	731	654	654
query32	88	75	75	75
query33	518	345	295	295
query34	933	932	572	572
query35	644	686	606	606
query36	1100	1106	1019	1019
query37	144	97	89	89
query38	2969	3089	2924	2924
query39	854	847	806	806
query39_1	848	801	808	801
query40	235	158	140	140
query41	69	65	64	64
query42	109	102	106	102
query43	410	395	364	364
query44	
query45	193	190	186	186
query46	931	998	612	612
query47	2150	2201	2112	2112
query48	305	328	229	229
query49	635	467	387	387
query50	700	287	214	214
query51	4122	4195	4159	4159
query52	105	111	100	100
query53	288	343	291	291
query54	312	270	278	270
query55	88	80	78	78
query56	300	309	309	309
query57	1379	1329	1273	1273
query58	282	279	271	271
query59	2494	2657	2501	2501
query60	338	346	313	313
query61	148	144	147	144
query62	633	593	558	558
query63	310	280	274	274
query64	4894	1292	981	981
query65	
query66	1437	465	349	349
query67	16511	16571	16509	16509
query68	
query69	407	308	279	279
query70	1002	1016	1017	1016
query71	340	309	304	304
query72	2764	2679	2391	2391
query73	542	575	323	323
query74	9679	9571	9427	9427
query75	2890	2753	2454	2454
query76	2318	1046	696	696
query77	401	399	308	308
query78	11759	11918	11147	11147
query79	1829	820	629	629
query80	1346	621	533	533
query81	571	282	263	263
query82	1040	152	112	112
query83	343	269	247	247
query84	249	122	106	106
query85	927	487	432	432
query86	415	320	295	295
query87	3122	3148	3041	3041
query88	3635	2693	2693	2693
query89	419	385	349	349
query90	2017	177	177	177
query91	172	160	134	134
query92	80	74	76	74
query93	1051	844	505	505
query94	632	330	307	307
query95	603	402	316	316
query96	666	545	235	235
query97	2472	2524	2413	2413
query98	237	225	217	217
query99	1035	985	925	925
Total cold run time: 256624 ms
Total hot run time: 185779 ms

@doris-robot
Copy link

BE UT Coverage Report

Increment line coverage 96.92% (252/260) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 52.72% (19512/37009)
Line Coverage 36.26% (181871/501525)
Region Coverage 32.59% (140897/432346)
Branch Coverage 33.65% (61153/181728)

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100% (0/0) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 71.85% (26038/36240)
Line Coverage 54.47% (272382/500028)
Region Coverage 51.76% (225921/436510)
Branch Coverage 53.35% (97262/182314)

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100% (0/0) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 71.85% (26038/36240)
Line Coverage 54.47% (272378/500028)
Region Coverage 51.75% (225891/436510)
Branch Coverage 53.35% (97260/182314)

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100% (0/0) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.53% (26646/36240)
Line Coverage 56.66% (283339/500028)
Region Coverage 54.09% (236087/436510)
Branch Coverage 55.93% (101971/182314)

}

auto bit_set = std::make_shared<BitSetScorer>(doc_bitset);
auto const_score = std::make_shared<ConstScoreScorer<BitSetScorerPtr>>(std::move(bit_set));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why use ConstScoreScorer here, not supported scoring?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefix queries don't support scoring in the traditional sense. They expand to multiple terms (e.g., "pre" matches "prefix", "prepare", "present", etc.), and scoring each expanded term individually doesn't make semantic sense - the user is searching for a prefix pattern, not independent terms. That's why I use ConstScoreScorer to give all matching documents the same constant score. This is standard for prefix/fuzzy/wildcard queries.

}
}

uint32_t advance() override {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a priority-queue (min-heap) optimization for advance()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered using a priority queue. Based on our dataset testing, the current linear scan performs well for typical query patterns. The simplicity and cache-friendly nature of linear scan outweigh the theoretical O(log n) advantage of heaps in practice. If profiling shows this becomes a bottleneck in the future, we can optimize with a hybrid approach (e.g., use heap when n > threshold).

}

// Only prefix term, no phrase terms — fall back to a plain prefix query.
PrefixQuery prefix_query(_context, std::move(_field), std::move(_prefix.value().second));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why std::move inner member?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The std::move is intentional here - both code paths in weight() (line 59 and line 80) consume these members and return immediately, so the object won't be used after this call. This avoids unnecessary copies of the string data. But I can change to copy if you think it's clearer.

Copy link
Member

@airborne12 airborne12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Feb 15, 2026
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants