Skip to content

Backport bugfixes for 8.0.7#3133

Open
roshkhatri wants to merge 15 commits intovalkey-io:8.0from
roshkhatri:8.0
Open

Backport bugfixes for 8.0.7#3133
roshkhatri wants to merge 15 commits intovalkey-io:8.0from
roshkhatri:8.0

Conversation

@roshkhatri
Copy link
Member

@roshkhatri roshkhatri commented Jan 29, 2026

Includes bugfixes for 8.0.7
Updates version.h
Adds release notes for new version

@codecov
Copy link

codecov bot commented Jan 30, 2026

Codecov Report

❌ Patch coverage is 79.31034% with 12 lines in your changes missing coverage. Please review.
✅ Project coverage is 70.79%. Comparing base (1cac48f) to head (132b590).
⚠️ Report is 1 commits behind head on 8.0.

Files with missing lines Patch % Lines
src/object.c 0.00% 7 Missing ⚠️
src/module.c 0.00% 3 Missing ⚠️
src/rax.c 91.30% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##              8.0    #3133      +/-   ##
==========================================
- Coverage   70.85%   70.79%   -0.06%     
==========================================
  Files         114      114              
  Lines       63096    61836    -1260     
==========================================
- Hits        44706    43778     -928     
+ Misses      18390    18058     -332     
Files with missing lines Coverage Δ
src/cli_common.c 61.03% <100.00%> (-0.26%) ⬇️
src/networking.c 88.45% <100.00%> (-0.13%) ⬇️
src/resp_parser.c 98.47% <100.00%> (ø)
src/server.c 89.04% <100.00%> (-0.09%) ⬇️
src/t_list.c 92.90% <100.00%> (+0.06%) ⬆️
src/rax.c 82.83% <91.30%> (+0.38%) ⬆️
src/module.c 9.61% <0.00%> (-0.03%) ⬇️
src/object.c 78.42% <0.00%> (+0.50%) ⬆️

... and 87 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

knggk and others added 15 commits February 4, 2026 21:06
Introduce a `size_t` field into the rax struct to track allocation size.
Update the allocation size on rax insert and deletes.
Return the allocation size when `raxAllocSize` is called.

This size tracking is now used in MEMORY USAGE and MEMORY STATS in place
of the previous method based on sampling.

The module API allows to create sorted dictionaries, which are backed by
rax. Users now also get precise memory allocation for them (through
`ValkeyModule_MallocSizeDict`).

Fixes valkey-io#677.

For the release notes:

* MEMORY USAGE and MEMORY STATS are now exact for streams, rather than
based on sampling.

---------

Signed-off-by: Guillaume Koenig <knggk@amazon.com>
Signed-off-by: Guillaume Koenig <106696198+knggk@users.noreply.github.com>
Co-authored-by: Joey <yzhaon@amazon.com>
Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
Avoid tmpfs as fadvise(FADV_DONTNEED) has no effect on memory-backed
filesystems.

Fixes valkey-io#897

---------

Signed-off-by: Ran Shidlansik <ranshid@amazon.com>
Signed-off-by: ranshid <88133677+ranshid@users.noreply.github.com>
Co-authored-by: ranshid <88133677+ranshid@users.noreply.github.com>
Co-authored-by: Ran Shidlansik <ranshid@amazon.com>
Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
Resolves valkey-io#2267

Timed out test gets logged at the end of the test run.

```
!!! WARNING The following tests failed:

*** [TIMEOUT]: WAIT should not acknowledge 2 additional copies of the data in tests/unit/wait.tcl
Cleanup: may take some time... OK
```

Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
)

This PR fixes the freebsd daily job that has been failing consistently
for the last days with the error "pkg: No packages available to install
matching 'lang/tclx' have been found in the repositories".

The package name is corrected from `lang/tclx` to `lang/tclX`. The
lowercase version worked previously but appears to have stopped working
in an update of freebsd's pkg tool to 2.4.x.

Example of failed job:

https://github.com/valkey-io/valkey/actions/runs/19282092345/job/55135193499

Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
GitHub has deprecated older macOS runners, and macos-13 is no longer supported.

1. The latest version of cross-platform-actions/action does allow
running on ubuntu-latest (Linux runner) and does not strictly require macOS.
2. Previously, cross-platform-actions/action@v0.22.0 used runs-on:
macos-13. I checked the latest version of cross-platform-actions, and
the official examples now use runs-on: ubuntu. I think we can switch from macOS to Ubuntu.

---------

Signed-off-by: Vitah Lin <vitahlin@gmail.com>
Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
…-io#2983)

There is a crash in freeReplicationBacklog:
```
Discarding previously cached primary state.
ASSERTION FAILED
'listLength(server.replicas) == 0' is not true
freeReplicationBacklog
```

The reason is that during dual channel operation, the RDB channel is protected.
In the chained replica case, `disconnectReplicas` is called to disconnect all
replica clients, but since the RDB channel is protected, `freeClient` does not
actually free the replica client. Later, we encounter an assertion failure in
`freeReplicationBacklog`.
```
void replicationAttachToNewPrimary(void) {
    /* Replica starts to apply data from new primary, we must discard the cached
     * primary structure. */
    serverAssert(server.primary == NULL);
    replicationDiscardCachedPrimary();

    /* Cancel any in progress imports (we will now use the primary's) */
    clusterCleanSlotImportsOnFullSync();

    disconnectReplicas();     /* Force our replicas to resync with us as well. */
    freeReplicationBacklog(); /* Don't allow our chained replicas to PSYNC. */
}
```

Dual channel replication was introduced in valkey-io#60.

Signed-off-by: Binbin <binloveplay1314@qq.com>
Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
…erhead (valkey-io#3005)

The metric `used_memory_dataset` turned into an insanely large number close to 2^64 (actually
overflowed negative value), as reported in valkey-io#2994.

When server starts, the global variable `server.initial_memory_usage` is used to record a memory
baseline in InitServerLast. This `server.initial_memory_usage` has clearly included initial database
memory, since databases are created in initServer.

In function getMemoryOverheadData, the `mem_total` is firstly assigned the baseline, which includes
initial database memory. And then all extra memory usage of databases are added to mem_total. The initial
database memory are therefore counted TWICE.

This eventually caused wrongly larger `used_memory_overhead`. For a database with only a couple of keys,
the `used_memory_overhead` is easily larger than `used_memory` and causes an overflowed `used_memory_dataset`.

In function getMemoryOverheadData(), kvstores without any allocated hashtable are ignored from calculation:
```c
if (db == NULL || !kvstoreNumAllocatedHashtables(db->keys)) continue;
```

However, even the kvstore has no allocated hashtable, there are still some memory allocated by kvstoreCreate(),
including `hashtable_size_index`, which can be larger than 128 KiB.

On the contrary, this caused wrongly smaller `used_memory_overhead` for an empty database. When we insert only
ONE key to the database, the database is suddenly taken into account, and `used_memory_overhead` will increase
(for `used_memory_dataset` decrease) by more than 128 KiB due to the single key insertion.

Signed-off-by: Ace Breakpoint <chemistudio@gmail.com>
Signed-off-by: bpint <chemistudio@gmail.com>
Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
Co-authored-by: Binbin <binloveplay1314@qq.com>
Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>
Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
increased the wait time to a total of 10 seconds where we check the log
for `Done loading RDB` message

Fixes valkey-io#2694

CI run (100 times):
https://github.com/roshkhatri/valkey/actions/runs/18576201712/job/52961907806

Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
…oved (valkey-io#2787)

There’s an issue with the LTRIM command. When LTRIM does not actually
modify the key — for example, with `LTRIM key 0 -1` — the server.dirty
counter is not updated because both ltrim and rtrim values are 0. As a
result, the command is not propagated. However, `signalModifiedKey` is
still called regardless of whether server.dirty changes. This behavior
is unexpected and can cause a mismatch between the source and target
during propagation, since the LTRIM command is not sent.

Signed-off-by: Harry Lin <harrylhl@amazon.com>
Co-authored-by: Harry Lin <harrylhl@amazon.com>
Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
…delay between failover of each shard (valkey-io#2793)

Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
…o#2874)

fedorarawhide CI reports these warnings:
```
networking.c: In function 'afterErrorReply':
networking.c:821:30: error: initialization discards 'const' qualifier from pointer target type [-Werror=discarded-qualifiers]
  821 |             char *spaceloc = memchr(s, ' ', len < 32 ? len : 32);
```

Signed-off-by: Binbin <binloveplay1314@qq.com>
Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
After valkey-io#3103 time sensitive `test-ubuntu-reclaim-cache` started to fail
because now startup always includes 30ms of calibration of HW clock,
that's why we get this output:

```
Run echo "test SAVE doesn't increase cache"
test SAVE doesn't increase cache
2460491776
Could not connect to Valkey at 127.0.0.1:8080: Connection refused
```

Added waits for server to start, locally run, it helps

---------

Signed-off-by: Daniil Kashapov <daniil.kashapov.ykt@gmail.com>
Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
…-io#3151)

Closes valkey-io#3146

The following two test cases are flaky

- `evict clients only until below limit` - uses exact math expecting
exactly half the clients evicted
- `evict clients in right order (large to small)` - uses exact math
expecting specific clients evicted in order

It's fine to skip them in TLS because the core logic being tested
(client eviction) doesn't change based on TLS vs non-TLS.

The `decrease maxmemory-clients causes client eviction` test case could
potentially be flaky as well (has not shown flakiness on CI yet), but
since it has more tolerant assertion: `connected_clients > 0 &&
connected_clients < $client_count`, I think it's okay not to bother
skipping it.

Other test cases are not flaky because they use large thresholds or
check binary outcomes (yes/no eviction), not exact counts.

Signed-off-by: Zhijun <dszhijun@gmail.com>
Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
…te loop on macOS 15.4 (valkey-io#1940)

This PR fixes an issue in the CI test for client-output-buffer-limit,
which was causing an infinite loop when running on macOS 15.4.

### Problem

This test start two clients, R and R1:
```c
R1 subscribe foo
R publish foo bar
```

When R executes `PUBLISH foo bar`, the server first stores the message
`bar` in R1‘s buf. Only when the space in buf is insufficient does it
call `_addReplyProtoToList`.
Inside this function, `closeClientOnOutputBufferLimitReached` is invoked
to check whether the client’s R1 output buffer has reached its
configured limit.
On macOS 15.4, because the server writes to the client at a high speed,
R1’s buf never gets full. As a result,
`closeClientOnOutputBufferLimitReached` in the test is never triggered,
causing the test to never exit and fall into an infinite loop.

### Fixed

I changed `r publish foo bar` to `r publish foo [string repeat bar 50]`
to ensure the buffer is filled, which correctly reproduces the scenario
where omem increases.

Signed-off-by: vitah <vitahlin@gmail.com>
Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
@sarthakaggarwal97
Copy link
Contributor

@enjoy-binbin @madolson should #1813 be backported to 8.0? It might have fixed the #969

@sarthakaggarwal97 sarthakaggarwal97 self-requested a review February 4, 2026 23:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.