https://github.com/llvm/llvm-project/blob/114f3b530bae5a195234e3bd1c1328b38b39a000/libc/src/string/memory_utils/aarch64/inline_memset.h#L58 Similar to `memcpy`, memset also have the initial branches pattern. It can be interesting to check if predicated SVE can speed up the initial part.