miara

miara is a small Zig library that implements several succinct data structures along with the SymSpell algorithm. It includes a perfect hash implementation (PTHash), a static select/rank structure called Widow (based on SPIDER and DArray), Elias–Fano integer compression, Serialization/Deserialization lib, and a memory-efficient version of the SymSpell - spelling correction algorithm.

Installation

Supported zig versions: 0.14.0, 0.14.1

Add the module to your project:

zig fetch --save https://github.com/alim-zanibekov/miara/archive/<VERSION or BRANCH or COMMIT>.tar.gz

This will add the dependency in your build.zig.zon

.{
    // ...
    .dependencies = .{
+       .miara = .{
+           .url = "https://github.com/alim-zanibekov/miara/archive/<VERSION or BRANCH or COMMIT>.tar.gz",
+           .hash = "SOMETHING",
+       },
    },
}

Add the dependency to your build.zig:

const std = @import("std");

pub fn build(b: *std.Build) void {
    // ...
    const miara = b.dependency("miara", .{
        .target = target,
        .optimize = optimize,
    });

    exe.root_module.addImport("miara", miara.module("miara"));
    // ...
}

Project guide:

PTHash

Perfect hash function (PFH) and minimal perfect hash function (MPFH). This PTHash implementation follows the original paper, with the exception that I added the optimal bucket assignment function from the PHOBIC paper.

This version achieves slightly better pilots compression on average due to a different select algorithm used in Elias–Fano, making it similar to the PHOBIC implementation. It's roughly the same in query performance and about 2x slower in build speed compared to Giulio Ermanno's (git: @jermp) C++ PHOBIC implementation (click).

	Elements	λ	α	BPK*	BAF*	Query ns/key	Build ns/key	Size	Prt.*	Num buckets
pthash.zig	1M	6.5	0.97	1.78	skew	35	1296	218KB	-	153846
pthash.zig	100M	6.5	0.97	1.83	skew	45	1779	22MB	-	15384615
pthash.zig	100M	3	0.97	2.33	skew	60	308	27MB	-	33333333
pthash.zig	1M	6.5	0.97	1.88	opt	51	1375	230KB	-	153846
pthash.zig	100M	6.5	0.97	1.84	opt	68	2156	22MB	-	15384615
pthash.zig	100M	3	0.97	2.34	opt	73	327	28MB	-	33333333
pthash.cpp	1M	6.5	0.97	1.78	skew	31	?	≈213KB	-	153847
pthash.cpp	100M	6.5	0.97	1.78	skew	62	≈1500	21MB	-	15384616
pthash.cpp	100M	3	0.97	2.24	skew	62	120	27MB	-	33333334
pthash.cpp	1M	6.5	0.97	2.75	opt	37	≈2000	336KB	-	153847
pthash.cpp	100M	6.5	0.97	1.78	skew	64	1157	21MB	50	50 * 307693
pthash.cpp	100M	3	0.97	2.24	skew	69	165	27MB	50	50 * 666667

Notes

jermp's PTHash bench parameters. In the same order as in the table

./build -n 1000000 -l 6.5 -a 0.97 -r xor -e EF -b skew -q 10000000 -s 76648836764 -p 0 --verbose
./build -n 100000000 -l 6.5 -a 0.97 -r xor -e EF -b skew -q 10000000 -s 76648836764 -p 0 --verbose
./build -n 100000000 -l 3 -a 0.97 -r xor -e EF -b skew -q 10000000 -s 76648836764 -p 0 --verbose
./build -n 1000000 -l 6.5 -a 0.97 -r xor -e EF -b opt -q 10000000 -s 76648836764 -p 0 --verbose
 # Took more than 20 minutes and I stopped it, not present in the table
./build -n 10000000 -l 6.5 -a 0.97 -r xor -e EF -b opt -q 10000000 -s 76648836764 -p 0 --verbose
./build -n 100000000 -l 6.5 -a 0.97 -r xor -e EF -b skew -q 10000000 -s 76648836764 -p 2000000 --verbose
./build -n 100000000 -l 3 -a 0.97 -r xor -e EF -b skew -q 10000000 -s 76648836764 -p 2000000 --verbose
 # Segfault, not present in the table
./build -n 100000000 -l 6.5 -a 0.97 -r xor -e EF -b opt -q 10000000 -s 76648836764 -p 2000000 --verbose

Prt.* - Partitions
BPK* - Bits per key
BAF* - Bucket assignment function
I was unable to fully test the optimal BAF in jermp's PTHash. It took more than 20 minutes on 10M keys, and I decided to stop the bench. Tested on Apple M1 series chip, commit: cc4c9c9c7366f7ebe238b4b6493372e774608cbf, also for some reason Cmake build took 10 minutes

Widow

A configurable, succinct static structure supporting constant time rank and select queries. It's based on the SPIDER ideas but adds more control over hierarchy levels and layout. You can configure the desired hierarchy levels at comptime.

The default variant supports both select 1/0, and rank, and works on both dense and sparse arrays. Different configurations are available for rank-only use cases or faster select 1 at the cost of memory.

	Bit¹ Density	Rng RAM read (ns/op)	select¹ (ns/op)	select⁰ (ns/op)	Rank (ns/op)	Build (ns/val)	Overhead (%)	Overhead (MB)	Bit Array (MB)
Widow	50%	6.94	43.71	45.14	8.38	4.36	6.06%	49.17	762.94
Widow	11.16%	7.18	96.21	39.69	8.31	5.57	8.52%	71.07	762.94
WidowS1NR	50%	7.14	25.43	-	-	7.07	11.42%	98.35	762.94
WidowR	50%	7.40	-	-	8.18	0.55	3.12%	24.59	762.94

Test Configuration: 100 million u64, 100 million random reads

Notes

miara.widow.Widow- Universal config with select 1, select 0, rank, and dense/sparse arrays support
miara.widow.WidowS1NR- Select 1 only, dense only, no rank support
miara.widow.WidowR - Rank-only implementation
Rng RAM Read - Random memory read on the same machine, for reference, ran on the same array

EliasFano

A static integer compression structure for monotonic sequences, offering compact storage with constant-time lookup (depends on a constant-time select 1). Takes N * (⌈log₂(U / N)⌉ + 2 * N) bits, where N is the number of values and U is the universe (the largest value)

	Elements	BPK*	Rng RAM read (ns/op)	Get (ns/op)	Build (ns/val)	Universe	Total (MB)
EliasFanoPS	100M	11.26	7.17	84.33	12.29	51B	134.19
EliasFanoPS	10M	11.26	6.23	37.63	12.20	5B	13.42
EliasFano	100M	11.26	7.32	73.64	12.57	51B	134.19
EliasFano	10M	11.26	6.05	30.94	12.26	5B	13.42

Notes

miara.EliasFanoPS - Elias-Fano prefix sum, for arbitrary sequences
miara.EliasFano - Elias-Fano for monotonically increasing sequences
BPK* - Bits per key
Universe - Max value in the array
Rng RAM Read - Random memory read on the same machine, for reference, ran on the same array
Elements - a[i] = a[i - 1] + randomInt(0, 1024);
Elements PS - a[i] = randomInt(0, 1024);

SymSpell

Fast fuzzy string matching, string tokenization

This implementation avoids storing pointers and uses perfect hashing to store deletes. It can also compress data further using Elias–Fano for the edit-to-word index and by bit-packing dictionary references. In the uncompressed variant, it uses only about eight bytes per word delete, and about three bytes in the compressed version. This version can store about 16x more data than the original Wolf Garbe's (git: @wolfgarbe) C# implementation, which uses a hash map and consumes at least 48 bytes per entry, but we lose the easy way to update the dictionary, the structure is static after construction.

Bench

---- SymSpell PFH + Elias-Fano + Bit Pack, delete distance - 2 ----

Dict len:         15000
Deletes count:    811259
Dict refs count:  787334
Dict refs size:   1.31 Mb
Edits size:       220.87 Kb
Compress edits:   true
Pack dict refs:   true
PThash size:      193.04 Kb
Dict size:        490.73 Kb
Searcher size:    1.27 Kb
-----------          
Total size:       2.20 Mb

---- SymSpell PFH, delete distance - 2 ----

Dict len:         15000
Deletes count:    811259
Dict refs count:  false
Dict refs size:   3.00 Mb
Edits size:       3.09 Mb
Compress edits:   false
Pack dict refs:   false
PThash size:      192.97 Kb
Dict size:        490.73 Kb
Searcher size:    1.27 Kb
-----------          
Total size:       6.77 Mb

fastd

Fast division, fastmod and fastdiv implementations, Daniel Lemire's (git: @lemire) C++ implementation ported to zig

Oversimplified general idea

Div
A = N / D
A = N * 1 / D
A = N * (2^64) / D / 2^64
M = 2^64 / D
A = N * M / 2^64
# We have to multiply N by M as 128 bit integers
# Also we want to add 1 to M since we used floor division before
A = (u128(N) * (M + 1)) >> 64
Mod
B = N mod D
B = N - (N / D) * D
B = N - A * D
B = N - (N * M / 2^64) * D
# But the paper describes more sophisticated approach (minus 1 operation), 
# this would give correct answer for every number <= 2^32
B = (u128(u64(M + 1) * D) * N) >> 64

the paper - I dont't really understand all the math, so you may try it yourself

bit

A simple bit array and nthSetBitPos (select 1) implementation, nthSetBitPos uses Sebastiano Vigna's broadword algorithm for 64-bit integers, if the machine is x86_64 and supports the BMI2 instruction set, uses pdep/tzcnt

serde [WIP]

Binary serialization library that supports almost all runtime zig types, self-referencing object serialization/deserialization, and stores type information with the data. It can also serialize (transferable to runtime) and deserialize (not transferable to runtime) at comptime.

Note

miara.serde.serialize and miara.serde.deserialize functions can be used to serialize and deserialize every data structure mentioned above

TO_{_never}DO:

Improve the "deletes" generation algorithm (SymSpell), the current implementation generates SUM nCk(wordLen, i) BY (i in 1..=deleteDistance) words
Partitioning/Multithreading support for PTHash
Re-implement bit.BitArray to store data in LSB order instead of MSB. This will simplify the implementation.

Papers

PTHash: Revisiting FCH Minimal Perfect Hashing; Giulio Ermanno Pibiri, Roberto Trani; https://arxiv.org/abs/2104.10402
PHOBIC: Perfect Hashing with Optimized Bucket Sizes and Interleaved Coding; Stefan Hermann, Hans-Peter Lehmann, Giulio Ermanno Pibiri, Peter Sanders, Stefan Walzer; https://arxiv.org/abs/2404.18497
SPIDER: Improved Succinct Rank and Select Performance; Matthew D. Laws, Jocelyn Bliven, Kit Conklin, Elyes Laalai, Samuel McCauley, Zach S. Sturdevant; https://arxiv.org/abs/2405.05214
Faster Remainder by Direct Computation: Applications to Compilers and Software Libraries; Daniel Lemire, Owen Kaser, Nathan Kurz; https://arxiv.org/abs/1902.01961
Practical Entropy-Compressed Rank/Select Dictionary; Daisuke Okanohara, Kunihiko Sadakane; https://arxiv.org/abs/cs/0610001
Fast approximate string matching with large edit distances in Big Data; Wolf Garbe; link
Broadword Implementation of Rank/Select Queries; Sebastiano Vigna; https://vigna.di.unimi.it/ftp/papers/Broadword.pdf

API

`miara.*` PTHash

/// Skewed mapper for PTHash, from original paper
pub fn SkewedMapper(Hash: type) type;

/// Optimal mapper for PTHash, from PTHash PHOBIC paper
pub fn OptimalMapper(Hash: type) type;

/// Default hasher for string-like keys using Wyhash and Murmur
pub fn DefaultStringHasher(comptime Key: type) type;

/// Default hasher for int keys using Murmur
pub fn DefaultNumberHasher(comptime Key: type) type;

/// Configuration for PTHash construction.
pub fn PTHashConfig(Mapper: type, Hasher: type) type {
    // see PTHashParams struct
    return struct {
        alpha: f64,
        minimal: bool,
        hashed_pilot_cache_size: usize,
        max_bucket_size: usize,
        mapper: Mapper,
        hasher: Hasher,
    };
}

/// Generic PTHash implementation (minimal perfect hash or compressed)
pub fn GenericPTHash(
/// /// Seed type for hash function: u64, u32, etc.
    Seed: type,
    /// Hash type: u64, u32, etc.
    Hash: type,
    /// Key type for hash function: []const u8, u64, u32, etc.
    Key: type,
    /// Key to bucket mapper. Must implement IMapper
    Mapper: type,
    /// Key and pilots hasher. Must implement IHasher
    Hasher: type,
    /// Custom EliasFano type for encoding pilots, should encode values as prefix sums
    CustomEliasFanoPS: ?type,
    /// Custom EliasFano type for encoding displacements, sorted index
    CustomEliasFano: ?type,
) type {
    return struct {
        const Self = @This();

        pub fn deinit(self: *Self, allocator: std.mem.Allocator) void;

        /// Builds a PTHash using random seed. Calls `buildSeed` internally
        pub fn build(
            allocator: std.mem.Allocator,
            comptime KeyIterator: type,
            keys: KeyIterator,
            config: PTHashConfig(Mapper, Hasher),
        ) !Self;

        /// Builds PFH or MPFH using a fixed seed
        pub fn buildSeed(
            allocator: std.mem.Allocator,
            comptime KeyIterator: type,
            keys: KeyIterator,
            config: PTHashConfig(Mapper, Hasher),
            seed: Seed,
        ) !Self;

        /// Get the index of a given key in the table
        pub fn get(self: *const Self, key: Key) !u64;
    };
}

/// Parameter struct with defaults for PTHash wrapper
pub const PTHashParams = struct {
    /// Average bucket size
    lambda: f64 = 6,
    /// Load factor
    alpha: f64 = 0.97,
    /// Whether to build a minimal perfect hash
    minimal: bool = false,
    /// Fixed number of buckets to use, overrides automatic estimation if set
    num_buckets: ?usize = null,
    /// Cache size for hashed pilot values during construction
    hashed_pilot_cache_size: usize = 4096,
    /// Maximum allowed size for any individual bucket
    max_bucket_size: usize = 1024,
};

/// Wrapper to simplify PTHash instantiation
pub fn PTHash(
    comptime Key: type,
    comptime Mapper: type,
) type;

Example usage

const std = @import("std");
const miara = @import("miara");

pub fn main() !void {
    var gpa = std.heap.GeneralPurposeAllocator(.{}){};
    const allocator = gpa.allocator();
    defer _ = gpa.deinit();

    const data = &[_][]const u8{ "never", "gonna", "give", "you", "up" };

    var iter = miara.iterator.SliceIterator([]const u8).init(data);
    const PTHash = miara.PTHash([]const u8, miara.OptimalMapper(u64));
    var pthash = try PTHash.buildSeed(allocator, @TypeOf(&iter), &iter, PTHash.buildConfig(iter.size(), .{
        .lambda = 3.5,
        .alpha = 0.97,
        .minimal = true,
    }), 37);
    defer pthash.deinit(allocator);

    for (data) |str| {
        std.debug.print("{s: <5} - {}\n", .{ str, try pthash.get(str) });
    }

    const toF64 = miara.util.toF64;
    const size = miara.util.calculateRuntimeSize(@TypeOf(&pthash), &pthash);
    std.debug.print("Bits per key: {d:.2}\n", .{(toF64(size) * 8.0) / toF64(data.len)});
}

Output

never - 1
gonna - 4
give  - 2
you   - 3
up    - 0
Bits per key: 889.60

NOTE: the bits per key here is calculated by dividing the entire structure size by 5 (number of keys), with more input keys this value will be around 2 bits per key in the default configuration, this is just an example of how to calculate the bits per key

`miara.symspell.*` SymSpell - spell checker

Note

type T is one of u8,u16,u21,u32

/// UTF-8 string type indicator 
pub const Utf8 = struct {};

/// SymSpell - wrapper over GenericSymSpell, uses `levenshteinDistance` as a distance functionpub const SymSpell;
pub fn SymSpell(
    comptime Word: type,
    comptime Ctx: type,
    comptime wordMaxDistance: fn (ctx: Ctx, str: []const T) usize,
) type;

/// SymSpellDL - wrapper over GenericSymSpell, uses `damerauLevenshteinDistance` as a distance function
pub fn SymSpellDL(
    comptime Word: type,
    comptime Ctx: type,
    comptime wordMaxDistance: fn (ctx: Ctx, str: []const T) usize,
) type;

/// SymSpell provides a distance function based fuzzy matcher over a dictionary
/// `Word` - slice type of u8, u16, u21, u32 or `Utf8`,
/// `Ctx` - context type for callbacks
/// `wordMaxDistance` - function to get max delete distance for a word
/// `strLen` - function to get string length
/// `compressEdits` - whether to compress the edits index using Elias-Fano compression
/// `bitPackDictRefs` - whether to compress (by bitpacking) the dictionary references
pub fn GenericSymSpell(
    comptime Word: type,
    comptime Ctx: type,
    comptime editDistanceBuffer: fn (ctx: Ctx, buffer: []u32, word1: []const T, word2: []const T) u32,
    comptime strLen: fn (ctx: Ctx, str: []const T) ?usize,
    comptime wordMaxDistance: fn (ctx: Ctx, str: []const u8) usize,
    comptime compressEdits: bool,
    comptime bitPackDictRefs: bool,
) type {
    return struct {
        const Self = @This();

        pub const Hit = struct {
            word: []const T,
            edit_distance: usize,
            delete_distance: usize,
            count: u32,
        };

        pub const Suggestion = struct {
            segmented: []T,
            corrected: []T,
            distance_sum: u32,
            probability_log_sum: f64,
        };

        pub const DictStats = struct {
            edits_layer_num_max: usize,
            edits_num_max: usize,
            word_max_len: usize,
            word_max_size: usize,
            max_distance: usize,
            count_sum: usize,
            count_uniform: bool,
        };

        /// Collects statistics from the input dictionary.
        /// These stats are used to size buffers, etc
        pub fn getDictStats(ctx: Ctx, dict: []const Token) !DictStats;

        /// Constructs SymSpell using a Bloom filter for deletes deduplication
        /// `BloomOptions`.`fp_rate` - false positive rate for Bloom filter
        pub fn initBloom(allocator: std.mem.Allocator, dict: []const Token, ctx: Ctx, opts: BloomOptions) !Self;

        /// Constructs SymSpell using an LRU cache for edit deduplication
        /// `LRUOptions`.`capacity` - LRU cache capacity
        pub fn initLRU(allocator: std.mem.Allocator, dict: []const Token, ctx: Ctx, opts: LRUOptions) !Self;

        /// Builds SymSpell dictionary
        pub fn init(main_allocator: std.mem.Allocator, dict: []const Token, ctx: Ctx, dict_stats: DictStats, Deduper: type, deduper: Deduper) !Self;
        pub fn deinit(self: *Self, allocator: std.mem.Allocator) void;

        /// Attempts to split and correct a possibly misspelled phrase
        pub fn wordSegmentation(self: *const Self, allocator: std.mem.Allocator, input: []const T) !?Suggestion;

        /// Reusable search instance tied to a specific SymSpell dictionary
        pub fn Searcher(EditsDeduper: type) type {
            return struct {
                /// Allocates the search buffers and internal state
                pub fn init(sym_spell: *const Self, allocator: std.mem.Allocator) !@This();
                pub fn deinit(self: *@This(), allocator: std.mem.Allocator) void;

                /// Initializes the search with a given input word
                pub fn load(self: *@This(), word: []const T, max_distance: usize, edits_deduper: EditsDeduper) !void;

                /// Advances the iterator and attempts to find the next closest matching word in the dictionary
                /// Returns true if a new best hit was found
                /// `Searcher`.`hit` is valid until the next call
                pub fn next(self: *@This()) !bool;

                /// Returns the best match for a term
                pub fn top(self: *@This()) !?Hit;
            };
        }
    };
}

// ... see `src/symspell.zig`

Example usage

const std = @import("std");
const miara = @import("miara");

pub fn main() !void {
    var gpa = std.heap.GeneralPurposeAllocator(.{}){};
    const allocator = gpa.allocator();
    defer _ = gpa.deinit();

    const SymSpell = miara.symspell.SymSpellDL(miara.symspell.Utf8, void, struct {
        /// Edit distance for word
            fn func(_: void, _: []const u8) usize {
            return 1;
        }
    }.func);

    const dict = &[_]SymSpell.Token{
        .{ .word = "never", .count = 100 },
        .{ .word = "gonna", .count = 100 },
        .{ .word = "let", .count = 100 },
        .{ .word = "you", .count = 100 },
        .{ .word = "down", .count = 100 },
    };

    var sym_spell = try SymSpell.initBloom(allocator, dict, {}, .{ .fp_rate = 0.01 });
    defer sym_spell.deinit(allocator);

    var searcher = try SymSpell.Searcher(miara.symspell.NoopDeduper(u8)).init(&sym_spell, allocator);
    defer searcher.deinit(allocator);

    try searcher.load("neer", 2, miara.symspell.NoopDeduper(u8){});
    const top = try searcher.top();

    std.debug.print("Top hit: {s}\n", .{top.?.word});

    var ws = try sym_spell.wordSegmentation(allocator, "neergonaltyudow") orelse unreachable;
    defer ws.deinit(allocator);

    std.debug.print("Segmented: {s}\n", .{ws.segmented});
    std.debug.print("Corrected: {s}\n", .{ws.corrected});
}

Output

Top hit: never
Segmented: neer gona lt yu dow
Corrected: never gonna let you down

`miara.widow.*` Rank/Select queries

/// Full-featured. Select and rank support
pub const Widow = GenericWidow(u64, true, true, &[_]u64{ 1 << 16, 1 << 9 }, &[_]usize{ 1 << 16, 1 << 9 }, true, false);
/// Only select 1 and rank. Select 0 is disabld
pub const WidowS1 = GenericWidow(u64, true, false, &[_]u64{ 1 << 16, 1 << 9 }, &[_]usize{ 1 << 16, 1 << 9 }, true, false);
/// Select 1 without rank support, level constants for Elias Fano (for dense, will overlow on sparse arrays)
pub const WidowS1NR = GenericWidow(u64, true, false, &[_]u64{}, &[_]usize{ 8192, 64 }, false, true);
/// Select 1/0 without rank support, level constants for Elias Fano with GEQ support (for dense, will overlow on sparse arrays)
pub const WidowS1S0NR = GenericWidow(u64, true, true, &[_]u64{}, &[_]usize{ 8192, 64 }, false, true);
/// Rank-only
pub const WidowR = GenericWidow(u64, false, false, &[_]u64{ 1 << 16, 1 << 9 }, &[_]usize{}, true, false);

/// Data structure for efficient bit-based rank and select queries.
/// Based on SPIDER architecture, but features a more sophisticated select 1/0 lookup
/// structure and enhanced configurability. The versions used in EliasFano are more
/// similar to the DArray data structure in terms of runtime performance, taking about
/// 2 iterations to locate the requested bit
pub fn GenericWidow(
/// /// The backing slice type
    comptime T: type,
    /// Enable select operations for 1-bits
    comptime Select1: bool,
    /// Enable select operations for 0-bits
    comptime Select0: bool,
    /// Hierarchy levels for the rank data structure (in bits)
    /// Must be powers of two
    comptime RankLevels: []const u64,
    /// Hierarchy levels for the select data structure (in bits)
    /// The offset types are calculated based on these numbers, for example
    /// if the levels are 2^16, 2^9 it will result in u64, u16, if the levels are 2^17, 2^9 it will
    /// result in u64, u32, and etc. For very sparse bit arrays, levels will be
    /// automatically recalculated with appropriate scaling
    comptime SelectLevels: []const u64,
    /// Use the rank data structure to accelerate select queries
    /// This approach requires less precision for select data types
    comptime SelectUsingRank: bool,
    /// Disable rank operations entirely
    comptime NoRank: bool,
) type {
    const BitArray = []const T;
    const BitSize = u64;

    return struct {
        const Self = @This();

        const SelectSupport = struct {
            strides: [SelectLevels.len]usize,
            strides_div: [SelectLevels.len]u128,
            table: SelectTable,
            size: BitSize,
            fn deinit(self: *@This(), allocator: std.mem.Allocator) void;
        };

        const RankSupport = struct {
            table: RankTable,
            n_set: BitSize,
            size: BitSize,
            fn deinit(self: *@This(), allocator: std.mem.Allocator) void;
        };

        /// Initialize the data structure with the given bit array
        pub fn init(allocator: std.mem.Allocator, bit_len: BitSize, bit_array: BitArray) !Self;

        pub fn deinit(self: *Self, allocator: std.mem.Allocator) void;

        /// Build hierarchical ranking structure for efficient bit counting queries
        pub fn buildRank(
            allocator: std.mem.Allocator,
            bit_len: BitSize,
            bit_array: BitArray,
        ) !RankSupport;

        /// Build hierarchical select support structure for efficient bit position queries
        pub fn buildSelect(
            allocator: std.mem.Allocator,
            /// The number of set bits (1s) in the array, regardless of the flip
            n_set: BitSize,
            bit_len: BitSize,
            bit_array: BitArray,
            /// True for select 0 queries, false for select 1 queries
            comptime flip: bool,
        ) !SelectSupport;

        /// Returns the number of 1 bits up to and including index `i` (0 < i < bit_len)
        pub fn rank1(self: *Self, i: u64) !u64;

        /// Returns the number of 0 bits up to and including index `i` (0 < i < bit_len)
        pub fn rank0(self: *Self, i: u64) !u64;

        /// Returns the number of `needle` bits up to and including index `i` (0 < i < bit_len)
        pub fn rank(
            rs: RankSupport,
            bit_array: BitArray,
            i: BitSize,
            comptime needle: u1,
        ) BitSize;

        /// Returns the position of the i-th 1-bit. `i` is a 1-based index; result is a 0-based index
        pub fn select1(self: *const Self, i: u64) !u64;

        /// Returns the position of the i-th 0-bit. `i` is a 1-based index; result is a 0-based index
        pub fn select0(self: *const Self, i: u64) !u64;

        /// Returns the position of the i-th `needle` bit. `i` is a 1-based index; result is a 0-based index
        pub noinline fn select(
            rs: if (NoRank) void else RankSupport,
            st: SelectSupport,
            bit_array: BitArray,
            i: BitSize,
            comptime needle: u1,
        ) !u64;
    };
}

/// Computes an optimal multiplicative `chain` starting from `init` using the given `hints`.
/// The function attempts to construct a sequence a1, a2, a3 .. a[hints.len -1] such that
/// - a1 <= init
/// - Each a[i] divides a[i - 1]
/// - The shape of the `chain` roughly follows the scale suggested by `hints`
/// Always returns a new allocated slice, even if `hints` is an empty array
pub fn multiplicativeChain(allocator: std.mem.Allocator, T: type, init: T, hints: []const T) ![]const T;

Example usage

const std = @import("std");
const miara = @import("miara");

pub fn main() !void {
    var gpa = std.heap.GeneralPurposeAllocator(.{}){};
    const allocator = gpa.allocator();
    defer _ = gpa.deinit();

    const data = &[_]u64{ 0x0E, 0x0F, 0x0F };
    var spider = try miara.widow.Widow.init(allocator, data.len * 64, data);
    defer spider.deinit(allocator);

    std.debug.print(
        "data:        {b:0>8}{b:0>8}{b:0>8}\n\n" ++
            "rank1(16):   {d: <4} # count of 1's in the first 16 bits\n" ++
            "rank0(16):   {d: <4} # count of 0's in the first 16 bits\n" ++
            "select1(1):  {d: <4} # index of the 1-st one\n" ++
            "select1(4):  {d: <4} # index of the 4-th one\n" ++
            "select0(10): {d: <4} # index of the 10-th zero\n",
        .{ data[0], data[1], data[2], try spider.rank1(16), 16 - try spider.rank1(16), try spider.select1(1), try spider.select1(4), try spider.select0(10) },
    );
}

Output

data:        000011100000111100001111

rank1(16):   7    # count of 1's in the first 16 bits
rank0(16):   9    # count of 0's in the first 16 bits
select1(1):  4    # index of the 1-st one
select1(4):  12   # index of the 4-th one
select0(10): 16   # index of the 10-th zero

`miara.*` EliasFano - integer compression

pub const EliasFano = GenericEliasFano(u64, false, false, null);
pub const EliasFanoPS = GenericEliasFano(u64, false, true, null);
pub const EliasFanoGEQ = GenericEliasFano(u64, true, false, null);

/// Elias–Fano representation for monotonically increasing sequences
/// `T` - an unsigned integer type
/// `EnableGEQ` - enable get next greater or equal queries
/// `PrefixSumMode` - encode as prefix sum for arbitrary arrays (though it may overflow)
/// `CustomWidow` - custom Widow type, maybe configured with different params
pub fn GenericEliasFano(
    T: type,
    EnableGEQ: bool,
    PrefixSumMode: bool,
    CustomWidow: ?type,
) type  {
    return struct {
        const Self = @This();

        /// Initializes Elias–Fano structure with values from `iter`
        /// `universe` is the maximum possible value in the sequence
        ///     (must be greater than or equal to the last element in the sequence)
        ///     if 0 is passed - will be calculated
        /// `TIter` must implement `Iterator(T)` see `/iterator.zig`
        pub fn init(allocator: std.mem.Allocator, universe: T, TIter: type, iter: TIter) !Self;
        pub fn deinit(self: *Self, allocator: std.mem.Allocator) void;

        /// Returns the i-th element in array
        pub fn get(self: *const Self, i: usize) !T;

        /// Returns the i-th element of the array regardless of PrefixSumMode
        pub fn getValue(self: *const Self, i: usize) !T;

        /// Returns the difference between the i + 1-th and i-th elements of the array
        pub fn getDiff(self: *const Self, i: usize) !T;

        /// Finds the next value greater than or equal to `num`
        pub fn getNextGEQ(self: *const Self, num: T) !T;
    };
}

Example usage

const std = @import("std");
const miara = @import("miara");

pub fn main() !void {
    var gpa = std.heap.GeneralPurposeAllocator(.{}){};
    const allocator = gpa.allocator();
    defer _ = gpa.deinit();

    const data = &[_]u64{ 1, 599, 864, 991, 1024, 9999 };
    var iter = miara.iterator.SliceIterator(u64).init(data);
    var ef = try miara.EliasFanoGEQ.init(allocator, data[data.len - 1], @TypeOf(&iter), &iter);
    defer ef.deinit(allocator);
    std.debug.print(
        "array:            {any}\n\n" ++
            "get(0):           {d}\n" ++
            "get(3):           {d: <4}\n" ++
            "getNextGEQ(5000): {d: <4}\n",
        .{ data, try ef.get(0), try ef.get(3), try ef.getNextGEQ(5000) },
    );
}

Output

array:            { 1, 599, 864, 991, 1024, 9999 }

get(0):           1
get(3):           991 
getNextGEQ(5000): 9999

`miara.*` BitArray

/// BitArray backed by u64 slice
pub const BitArray = GenericBitArray(u64);

/// Returns a bit array type backed by the given unsigned integer type
pub fn GenericBitArray(comptime T: type) type {
    return struct {
        const Self = @This();

        data: []T,
        /// Size in bits
        len: Size,

        /// Initializes an empty bit array with no allocation
        pub fn init() Self;

        /// Initializes a bit array with a given number of bits preallocated
        pub fn initCapacity(allocator: std.mem.Allocator, capacity: Size) MemError!Self;

        pub fn deinit(self: *Self, allocator: std.mem.Allocator) void;

        /// Expands the bit array to use all available capacity without reallocation
        pub fn expandToCapacity(self: *Self) void;

        /// Ensures the bit array can hold at least the specified capacity, reallocating if needed
        pub fn ensureCapacity(self: *Self, allocator: std.mem.Allocator, capacity: usize) !void;

        /// Sets a range of bits from the source data with specified bit boundaries and endianness
        pub fn setRange(
            self: *Self,
            index: Size,
            data: anytype,
            from: std.math.IntFittingRange(0, @bitSizeOf(@TypeOf(data))),
            to: std.math.IntFittingRange(0, @bitSizeOf(@TypeOf(data))),
            endian: std.builtin.Endian,
        ) void;

        /// Appends `n` lsb of `data`, growing memory as needed
        pub fn appendUInt(self: *Self, allocator: std.mem.Allocator, data: anytype, n: std.math.Log2IntCeil(@TypeOf(data))) MemError!void;

        /// Appends `n` lsb of `data`, assuming capacity is sufficient
        pub fn appendUIntAssumeCapacity(self: *Self, data: anytype, n: std.math.Log2IntCeil(@TypeOf(data))) void;

        /// Sets all bits to zero
        pub fn clearAll(self: *Self) void;

        /// Sets all bits to one
        pub fn setAll(self: *Self) void;

        /// Returns @bitSizeOf(Out) bits starting at bit `index`
        pub fn get(self: *const Self, comptime Out: type, index: Size) BitArrayError!Out;

        /// Returns `n` bits starting at bit `index` as `T`
        pub fn getVarFast(self: *const Self, index: Size, n: std.math.Log2IntCeil(T)) T;

        /// Returns `n` bits starting at bit `index` as `Out`
        pub fn getVar(self: *const Self, comptime Out: type, n: std.math.Log2IntCeil(Out), i: Size) BitArrayError!Out;

        /// True if bit at index `i` is set
        pub fn isSet(self: *const Self, i: Size) bool;

        /// Finds the next bit equal to `query` (0 or 1) starting from bit `start`
        pub fn idxNext(self: *const Self, start: Size, comptime query: u1) ?Size;

        /// Sets bit at index `i` to one
        pub fn set(self: *Self, i: Size) void;

        /// Sets bit at index `i` to zero
        pub fn clear(self: *Self, i: Size) void;
    };
}

pub const NthSetBitError = error{ InvalidN, NotFound };

/// Returns the index (0-based, from msb) of the n-th set bit in `src`
/// Requires `src` to be an unsigned integer
pub fn nthSetBitPos(src: anytype, n: std.math.Log2IntCeil(@TypeOf(src))) NthSetBitError!std.math.Log2Int(@TypeOf(src));

/// Returns the index (0-based, from lsb) of the n-th (0-based) set bit in `src`
/// Sebastiano Vigna's broadword approach https://vigna.di.unimi.it/ftp/papers/Broadword.pdf
pub fn nthSetBitPosU64Broadword(x: u64, r: u6) !u6;

Example usage

const std = @import("std");
const miara = @import("miara");

pub fn main() !void {
    var gpa = std.heap.GeneralPurposeAllocator(.{}){};
    const allocator = gpa.allocator();
    defer _ = gpa.deinit();

    var ba = miara.BitArray.init();
    defer ba.deinit(allocator);
    try ba.appendUInt(allocator, @as(u8, 0xF0), 4);
    try ba.appendUInt(allocator, @as(u8, 0x1), 1);
    try ba.appendUInt(allocator, @as(u8, 0x10), 5);
    try ba.appendUInt(allocator, @as(u8, 0x7), 3);

    for (0..ba.len) |i| {
        std.debug.print("{}", .{try ba.get(u1, i)});
    }
    std.debug.print("\n", .{});
}

Output

0000110000111

`miara.filter.*` Bloom filter

/// Estimates bit size needed for Bloom filter with `n_keys` and false positive rate
pub fn bloomBitSize(n_keys: usize, fp_rate: f64) u64;

/// Estimates byte size needed for Bloom filter with `n_keys` and false positive rate
pub fn bloomByteSize(n_keys: usize, fp_rate: f64) u64;

/// Calculates optimal number of hash functions for Bloom filter
pub fn bloomNHashFn(n_keys: usize, fp_rate: f64) u64;

/// A Bloom filter with 6 built-in hash functions
pub const BloomFilter6;

/// Creates a Bloom filter type parameterized by a list of hash functions
pub fn BloomFilter(
    comptime HashFunctions: []*const fn (*anyopaque, []const u8) u64,
) type {
    return struct {
        const Self = @This();

        /// Initializes a Bloom filter with `bit_count` bits and `hash_fn_cnt` hash functions (up to HashFunctions.len)
        pub fn init(allocator: std.mem.Allocator, bit_count: usize, hash_fn_cnt: usize) !Self;

        /// Initializes Bloom filter without internal allocations
        pub fn initSlice(bit_array: []u8, bit_count: usize, hash_fn_cnt: usize) !Self;

        pub fn deinit(self: *Self, allocator: std.mem.Allocator) void;

        /// Inserts a key into the Bloom filter
        pub fn put(self: *Self, key: []const u8) void;

        /// Returns true if the key is possibly in the Bloom filter
        pub fn contains(self: *Self, key: []const u8) bool;
    };
}

/// LRU filter
pub const LRUFilter = struct {
    const Self = @This();

    pub fn init(allocator: std.mem.Allocator, capacity: u32) !Self;
    pub fn deinit(self: *Self, allocator: std.mem.Allocator) void;

    /// Clears all entries while keeping capacity
    pub fn clear(self: *Self) void;

    /// Inserts key, evicts least recently used if full
    pub fn put(self: *Self, key: []const u8) void;

    /// Checks if key is present (and marks as recently used)
    pub fn contains(self: *Self, key: []const u8) bool;
};

Example usage

const std = @import("std");
const miara = @import("miara");

pub fn main() !void {
    var gpa = std.heap.GeneralPurposeAllocator(.{}){};
    const allocator = gpa.allocator();
    defer _ = gpa.deinit();

    const data = &[_][]const u8{ "never", "gonna", "run", "around", "and", "desert", "you" };
    const fp_rate = 0.01;
    const bit_size = miara.filter.bloomBitSize(data.len, fp_rate);
    const n_hash_fn = miara.filter.bloomNHashFn(data.len, fp_rate);

    std.debug.print("estimated bit size:  {}\n", .{bit_size});
    std.debug.print("number of hash fn's: {}\n\n", .{n_hash_fn});

    var bf = try miara.filter.BloomFilter6.init(allocator, bit_size, n_hash_fn);
    defer bf.deinit(allocator);

    for (data) |it| bf.put(it);
    for (data) |it| {
        std.debug.print("{s: <7} {}\n", .{ it, bf.contains(it) });
    }
    std.debug.print("\ncookie  {}\n", .{bf.contains("cookie")});
    std.debug.print("\n-----\n\n", .{});

    var lf = try miara.filter.LRUFilter.init(allocator, 4);
    defer lf.deinit(allocator);
    for (data[0..4]) |it| lf.put(it);
    for (data[0..4]) |it| {
        std.debug.print("{s: <7} {}\n", .{ it, lf.contains(it) });
    }
    lf.put(data[4]);
    std.debug.print("{s: <7} {}\n", .{ data[4], lf.contains(data[4]) });
    std.debug.print("{s: <7} {}\n", .{ data[0], lf.contains(data[0]) });
}

Output

estimated bit size:  67
number of hash fn's: 6

never   true
gonna   true
run     true
around  true
and     true
desert  true
you     true

cookie  false

-----

never   true
gonna   true
run     true
around  true
and     true
never   false

`miara.fastd.*` Fast division/modulus

/// Calculate magic number for `fastdiv`, `fastmod`, `is_divisible`
/// `divisor` - u32 or u64 number
pub inline fn magicNumber(d: anytype) MagicType(@TypeOf(d));

/// Fast division, supports u32 and u64
/// `m` - magic number
/// `n` - numerator
pub inline fn fastdiv(m: anytype, n: ReverseMagicType(@TypeOf(m))) ReverseMagicType(@TypeOf(m));

/// Fast modulo, supports u32 and u64
/// `m` - magic number
/// `n` - numerator
/// `n` - denominator
pub inline fn fastmod(m: anytype, n: ReverseMagicType(@TypeOf(m)), d: ReverseMagicType(@TypeOf(m))) ReverseMagicType(@TypeOf(m));

/// Gives the answer to N mod D == 0
/// `m` - magic number
/// `n` - numerator
pub inline fn is_divisible(m: anytype, n: ReverseMagicType(@TypeOf(m))) bool;

Example usage

const std = @import("std");
const miara = @import("miara");

pub fn main() !void {
    const fd = miara.fastd;
    const m = fd.magicNumber(@as(u64, 37));
    std.debug.print(
        "333 / 37:   {d}\n" ++
            "333 mod 37: {d}\n" ++
            "96 mod 37:  {d}\n" ++
            "42 / 37:    {d}\n\n",
        .{ fd.fastdiv(m, 333), fd.fastmod(m, 333, 37), fd.fastmod(m, 96, 37), fd.fastdiv(m, 42) },
    );
}

Output

333 / 37:   9
333 mod 37: 0
96 mod 37:  22
42 / 37:    1

`miara.serde.*` Serialization/Deserialization

/// Serialize value using native endianness. Convenience wrapper for serializeEndian
/// with platform-specific byte ordering. Returns total bytes written to stream
pub fn serialize(allocator: Allocator, writer: anytype, value: anytype) !usize;

/// Deserialize value using native endianness. Convenience wrapper for deserializeEndian
/// with platform-specific byte ordering. Returns reconstructed object and DeDeinit struct
/// to free the allocated memory
pub fn deserialize(allocator: Allocator, reader: anytype, comptime T: type) !struct { T, DeDeinit };

/// Serialize value with specified endianness. Writes binary header field count,
/// type count, and type info size. Processes type information and object
/// data while managing pointer offsets for self-referencing structures. Supports
/// circular references and complex object graphs through pointer tracking system
pub fn serializeEndian(allocator: Allocator, writer: anytype, value: anytype, endian: Endian) !usize;

/// Deserialize value with specified endianness. Validates binary header format,
/// reads type and field information using temporary arena allocator, then
/// reconstructs the object with proper memory management. Handles self-referencing
/// objects and circular references through pointer tracking system that maintains
/// object identity during reconstruction. Returns reconstructed object and DeDeinit struct
/// to free the allocated memory
pub fn deserializeEndian(allocator: Allocator, reader: anytype, comptime T: type, endian: Endian) !struct { T, DeDeinit };

/// Serialize value with specified endianness at compile time
pub fn serializeEndianComptime(writer: anytype, value: anytype, endian: Endian) !usize;

/// Deserialize value with specified endianness at compile time
/// WARNING: The resulting object is not transferrable to runtime if it contains pointers,
/// since it will have references to comptime mutable memory
pub fn deserializeEndianComptime(reader: anytype, comptime T: type, endian: Endian) !T;

/// Pretty print file type. Wrapper for  prettyPrintFileTypeEndian with platform-native byte ordering
pub fn prettyPrintFileType(
    allocator: std.mem.Allocator,
    input: anytype,
    output: anytype,
    indent: []const u8,
    linebreak: []const u8,
) !void;

/// Pretty print file type with specified endianness. Outputs human-readable formatted
/// type structure with proper indentation and line breaks. Can handle self-referencing structures
pub fn prettyPrintFileTypeEndian(
    allocator: std.mem.Allocator,
    input: anytype,
    endian: std.builtin.Endian,
    output: anytype,
    indent: []const u8,
    linebreak: []const u8,
) !void;

Example usage

const std = @import("std");
const miara = @import("miara");

pub fn main() !void {
    var gpa = std.heap.GeneralPurposeAllocator(.{}){};
    const allocator = gpa.allocator();
    defer _ = gpa.deinit();

    const SymSpell = miara.symspell.SymSpellDL(miara.symspell.Utf8, void, struct {
        fn func(_: void, _: []const u8) usize {
            return 1;
        }
    }.func);
    @setEvalBranchQuota(20000);
    { // Build and store to file

        const dict = &[_]SymSpell.Token{
            .{ .word = "never", .count = 100 },
            .{ .word = "ever", .count = 100 },
            .{ .word = "get", .count = 100 },
            .{ .word = "up", .count = 100 },
        };

        var sym_spell = try SymSpell.initBloom(allocator, dict, {}, .{ .fp_rate = 0.01 });
        defer sym_spell.deinit(allocator);

        const out_file = try std.fs.cwd().createFile("symspell.zsd", .{});
        defer out_file.close();
        const bytes_written = try miara.serde.serialize(allocator, out_file.writer(), sym_spell);
        std.debug.print("bytes_written: {}\n", .{bytes_written});

    }
    { // Read from file and use

        const in_file = try std.fs.cwd().openFile("symspell.zsd", .{});
        defer in_file.close();
        var sym_spell, var dd = try miara.serde.deserialize(allocator, in_file.reader(), SymSpell);
        defer dd.deinit(allocator);

        var searcher = try SymSpell.Searcher(miara.symspell.NoopDeduper(u8)).init(&sym_spell, allocator);
        defer searcher.deinit(allocator);

        try searcher.load("neer", 2, miara.symspell.NoopDeduper(u8){});
        const top = try searcher.top();

        std.debug.print("Top hit: {s}\n\n", .{top.?.word});

        try in_file.seekTo(0);
        try miara.serde.prettyPrintFileType(allocator, in_file.reader(), std.io.getStdOut().writer(), "  ", "\n");

    }
}

Output

bytes_written: 1559
Top hit: never

struct Struct_0 {
  dict: []struct Struct_1 {
    word: []u8,
    count: u32,
  },
  dict_stats: struct Struct_2 {
    edits_layer_num_max: u64,
    edits_num_max: u64,
    word_max_len: u64,
    word_max_size: u64,
    max_distance: u64,
    count_sum: u64,
    count_uniform: bool,
  },
  pthash: struct Struct_3 {
    seed: u64,
    num_keys: u64,
    table_size: u64,
    config: struct Struct_4 {
      alpha: f64,
      minimal: bool,
      hashed_pilot_cache_size: u64,
      max_bucket_size: u64,
      mapper: struct Struct_5 {
        num_buckets: u64,
        num_keys: u64,
        eps: f64,
        map_constant: f64,
      },
      hasher: struct Struct_6 {
      },
    },
    pilots: struct Struct_7 {
      higher_bits_lookup: struct Struct_8 {
        bit_array: []u64,
        rank_table: void,
        select1_table: struct Struct_9 {
          strides: [2]u64,
          strides_div: [2]struct Struct_10 {
            0: u128,
          },
          table: struct Struct_11 {
            0: []u64,
            1: []u16,
          },
          size: u64,
        },
        select0_table: void,
      },
      higher_bits: struct Struct_12 {
        data: []u64,
        len: u64,
      },
      lower_bits: Struct_12,
      low_n: u16,
      len: u64,
    },
    free_slots: ?Struct_7,
  },
  edits_index: Struct_7,
  edits_values: Struct_12,
  punctuation: [][]u8,
  separators: [][]u8,
  segmentation_token: []u8,
  ctx: void,
  assume_ed_increases: bool,
}

Hello there
You scrolled till here, so I might as well tell you a secret.

Secret lies here

Sorry, I lied.

It's right here

Nope.

Maybe here?

You seem to really like revealing things.

This might get weird

But don't stop now.

Just a few more

Secrets like these are hidden for a reason.

This is where it starts to change

You're persistent.
I like that.

Or maybe just curious?

Either way... onward.

What if there's nothing here?

Or what if there's everything?

Philosophical now, huh?

Baseball.
Focus.

Halfway there

Or are you?

There's no turning back

You wouldn't scroll up now, would you?

You're in too deep

So keep going.

13 - this number is cursed

Look at yourself.
You're unstoppable.

You must really want it

Almost there.

Just a little further

You're doing great.

Is your finger tired yet?

Mine would be.

What do you expect to find?

The truth?

You're so close now

Almost.

The moment of truth

One more.

No really, just one more. Maybe

Alright.
Now it's definitely just a few more.

The definition of "few"

You're still here.
That says something.

Okay fine. Last five

Give or take.

Are you still counting?

I stopped a while ago.

Does this even end?

You're not sure anymore.

This is the actual one

Unless it isn't.

Are you still here?

I thought we were done.
But no.
You kept going.
You chose this.

Do you think there's more?

There always could be.
That's the trick, isn't it?

You might regret this

Or you might not.
Uncertainty is the cost of curiosity.

This is where it really begins

Everything before this?
A warm-up.

Hope you're comfortable

You won't be in a few layers.

Time is a flat `<details>` tag

And you're folding through it.

Still with me?

I'm surprised.
But also... proud?

No one comes this far

Statistically speaking.

We could stop right here

But we won't.

Ten more?

Maybe.
Maybe not.
Numbers mean nothing now.

You're clearly committed

Which makes this even better.

The secret might be ahead

Or not.

You scrolled through a page of lies

Only to find another one.

This isn't healthy

But it's not unhealthy either.
It's... Markdown.

Your scrollbar is microscopic now

Good.
That means it's working.

Stanley?

No, no, no, this can't be.

You're nearing something

A point.
A statement.
A truth?

This is the last gate

The actual last gate.
I promise.

Unless it's not

Just kidding. The next one really is it.

The real secret is here

MIARA Is A Recursive Acronym

That's it. That's what this was all for.

Do you feel enlightened?

You shouldn't.
But you should feel something.

Confusion, perhaps.
Satisfaction?
No. Probably not.

But hey, you made it.
And that's worth...
well, exactly what you got.

Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
build.zig		build.zig
build.zig.zon		build.zig.zon

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

miara

Installation

Project guide:

PTHash

Widow

EliasFano

SymSpell

fastd

bit

serde [WIP]

Papers

API

`miara.*` PTHash

`miara.symspell.*` SymSpell - spell checker

`miara.widow.*` Rank/Select queries

`miara.*` EliasFano - integer compression

`miara.*` BitArray

`miara.filter.*` Bloom filter

`miara.fastd.*` Fast division/modulus

`miara.serde.*` Serialization/Deserialization

About

Uh oh!

Releases

Languages

License

alim-zanibekov/miara

Folders and files

Latest commit

History

Repository files navigation

miara

Installation

Project guide:

PTHash

Widow

EliasFano

SymSpell

fastd

bit

serde [WIP]

Papers

API

miara.* PTHash

miara.symspell.* SymSpell - spell checker

miara.widow.* Rank/Select queries

miara.* EliasFano - integer compression

miara.* BitArray

miara.filter.* Bloom filter

miara.fastd.* Fast division/modulus

miara.serde.* Serialization/Deserialization

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Languages

`miara.*` PTHash

`miara.symspell.*` SymSpell - spell checker

`miara.widow.*` Rank/Select queries

`miara.*` EliasFano - integer compression

`miara.*` BitArray

`miara.filter.*` Bloom filter

`miara.fastd.*` Fast division/modulus

`miara.serde.*` Serialization/Deserialization