Skip to content

Commit ccecf88

Browse files
committed
doc: update README
1 parent 0097141 commit ccecf88

File tree

1 file changed

+16
-21
lines changed

1 file changed

+16
-21
lines changed

README.md

Lines changed: 16 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -25,43 +25,38 @@ ZenANN will be implemented in C++ for high performance and exposes an intuitive
2525
### Index Hierarchy
2626
There will be an abstract base index, which provides a unified interface for different index classes.
2727
1. **Base Index Class**
28-
- `indexBase`: Defines the common API for all indexing methods (eg. `add()`, `search()`, `train()`, `reorder_layout()`)
29-
30-
2. **Derived Index Classes**
31-
- `indexHNSW`: A graph-based structure for accurate and efficient ANN
32-
- `indexIVF`: A cluster-based structure for large dataset
33-
3. **Hybrid Index Classes**
34-
- `indexIVF_HNSW` / `indexHNSW_IVF` : For fast-coveraging larger datasets
35-
4. (Optional) **Quantization Index Classes**
36-
- `indexPQ`: Combined with product quantization for memory-limited scenarios
28+
- `indexBase`: Defines the common API for all indexing methods (eg. `add()`, `search()`, `train()`)
29+
2. **KD-tree Index Class**
30+
- `KDTreeIndex`: To serve as a baseline for approximate search algorithms, KD-tree is used to perform exact search.
31+
3. **IVF Index Class**
32+
- `IVFIndex`: A cluster-based structure for large dataset
33+
4. **HNSW Index Class**
34+
- `HNSWIndex`: A graph-based structure for accurate and efficient ANN
3735

3836
Note: Actual implementation detail of HNSW may be built on Faiss's interface according to development progress
3937

4038
### Processing Flow
4139
1. Initialize an index (e.g., `indexBase`, `indexHNSW`)
42-
2. Build an index
43-
2-1. Add the given vector data using `add()` to a specific index instance.
44-
2-2. Train index with `train()` if needed
45-
2-3. Optimize the index data layout with `reorder_layout()` to improve cache locality.
40+
2. Build an index with `add()`
41+
- Add the given vector data to a specific index instance.
42+
- Train index with `train()` if needed(for IVF-based Index)
43+
- Optimize the index data layout with reorder_layout in Faiss submodule to improve cache locality.
4644
4. Perform a query on the specified index instance using `search()`.
47-
5. Evaluate accuracy using the `get_statistics()` API.
45+
5. Return result set with top-k id & estimated distance for each query.
4846

4947
## API Description
5048
There is a simple python examples for understanding the API design
5149
```
5250
import zenann
5351
54-
# Initialize an HNSW index
55-
index = zenann.HNSWIndex(dimension=128, ef_construction=200, M=16)
52+
# Initialize an index for ANN search
53+
index = zenann.HNSWIndex(dim=128, M=16, efConstruction=200)
5654
57-
# Add vectors to the index and conduct reordering
55+
# Add vectors to the index and conduct training / reordering
5856
index.add(data_vectors)
59-
index.train()
60-
index.reorder_layout()
6157
6258
# Perform a search
6359
results = index.search(query_vector, k=5, efSearch=100)
64-
recall = get_statistics(results, ground_truth)
6560
```
6661

6762
## Engineering Infrastructure
@@ -71,10 +66,10 @@ recall = get_statistics(results, ground_truth)
7166
- Git
7267
- Github
7368
### Testing Framework
74-
- C++: Google Test
7569
- Python: pytest
7670
### Documentation
7771
- Markdown
72+
- Mermaid
7873
### Continuous Integration
7974
- Github Actions
8075

0 commit comments

Comments
 (0)