-
Notifications
You must be signed in to change notification settings - Fork 31
Open
Description
Hello. Based on these notes you left for RcppHNSW[1][2], I suggest switching to single thread for the hnsw_build step and only that step [3].
Here's my test log messages. The number of graph edges fluctuates if and only if the index building function uses multi threading. It and the ultimate umap result become reproducible if I fix n_threads = 1 in using hnsw_build.
22:09:34 Commencing optimization for 500 epochs, with 17984770 positive edges using 20 threads
umap2() messages
22:08:09 Using HNSW for nearest neighbor search
22:08:09 UMAP embedding parameters a = 0.9922 b = 1.112
22:08:09 Setting random seed 1
22:08:09 Read 107723 rows and found 489 numeric columns
22:08:09 Building HNSW index with metric 'l2' ef = 200 M = 16 using 1 threads
22:09:12 Finished building index
22:09:12 Searching HNSW index with ef = 100 and 20 threads
22:09:17 Finished searching
22:09:17 Commencing smooth kNN distance calibration using 20 threads with target n_neighbors = 100
22:09:21 Initializing from normalized Laplacian + noise (using RSpectra)
22:09:29 Range-scaling initial input columns to 0-10
22:09:34 Commencing optimization for 500 epochs, with 17984770 positive edges using 20 threads
22:09:34 Using rng type: pcg
Using method 'umap'
Optimizing with Adam alpha = 1 beta1 = 0.5 beta2 = 0.9 eps = 1e-07
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
22:09:39 Optimization finished[1] jlmelville/rcpphnsw@6c54753
[3]
Line 18 in 3b7c889
| n_threads = n_threads, |
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels