Skip to content

Social graph#98

Open
mmalmi wants to merge 11 commits intodamus-io:masterfrom
mmalmi:master
Open

Social graph#98
mmalmi wants to merge 11 commits intodamus-io:masterfrom
mmalmi:master

Conversation

@mmalmi
Copy link

@mmalmi mmalmi commented Nov 7, 2025

See SOCIAL_GRAPH.md for description

@alltheseas
Copy link

@mmalmi when you ran this, what was the total storage size for nostr's social graph (from your perspective, or otherwise)?

@alltheseas
Copy link

@jb55 agent meta-review:

Findings

Severity Issue File Description
Critical Thread Safety Bug in UID Allocation src/ndb_uid.c:116-139 ndb_uid_map is shared across ingester threads. next_id++ races and can double-assign UIDs when multiple threads process contact/mute lists concurrently. Fix: Route all social-graph writes through the single writer thread, or use atomic operations with proper rollback on transaction failure.
Critical Social Graph Writes in Ingester Threads src/nostrdb.c:3325-3342 Kind 3/10000 events are handled synchronously in the ingester threadpool with LMDB write transactions. This defeats nostrdb's single-writer design and serializes on LMDB's global write lock, potentially stalling all event ingestion. Fix: Queue social graph updates to the writer thread like other write operations.
Critical Incomplete Incremental Distance Updates src/ndb_socialgraph.c:94-210 Incremental distance updates only touch the immediate target on follow/unfollow. Downstream nodes never get recomputed (e.g., root→A→B→C, A unfollows B: B becomes 1000 but C stays at 3). Only recalculate_distances() produces correct state. Fix: Cascade updates, trigger full BFS, or document limitation.
High Mute Count Logic Bug src/ndb_socialgraph.c:1017-1051 Muter count incremented for all entries, not just new mutes. Unlike follow logic (lines 484-526) which checks is_new, muter counts inflate on every update. Fix: Add is_new check mirroring follow logic.
High Write Failures Silently Ignored src/nostrdb.c:3327-3342 Return value from handle_contact_list() /handle_mute_list() not checked. On errors, partial graph state is committed. Fix: Check return value; abort transaction on failure.
High libsodium/NIP-44 Removal Makefile, src/nostrdb.h, multiple deleted files Removes libsodium (~134k lines) and NIP-44 encryption support. nip44_decrypt() API gone. Major unrelated API regression that will break consumers. Recommendation: Split into separate PR with justification and migration guidance.
Low O(n²) Bubble Sort bucketed_u32_list.c:82-90 Called on every add. Acceptable for typical follow lists (100-500) but could use qsort for larger lists.
Low Magic Number for Unknown Distance ndb_socialgraph.c:705 Returns 1000 for unknown users. Intentional and tested, but could be a named constant.
Low Double Semicolon Typo nostrdb.c:7584 secp256k1_context_create(SECP256K1_CONTEXT_NONE);;
Low Test Code Duplication test_bucketed_uid_list.c Copies implementation instead of
including header; risks divergence.
Low No Public API for Users-by-Distance - Index users_by_follow_distance_db exists but isn't exposed via public function.
Low u8 Bucket Linear Search bucketed_u32_list.c:41-47 Uses linear scan while u16/u32 use binary search. Bucket is small (<256 entries), negligible impact.

Positive Aspects

  • Good space optimization - Bucketed UID lists save ~78% vs naive arrays
  • Scalable reverse index - Composite keys (followed_uid, follower_uid) allow O(log n) inserts for popular users
  • Cached counts - O(1) follower/muter count queries
  • Root persistence - Survives restarts correctly
  • Reasonable test coverage - Tests cover basic operations, updates, persistence, root changes
  • Clear documentation - SOCIAL_GRAPH.md explains architecture well

Recommended Actions

Before merge (Critical/High):

  1. Route social graph writes through writer thread (fixes thread safety + architectural issues)
  2. Fix mute count is_new check
  3. Check return values and abort transaction on failure
  4. Fix incremental distance updates or document as known limitation
  5. Split libsodium removal into separate PR

Can defer (Low):

  • O(n²) sort optimization
  • Named constant for distance sentinel
  • Typos and code duplication
  • Expose users-by-distance API

@alltheseas
Copy link

@coderabbitai review please

@coderabbitai
Copy link

coderabbitai bot commented Jan 16, 2026

@alltheseas: I'll review the changes in this pull request.

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai
Copy link

coderabbitai bot commented Jan 16, 2026

Warning

Rate limit exceeded

@alltheseas has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 8 minutes and 22 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 13c08d3 and 73195fa.

📒 Files selected for processing (14)
  • .gitignore
  • Makefile
  • SOCIAL_GRAPH.md
  • src/bucketed_u32_list.c
  • src/bucketed_u32_list.h
  • src/ndb_socialgraph.c
  • src/ndb_socialgraph.h
  • src/ndb_uid.c
  • src/ndb_uid.h
  • src/nostrdb.c
  • src/nostrdb.h
  • test.c
  • test_bucketed_uid_list.c
  • test_socialgraph.c

Comment @coderabbitai help to get the list of available commands and usage tips.

@mmalmi
Copy link
Author

mmalmi commented Jan 30, 2026

@mmalmi when you ran this, what was the total storage size for nostr's social graph (from your perspective, or otherwise)?

Depends how much you crawl, but I think I tested with 100k users or so

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants