Skip to content

Add storage format spec and NDJSON migration tool#21

Open
Intina47 wants to merge 1 commit intomainfrom
mamba/update-note-storage-format-and-migration-plan
Open

Add storage format spec and NDJSON migration tool#21
Intina47 wants to merge 1 commit intomainfrom
mamba/update-note-storage-format-and-migration-plan

Conversation

@Intina47
Copy link
Owner

Motivation

  • Preserve the existing append-only plain-text notebook while enabling a metadata-first format for sync and richer tooling.
  • Define a stable NDJSON schema that supports tags, repo metadata, links, and RFC3339 timestamps without breaking legacy notes.
  • Provide a simple, reversible migration path from journal.txt to an optional journal.ndjson for metadata-heavy workflows.

Description

  • Add docs/storage-format.md describing the current journal.txt format, a proposed journal.ndjson schema (fields: id, text, created_at, updated_at, tags, links, repo, source), compatibility guarantees, and a migration plan.
  • Add scripts/migrate-journal-to-ndjson.go, a small Go utility that reads journal.txt, parses timestamps ([YYYY-MM-DD HH:MM]), emits RFC3339 created_at, preserves original text, assigns a stable hex id, and writes one JSON object per line with source provenance (source file and line number).
  • The migration tool supports -in, -out, and -force flags and skips blank lines; IDs are generated with cryptographic randomness and fall back to a timestamped string on error.

Testing

  • No automated test suite was run against the new files; this change is documentation and a standalone migration tool.
  • The migration script was formatted with gofmt -w scripts/migrate-journal-to-ndjson.go to ensure idiomatic formatting.
  • Basic static validation was performed by reading the created files (docs/storage-format.md and scripts/migrate-journal-to-ndjson.go) to confirm the intended schema and flags are present.

Codex Task

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: cf860ae4ea

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +71 to +73
pattern := regexp.MustCompile(`^\[(\d{4}-\d{2}-\d{2} \d{2}:\d{2})\]\s*(.*)$`)
scanner := bufio.NewScanner(file)
lineNum := 0

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid bufio.Scanner token limit for long notes

This uses bufio.NewScanner with the default 64K token limit, so any journal.txt line longer than ~64KB will cause scanner.Err() to return bufio.ErrTooLong and the migration will stop with a partial journal.ndjson. That can happen if users paste large logs or long-form notes into a single line. Consider switching to bufio.Reader with ReadString('\n') or calling scanner.Buffer with a larger limit so long notes migrate cleanly.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant