🐣 add tools for reflowing the transcript into one paragraph per sentence / speaker by anuejn · Pull Request #510 · bugbakery/transcribee

anuejn · 2026-01-07T23:50:50Z

No description provided.

into one paragraph per sentence / speaker

frontend/src/editor/automerge_websocket_editor.ts

rroohhh · 2026-01-09T11:52:04Z

frontend/src/editor/text_tools.tsx

@@ -0,0 +1,182 @@
+import { TbHammer } from 'react-icons/tb';


Can we have tests for the transformations in this file? :)

rroohhh · 2026-01-09T11:53:26Z

frontend/src/editor/text_tools.tsx

+  );
+}
+
+export function TextTools({ editor }: { editor: EditorWithWebsocket }) {


Should we add a warning to these if applied to a document that is in a non latin style language?

rroohhh · 2026-01-09T11:55:49Z

frontend/src/editor/text_tools.tsx

+                  .filter((token) => token.text.includes(','))
+                  .map((token) => token.pause);
+                silences.sort();
+                const thresholdIndex = Math.floor(paragraph.children.length / 100); // aim for paragraphs of max ~50 tokens


This says ~50 tokens but divides by 100, this seems contradictory, or am I missing something?

Also the magic paragraph length could probably be a constant that is used here and for the <= 100 further up

rroohhh · 2026-01-09T11:57:25Z

frontend/src/editor/text_tools.tsx

+              }
+            };
+            doc.children.forEach((paragraph) => {
+              let minPauseBetweenSentences = initial; // this gets reduced with every additional token


Why does it get reduces with every additional token?

rroohhh · 2026-01-09T12:00:23Z

frontend/src/editor/text_tools.tsx

+                children: [] as { text: string }[],
+              };
+              paragraph.children.forEach((token, i) => {
+                currentParagraph.children.push(JSON.parse(JSON.stringify(token)));


Why the JSON dance?

rroohhh · 2026-01-09T12:01:22Z

frontend/src/editor/text_tools.tsx

+                addNewChild(currentParagraph);
+              }
+            });
+            doc.children = newChildren;


I think doing it this way totally fucks up collaborative editing...

rroohhh · 2026-01-09T12:05:47Z

frontend/src/pages/document.tsx

          )}
        </TopBarPart>
        <TopBarPart>
+          {editor && <TextTools editor={editor} />}


This should be gated on data?.can_write, no?

🐣 add tools for reflowing the transcript

c7a9403

into one paragraph per sentence / speaker

anuejn force-pushed the anujen/text_reflow_tools branch from 134df4c to c7a9403 Compare January 7, 2026 23:51

anuejn requested review from pajowu, phlmn and rroohhh January 7, 2026 23:51

✨ add smart reflow

6fad5ae

anuejn force-pushed the anujen/text_reflow_tools branch from 79e9045 to 6fad5ae Compare January 8, 2026 01:21

rroohhh requested changes Jan 9, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

🐣 add tools for reflowing the transcript into one paragraph per sentence / speaker#510

🐣 add tools for reflowing the transcript into one paragraph per sentence / speaker#510
anuejn wants to merge 2 commits intomainfrom
anujen/text_reflow_tools

anuejn commented Jan 7, 2026

Uh oh!

Uh oh!

rroohhh Jan 9, 2026 •

edited

Loading

Uh oh!

rroohhh Jan 9, 2026

Uh oh!

rroohhh Jan 9, 2026

Uh oh!

rroohhh Jan 9, 2026

Uh oh!

rroohhh Jan 9, 2026

Uh oh!

rroohhh Jan 9, 2026

Uh oh!

rroohhh Jan 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

anuejn commented Jan 7, 2026

Uh oh!

Uh oh!

rroohhh Jan 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rroohhh Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

rroohhh Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

rroohhh Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

rroohhh Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

rroohhh Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

rroohhh Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

rroohhh Jan 9, 2026 •

edited

Loading