🐣 add tools for reflowing the transcript into one paragraph per sentence / speaker#510
🐣 add tools for reflowing the transcript into one paragraph per sentence / speaker#510
Conversation
into one paragraph per sentence / speaker
134df4c to
c7a9403
Compare
79e9045 to
6fad5ae
Compare
| @@ -0,0 +1,182 @@ | |||
| import { TbHammer } from 'react-icons/tb'; | |||
There was a problem hiding this comment.
Can we have tests for the transformations in this file? :)
| ); | ||
| } | ||
|
|
||
| export function TextTools({ editor }: { editor: EditorWithWebsocket }) { |
There was a problem hiding this comment.
Should we add a warning to these if applied to a document that is in a non latin style language?
| .filter((token) => token.text.includes(',')) | ||
| .map((token) => token.pause); | ||
| silences.sort(); | ||
| const thresholdIndex = Math.floor(paragraph.children.length / 100); // aim for paragraphs of max ~50 tokens |
There was a problem hiding this comment.
This says ~50 tokens but divides by 100, this seems contradictory, or am I missing something?
Also the magic paragraph length could probably be a constant that is used here and for the <= 100 further up
| } | ||
| }; | ||
| doc.children.forEach((paragraph) => { | ||
| let minPauseBetweenSentences = initial; // this gets reduced with every additional token |
There was a problem hiding this comment.
Why does it get reduces with every additional token?
| children: [] as { text: string }[], | ||
| }; | ||
| paragraph.children.forEach((token, i) => { | ||
| currentParagraph.children.push(JSON.parse(JSON.stringify(token))); |
| addNewChild(currentParagraph); | ||
| } | ||
| }); | ||
| doc.children = newChildren; |
There was a problem hiding this comment.
I think doing it this way totally fucks up collaborative editing...
| )} | ||
| </TopBarPart> | ||
| <TopBarPart> | ||
| {editor && <TextTools editor={editor} />} |
There was a problem hiding this comment.
This should be gated on data?.can_write, no?
No description provided.