Skip to content

修复方法_approx_token_len逻辑,排除中文字符的重复统计#327

Open
zzhRooT1998 wants to merge 1 commit intodatawhalechina:mainfrom
zzhRooT1998:fix/approx_token_len
Open

修复方法_approx_token_len逻辑,排除中文字符的重复统计#327
zzhRooT1998 wants to merge 1 commit intodatawhalechina:mainfrom
zzhRooT1998:fix/approx_token_len

Conversation

@zzhRooT1998
Copy link

Refactor token length estimation for mixed Chinese and English text. Improve handling of CJK characters and non-CJK tokens.

Refactor token length estimation for mixed Chinese and English text. Improve handling of CJK characters and non-CJK tokens.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants