Skip to content

It is hoped that in future versions, the reasoning framework of CTranslate2 can be supported. :) #2

@YLQY

Description

@YLQY

Hello, I am very happy to finally wait for the demo of UnitySentis. This surprised me. I am a speech recognition algorithm engineer. We often encounter such problems, the inference speed of the model, that is, the problem of real-time performance. Nowadays, very popular large models, such as ChatGPT and whsiper-large-v2. I want to use these models in UnitySentis. We have also tested the onnx model inference speed of Whisper in the Linux environment. Unfortunately, this inference speed is still unbearable for people. So we have two methods. The first is to reduce the size of the model, such as using whisper-small or whisper-base models, but there will still be some loss in effect. We are very happy that we have found an inference library named CTranslate2 (https://github.com/OpenNMT/CTranslate2). We only need to export the model to the format required by ctranslate2, and then with the acceleration of CTranslate2, whisper-large- The real-time rate of v2 has been significantly improved, the RTF value can be reduced to 0.03 (A30 machine), and the running speed under the CPU has also been significantly improved. Moreover, CTranslate2 also supports most of the current mainstream Transformer models, so it would be great if UnitySentis could refer to or integrate CTranslate2 in future versions. :)
I wish UnitySentis will develop better and better :)

Here are some links you may want to use
https://github.com/OpenNMT/CTranslate2
https://opennmt.net/CTranslate2
https://github.com/guillaumekln/faster-whisper/tree/master

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions