Skip to content

Request for Guidance on Adding Hindi Support to CSM #116

@goravaa

Description

@goravaa

Hi SesameAILabs team,

I’m interested in extending the CSM model to support Hindi for text-to-speech tasks. My goal is to adapt CSM to generate conversational speech in Hindi. Here are some of the aspects I’ve considered so far:

  • Data:
    I plan to gather or leverage existing Hindi speech and text datasets to fine-tune the model effectively.

  • Tokenizer:
    The current implementation uses a tokenizer (from Meta Llama) that is optimized for English. I might need to adapt or re-train a tokenizer (e.g., using SentencePiece) on Hindi text to better handle Hindi script and vocabulary.

  • Training Strategy:
    I’m considering a transfer learning approach where the model is fine-tuned on Hindi-specific data. This may require adjustments to both the text encoder and the audio decoder to capture the linguistic and acoustic nuances of Hindi.

Could you please provide guidance on the following:

  • Are there any existing plans or recommendations for adding multilingual support or specifically Hindi support to CSM?

  • What best practices would you suggest for fine-tuning a model like CSM for a language with a different script and phonetic structure?

  • Are there any particular modifications or additional components (such as phoneme-based representations) that might improve Hindi speech generation?

I’m open to collaboration and would greatly appreciate any feedback or pointers from the community.

Thank you for your work on CSM, and I look forward to your suggestions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions