Conversation
calpt
left a comment
There was a problem hiding this comment.
Thanks a lot for working on this! Already looks pretty good overall, I left some comments regarding the open issues that are hopefully helpful.
Once that's done, please also add this new model type to the docs as described in the contributing guide, thanks!
| head_types = [ | ||
| "classification", | ||
| "multilabel_classification", | ||
| "question_answering", | ||
| "seq2seq_lm", | ||
| ] |
There was a problem hiding this comment.
this defines the range of supported heads. Since I believe we'd only want to support sequence generation, you can remove everything except for "seq2seq_lm" here.
| ParallelAdapterInferenceTestMixin, | ||
| ParallelTrainingMixin, |
There was a problem hiding this comment.
In case we don't want to support Parallel composition (which is totally fine), please remove these two mixins to disable the tests.
Otherwise, by adding the model type here:
adapters/src/adapters/composition.py
Line 121 in f0ca962
| from .test_adapter_heads import PredictionHeadModelTestMixin | ||
|
|
||
|
|
||
| class M2M100AdapterTestBase(AdapterTestBase): |
There was a problem hiding this comment.
Since this model doesn't support text classification tasks (and we test text classification training by default in the test runs), we'd have to override the add_heead() and dataset() methods here. That would roughly look like here:
adapters/tests/test_bert_generation.py
Lines 29 to 70 in f0ca962
|
Thank you so much for the helpful tips! It seems I have one major issue left, which is connected to the issue raised here: Do you have any advice by any chance to address this? Thank you! |
|
Hey, I tried to reproduce these issues, but got the following error message instead: It might be that the latest code state pushed here is not the latest one? |
I implemented AdapterHub support for the Facebook NLLB model and its underlying M2M100 architecture. I've carried out and ran all the relevant tests, auto formatting and quality checks.
The code passes 124 tests, skipping 7, and failing 11. The 11 failed tests are all connected to Parallel composition blocks (that I did not implement) and flex heads, which I also did not implement. As the model is a machine translation model, it does not need to have classification heads on top of it, but I didn't find how to disable the irrelevant
head_typesin theADAPTER_MODEL_MAPPINGdictionary to be able to skip these tests.Any advice on this is greatly appreciated!
Key addition:
M2M100AdapterModelclass with the relevantWithAdaptersandAdaptersMixinclasses implemented.