-
Notifications
You must be signed in to change notification settings - Fork 7
[head] Relation extraction + NER multitask head #371
Description
Is your feature request related to a problem? Please describe.
In order to better support information extraction use cases, joint models performing relation extraction + NER typically perform better and simplify extraction problems.
Describe the solution you'd like
The solution will be to create a joint task head performing NER -> Relation Extraction (Classification). This can be done combining our current TokenClassification and RelationClassification heads.
I include a working implementation draft (https://gist.github.com/dvsrepo/a33bcd1c4e7074fbf15aefdccca5b46f) with several caveats:
-
We need to extend our current
vocabularyhandling to support heads to have custom label namespaces (now its fixed in vocabulary.LABEL_NAMESPACE. When you start combining heads with different label domains (e.g., labels for a classifier and tags for a token classifier) they will basically overwrite each other, leading to indexing issues. Ideally, the label namespace could be set in the head (although I would no recommend to request this to the user in the init or configuration). -
Loss could be calculated with different coefficients, e.g. loss_classiffier + 0.5*loss_ner. This is a hyperparam which could be optimized with HPO so it should go to the head config.
-
We need to think about the TaskOutput and metrics report (see the implementation for a rough idea).
-
This is the first implementation of a multitask head so we should set the basis for other multitask models (e.g., classification + lm loss term)
-
Backbone forward pass is done twice (or N times if we had N heads).
-
There are some issues with default_mapping functionality when we have several optional params (entities and labels in our case) see data creation in gist