-
-
Notifications
You must be signed in to change notification settings - Fork 7
Open
Open
Enhancement
Copy link
Description
Some time ago, John developed a script to scrape build job data from ClearML (research and production runs) which I've refined and expanded. Currently, we only use this data (to my knowledge) for some high-level reporting (e.g., how many projects are using Serval, how many new projects this month, etc.), but I think there's a lot more we could glean from it. Here are some ideas:
- Using the ClearML data, we can identify which production projects are long-time, consistent users of our tools. Knowing this, we could:
- Update/expand our list of standard NMT testing projects to include some of these projects to see how updates to the pipeline would affect our most consistent users - not only in regard to scores like BLEU but also in regard to other new features like marker placement or quotation denormalization.
- Attempt to identify patterns in these projects: Are they from certain language families? Certain regions? Certain partner organizations? If we run a test against a random 250, do they tend to get higher scores than the non-long-time projects?
- Reach out to the project owners and develop some kind of inner-circle where we could explore feature ideas and hear concerns.
- Using the production data, we could also do the reverse: What projects tried it once a long time ago and haven't tried it since? We could do similar things for these projects as mentioned in the bullet points above as well as:
- Reach out to these projects and ask whether they've encountered difficulties with their draft or encourage them to retry (if it's been long enough).
- Since we can scrape the complete config from all research ClearML runs through the API, there's an opportunity to analyze the affect of different configuration options retrospectively. This could include:
- Language or script codes (which could be mapped to families, regions, etc.)
- Hyperparameters
- Or just establish baselines across many runs or long-term trends (are our drafts getting better?)
And I'm sure there are more opportunities than these!
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels
Type
Projects
Status
📋 Backlog