Skip to content

Retrospective research opportunities and forward-thinking testing/engagement using ClearML data #915

@Enkidu93

Description

@Enkidu93

Some time ago, John developed a script to scrape build job data from ClearML (research and production runs) which I've refined and expanded. Currently, we only use this data (to my knowledge) for some high-level reporting (e.g., how many projects are using Serval, how many new projects this month, etc.), but I think there's a lot more we could glean from it. Here are some ideas:

  • Using the ClearML data, we can identify which production projects are long-time, consistent users of our tools. Knowing this, we could:
    • Update/expand our list of standard NMT testing projects to include some of these projects to see how updates to the pipeline would affect our most consistent users - not only in regard to scores like BLEU but also in regard to other new features like marker placement or quotation denormalization.
    • Attempt to identify patterns in these projects: Are they from certain language families? Certain regions? Certain partner organizations? If we run a test against a random 250, do they tend to get higher scores than the non-long-time projects?
    • Reach out to the project owners and develop some kind of inner-circle where we could explore feature ideas and hear concerns.
  • Using the production data, we could also do the reverse: What projects tried it once a long time ago and haven't tried it since? We could do similar things for these projects as mentioned in the bullet points above as well as:
    • Reach out to these projects and ask whether they've encountered difficulties with their draft or encourage them to retry (if it's been long enough).
  • Since we can scrape the complete config from all research ClearML runs through the API, there's an opportunity to analyze the affect of different configuration options retrospectively. This could include:
    • Language or script codes (which could be mapped to families, regions, etc.)
    • Hyperparameters
    • Or just establish baselines across many runs or long-term trends (are our drafts getting better?)

And I'm sure there are more opportunities than these!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    Status

    📋 Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions