Clarification of the inclusion of Data in the boundary #13
jawache
started this conversation in
Software Components
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
@navveenb I don't believe it was just you there were a few other people who agreed with your point of including "data" in the software boundary.
I wanted to clarify what your intention was for including that. Correct me if I'm wrong.
I believe you intention of including "data" in the software boundary is so that we incentive the cleaning up and optimizing of data before a training cycle, since better quality data means less carbon/energy used in training is that correct?
The reason I wanted to clarify is that if the above is correct, I would perhaps rephrase is in terms that you want to "incentivize" the use of better quality data in training.
Specifically including "data" in the software boundary means we would have to measure the carbon cost of collecting and cleaning data and then allocating that carbon to a functional unit, which is a complex challenge and I suspect is not the true intention, the true intention is to use better data so the training costs are lower.
Let me know if my understanding is correct?
Beta Was this translation helpful? Give feedback.
All reactions