fix: support Pydantic MISSING sentinel in ops.Relation.save#2306
fix: support Pydantic MISSING sentinel in ops.Relation.save#2306dimaqq wants to merge 1 commit intocanonical:mainfrom
Conversation
james-garner-canonical
left a comment
There was a problem hiding this comment.
The core code change looks good to me, as does the to-do list.
|
|
||
| # Encode each value, and then pass it over to Juju. | ||
| data = {field: encoder(values[attr]) for attr, field in sorted(fields.items())} | ||
| data = {field: encoder(values[attr]) if attr in values else '' for attr, field in sorted(fields.items())} |
There was a problem hiding this comment.
I feel that this is the wrong behaviour, that if it's missing that should be missing in the dump as well, leaving it unchanged.
Pydantic says:
The MISSING sentinel is a singleton indicating a field value was not provided during validation.
This singleton can be used as a default value, as an alternative to None when it has an explicit meaning. During serialization, any field with MISSING as a value is excluded from the output.
I don't get "delete this" from that definition. To me "excluded from the output" means that it is like you didn't have it in the class, and our behaviour in Ops (reflecting that of Juju) is that if it's not there, it means don't touch it. If you want to delete something, you need to do that explicitly.
As discussed in daily today, I would be fine if there's a use-case for entirely replacing to add that as an argument for save. Maybe even a helper that does del self.data[app_or_unit][key] for all keys that have a specific value or something like that (again, assuming there's a clear use-case).
I think the correct fix here is something like this:
values = obj.model_dump(mode='json', by_alias=True, exclude_defaults=False) # type: ignore
for name in values:
# Pydantic takes care of the alias.
field = obj.__class__.model_fields[name]
fields[field.alias or name] = field.alias or name # type: ignoreOr something similar: basically, only include in the dump the keys that model_dump provides, rather than for all fields.
There was a problem hiding this comment.
I somehow missed that nuance when looking at this, good point, Tony.
I guess we could also avoid deletion like this:
| data = {field: encoder(values[attr]) if attr in values else '' for attr, field in sorted(fields.items())} | |
| data = {field: encoder(values[attr]) for attr, field in sorted(fields.items()) if attr in values} |
There was a problem hiding this comment.
Yes, that's the suggestion I made wherever this was first brought up. I wondered if it was more efficient to do the loop the other way around, but probably in reality it doesn't make any difference either way. The @james-garner-canonical version is probably clearer to read.
There was a problem hiding this comment.
Yep, that's the question: should we whack fields that are both declared and unset?
This PR is the most narrow approach: since this path was broken (run-time exception, #2299), implementing the missing value functionality this way is not a breaking change.
The issue I have with Tony's alternative, is that it leaves the MISSING a no-op on the write path.
Which means the charmers would have to remove the values for declared fields out of band: there's no way to do that using relation.save(...), they'd have to do for field in some_fields: del relation[Unit/App][field] I think?
There was a problem hiding this comment.
This PR is the most narrow approach: since this path was broken (run-time exception, #2299), implementing the missing value functionality this way is not a breaking change.
I agree it's not a breaking change. But my argument is that it is narrower, less surprising, more consistent, and more correct to fix it the other way.
Which means the charmers would have to remove the values for declared fields out of band: there's no way to do that using
relation.save(...), they'd have to dofor field in some_fields: del relation[Unit/App][field]I think?
Yes, if someone wants to remove a field, then they need to explicitly do so. I did say that it seems reasonable to provide a helper for this or an argument on the save method, if there is a compelling use case.
I'm still not convinced about the removal. If you downgrade, you've lost data, so behaviour changes (quite likely breaking the relation) and that doesn't seem like the happy path. If you stay at the same schema then there's some wasted space, but that doesn't seem so bad, and a "explicitly delete once all units have updated" path could assist with that. If you're upgrading, you've lost data mustn't reuse or change the field, so it's not in the way at all. We deliberately designed load so that if there are fields that aren't in the schema that's ignored.
DRAFT: shows that the approach works
TO DO:
Fixes #2299