Status: RFC
Applies to: client, server
For a summarized list of proposed changes, see the Changes Checklist section.
This RFC defines how smithy-rs will enable customers to use the serde library with generated clients & servers. This is a common request
for myriad reasons, but as we have written about before this is a challenging design area. This RFC proposes a new approach: Rather than implement Serialize directly, add a method to types that returns a type that implements Serialize. This solves a number of issues:
- It is minimally impactful: It doesn't lock us into one
Serializeimplementation. It contains only one public trait,SerializeConfigured. This trait will initially be defined on a per-crate basis to avoid the orphan-trait rule. It also doesn't have any impact on shared runtime crates (since no types actually need to implement serialize). - It allows customers to configure serde to their use case. For example, for testing/replay you probably don't want to redact sensitive fields but for logging or other forms of data storage, you may want to redact those fields.
- The entire implementation is isolated to a single module, making it trivial to feature-gate out.
serde: A specific Rust library that is commonly used for serializationSerializer: The serde design decouples the serialization format (e.g. JSON) from the serialization structure of a particular piece of data. This allows the same Rust code to be serialized to CBOR, JSON, etc. The serialization protocol, e.g.serde_json, is referred to as theSerializer.- Decorator: The interface by which code separated from the core code generator can customize codegen behavior.
Currently, there is no practical way for customers to link Smithy-generated types with Serialize.
Customers will bring the SerdeDecorator into scope by including it on their classpath when generating clients and servers.
Customers may add a serde trait to members of their model:
use smithy.rust#serde;
@serde
structure SomeStructure {
field: String
}The serde trait can be added to all shapes, including operations and services. When it is applied to a service, all shapes in the service closure will support serialization.
Note: this RFC only describes Serialize. A follow-up RFC and implementation will handle Deserialize.
Generated crates that include at least one serde tagged shape will include a serde feature. This will feature gate the module containing the serialization logic. This will provide implementations of SerializeConfigured which provides two methods:
my_thing.serialize_ref(&settings); // Returns `impl Serialize + 'a`
my_thing.serialize_owned(settings); // Returns `impl Serialize`Once a customer has an object that implements Serialize they can then use it with any Serializer supported by serde.
use generated_crate::serde::SerializeConfigured;
let my_shape = Shape::builder().field(5).build();
let settings = SerializationSettings::redact_sensitive_fields();
let as_json = serde_json::to_string(&my_shape.serialize_ref(&settings));The generated code includes two methods:
serialize_redacted and serialize_unredacted.
Note: There is nothing in these implementations that rely on implementation details—Customers can implement these methods (or variants of them) themselves.
These have the correct signatures to be used with serialize_with:
use generated_crate::serde::serialize_redacted;
#[derive(Serialize)]
struct MyStruct {
#[serde(serialize_with = "serialize_redacted")]
inner: SayHelloInput,
}This will be supported in the future. Currently Deserialize behavior is not covered by this RFC. Customers can take the same serialization settings they used.
This is possible by using the base APIs. If customers want to delegate another thread or piece of code to actually perform the serialization, they can use .serialize_owned(..) along with erased-serde to accomplish this.
In order to provide configurable serialization, this defines the crate-local public trait SerializeConfigured:
/// Trait that allows configuring serialization
/// **This trait should not be implemented directly!** Instead, `impl Serialize for ConfigurableSerdeRef<T>`
pub trait SerializeConfigured {
/// Return a `Serialize` implementation for this object that owns the object.
///
/// Use this if you need to create `Arc<dyn Serialize>` or similar.
fn serialize_owned(self, settings: SerializationSettings) -> impl Serialize;
/// Return a `Serialize` implementation for this object that borrows from the given object
fn serialize_ref<'a>(&'a self, settings: &'a SerializationSettings) -> impl Serialize + 'a;
}We also need to define SerializationSettings. The only setting currently exposed is redact_sensitive_fields:
#[non_exhaustive]
#[derive(Copy, Clone, Debug, Default)]
pub struct SerializationSettings {
/// Replace all sensitive fields with `<redacted>` during serialization
pub redact_sensitive_fields: bool,
}We MAY add additional configuration options in the future, but will keep the default behavior matching current behavior. Future options include:
- Serialize
nullwhen a field is unset (the current default is to skip serializing that field) - Serialize blobs via a list of numbers instead of via base64 encoding
- Change the default format for datetimes (current
HttpDate)
No objects actually implement SerializeConfigured. Instead, the crate defines two private structs:
pub(crate) struct ConfigurableSerde<T> {
pub(crate) value: T,
pub(crate) settings: SerializationSettings
}
pub(crate) struct ConfigurableSerdeRef<'a, T> {
pub(crate) value: &'a T,
pub(crate) settings: &'a SerializationSettings
}Why two structs?
We need to support two use cases—one where the customer wants to maintain ownership of their data and another where the customer wants to create
Box<dyn Serialize>or other fat pointer. There is a blanket impl forSerializefromConfigurableSerdetoConfigurableSerdeRef.
The SerializeConfigured trait has a blanket impl for ConfigurableSerdeRef:
/// Blanket implementation for all `T` such that `ConfigurableSerdeRef<'a, T>` implements `Serialize`.
impl<T> SerializeConfigured for T
where for<'a> ConfigurableSerdeRef<'a, T>: Serialize {
fn serialize_owned(
self,
settings: SerializationSettings,
) -> impl Serialize {
ConfigurableSerde {
value: self,
settings,
}
}
fn serialize_ref<'a>(
&'a self,
settings: &'a SerializationSettings,
) -> impl Serialize + 'a {
ConfigurableSerdeRef {
value: self,
settings,
}
}
}The job of the code generator is then to implement ConfigurableSerdeRef for all the specific T that we wish to serialize.
Handling @sensitive is done by wrapping memers in Sensitive<'a T>(&'a T) during serialization. The serialize implementation consults the settings to determine if redaction is required.
if let Some(member_1) = &inner.foo {
s.serialize_field("foo",
&Sensitive(&member_1.serialize_ref(&self.settings)).serialize_ref(&self.settings),
)?;
}Note that the exact mechanism for supporting sensitive shapes is crate-private and can be changed in the future.
For Maps and Lists, we need to be able to handle the case where two different Vec<String> may be serialized differently. For example, one may target a Sensitive string and the other may target a non-sensitive string.
To handle this case, we generate a wrapper struct for collections:
struct SomeStructWrapper<'a>(&'a Vec<SomeStruct>);We then implement Serialize for this wrapper which allows us to control behavior on a collection-by-collection basis without running into conflicts.
Note: This is a potential area where future optimizations could reduce the amount of generated code if we were able to detect that collection serialization implementations were identical and deduplicate them.
For custom types that do not implement Serialize, we generate crate-private implementations, only when actually needed:
impl<'a> Serialize for ConfigurableSerdeRef<'a, DateTime> {
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error> where S: Serializer {
serializer.serialize_str(&self.value.to_string())
}
}- Define
SerializeConfigured - Define
ConfigurableSerde/SerdeRef - Generate implementations for all types in the service closure
- Handle sensitive shapes
- Implement
Deserialize