Skip to content

Conversation

@Random-Scientist
Copy link

@Random-Scientist Random-Scientist commented Nov 22, 2024

Removes the lifetime parameter on the SpanContents trait. (addresses #410). This is a breaking change.

Up to authors/maintainers whether to merge this or not, here are the overall pros and cons of this approach:

Pros:

  • makes SpanContents much more flexible, as it is no longer forced to borrow its data from the corresponding SourceCode.
  • forces any wrapper types to actually wrap the inner SpanContents, as opposed to taking its data reference, calling the getters once and constructing a new MietteSpanContents<'data>. This probably doesn't matter that much, but it is technically possible to create a weird SpanContents implementation that does not return consistent values from its getters, which the current implementation of NamedSource does not handle correctly.

Cons:

  • Makes NamedSource and some user-defined SpanContents wrappers slightly less efficient. As discussed above, NamedSource and other user-defined wrappers are now forced to retain the inner SpanContents instead of copying the relevant data out of it, requiring the addition of an indirection if the inner SpanContents' type is not statically known. In practice many such usages will retain similar performance (at least in release builds) due to vtable inlining, but the fact remains that this change will result in equivalent or worse performance.
  • This is a breaking change, requiring the user to make major changes if they wrap SpanContents in a way similar to the current implementation of NamedSource (see the changes to NamedSource for an example of what this entails).

Semantic differences

Here are the implementations with all of their lifetime parameters unelided, current:

trait SpanContents<'a> {
    fn data<'elided>(&'elided self) -> &'a [u8];
    ...
}

and proposed:

trait SpanContents {
    fn data<'elided>(&'elided self) -> &'elided [u8];
    ...
}

The key difference in semantics between the two is that the former has an independent borrow lifetime which may outlive Self, i.e. the current definition can be used as below, while my proposed implementation cannot.

fn example<'a, 'b, T: SpannedIter<'b> + 'a>(val: T) -> &'b [u8] where 'b: 'a {
   let data = val.data();
   drop(val);
   return data;
}

However, this flexibility w.r.t. the lifetime of the borrow comes at a cost, forcing every implementation of SpanContents to borrow its data from somewhere else that lives for 'a because 'a may outlive Self.

This is the current signature of the source of all SpanContents, SourceCode::read_span:

fn read_span<'a>(
        &'a self,
        span: &SourceSpan,
        context_lines_before: usize,
        context_lines_after: usize,
    ) -> Result<Box<dyn SpanContents<'a> + 'a>, MietteError>;

Because of the + 'a bound on the trait object, the lifetime of SpanContents' borrow must be equal to the lifetime of Self. With this added constraint, the two approaches above become almost equivalent, most importantly they are both valid to call data on for the exact same lifetime ('a in the context of read_span). The only difference is that the borrow returned by data is now allowed to originate within the SpanContents (causing the caveats I mentioned above).

Errata

  • Is NamedSource intended to override the language all the time, or only when language is not None? (the previous behavior is all of the time but this feels like the wrong behavior)

I ran cargo test locally with these changes and all tests pass.

@Random-Scientist Random-Scientist changed the title Allow SpanContents to possibly own its backing data Allow implementations of SpanContents to own their backing data Nov 22, 2024
@zkat zkat added the breaking A semver-major breaking change label Nov 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

breaking A semver-major breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants