Noob question: performance tradeoffs vs. the MLStyle approach to sum types

Hi there! I'm really impressed with the technical depth of this package, and I'm keen on trying to deepen my understanding of how it's designed and why.

## My question

Assuming my understanding of the SumTypes and MLStyle approaches to implementing sum types is correct (see below), my general question is "what are the performance tradeoffs between these approaches?"

Sub questions:

- Are the structs defined by SumTypes fixed size no matter which conceptual type they represent, or does their generic typing afford flexibility there?
  - If so, does that trade off potentially higher memory usage?
  - If so, does it enable performance wins for things like an Array, since indexing is simpler?
  - If not, does that trade off performance for things like Arrays?
- What kinds of use cases would one of the two approaches offer substantially better performance than the other, and in which kind of use cases would the performance difference be small?
- In general (beyond the world of sum types), how bad is it to use an abstract type vs. a union on concrete types vs. a single concrete type for things like Arrays?

## My (possibly faulty) understanding of how we can implement sum types in Julia

As far as I can tell, there are two main options to offering an algebraic datatype experience in Julia.

First there's SumTypes, which like Unityper defines a single struct which can serve to store and access one of several different "conceptual" types.

E.g. 
```julia
# Let's look under the hood of SumTypes.@sum_type.
macroexpand(
    @__MODULE__,
    :(
        SumTypes.@sum_type Example begin
            Stringy(value::String)
            Symbolic(name::Symbol)
        end
    )
)

# We get this.
($(Expr(:toplevel, quote
    #= /home/luke/.julia/packages/SumTypes/qpNRC/src/sum_type.jl:260 =#
    struct Example{var"#N#", var"#M#", var"#FT#"}
        bits::(Tuple{Vararg{T, N}} where {N, T}){var"#N#", UInt8}
        ptrs::(Tuple{Vararg{T, N}} where {N, T}){var"#M#", Any}
        var"#tag#"::var"#FT#"
        1 + 1
    end
...  # <truncated>
```

The rest of the generated code is pretty involved, so I don't follow completely, but my understanding is that we're doing some pretty heavy lifting to somewhat reimpliment structs at a pretty low level (including unsafe memory viewing?). My guess is that we do this for performance reasons, but I don't really understand where the performance gains come from.

On the other hand, MLStyle seems to just be providing syntactic sugar around what I'm assuming will eventually be equivalent to `isa` checks on multiple different concrete implementations of an abstract type. In other words, still using Julia's type system to define an abstract type and several concrete subtypes (one for each of the possible types of the sum type). 

```julia
# Let's look under the hood of MLStyle.@data.
macroexpand(
    @__MODULE__,
    :(
        MLStyle.@data Example begin 
            Stringy(value::String)
            Symbolic(name::Symbol)
        end
    )
)

quote
    abstract type Example end
    struct Stringy{} <: Example
        #= REPL[7]:5 =#
        value::String
    end
    nothing
    begin
        #= /home/luke/.julia/packages/MLStyle/SLOsr/src/Record.jl:95 =#
        #= REPL[7]:5 =#
        #= /home/luke/.julia/packages/MLStyle/SLOsr/src/Record.jl:96 =#
        function (MLStyle).pattern_uncall(t::Type{<:Stringy}, self::Function, type_params::Any, type_args::Any, args::Any)
            #= /home/luke/.julia/packages/MLStyle/SLOsr/src/Record.jl:96 =#
            #= /home/luke/.julia/packages/MLStyle/SLOsr/src/Record.jl:103 =#
            #= REPL[7]:5 =#
            #= /home/luke/.julia/packages/MLStyle/SLOsr/src/Record.jl:104 =#
            (MLStyle.Record._compile_record_pattern)(t, self, type_params, type_args, args)
        end
    end
    struct Symbolic{} <: Example
        #= REPL[7]:6 =#
        name::Symbol
    end
    nothing
    begin
        #= /home/luke/.julia/packages/MLStyle/SLOsr/src/Record.jl:95 =#
        #= REPL[7]:6 =#
        #= /home/luke/.julia/packages/MLStyle/SLOsr/src/Record.jl:96 =#
        function (MLStyle).pattern_uncall(t::Type{<:Symbolic}, self::Function, type_params::Any, type_args::Any, args::Any)
            #= /home/luke/.julia/packages/MLStyle/SLOsr/src/Record.jl:96 =#
            #= /home/luke/.julia/packages/MLStyle/SLOsr/src/Record.jl:103 =#
            #= REPL[7]:6 =#
            #= /home/luke/.julia/packages/MLStyle/SLOsr/src/Record.jl:104 =#
            (MLStyle.Record._compile_record_pattern)(t, self, type_params, type_args, args)
        end
    end
end
```

This seems like it offers fewer "sharp edges" with respect to matching the standard mental model of types in Julia, but I'm guessing it is trading off performance somewhere, and I'm hoping to better understand where!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Noob question: performance tradeoffs vs. the MLStyle approach to sum types #29

My question

My (possibly faulty) understanding of how we can implement sum types in Julia

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Noob question: performance tradeoffs vs. the MLStyle approach to sum types #29

Description

My question

My (possibly faulty) understanding of how we can implement sum types in Julia

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions