-
Notifications
You must be signed in to change notification settings - Fork 9
Description
Hi there! I'm really impressed with the technical depth of this package, and I'm keen on trying to deepen my understanding of how it's designed and why.
My question
Assuming my understanding of the SumTypes and MLStyle approaches to implementing sum types is correct (see below), my general question is "what are the performance tradeoffs between these approaches?"
Sub questions:
- Are the structs defined by SumTypes fixed size no matter which conceptual type they represent, or does their generic typing afford flexibility there?
- If so, does that trade off potentially higher memory usage?
- If so, does it enable performance wins for things like an Array, since indexing is simpler?
- If not, does that trade off performance for things like Arrays?
- What kinds of use cases would one of the two approaches offer substantially better performance than the other, and in which kind of use cases would the performance difference be small?
- In general (beyond the world of sum types), how bad is it to use an abstract type vs. a union on concrete types vs. a single concrete type for things like Arrays?
My (possibly faulty) understanding of how we can implement sum types in Julia
As far as I can tell, there are two main options to offering an algebraic datatype experience in Julia.
First there's SumTypes, which like Unityper defines a single struct which can serve to store and access one of several different "conceptual" types.
E.g.
# Let's look under the hood of SumTypes.@sum_type.
macroexpand(
@__MODULE__,
:(
SumTypes.@sum_type Example begin
Stringy(value::String)
Symbolic(name::Symbol)
end
)
)
# We get this.
($(Expr(:toplevel, quote
#= /home/luke/.julia/packages/SumTypes/qpNRC/src/sum_type.jl:260 =#
struct Example{var"#N#", var"#M#", var"#FT#"}
bits::(Tuple{Vararg{T, N}} where {N, T}){var"#N#", UInt8}
ptrs::(Tuple{Vararg{T, N}} where {N, T}){var"#M#", Any}
var"#tag#"::var"#FT#"
1 + 1
end
... # <truncated>The rest of the generated code is pretty involved, so I don't follow completely, but my understanding is that we're doing some pretty heavy lifting to somewhat reimpliment structs at a pretty low level (including unsafe memory viewing?). My guess is that we do this for performance reasons, but I don't really understand where the performance gains come from.
On the other hand, MLStyle seems to just be providing syntactic sugar around what I'm assuming will eventually be equivalent to isa checks on multiple different concrete implementations of an abstract type. In other words, still using Julia's type system to define an abstract type and several concrete subtypes (one for each of the possible types of the sum type).
# Let's look under the hood of MLStyle.@data.
macroexpand(
@__MODULE__,
:(
MLStyle.@data Example begin
Stringy(value::String)
Symbolic(name::Symbol)
end
)
)
quote
abstract type Example end
struct Stringy{} <: Example
#= REPL[7]:5 =#
value::String
end
nothing
begin
#= /home/luke/.julia/packages/MLStyle/SLOsr/src/Record.jl:95 =#
#= REPL[7]:5 =#
#= /home/luke/.julia/packages/MLStyle/SLOsr/src/Record.jl:96 =#
function (MLStyle).pattern_uncall(t::Type{<:Stringy}, self::Function, type_params::Any, type_args::Any, args::Any)
#= /home/luke/.julia/packages/MLStyle/SLOsr/src/Record.jl:96 =#
#= /home/luke/.julia/packages/MLStyle/SLOsr/src/Record.jl:103 =#
#= REPL[7]:5 =#
#= /home/luke/.julia/packages/MLStyle/SLOsr/src/Record.jl:104 =#
(MLStyle.Record._compile_record_pattern)(t, self, type_params, type_args, args)
end
end
struct Symbolic{} <: Example
#= REPL[7]:6 =#
name::Symbol
end
nothing
begin
#= /home/luke/.julia/packages/MLStyle/SLOsr/src/Record.jl:95 =#
#= REPL[7]:6 =#
#= /home/luke/.julia/packages/MLStyle/SLOsr/src/Record.jl:96 =#
function (MLStyle).pattern_uncall(t::Type{<:Symbolic}, self::Function, type_params::Any, type_args::Any, args::Any)
#= /home/luke/.julia/packages/MLStyle/SLOsr/src/Record.jl:96 =#
#= /home/luke/.julia/packages/MLStyle/SLOsr/src/Record.jl:103 =#
#= REPL[7]:6 =#
#= /home/luke/.julia/packages/MLStyle/SLOsr/src/Record.jl:104 =#
(MLStyle.Record._compile_record_pattern)(t, self, type_params, type_args, args)
end
end
endThis seems like it offers fewer "sharp edges" with respect to matching the standard mental model of types in Julia, but I'm guessing it is trading off performance somewhere, and I'm hoping to better understand where!