Add a flag to control the use of views in gradients#223
Add a flag to control the use of views in gradients#223penelopeysm wants to merge 3 commits intobackport-0.6from
Conversation
7965cee to
83b8bd7
Compare
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
|
AdvancedVI.jl documentation for PR #223 is available at: |
|
What I don't really understand is, the same failure should have happened with DynamicPPL 0.38, because gradient prep there was also done with a Vector... So I don't know why CI is failing on TuringLang/Turing.jl#2702 but not TuringLang/Turing.jl#2699 :/ |
|
Ahh, it's because DPPL@0.38 has an extra line in So an alternative fix would be to add that back to DynamicPPL. It doesn't cause issues elsewhere because I think everywhere else that we use LogDensityFunction we just pass it a vector. An alternative fix would be that, in DynamicPPL we could add a specialisation: function LogDensityProblems.logdensity_and_gradient(ldf, x::SubArray)
LogDensityProblems.logdensity_and_gradient(ldf, [v for v in x])
end
function LogDensityProblems.logdensity_and_gradient(ldf, x::AbstractVector{<:Real})
...
endThat might actually be easier. |
|
The specialisation on julia> arr = randn(3,3,3);
julia> convert(Vector{Float64}, @view arr[1, :, 1])
3-element Vector{Float64}:
1.7808290397025959
-0.5444292197945666
-0.04197939645098293It seems like a waste though, to copy the view just because the gradient prep isn't ready for a view. Would it be overkill to make multiple EDIT: Or rather, type stability itself would probably be fine, but the cost to pay would be that every |
|
That sounds like a quite a lot of work to try, for uncertain pay-off. It may well be that once the gradient computation starts in earnest the view will have to get copied regardless. So very happy to just copy for now, and see if this can be improved later. |
|
Yup, agreed on all counts. |
using
LogDensityProblems.logdensity_and_gradientwith views in e.g.AdvancedVI.jl/src/algorithms/fisherminbatchmatch.jl
Lines 94 to 97 in 9e6948e
is incompatible with
DynamicPPL.LogDensityFunctionbecause itsprepare_gradientis run with a Vector argument. if you then attempt to calllogdensity_and_gradientwith aSubArrayit will error likesee e.g. https://github.com/TuringLang/Turing.jl/actions/runs/19878777583/job/56972119769?pr=2702.
This PR therefore offers a way to disable this by means of overloading a method on
prob. Unfortunately, this is not a very clean solution. it would be nicer to have a field onprob. However, since the only requirement is thatprobis an object that satisfies the LogDensityProblems interface, I don't think we can do any better.It could be put into a keyword argument of
algorithm, but I think that's wrong as this is a property of theprob.