Conversation
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Codecov Report
@@ Coverage Diff @@
## master #121 +/- ##
==========================================
+ Coverage 93.61% 94.41% +0.79%
==========================================
Files 5 5
Lines 329 376 +47
==========================================
+ Hits 308 355 +47
Misses 21 21
Continue to review full report at Codecov.
|
| # pseudo-observation covariance matrix `Ŝ` such that ``\hat{C} = C - C (C + \hat{S})^{-1} C``. | ||
| # | ||
| # However, ths is not necessarily a problem: if the likelihood used in the model is | ||
| # log-concave then the optimal choice for `Ĉ` can always be represented using this |
There was a problem hiding this comment.
I was wondering about that. I've definitely seen the result floating around (and it's easy enough to prove) -- will have a hunt.
There was a problem hiding this comment.
if it's that easy to prove, just do it here in the docs 😂
| # The reason to think that this parametrisation will do something sensible is this property. | ||
| # Obviously when ``\mathbf{v} \neq \mathbf{x}`` the optimal approximate posterior cannot be | ||
| # recovered, however, when the hope is that there exists a small pseudo-dataset which gets | ||
| # close to the optimum. |
There was a problem hiding this comment.
what's the advantage of this parametrisation ?
| return AbstractGPs.elbo(sva, l_fx, ys; kwargs...) | ||
| end | ||
|
|
||
| _get_prior(approx::SparseVariationalApproximation) = approx.fz.f |
There was a problem hiding this comment.
How about just calling it
| _get_prior(approx::SparseVariationalApproximation) = approx.fz.f | |
| prior(approx::SparseVariationalApproximation) = approx.fz.f |
?
I think it shows up in a bunch more places (within AbstractGPs' VFE/DTC code too)...
| PseudoObsSparseVariationalApproximation( | ||
| f::AbstractGP, | ||
| z::AbstractVector, | ||
| S::AbstractMatrix{<:Real}, |
There was a problem hiding this comment.
Hm, it's got to be a symmetric, pos-def matrix, right?
| S::AbstractMatrix{<:Real}, v::AbstractVector, y::AbstractVector{<:Real} | ||
| ) | ||
|
|
||
| Chooses `likelihood(u) = N(y; f(v), S)` where `length(y)` need not be equal to the number |
There was a problem hiding this comment.
I don't understand how this makes any sense. On the left-hand side you have u, on the right hand side you don't. How are the two coupled ?
There was a problem hiding this comment.
Ah, so it's implicit via f. u := f(z), so by making noisy observations of f(v), you learn something about f(z). Not clear from what I wrote though.
There was a problem hiding this comment.
....so then the inducing points are actually v? why are u/z needed then?
| y = approx.likelihood.y | ||
| S = approx.likelihood.S | ||
| v = approx.likelihood.v | ||
| return posterior(AbstractGPs.VFE(f(z, 1e-9)), f(v, S), y) |
There was a problem hiding this comment.
Could we name all these magic constants floating around? Should they be consistent? Should they be configurable? Here it's 1e-9, above it's zero, below it's 1e-18....
| approx_centered = SparseVariationalApproximation( | ||
| Centered(), f(z, 1e-12), qu | ||
| ) | ||
| approx_post_centered = posterior(approx_centered) | ||
| approx_centered = SparseVariationalApproximation( | ||
| Centered(), f(z, 1e-12), qu | ||
| ) |
There was a problem hiding this comment.
| approx_centered = SparseVariationalApproximation( | |
| Centered(), f(z, 1e-12), qu | |
| ) | |
| approx_post_centered = posterior(approx_centered) | |
| approx_centered = SparseVariationalApproximation( | |
| Centered(), f(z, 1e-12), qu | |
| ) | |
| approx_centered = SparseVariationalApproximation( | |
| Centered(), f(z, 1e-12), qu | |
| ) | |
| approx_post_centered = posterior(approx_centered) |
| approx_centered = SparseVariationalApproximation( | ||
| Centered(), f(z, 1e-12), qu | ||
| ) |
There was a problem hiding this comment.
| approx_centered = SparseVariationalApproximation( | |
| Centered(), f(z, 1e-12), qu | |
| ) |
I'm rather more hesitant about it, because any added code incurs a maintenance burden throughout the future lifetime... so if you'd like to add these I'd like to be more convinced why they're genuinely useful. Are they easier/faster to optimise? What do you gain from using them? |
|
Whether or not we end up adding these specific parameterisations, I think it would still be good to add the |
As discussed on the call the other day, I've implemented a couple of pseudo-observation parametrisations.
The first has been used in a few places now, so I really think that we should support it. The thing I refer to as the "decoupled pseudo-observation" parametrisation is quite non-standard (I've not actually seen it used anywhere before, just been using it in my own work), but it's a really obvious extension, so I figure why not?
I've had to add an additional abstract type to make it possible to have different fields from those in the `SparseVariationalApproximation", that's why there are quite a lot of LOC changes.
Also, I've expanded on the parametrisation explanations, and provided some code examples that actually run and are now CI'd, so they won't go stale.
Keen to know what people make of this.