Reconciliation of overlapping Gaussian processes
Combining Gaussian processes on the same domain; consistency; coherence; generalized phase retrieval
August 12, 2024 — August 12, 2024
Suppose I have two random functions \(f_1(x)\) and \(f_2(x)\), each of which is a Gaussian process, i.e. a well-behaved random function. The two functions are defined on the same domain \(x \in \mathcal{X}\), and I want to reconcile them in some way. This could be because they are noisy observations of the same underlying function or because they are two different models of the same phenomenon. This kind of thing arises often in the context of spatiotemporal modeling. There is also a close connection to Gaussian Belief Propagation. In either case, I want to find a way to combine them into a single random function \(f(x)\) that captures the information in both. Ideally, it should behave like a standard update does, e.g. if they both agree with high certainty, then the combined function should also agree with high certainty. If they do not agree, or have low certainty, then the combined function should reflect that by having low certainty.
I am sure that this must be well-studied, but it is one of those things that is rather hard to google for and ends up being easier to work out by hand, which is what I do here.
1 Overlapping GPs
Suppose I have two GP priors \(f_1(x) \sim \mathcal{GP}(m_1(x), k_1(x, x'))\) and \(f_2(x) \sim \mathcal{GP}(m_2(x), k_2(x, x'))\) defined over the same index set. They could be two posterior likelihood updates, or two expert priors, or whatever.
How do we reconcile these two GPs into a single GP \(f(x)\)?
The standard answer for Gaussian processes is to find a new one whose density is the product of the density of the two components.
2 Enveloping overlapping GPs
3 Connection to classical methods
This idea bears a resemblance to Griffin-Lim iteration phase recovery, where we have two overlapping signals and we want to combine them into a single signal that is consistent with both. That case is somewhat different because it assumes
- a point estimate will do; we do not talk about random functions, and
- The covariance kernels are implicitly stationary, which we do not assume here (and as such, there is not necessarily a “phase” to “recover”).