This sampling step works with variables in their original space, to create
the sampling step we thus need to:
Create the initial value for the variables to sample;
Create a log-density graph that works in the transformed space, and
build a NUTS kernel that uses this graph;
Apply the default transformations to the initial values;
Apply the NUTS kernel to the transformed initial values;
Apply the backward transformation to the updated values.
Parameters:
rvs_to_samples – A dictionary that maps the random variables whose posterior
distribution we wish to sample from to their initial values.
realized_rvs_to_values – A dictionary that maps the random variables not sampled by NUTS to their
realized value. These variables can either correpond to observations, or
to variables whose value is set by a different sampler.
Returns:
A NUTS sampling step for each random variable, their initial values, the
for discrete states in the sequence \(S_t\), \(t \in \{0, \dots, T\}\)
and observations \(y_t\).
The argument gamma_0 corresponds to \(S_0\), Gammas to
\(p(S_{t+1} | S_t)\), and log_lik to \(\log p(y_t \mid S_t)\).
The latter is used to derive the forward quantities
\(p(S_{t+1} \mid y_{1:T})\).
This implementations is most similar to the one in [nr], which forgoes
expensive log-space calculations by using fixed precision bounds and
re-normalization/scaling.
Parameters:
gamma_0 – The initial state probabilities as an array of base shape (M,).
Gamma – The transition probability matrices. This array should take the base
shape (N,M,M), where N is the state sequence length and M
is the number of distinct states. If N is 1, the single
transition matrix will broadcast across all elements of the state sequence.
log_lik – An array of base shape (M,N) consisting of the log-likelihood
values for each state value at each point in the sequence.
lower_prec_bound – A number indicating when forward probabilities are too small and need
to be re-normalized/scaled.
Returns:
A tuple containing the tensor representing the FFBS sampled states and the
Sylvia Fruehwirth-Schnatter, “Markov chain Monte Carlo estimation of classical and dynamic switching and mixture models”, Journal of the American Statistical Association 96 (2001): 194–209
Press, William H., Saul A. Teukolsky, William T. Vetterling, and Brian P. Flannery. 2007. “Numerical Recipes 3rd Edition: The Art of Scientific Computing.” Cambridge university press.
Sample from the posterior of a normal prior and normal observation model.
\[\begin{split}\begin{align*}
Z &\sim \operatorname{N}\left( X \beta, \Omega^{-1} \right) \\
\beta &\sim \operatorname{N}\left( 0, \tau^2 \Lambda \right)
\end{align*}\end{split}\]
where \(X \in \mathbb{R}^{n \times p}\),
\(\Lambda = \operatorname{diag}\left(\lambda^2_1, \dots, \lambda^2_p\right)\), and
\(\Omega^{-1} = \operatorname{diag}\left(\omega_1, \dots, \omega_n\right)\).
The posterior distribution of \(\beta\) is given by
\[\begin{split}\begin{align*}
\left( \beta \mid Z = z \right) &\sim
\operatorname{N}\left( A^{-1} X^{\top} \Omega z, A^{-1} \right) \\
A &= X^{\top} X + \Lambda^{-1}_{*} \\
\Lambda_{*} &= \tau^2 \Lambda
\end{align*}\end{split}\]
This function chooses the best sampler for \(\beta \mid z\) based on
the dimensions of \(X\).
Parameters:
srng – The random number generator used to draw samples.
omega – The observation model diagonal std. dev. values \(\omega_i\).
In other words, \(\operatorname{diag}\left(\Omega\right)\).
lmbdatau_inv – The inverse \(beta\) std. dev. values \(\tau^{-1} \lambda^{-1}_i\).
In other words, \(\operatorname{diag}\left(\Lambda^{-1/2}_{*}\right)\).
Gibbs kernel to sample from the posterior distributions of the horseshoe prior shrinkage parameters.
This kernel generates samples from the posterior distribution of the local
and global shrinkage parameters of a horseshoe prior, respectively \(\lambda\)
and \(\tau\) in the following model:
The graphs constructed by this function are \(\lambda \mid \beta, \tau\) and
\(\tau \mid \lambda\), respectively.
We use the following observations [1]_ to sample from the posterior
conditional probability of \(\tau\) and \(\lambda\):
1. The half-Cauchy distribution can be intepreted as a mixture of inverse-gamma
distributions;
2. If \(Z \sim \operatorname{InvGamma}(1, a)\), \(Z \sim 1 / \operatorname{Exp}(a)\).
Parameters:
srng – The random number generating object to be used during sampling.
beta – Regression coefficients.
sigma2 – Variance of the regression coefficients.
lambda2 – Square of the local shrinkage parameters.
tau2 – Square of the global shrinkage parameters.
Return type:
Posteriors for \(lambda\) and \(tau\), respectively.
References
aemcmc.gibbs.nbinom_dispersion_posterior(srng, h, p, a, b, y)[source]#
Sample the conditional posterior for the dispersion parameter under a negative-binomial and gamma prior.
In other words, this draws a sample from \(h \mid Y = y\) per
\[\begin{split}\begin{align*}
Y_i &\sim \operatorname{NB}(h, p_i) \\
h &\sim \operatorname{Gamma}(a, b)
\end{align*}\end{split}\]
where \(\operatorname{NB}\) is a negative-binomial distribution.
The conditional posterior sample step is derived from the following decomposition:
where \(\operatorname{Log}\) is the logarithmic distribution. Under a
gamma prior, \(h\) is conjugate to \(l\). We draw samples from
\(l\) according to \(l \sim \operatorname{CRT(y, h)}\), where
\(y\) is a sample from \(y \sim Y\).
where \(\operatorname{NB}\) is a negative-binomial distribution.
Parameters:
srng – The random number generator from which samples are drawn.
beta – The current/previous value of the regression parameter \(beta\).
beta_std – The std. dev. of the regression parameter \(beta\).
X – The regression matrix.
h – The \(h\) parameter in the negative-binomial distribution of \(Y\).
y – A sample from the observation distribution \(y \sim Y\).
Return type:
A sample from the posterior \(\beta \mid y\).
Notes
The \(z\) expression in Section 2.2 of [1]_ seems to
omit division by the Polya-Gamma auxilliary variables whereas [2] and [3]
explicitly include it. We found that including the division results in
accurate posterior samples for the regression coefficients. It is also
worth noting that the \(\sigma^2\) parameter is not sampled directly
in the negative binomial regression problem and thus set to 1 [2].