arviz_stats.loo_approximate_posterior#
- arviz_stats.loo_approximate_posterior(data, log_p, log_q, pointwise=None, var_name=None, log_jacobian=None)[source]#
Compute PSIS-LOO-CV for approximate posteriors.
Estimates the expected log pointwise predictive density (elpd) using Pareto-smoothed importance sampling leave-one-out cross-validation (PSIS-LOO-CV) for approximate posteriors (e.g., from variational inference). Requires log-densities of the target (log_p) and proposal (log_q) distributions.
The PSIS-LOO-CV method is described in [1] and [2]. The approximate posterior correction is computed using the method described in [3].
See the EABM chapter on Model Comparison for Large Data for more details.
- Parameters:
- data
xarray.DataTreeorInferenceData Input data. It should contain the log_likelihood group corresponding to samples drawn from the proposal distribution (q).
- log_p
ndarrayorxarray.DataArray The (target) log-density evaluated at S samples from the target distribution (p). If ndarray, should be a vector of length S where S is the number of samples. If DataArray, should have dimensions matching the sample dimensions (“chain”, “draw”).
- log_q
ndarrayorxarray.DataArray The (proposal) log-density evaluated at S samples from the proposal distribution (q). If ndarray, should be a vector of length S where S is the number of samples. If DataArray, should have dimensions matching the sample dimensions (“chain”, “draw”).
- pointwisebool, optional
If True, returns pointwise values. Defaults to rcParams[“stats.ic_pointwise”].
- var_name
str, optional The name of the variable in log_likelihood groups storing the pointwise log likelihood data to use for loo computation.
- log_jacobian
xarray.DataArray, optional Log-Jacobian adjustment for variable transformations. Required when the model was fitted on transformed response data \(z = T(y)\) but you want to compute ELPD on the original response scale \(y\). The value should be \(\log|\frac{dz}{dy}|\) (the log absolute value of the derivative of the transformation). Must be a DataArray with dimensions matching the observation dimensions.
- data
- Returns:
ELPDDataObject with the following attributes:
kind: “loo”
elpd: expected log pointwise predictive density
se: standard error of the elpd
p: effective number of parameters
n_samples: number of samples
n_data_points: number of data points
scale: “log”
warning: True if the estimated shape parameter of Pareto distribution is greater than
good_k.good_k: For a sample size S, the threshold is computed as
min(1 - 1/log10(S), 0.7)elpd_i:
DataArraywith the pointwise predictive accuracy, only ifpointwise=Truepareto_k:
DataArraywith Pareto shape values, only ifpointwise=Trueapprox_posterior: True (approximate posterior correction applied)
See also
looStandard PSIS-LOO-CV.
loo_subsampleSub-sampled PSIS-LOO-CV.
compareCompare models based on their ELPD.
References
[1]Vehtari et al. Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC. Statistics and Computing. 27(5) (2017) https://doi.org/10.1007/s11222-016-9696-4 arXiv preprint https://arxiv.org/abs/1507.04544.
[2]Vehtari et al. Pareto Smoothed Importance Sampling. Journal of Machine Learning Research, 25(72) (2024) https://jmlr.org/papers/v25/19-556.html arXiv preprint https://arxiv.org/abs/1507.02646
[3]Magnusson, M., Riis Andersen, M., Jonasson, J., & Vehtari, A. Bayesian Leave-One-Out Cross-Validation for Large Data. Proceedings of the 36th International Conference on Machine Learning, PMLR 97:4244–4253 (2019) https://proceedings.mlr.press/v97/magnusson19a.html arXiv preprint https://arxiv.org/abs/1904.10679