Stan Reference
Free reference guide: Stan Reference
About Stan Reference
The Stan Reference is a searchable cheat sheet for the Stan probabilistic programming language used for Bayesian statistical inference via Hamiltonian Monte Carlo (HMC) and the No-U-Turn Sampler (NUTS). It covers the complete 7-block program structure: functions, data, transformed data, parameters, transformed parameters, model, and generated quantities blocks with typed variable declarations and constraint syntax.
The distributions section documents normal, bernoulli/binomial, poisson/negative binomial, cauchy/student_t, beta/dirichlet, multi_normal, and lkj_corr_cholesky distributions with prior specification patterns. Model examples include Bayesian linear regression, logistic regression, hierarchical/multilevel models, and non-centered parameterization for improved convergence in funnel geometries.
Interface sections cover RStan (R), PyStan (Python), and CmdStan (command-line) with compilation, sampling, and posterior extraction workflows. MCMC diagnostics include Rhat convergence, effective sample size (n_eff), divergent transitions, and max treedepth. Model comparison tools include LOO-CV (loo package), prior predictive checks, and posterior predictive checks with bayesplot visualization.
Key Features
- Complete 7-block Stan program structure: data, parameters, transformed parameters, model, generated quantities
- Distribution reference: normal, bernoulli_logit, poisson_log, cauchy, student_t, beta, dirichlet, multi_normal, LKJ
- Model examples: Bayesian linear regression, logistic regression, hierarchical models, non-centered parameterization
- Prior specification patterns: weakly informative priors, regularizing priors, half-Cauchy for variance parameters
- Interface guides for RStan, PyStan, and CmdStan with compilation and sampling workflows
- MCMC convergence diagnostics: Rhat, n_eff, divergent transitions, max treedepth, E-BFMI
- Model comparison: LOO-CV with loo package, prior predictive checks, posterior predictive checks with bayesplot
- Searchable by category with dark mode support across desktop, tablet, and mobile devices
Frequently Asked Questions
What is Stan and what is it used for?
Stan is an open-source probabilistic programming language (BSD license) for Bayesian statistical inference. It uses Hamiltonian Monte Carlo (HMC) and the No-U-Turn Sampler (NUTS) with automatic differentiation to efficiently sample from complex posterior distributions. It is widely used in social sciences, epidemiology, pharmacometrics, and machine learning research.
What are the main blocks in a Stan program?
A Stan program has 7 optional blocks: functions (user-defined functions), data (observed data declarations), transformed data (data transformations), parameters (unknowns to estimate), transformed parameters (parameter transformations), model (prior distributions and likelihood), and generated quantities (posterior predictions, log-likelihood for model comparison).
How do I specify prior distributions in Stan?
Priors are specified in the model block using the ~ operator. Common patterns include: alpha ~ normal(0, 10) for a weakly informative prior, beta ~ normal(0, 1) for regularization, sigma ~ cauchy(0, 2.5) as a half-Cauchy prior for variance parameters, and theta ~ beta(1, 1) for a uniform prior on probabilities.
What is non-centered parameterization and when should I use it?
Non-centered parameterization reparameterizes hierarchical models by sampling alpha_raw ~ std_normal() and computing alpha = mu + tau * alpha_raw in transformed parameters. This avoids funnel geometry problems that cause divergent transitions when the group-level variance (tau) is small, leading to better MCMC sampling efficiency.
How do I run Stan models in R and Python?
In R, use RStan: fit <- stan(file="model.stan", data=stan_data, chains=4, iter=2000). In Python, use PyStan: posterior = stan.build(model_code, data=data); fit = posterior.sample(num_chains=4). CmdStan offers command-line compilation and execution with CmdStanR/CmdStanPy wrappers.
What MCMC diagnostics should I check after fitting a Stan model?
Check Rhat (should be < 1.01 for convergence), n_eff (effective sample size > 100), divergent transitions (must be 0), max treedepth warnings (increase if hitting limit), and E-BFMI (should be > 0.3). Use stan_trace() for trace plots and pairs() for parameter correlation diagnostics.
How do I compare Bayesian models in Stan?
Compute log-likelihood in the generated quantities block, then use the loo R package for LOO-CV: loo1 <- loo(extract_log_lik(fit)). Compare models with loo_compare(loo1, loo2) which reports elpd differences. Also use prior predictive checks to validate priors and posterior predictive checks (ppc_dens_overlay) to assess model fit.
Is this Stan reference free to use?
Yes, this Stan Reference is completely free with no account required. All content runs in your browser with zero server processing. It is part of liminfo.com's collection of free statistics and data science reference tools.