New paper on Bayesian workflow

(Thanks to Daniel Lee for setting up this blog. We’ve been discussing the idea of a Stan blog for a long time and finally someone took the initiative to set it up! If you want to help out with content check out Daniel’s initial post for details.)

Andrew just posted about this on his blog, but I’ll quickly mention it again here because it will hopefully be of interest to Stan users. A bunch of us recently released a preprint of a new paper on Bayesian workflow. Here’s the abstract:

The Bayesian approach to data analysis provides a powerful way to handle uncertainty in all observations, model parameters, and model structure using probability theory. Probabilistic programming languages make it easier to specify and fit Bayesian models, but this still leaves us with many options regarding constructing, evaluating, and using these models, along with many remaining challenges in computation. Using Bayesian inference to solve real-world problems requires not only statistical skills, subject matter knowledge, and programming, but also awareness of the decisions made in the process of data analysis. All of these aspects can be understood as part of a tangled workflow of applied Bayesian statistics. Beyond inference, the workflow also includes iterative model building, model checking, validation and troubleshooting of computational problems, model understanding, and model comparison. We review all these aspects of workflow in the context of several examples, keeping in mind that in practice we will be fitting many models for any given problem, even if only a subset of them will ultimately be relevant for our conclusions.

Andrew Gelman, Aki Vehtari, Daniel Simpson, Charles C. Margossian, Bob Carpenter, Yuling Yao, Paul-Christian Bürkner, Lauren Kennedy, Jonah Gabry, Martin Modrák

The paper is quite long (70+ pages) and although we’d love for you to read all of it, if that’s not possible we hope many of the sections will be valuable on their own (there’s a table of contents to make it easier to navigate).

One particularly interesting aspect of working on this paper was the disagreement among the authors about various important topics. I’m not sure if it comes through at all in the text because in most cases I think the disagreements were minimal and we were able to present a unified perspective in the paper, but there are definitely issues where differences remain. For example, I think all of us more or less agree on most of the benefits and limitations of approximate inference algorithms, but the extent to which we’re actually comfortable using them (and in which circumstances) varies quite a bit from person to person based on different experiences, philosophies, and priorities.

Also, I do want to acknowledge that other people did most of the work on this paper, so I’m really just the messenger here. And there’s also been a lot of other work done on Bayesian workflow by people in the Stan community, including our previous paper Visualization in Bayesian workflow, a lot of great case studies from Michael Betancourt (some specifically about workflow but many that touch on the topic), a recent case study specifically on Bayesian workflow for disease transmission modeling by Leo Grinsztajn, Elizaveta Semenova, Charles Margossian, and Julien Riou, and many other examples (please share others I’m forgetting or am unaware of in the comments!).

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s