Introduction and course overview

From an epidemiological line list to informing decisions in real-time

Aim of the course

In this course we will address how can we use data typically collected in an outbreak, or in routine surveillance, to answer questions like

What is the number of cases now (nowcasting)
Are infections rising/falling and by how much (\(R_t\) estimation)
What does this mean for the near future (forecasting)

To answer these questions, we need to understand the epidemiological processes that create the kinds of data that we typically have available for outbreak analysis and infectious disease surveillance.

There are particular challenges when trying to do these analyses in real time (i.e. whilst transmission and data collection is ongoing) rather than retrospectively, which we will address in turn.

Let’s look at infectious disease surveillance data from the perspective of an individual infection. There are two types of processes happening:

upwards, from an individual infection through to being recorded in surveillance data; and
outwards, from each infection spreading to cause new infections in the population.

Both of these processes involve time delays which makes analysing data in real time especially tricky.

In this course, we first focus on making sense of the delays and biases in the data we are able to access in real time as part of infectious disease surveillance. We’ll combine this with the process of infectious disease transmission, with the reproduction number as a key component. With that, we can start to interpret the present (nowcasting) and predict the future (forecasting).

Why this course?

These are common questions in outbreak response and disease surveillance
Accounting for underlying processes can get surprisingly complicated quickly ¹, and it’s easy to make mistakes
There’s currently (at the time of devising this course) no comprehensive training resource that links these common questions and challenges

Approach

Throughout the course we will

use models to simulate data sets in R (thus introducing the generative model)
apply the generative model to the simulated data in the probabilistic programming language stan, to
- learn about the system (conduct inference)
- make predictions (nowcasting/forecasting)

Each session in the course:

builds on the previous one so that participants will have an overview of the real-time analysis workflow by the end of the course;
starts with a short introductory talk;
mainly consists of interactive content that participants will work through;
marks optional and additional material clearly (boxes labelled optional, plus a “Going further” section at the end of each session). Skip these on a first pass and come back to them later or after the course; the main material comes first;

For those attending the in-person version the course also:

has multiple instructors ready to answer questions about this content; if several people have a similar question we may pause the session and discuss it with the group;
pauses at several points during each session to discuss observations and learnings together and address any questions;
ends with a wrap-up and discussion where we review the sessions material.

Timeline for the course

The course was created to be taught in-person for 2.5 days but of course if you are studying this on your own using the web site you can go through the material at your own pace and in your own time. Broadly, the intended timeline is:

delay distributions and how to estimate them (day 1)
\(R_t\) estimation and the generation interval (day 1)
nowcasting (day 2)
forecasting and evaluation (day 2)
ensemble methods and applications (day 3)

Let’s get started!

Have a more detailed look at the learning objectives
Introduction to the course and the instructors
If you haven’t already, start with getting set up for the course
Once you’re all set up, we begin with delay distributions.
For the probability distributions and Bayesian inference with stan that we rely on throughout the course, work through the Stan reference guide as self-study (also shared on Slack); it is not taught as a separate session.

Footnotes

Time travel is messy stuff ↩︎