Introduction and course overview

From an epidemiological line list to informing decisions in real-time

Aim of the course

In this focused half-day course we will address how we can use data typically collected in an outbreak, or in routine surveillance, to answer questions like

  • Are infections rising/falling and by how much (\(R_t\) estimation)
  • What is the number of cases now (nowcasting)
  • How can we estimate both when reporting delays are unknown (joint nowcasting)

To answer these questions, we need to understand the epidemiological processes that create the kinds of data that we typically have available for outbreak analysis and infectious disease surveillance.

There are particular challenges when trying to do these analyses in real time (i.e. whilst transmission and data collection is ongoing) rather than retrospectively, which we will address in turn.

Let’s look at infectious disease surveillance data from the perspective of an individual infection. There are two types of processes happening:

  • upwards, from an individual infection through to being recorded in surveillance data; and

  • outwards, from each infection spreading to cause new infections in the population.

Both of these processes involve time delays which makes analysing data in real time especially tricky.

In this course, we focus on understanding the transmission process through the reproduction number as a key component. We’ll then explore how to interpret the present state of an outbreak (nowcasting) when we have incomplete surveillance data due to reporting delays.

Why this course?

  • These are common questions in outbreak response and disease surveillance
  • Accounting for underlying processes can get surprisingly complicated quickly 1, and it’s easy to make mistakes
  • There’s currently (at the time of devising this course) no comprehensive training resource that links these common questions and challenges

Approach

Throughout the course we will

  1. use models to simulate data sets in R (thus introducing the generative model)
  2. apply the generative model to the simulated data in the probabilistic programming language stan, to
    • learn about the system (conduct inference)
    • make nowcasts (estimating current state)

Each session in the course:

  • builds on the previous one so that participants will have an overview of the real-time analysis workflow by the end of the course;
  • starts with a short introductory talk;
  • mainly consists of interactive content that participants will work through;
  • has optional/additional material that can be skipped or completed after the course ends;

For those attending the in-person version the course also:

  • has multiple instructors ready to answer questions about this content; if several people have a similar question we may pause the session and discuss it with the group;
  • follows a stop-and-review approach where we pause after each section of self-guided material to discuss and review together and address any questions;
  • ends with a wrap-up and discussion where we review the sessions material.

Timeline for the course

This EMBL-EBI course is designed as a focused half-day workshop covering the essential methods for real-time infectious disease surveillance. The timeline is:

  • \(R_t\) estimation and the renewal equation (session 1)
  • nowcasting concepts (session 2)
  • joint nowcasting with unknown reporting delays (session 3)

Let’s get started!

Footnotes

  1. Time travel is messy stuff↩︎