Summer School Epidemiology 2023
Models, Statistics and Genetics: Understanding Epidemics
Where?
Eldoret, Kenya
When?
March 12 to 24, 2023
Organized by
KNUST (Ghana), Moi University (Kenya), TUM (Germany)
Topic
The current COVID19 pandemic has demonstrated both the immediate and longterm public health impacts of epidemics on populations worldwide. Previous infectious diseases, such as malaria, HIV and tuberculosis, continue to significantly contribute to mortality and morbidity, particularly in underserved regions of Africa. Although ongoing scientific efforts to meet the public health challenges posed by epidemics has recently dramatically increased due to COVID19, there remains the need to recruit and equip future mathematical epidemiologists and data scientists with the tools to combat novel and endemic epidemics. In response to this need, the proposed summer school will provide an opportunity for young researchers and doctoral students from Africa and Europe to collaborate and deepen their understanding of the mathematical fundamentals in infectious disease modelling. The summer school aims for workshops to cover the fundamentals to the current stateoftheart modelling efforts, allow the students and practitioners to identify current challenges and knowledge gaps, and network to advance the theory and practice of epidemiology, especially using available genomic data.
This summer school will bring together scientists from different fields to discuss and review recent developments in interaction with young researchers. The objective is to introduce and present three topics indepth: Deterministic and stochastic mechanistic mathematical models, statistical methods and methods from population genomics.
Confirmed Speakers
Prof. George Lawi, (Masinde Muliro University of Science and Technology, Kenya)
Prof. Ann Mwangi, (Moi University, Kenya)
Prof. Johannes Müller (TU München, Germany)
Dr. Hannes Petermeier (TU München, Germany)
Prof. Aurélien Tellier (TU München, Germany)
Program
We plan for 5 lecture series, small interactive discussions in mentored project groups, scientific talks by senior scientists and poster presentations by participants. The five lecture series cover:
 Modern biostatistical data handling of epidemiological data
 Evolutionary genomics for parasites and statistical methods
 Modelbased analysis of epidemiological data
 Deterministic epidemiological models and optimal control
 Stochastic epidemiological models and contact tracing
Target Group
The summer school is aimed at PhD students and postdocs from TUM and SubSaharan Africa interested in the analysis of infectious disease data, genomic data and infectious disease modelling. Participants should have background knowledge in statistics, mathematics, physics or computer sciences. Female candidates are encouraged to apply.
Financial support
The summer school is supported by the VolkswagenStiftung (Germany); most of the local costs, particularly accomodation, will be covered. Participants can apply for financial support for the travel expenses.
Dates, Deadlines and Application
The deadline for submission of applications is November 30, 2022. Successful applicants will be notified by midDecember 2022.
Please register via our web form.
We plan for an inperson meeting. In case COVID19 forces us to change the format of the summer school, participants will be informed a.s.a.p.
In case of inquiries, please contact johannes.mueller@mytum.de or beryl.musundi@tum.de.
Details of the Lecture Series
 Statistics in R for Life Sciences with Data Science for Infectious Diseases (Hannes Petermeier)
 Theoretical and mathematical theory of evolutionary genomics with special application to human parasites (Aurélien Tellier)
 Modelling infectious diseases using R (Ann Mwangi)
 Conceptualization and Analysis of Pragmatic Optimal Control models for infectious diseases (Prof. George. Lawi)
 Stochastic models and contact tracing (Johannes Müller)
Statistics in R for Life Sciences with Data Science for Infectious Diseases
Hannes Petermeier
Aim: This handson course brings students up to speed on how to use the R statistical software package for performing stateoftheart quantitative research in the life sciences, with emphasis on data science applied to infectious diseases.
Contents: Lecture 1 Basic statistic review: statistical estimation, confidence intervals, and hypothesis testing; Lecture 2 Visualization of multivariate data: two and threedimensional graphs, principal component analysis (PCA), cluster analysis; Lecture 3 Categorical data: twoway tables, correspondence analysis, highdimensional multiple correspondence analysis (MCA); Lecture 4 ANOVA and experimental design: analysis of variance, block randomization, sample size and power calculations; Lecture 5 Linear regression: simple and multiple linear regression, statistical inference, percent variation explained, residual analysis; Lecture 6 Multiple linear regression: main effects versus interactions, multicollinearity, transformations, model selection, information criteria, automatic model selection procedures; Lecture 7 Logistic and Poisson regression: inference and interpretation, likelihood ratio tests, adjustment for rates; Lecture 8 Survival regression: Life tables, KaplanMeier estimation and hypothesis testing, Cox proportional hazards; Lecture 9 Linear mixedeffect models: specification, longitudinal data, residual analysis.
Organization: Students will have access to lecture videos for home viewing before the course. For four days, they will attend lectures in the morning and perform exercises in small groups in the afternoon. On the last day of the week, they will perform the data science lab. All course materials, including lecture slides, lecture videos, exercises, and exercise solutions, will be provided in electronic form. To extend the course’s reach and viability, the materials will remain at Moi University upon completion of the course and will be used in teaching the course in subsequent years. To this effect, topperforming students who volunteer will be appointed as ambassadors to operate as local teaching assistants for the course, with remote mentorship provided by Prof. Ankerst and her TUM teaching staff.
Data science lab: After completion of the lectures and R labs, students will perform their own infectious disease analysis from start to finish. They will identify online data on infectious diseases for webscraping, such as datasets on COVID by country or region. They will perform a literature search to see prior published work on the region and summarize the models and approaches used. They will perform analyses in R using previous models and summarize their findings.
References:

R Peck, C Olsen, JL Devore. Statistics & Data Analysis, 5th edition, 2016.

B Abraham, J Ledolter. Introduction to Regression Modeling, 2006.

T Hastie, R Tibshirani, J Friedman. The Elements of Statistical Learning, 2nd ed., 2009.

F Husson, S Le, J Pages. Exploratory Multivariate Analysis by Example Using R, 2017.

G Van Belle, LD Fisher, PJ Heagerty, T Lumley. Biostatistics, 2004.

D Collett. Modelling Survival Data in Medical Research, 2015.

GM Fitzmaurice, NM Laird, JH Ware. Applied Longitudinal Analysis, 2012.
Theoretical and mathematical theory of evolutionary genomics with special application to human parasites
Aurélien Tellier
Aim: In this module, we want to provide to the participants an overview of the current evolutionary genomics theory underlying the genomic analysis of human parasites and focus specifically on malaria and the Covid19 virus. After this module, the participants will be able to 1) explain and demonstrate the properties of the key stochastic models of evolutionary genomics, 2) apply the correct analysis framework to different types of human parasites based on the model assumptions, and 3) know the bases of genomic data analysis and can draw past demographic inference and phylogenetic reconstructions from simple small datasets. This knowledge is useful to study any neglected tropical disease.
Content: We will describe and demonstrate the properties of the main stochastic models used in neutral population genomics (WrightFisher model, coalescent theory). We also build explicit models for the action of recombination and selection (textbooks used: Charlesowrth and Charlesworth 2010, Hein, Schierup and Wiuf 2004, Wakeley 2008). We apply this framework to two types of human parasites. First, we focus on parasites with larger genomes (e.g. malaria, helminths) that undergo sexual reproduction. We will describe the theory and apply coalescent approaches such as the Sequential Markovian Coalescent on genomewide data to draw inference on parameters (Sellinger et al. 2020, 2021). We will show the use of stochastic models of neutral evolution to draw inference on past demographic history and using the Bayesian inference method (Maerkle and Tellier 2020) the possibility to detect selective events in the genomes of Plasmodium falciparum. Second, we also derive models with restricted assumptions which are valid for clonal organisms, such as viruses and bacteria. We will practice the use of skyline plots and phylogenetic approaches to draw estimates on the history of a disease as well as short term estimations of the R0 parameter of an epidemic using the large amount of sequences available for Covid19 (Leventhal et al. 2014).
Organization: Each morning session contains lectures and small integrated exercises such as computer simulations and computations of the properties of the mathematical models. The afternoon session consists of a small handson practice of data analysis (overview of a vcf file, inference with SMC/Bayesian methods, skyline plots and phylogeny) based on the software R and available python libraries installed on a virtual machine on our cluster in Munich. Finally, the participants will read and prepare critical presentations of several scientific papers on malaria and Covid19 evolution/epidemiology to 1) exemplify the gain of knowledge resulting from the use of genome sequence data, and 2) critically evaluate the potential pitfalls of these approaches.
Topic lecture: In the specific topic lecture, we will present an extension of the evolutionary genomics framework to parasites which present dormancy and quiescence. Such parasites exhibit some metabolically inactive state (in or outside the host) which can be reactivated at a later point in time, and is a feature of many tropical diseases and viruses. We will show the mathematical models used to describe this phenomenon and how they differ from the classic population genomics ones studied in the lecture. We will also explicit the importance of dormancy for parasite survival and take this feature into account in disease management strategies. We will finally provide some examples of inference of dormancy from genome data using Machine Learning methods currently developed in the Tellier lab.
References:

Charlesworth, B., and D. Charlesworth. 2010. Elements of evolutionary genetics. Roberts and Company, Greenwood Village, CO, USA.

Hein, J., Schierup, M., and C. Wiuf. 2004. Gene genealogies, variation and evolution: a primer in coalescent theory. Oxford University Press, Oxford, UK.

Leventhal, G.E., et al. 2014. Using an epidemiological model for phylogenetic inference reveals density dependence in HIV transmission." Molecular biology and evolution 31: 617.

Märkle H., and A. Tellier, 2020, Inference of coevolutionary dynamics and parameters from host and parasite polymorphism data of repeated experiments, PLoS computational biology, 16(3), p.e1007668.

Sellinger T.P.P., D. AbuAwad, M. Moest, and A. Tellier. 2020. Inference of past demography, dormancy and selffertilization rates from whole genome sequence data, PLoS Genetics, 16(4), p.e100869.

Sellinger, T.P.P., AbuAwad, D. and A. Tellier. 2021, Limits and convergence properties of the sequentially Markovian coalescent. Mol Ecol Resour, 21: 22312248

Wakeley, J., 2008, Coalescent Theory: An Introduction. Roberts and Company, Greenwood Village, CO, USA
Modelling infectious diseases using R
Ann Mwangi
Aim: The emergence of COVID has demonstrated the importance of modelling infectious diseases in improving our understanding of transmission and the impact of interventions. This handson course aims to build mathematical modelling capacity to graduate students and midcareer scientists by equipping them with a basic understanding of infectious disease modelling and how to write and analyse the dynamics of a mathematical model under different scenarios and interventions using the R statistical software package. After the module, participants will independently be able to 1) identify the basic concepts of epidemiology, 2) explore the variety of models for different infectious diseases, 3) formulate simple mathematical models for infectious diseases, 4) fit simple mathematical models for infectious diseases in R under different assumptions.
Content: Basic concepts of epidemiology and mathematical models; basic concepts of population dynamics; compartmental models of infectious disease dynamics (the SIS and SIR models); Basic reproduction number R0; Implementation of mathematical models in R; calibrating models against epidemiological data to estimate key model parameters.
Organisation:
Participants will be asked to bring their own laptops with a preinstalled version of R software that can be downloaded at www.rproject.org.
References

Driessche P., Wu J. (eds) Mathematical Epidemiology. Lecture Notes in Mathematics, vol 1945. Springer, Berlin, Heidelberg.

R.M. Anderson and R.M. May, Infectious Diseases of Humans: Dynamics and Control, Oxford University Press.

O. Diekmann, H. Hesterbeek and T. Britton, Mathematical Tools for Understanding Infectious Disease Dynamics, 2013, Princeton University Press.

Marion, Introduction to Mathematical Modeling, 2008. https://people.maths.bris.ac.uk/~madjl/course_text.pdf.

Garnett GP, Cousens S, Hallett TB, Steketee R, Walker N. 2011. Mathematical models in the evaluation of health programmes. Lancet, 378(9790): 51525.

Anderson RM, Garnett GP. 2000. Mathematical models of the transmission and control of sexually transmitted diseases. Sexually Transmitted Diseases, 27(10): 6364.
Conceptualization and Analysis of Pragmatic Optimal Control models for infectious diseases
Prof. George. Lawi
Aim: The mathematical modelling of infectious diseases usually aims at understanding the transmission dynamics. Models continue to be developed at betweenhost, withinhost and immunoepidemiological (multilevel) scales, largely with interventions [1,2,3,4,5]. Over the decades, mathematical models have been reliably and efficiently used in formulating control strategies against infectious diseases [6,7]. Economically speaking it is desirable to apply these controls in a manner that minimizes the cost but maximizes the desired outcome. The recent community transmission of COVID19 in many nations has awakened interest in optimal control models. This handson course aims to equip graduate students in mathematical epidemiology with the critical skills of conceptualization and analysis of pragmatic optimal control models for infectious diseases. By the end of the training program, the students should be able to a) develop deterministic models describing the transmission dynamics of infectious diseases with controls, b) formulate and solve pragmatic optimal control problems for deterministic models of infectious diseases and c) perform numerical simulations on the impact of different combinations of control strategies on the transmission dynamics of infectious diseases.
Content:Basic mathematical epidemiology: description and formulation of models; Stability and Sensitivity analysis of mathematical models: the role of the Basic reproduction number [3,4]; Optimal control theory and its application to mathematical epidemiology: Pontryagin’s maximum principle (PMP), formulating and solving the optimal control problem, numerical simulations on the impact of different scenarios of control strategies on the transmission dynamics of infectious diseases [5,6].
Organization: Each morning session consists of lectures and integrated exercises. The afternoon session consists of handson guided small group/individual exercises.
Topic lecture: In the specific topic lecture, we will present an exposition of Optimal control theory and its application to mathematical epidemiology. From the sensitivity analysis of the reproduction number, we will identify key drivers of infection and conceptualize pragmatic optimal control strategies with application to COVID19 in the Kenyan context. We will thereafter carry out optimal control analysis and numerical analysis of the different control scenarios.
References

G.O . Lawi, J.Y.T. Mugisha and N. OmoloOngati, Modelling Coinfection of Paediatric Malaria and Pneumonia, Int. Journal of Math. Analysis,Vol 7,2013,no. 9, 413424

Musundi O. Beryl, Lawi O. George and Nyamwala O. Fredrick, Mathematical Analysis of a Cholera Transmission Model Incorporating Media Coverage IJPAM Vol 111 No.2 2016, 219231, ISSN 13118080

B. Mobisa, G. O. Lawi and J. K. Nthiiri, Modelling In Vivo HIV Dynamics under Combined Antiretroviral Treatment, Journal of Applied Mathematics, Vol. 2018, Article ID 8276317, 11 pages, doi.org/10.1155/2018/8276317 ISSN: 1110757X (Print); ISSN: 16870042 (Online)

Rachel A. Nyang’inja, George O. Lawi, Mark O. Okongo and Titus O. Orwa. Stability Analysis Of RotavirusMalaria CoEpidemic Model With Vaccination. Dynamic Systems and Applications, 28, No. 2 (2019), 371 407 ISSN: 10562176

F. K. Tireito, G. O. Lawi and C. A. Okaka, Mathematical Analysis of HIV/AIDS Prophylaxis Treatment Model, Applied Mathematical Sciences, Vol. 12, 2018, No. 18, 893902. ISSN 13147552

A. B. Gumel, S. Ruan, T. Day et al., “Modelling Strategies for Controlling SARS Outbreaks,” Proceedings of the Royal Society of London. Series B: Biological Sciences, vol. 271, no. 1554, pp. 2223–2232, 2004.

A. B. Gumel, S. Ruan, T. Day et al., \Modelling Strategies for Controlling SARS Outbreaks," Proceedings of the Royal Society of London. Series B: Biological Sciences, vol. 271, no. 1554, pp. 2223{2232, 2004.

C. E. Madubueze, A. R. Kimbir, and T. Aboiyar, \Global Stability of Ebola Virus Disease Model with Contact Tracing and Quarantine," Applications & Applied Mathematics, vol. 13, no. 1, pp. 382{403, 2018.

M Li, An Introduction to Mathematical Modeling of Infectious Diseases, Springer,2018.

Diekmann O and Heesterbeek J.A.P, Mathematical Epidemiology of Infectious Diseases: Model Building, Analysis and Interpretation Dynamics, Wiley, 2000.

Pontryagin LS, Boltyanskii VG, Gamkrelidze RV, Mishchenko EF. The mathematical theory of optimal processes. New York: Wiley; 1962.

Lewis FL, Vrabie D, Syrmos VL. Optimal control. New York: Wiley; 2012.
Stochastic models and contact tracing
Johannes Müller
Aim: The module aims to introduce the main stochastic modelling approaches for epidemics, particularly in view of control measures. At the end of the module, the participants will be able to 1) choose an appropriate stochastic modelling framework for a question at hand, 2) perform a basic theoretical analysis of the model, 3) perform individualbased stochastic simulations, and 4) formulate and deal with models for contact tracing with applications to COVID19.
Content: While deterministic models cover the time course of infections in large populations and predict the effect of mass vaccination and screening very well, stochastic models are required for smaller populations and individualbased control measures such as contact tracing and ring vaccination. We will begin by discussing the stochastic SIR model in a homogeneous, finite population based on the Selke construction [1]. We will find information on the probability of a major outbreak and the final size of an infection. We will proceed to discuss the time to extinction of different parameter regions as introduced by I. Nasell [2]. Afterwards, we will leave homogeneous populations and discuss (random) graphbased models. Here, we will particularly consider the fundamental theorem in random graphs on the emergence of a giant component, its relation to large outbreaks [3] and its connection to the probability of a major outbreak. It will be natural to study the effect of vaccination in this model. We will proceed towards individualbased simulation models, which permit taking into account the contact structure of single individuals. Here, we will discuss an appropriate version of the Gillespie algorithm.
Organization: Emphasis will be on the interactive elements of the lecture. After the introduction, theoretical results will be derived by the participants in small working groups through guided exercises. In the case of the simulation model, we will provide an objectoriented implementation in R of the Gillespie algorithm as a skeleton, which will be modified by the participants to enable the testing of theoretical results in stochastic individualbased epidemic simulation models.
Topic lecture: The last part of the lecture series will be devoted to contact tracing. Several approaches have been presented in the literature [4]. After a brief overview, we will focus on the branching process approach [5]. The most recent developments are studies of contact tracing in the case of superspreaders [6], which are known to be of special importance in the spread of COVID19. Only partially solved is the question of proving rigorous thresholds for the branching tracing process. We will discuss several recent ideas of this open problem [7,8].
Literature:

H. Andersson, T. Britton. Stochastic Epidemic Models and Their Statistical Analysis Springer, 2000

Ingemar Nåsell, Extinction and QuasiStationarity in the Stochastic Logistic SIS Model. Springer, 2011

R. Durret, Random Graph Dynamics. Cambridge Univ. Press, 2006

J. Müller, M. Kretzschmar. Contact Tracing – Old Models and New Challenges. Infectious Disease Modelling, 6 (2021), 22223

J. Müller, M. Kretzschmar, and K. Dietz. Contact tracing in deterministic and stochastic models. Math. Biosc., 164:3964, 20

J. Müller, V. Hösel. Contact tracing & superspreaders in the branchingprocess model. JOMB, accepted 2021; arXiv preprint 2010.049

M.T. Barlow, A branching process with contact tracing, 2020. arXiv preprint 2007.16182v

D. Zhang, T. Britton. Analysing the Effect of TestandTrace Strategy in an SIR Epidemic Model. 2021, arXiv preprint 2110.07220