man/wp.Rd


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128

% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/wp.R
\name{wp}
\alias{wp}
\title{White and Pagano (WP)}
\usage{
wp(
  cases,
  mu = NA,
  serial = FALSE,
  grid_length = 100,
  max_shape = 10,
  max_scale = 10
)
}
\arguments{
\item{cases}{Vector of case counts. The vector must be of length at least two
and only contain positive integers.}

\item{mu}{Mean of the serial distribution. This must be a positive number or
\code{NA}. If a number is specified, the value should match the case counts in
time units. For example, if case counts are weekly and the serial
distribution has a mean of seven days, then \code{mu} should be set to \code{1}. If
case counts are daily and the serial distribution has a mean of seven days,
then \code{mu} should be set to \code{7}.}

\item{serial}{Whether to return the estimated serial distribution in addition
to the estimate of R0. This must be a value identical to \code{TRUE} or \code{FALSE}.}

\item{grid_length}{The length of the grid in the grid search (defaults to
100). This must be a positive integer. It will only be used if \code{mu} is set
to \code{NA}. The grid search will go through all combinations of the shape and
scale parameters for the gamma distribution, which are \code{grid_length} evenly
spaced values from \code{0} (exclusive) to \code{max_shape} and \code{max_scale}
(inclusive), respectively. Note that larger values will result in a longer
search time.}

\item{max_shape}{The largest possible value of the shape parameter in the
grid search (defaults to 10). This must be a positive number. It will only
be used if \code{mu} is set to \code{NA}. Note that larger values will result in a
longer search time, and may cause numerical instabilities.}

\item{max_scale}{The largest possible value of the scale parameter in the
grid search (defaults to 10). This must be a positive number. It will only
be used if \code{mu} is set to \code{NA}. Note that larger values will result in a
longer search time, and may cause numerical instabilities.}
}
\value{
If \code{serial} is identical to \code{TRUE}, a list containing the following
components is returned:
\itemize{
\item \code{r0} - the estimate of R0
\item \code{supp} - the support of the estimated serial distribution
\item \code{pmf} - the probability mass function of the estimated serial
distribution
}

Otherwise, if \code{serial} is identical to \code{FALSE}, only the estimate of R0 is
returned.
}
\description{
This function implements an R0 estimation due to White and Pagano (Statistics
in Medicine, 2008). The method is based on maximum likelihood estimation in a
Poisson transmission model. See details for important implementation notes.
}
\details{
This method is based on a Poisson transmission model, and hence may be most
most valid at the beginning of an epidemic. In their model, the serial
distribution is assumed to be discrete with a finite number of possible
values. In this implementation, if \code{mu} is not \code{NA}, the serial distribution
is taken to be a discretized version of a gamma distribution with shape
parameter \code{1} and scale parameter \code{mu} (and hence mean \code{mu}). When \code{mu} is
\code{NA}, the function implements a grid search algorithm to find the maximum
likelihood estimator over all possible gamma distributions with unknown shape
and scale, restricting these to a prespecified grid (see the parameters
\code{grid_length}, \code{max_shape} and \code{max_scale}). In both cases, the largest value
of the support is chosen such that the cumulative distribution function of
the original (pre-discretized) gamma distribution has cumulative probability
of no more than 0.999 at this value.

When the serial distribution is known (i.e., \code{mu} is not \code{NA}), sensitivity
testing of \code{mu} is strongly recommended. If the serial distribution is
unknown (i.e., \code{mu} is \code{NA}), the likelihood function can be flat near the
maximum, resulting in numerical instability of the optimizer. When \code{mu} is
\code{NA}, the implementation takes considerably longer to run. Users should be
careful about units of time (e.g., are counts observed daily or weekly?) when
implementing.

The model developed in White and Pagano (2008) is discrete, and hence the
serial distribution is finite discrete. In our implementation, the input
value \code{mu} is that of a continuous distribution. The algorithm discretizes
this input, and so the mean of the estimated serial distribution returned
(when \code{serial} is set to \code{TRUE}) will differ from \code{mu} somewhat. That is to
say, if the user notices that the input \code{mu} and the mean of the estimated
serial distribution are different, this is to be expected, and is caused by
the discretization.
}
\examples{
# Weekly data.
cases <- c(1, 4, 10, 5, 3, 4, 19, 3, 3, 14, 4)

# Obtain R0 when the serial distribution has a mean of five days.
wp(cases, mu = 5 / 7)

# Obtain R0 when the serial distribution has a mean of three days.
wp(cases, mu = 3 / 7)

# Obtain R0 when the serial distribution is unknown.
# Note that this will take longer to run than when `mu` is known.
wp(cases)

# Same as above, but specify custom grid search parameters. The larger any of
# the parameters, the longer the search will take, but with potentially more
# accurate estimates.
wp(cases, grid_length = 40, max_shape = 4, max_scale = 4)

# Return the estimated serial distribution in addition to the estimate of R0.
estimate <- wp(cases, serial = TRUE)

# Display the estimate of R0, as well as the support and probability mass
# function of the estimated serial distribution returned by the grid search.
estimate$r0
estimate$supp
estimate$pmf
}
\references{
\href{https://doi.org/10.1002/sim.3136}{White and Pagano (Statistics in Medicine, 2008)}
}