# Bootstrap replicate weights for sequential Poisson sampling

Source:`R/sps_repweights.R`

`sps_repweights.Rd`

Produce bootstrap replicate weights that are appropriate for Poisson sampling, and therefore approximately correct for sequential Poisson sampling.

## Arguments

- w
A numeric vector of design (inverse probability) weights for a (sequential) Poisson sample.

- replicates
A positive integer that gives the number of bootstrap replicates (1,000 by default). Non-integers are truncated towards 0.

- tau
A number greater than or equal to 1 that gives the rescale factor for the bootstrap weights. Setting to 1 (the default) does not rescale the weights.

- dist
A function that produces random deviates with mean 0 and standard deviation 1, such as

`rnorm()`

. The default uses the pseudo-population method from section 4.1 of Beaumont and Patak (2012); see details.

## Value

A matrix of bootstrap replicate weights with `replicates`

columns (one for
each replicate) and `length(w)`

rows (one for each unit in the sample), with
the value of `tau`

as an attribute.

## Details

Replicate weights are constructed using the generalized bootstrap method by Beaumont and Patak (2012). Their method takes a vector of design weights \(w\), finds a vector of adjustments \(a\) for each bootstrap replicate, and calculates the replicate weights as \(a w\).

There are two ways to calculate the adjustments \(a\). The default
pseudo-population method randomly rounds \(w\) for each replicate to
produce a collection of integer weights \(w'\) that are used to generate a
random vector \(b\) from the binomial distribution. The vector of
adjustments is then \(a = 1 + b - w' / w\). Specifying a
deviates-generating function for `dist`

uses this function to produce a
random vector \(d\) that is then used to make an adjustment \(a = 1 + d
\sqrt{1 - 1 / w}\).

The adjustments can be rescaled by a value \(\tau \geq 1\) to prevent negative replicate weights. With this rescaling, the adjustment becomes \((a + \tau - 1) / \tau\). If \(\tau > 1\) then the resulting bootstrap variance estimator should be multiplied by \(\tau^2\).

## Note

As an alternative to the bootstrap, Ohlsson (1998, equations 2.13) proposes an analytic estimator for the variance of the total \(\hat Y = \sum wy\) (for the take-some units) under sequential Poisson sampling: $$V(\hat Y) = \frac{n}{n - 1} \sum \left(1 - \frac{1}{w}\right) \left(wy - \frac{\hat Y}{n}\right)^2.$$ See Rosén (1997, equation 3.11) for a more general version of this estimator that can be applied to other order sampling schemes. Replacing the left-most correction by \(n / (m - 1)\), where \(m\) is the number of units in the sample, gives a similar estimator for the total under ordinary Poisson sampling, \(\hat Y = n / m \sum wy\).

## References

Beaumont, J.-F. and Patak, Z. (2012). On the Generalized
Bootstrap for Sample Surveys with Special Attention to Poisson Sampling.
*International Statistical Review*, 80(1): 127-148.

Ohlsson, E. (1998). Sequential Poisson Sampling.
*Journal of Official Statistics*, 14(2): 149-162.

Rosén, B. (1997). On sampling with probability proportional to size.
*Journal of Statistical Planning and Inference*, 62(2): 159-191.

## See also

`sps()`

for drawing a sequential Poisson sample.

`bootstrapFP()`

(with `method = "wGeneralised"`

) in the bootstrapFP
package for calculating the variance of Horvitz-Thompson estimators using
the generalized bootstrap.

## Examples

```
# Make a population with units of different size
x <- c(1:10, 100)
# Draw a sequential Poisson sample
(samp <- sps(x, 5))
#> [1] 4 6 8 9 11
# Make some bootstrap replicates
dist <- list(
pseudo_population = NULL,
standard_normal = rnorm,
exponential = \(x) rexp(x) - 1,
uniform = \(x) runif(x, -sqrt(3), sqrt(3))
)
lapply(dist, sps_repweights, w = weights(samp), replicates = 5, tau = 2)
#> $pseudo_population
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 4.8750000 5.375000 1.937500 5.375000 3.656250
#> [2,] 2.4375000 1.291667 1.291667 3.583333 1.291667
#> [3,] 2.0781250 1.578125 2.437500 1.578125 2.078125
#> [4,] 0.5277778 1.291667 1.291667 1.791667 1.027778
#> [5,] 1.0000000 1.000000 1.000000 1.000000 1.000000
#> attr(,"tau")
#> [1] 2
#>
#> $standard_normal
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 1.308878 4.195764 6.222635 5.04479605 4.051434
#> [2,] 2.536105 2.813617 3.408599 0.04442534 3.206192
#> [3,] 2.461942 1.657655 2.134877 1.63222601 2.301553
#> [4,] 1.634049 1.605084 1.777511 1.72258501 1.510670
#> [5,] 1.000000 1.000000 1.000000 1.00000000 1.000000
#> attr(,"tau")
#> [1] 2
#>
#> $exponential
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 2.313069 6.386048 2.740352 5.424610 2.707660
#> [2,] 1.491443 1.577468 2.009637 1.699790 2.856261
#> [3,] 1.310694 1.633221 1.455132 1.276392 2.967271
#> [4,] 1.088787 1.170839 1.690410 1.219775 1.183783
#> [5,] 1.000000 1.000000 1.000000 1.000000 1.000000
#> attr(,"tau")
#> [1] 2
#>
#> $uniform
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 2.970496 4.307107 4.721201 4.575343 1.1823646
#> [2,] 3.359473 1.908055 2.419320 2.701965 2.5187027
#> [3,] 1.752788 2.523467 1.216850 1.519328 1.1756872
#> [4,] 2.273226 1.804588 1.541610 2.242415 0.9458692
#> [5,] 1.000000 1.000000 1.000000 1.000000 1.0000000
#> attr(,"tau")
#> [1] 2
#>
```