InclusionProb

InclusionProb(x, n, *, alpha=0.001, cutoff=np.inf, sort_method='stable')

First-order inclusion probabilities for units in the population.

Parameters

x : ArrayLike

Sizes for units in the population. Should be a flat array of positive numbers.

n : int

Sample size.

alpha : float = 0.001

A number between 0 and 1 such that units with inclusion probabilities greater than or equal to 1 - alpha are set to 1. The default is slightly larger than 0. See Ohlsson (1998) for details.

cutoff : float = np.inf

A number such that all units with size greater than or equal to cutoff get an inclusion probability of 1. The default does not apply a cutoff.

sort_method : (stable, partial) = 'stable'

Sorting method to use when allocation take-all units. The default uses a stable sort. Using a partial sort can be faster if there are no duplicate in x.

Attributes

values : Array

Vector of inclusion probabilties.

n : int

Sample size.

take_all : Array

Take-all units.

take_some : Array

Take-some units.

References

Ohlsson, E. (1998). Sequential Poisson Sampling. Journal of Official Statistics, 14(2): 149-162.

Tillé, Y. (2006). Sampling Algorithms. Springer.

Examples

import numpy as np
import pysps

x = [1, 2, 3, 4, 5]
pi = pysps.InclusionProb(x, 3)
pi
InclusionProb(array([0.2, 0.4, 0.6, 0.8, 1. ]), 3)
# Units 1-4 belong to the take-some stratum, and units 5 belongs to
# the take-all stratum

pi.take_some
pi.take_all
array([4])
# Calculate design weights for a PPS sampling scheme

1 / pi.values
array([5.        , 2.5       , 1.66666667, 1.25      , 1.        ])