InclusionProb

InclusionProb(x, n, *, alpha=0.001, cutoff=np.inf, sort_method='stable')

First-order inclusion probabilities for units in the population.

Parameters

x : ArrayLike: Sizes for units in the population. Should be a flat array of positive numbers.
n : int: Sample size.
alpha : float = 0.001: A number between 0 and 1 such that units with inclusion probabilities greater than or equal to 1 - alpha are set to 1. The default is slightly larger than 0. See Ohlsson (1998) for details.
cutoff : float = np.inf: A number such that all units with size greater than or equal to cutoff get an inclusion probability of 1. The default does not apply a cutoff.
sort_method : (stable, partial) = 'stable': Sorting method to use when allocation take-all units. The default uses a stable sort. Using a partial sort can be faster if there are no duplicate in x.

Ohlsson, E. (1998). Sequential Poisson Sampling. Journal of Official Statistics, 14(2): 149-162.

Tillé, Y. (2006). Sampling Algorithms. Springer.

import numpy as np
import pysps

x = [1, 2, 3, 4, 5]
pi = pysps.InclusionProb(x, 3)
pi

InclusionProb(array([0.2, 0.4, 0.6, 0.8, 1. ]), 3)

# Units 1-4 belong to the take-some stratum, and units 5 belongs to
# the take-all stratum

pi.take_some
pi.take_all

array([4])

# Calculate design weights for a PPS sampling scheme

1 / pi.values

array([5.        , 2.5       , 1.66666667, 1.25      , 1.        ])