Decomposing diversity indexes

Index numbers

A diversity index is a way to measure the prevalence of different species in an ecosystem. I show how methods from the price-index literature can be used to decompose a diversity index into the contribution of each species towards overall diversity.

Author

Steve Martin

Published

October 19, 2025

Modified

October 28, 2025

Doi

10.59350/dfntc-yfd90

A diversity index is a way to measure the prevalence of different species in an ecosystem. Although these arise naturally in ecology, diversity indexes show up elsewhere as well. In economics, for example, we can think of diversity as related to the market concentration of firms (species) in an industry (ecosystem). What I want to show here is how we can use some of the machinery from the world of price and quantity indexes to decompose a diversity index into the contribution of each species towards overall diversity.

What is diversity, anyway?

Diversity is a tricky concept to formalize. Up until writing this post, I would have said it was just the number of species in an ecosystem. In his seminal paper, Hill (1973) considers a family diversity indexes that give a measure of the effective number of species in an ecosystem—that is, the number of species that would be present in an ecosystem if all species were equally prevalent. Mathemetically, if there are \(n\) species and the \(i\)-th species appears with probability \(p_{i}\), then the diversity index of order \(\alpha\) is

\[ N_{\alpha}(p_{1}, \ldots, p_{n}) = \left(\sum_{i=1}^{n}p_{i}^{\alpha}\right)^{\frac{1}{1 - \alpha}}. \]

Different values for \(\alpha\) yield different indexes; for example, \(N_{0}\) measures diversity by the total number of species \(n\), also known as the richness of the ecosystem. As a function of \(\alpha\), \(N_{\alpha}\) is a continuously decreasing function that maps values in \([0, \infty)\) onto \([n, 1 / \max(p_{1}, \ldots, p_{n}))\). We can see this with an example of an ecosystem with 10 species.

diversity_index <- function(x, alpha) {
  if (alpha != 1) {
    sum(x^alpha)^(1 / (1 - alpha))
  } else {
    exp(-sum(p * log(p)))
  }
}

set.seed(54321)

p <- sort(gpindex::scale_weights(rlnorm(10)))

hist(p, main = "Abundance of species in an ecosystem")

alphas <- seq(0, 5, 0.25)
index <- sapply(alphas, \(a) diversity_index(p, a))

plot(
  alphas,
  index,
  ylim = c(0, 10),
  xlab = "𝛼",
  ylab = "Index",
  main = "Diversity decreases with 𝛼"
)
abline(1 / max(p), 0, lty = "dashed")

With a bit of rearranging, we can see that \(N_\alpha\) is the reciprocal of the generalized mean of \((p_{1}, \ldots, p_{n})\) with these same values as weights

\[ N_{\alpha}(p_{1}, \ldots, p_{n}) = 1 / \left(\sum_{i=1}^{n} p_{i}^{\alpha - 1} p_{i}\right)^{\frac{1}{\alpha - 1}}. \]

Formulating \(N_{\alpha}\) as a generalized mean shows a clear link between diversity indexes and concentration indexes, as \(N_{2}\) is the reciprocal of the well-known Simpson index (or Herfindahl–Hirschman index if you’re an economist). (It also shows a link with measures of entropy, as \(\log(N_{\alpha})\) is a generalization of Shannon entropy.) What makes a diversity index different from a measure of concentration (or entropy) is that it expresses diversity in terms of the effective size of the ecosystem that would give rise to a particular concentration of species if species were all equally abundant. Intuitively, \(1 / p_{i}\) gives the effective size of the ecosystem if all species were as prevalent as species \(i\). Rather than considering a single species, \(N_{a}\) uses the average abundance across all species to arrive at a measure of diversity.

Decomposing diversity

Hill (1973) notes that different choices for \(\alpha\) imply different sensitivities to rare versus abundant species in an ecosystem. Setting \(\alpha = 0\) means that diversity depends only on the number of species, no matter how rare some may be, whereas when \(\alpha \rightarrow \infty\) then only the most prevalent species influences diversity. We can go one step further by decomposing \(N_{\alpha}\) so that it’s represented as an arithmetic mean of the effective size of each species \(1 / p_{i}\) and we can see the contribution of each species towards total diversity. We’ll do this by borrowing some of the machinery from price and quantity indexes to derive weights \((w_{1}, \ldots, w_{n})\) such that

\[ N_{\alpha}(p_{1}, \ldots, p_{n}) = \sum_{i=1}^{n} w_{i} / p_{i}. \]

The core tool to do this comes from my {gpindex} package.

diversity_weights <- function(x, alpha) {
  gpindex::transmute_weights(alpha - 1, -1)(x, x)
}

When \(\alpha = 0\), each species contributes the same amount to overall diversity.

diversity_weights(p, 0) / p

 [1] 1 1 1 1 1 1 1 1 1 1

Increasing \(\alpha\) to 2, the measure of diversity decreases and more weight is shifted towards more abundant species.

diversity_weights(p, 2) / p

 [1] 0.02204630 0.02298373 0.04003915 0.04733940 0.05467209 0.07962440
 [7] 0.10013170 0.14336078 0.48334916 1.48242257

As alpha becomes larger, rare species get a small weight and contribute little towards overall diversity.

diversity_weights(p, 5)

 [1] 4.429620e-05 4.817851e-05 1.481826e-04 2.083463e-04 2.795183e-04
 [6] 6.049503e-04 9.729488e-04 2.068354e-03 3.261787e-02 9.630074e-01

diversity_weights(p, 5) / p

 [1] 0.004974804 0.005190128 0.009163420 0.010897035 0.012658720 0.018811300
 [7] 0.024058228 0.035722328 0.167085953 1.608432485

This gives a different way to view the prevalence of a species, not just by their abundance but by how their abundance contributes towards overall diversity in an ecosystem.

References

Hill, Mark O. 1973. “Diversity and Evenness: A Unifying Notation and Its Consequences.” Ecology 54 (2): 427–32. https://doi.org/10.2307/1934352.

Reuse

CC BY 4.0

Citation

BibTeX citation:

@online{martin2025,
  author = {Martin, Steve},
  title = {Decomposing Diversity Indexes},
  date = {2025-10-19},
  url = {https://marberts.github.io/blog/posts/2025/diversity/},
  doi = {10.59350/dfntc-yfd90},
  langid = {en}
}

For attribution, please cite this work as:

Martin, Steve. 2025. “Decomposing Diversity Indexes.” October 19, 2025. https://doi.org/10.59350/dfntc-yfd90.