Computing Weighted Correlation Functions

Every clustering statistic in Corrfunc accepts an array of weights that can be used to compute weighted correlation functions. The API reference for each clustering statistic (Corrfunc.theory.xi, Corrfunc.mocks.DDrppi_mocks, etc.) contains examples of how to do this. The interface is standard across functions: the inputs are a weights array and a weight_type string that specifies how to use the “point weights” to compute a “pair weight”. Currently, the only supported weight_type is pair_product, in which the pair weight is the product of the point weights (but see Implementing Custom Weight Functions for how to write your own function).

Warning

The computation of the weighted result is susceptible to loss of floating point precision, especially in single precision. If you are using single precision, make sure you test double precision as well (by casting all pos and weight input arrays to type np.float64, for example) and check that the difference with the single-precision result is acceptable.

If weight_type and weights (or weights1 and weights2 for cross-correlations) are given, the mean pair weight in a separation bin will be given in the weightavg field of the output. This field is 0.0 if weights are disabled.

Pair counts (i.e. the npairs field in the results array) are never affected by weights. For theory functions like Corrfunc.theory.xi and Corrfunc.theory.wp that actually return a clustering statistic, the statistic is weighted. For pair_product, the distribution used to compute the expected bin weight from an unclustered particle set (the RR term) is taken to be a spatially uniform particle set where every particle has the mean weight. See RR in Weighted Clustering Statistics for more discussion.

Running with weights incurrs a modest performance hit (around 20%, similar to enabling ravg). Weights are supported for all instruction sets (SSE, AVX, and fallback).

Consider the following simple example adapted from the Corrfunc.theory.xi docstring, in which we assign a weight of 0.5 to every particle and get the expected average pair weight of 0.25 (last column of the output). Note that xi (fourth column) is also weighted, but the case of uniform weights is equivalent to the unweighted case.

>>> from __future__ import print_function
>>> import numpy as np
>>> from os.path import dirname, abspath, join as pjoin
>>> import Corrfunc
>>> from Corrfunc.theory.xi import xi
>>> binfile = pjoin(dirname(abspath(Corrfunc.__file__)),
...                 "../theory/tests/", "bins")
>>> N = 100000
>>> boxsize = 420.0
>>> nthreads = 4
>>> seed = 42
>>> np.random.seed(seed)
>>> X = np.random.uniform(0, boxsize, N)
>>> Y = np.random.uniform(0, boxsize, N)
>>> Z = np.random.uniform(0, boxsize, N)
>>> weights = np.full_like(X, 0.5)
>>> results = xi(boxsize, nthreads, binfile, X, Y, Z, weights=weights, weight_type='pair_product', output_ravg=True)
>>> for r in results: print("{0:10.6f} {1:10.6f} {2:10.6f} {3:10.6f} {4:10d} {5:10.6f}"
...                         .format(r['rmin'], r['rmax'],
...                         r['ravg'], r['xi'], r['npairs'], r['weightavg']))
...                   
      0.167536   0.238755   0.226592  -0.205733          4   0.250000
      0.238755   0.340251   0.289277  -0.176729         12   0.250000
      0.340251   0.484892   0.426819  -0.051829         40   0.250000
      0.484892   0.691021   0.596187  -0.131853        106   0.250000
      0.691021   0.984777   0.850100  -0.049207        336   0.250000
      0.984777   1.403410   1.225112   0.028543       1052   0.250000
      1.403410   2.000000   1.737153   0.011403       2994   0.250000
      2.000000   2.850200   2.474588   0.005405       8614   0.250000
      2.850200   4.061840   3.532018  -0.014098      24448   0.250000
      4.061840   5.788530   5.022241  -0.010784      70996   0.250000
      5.788530   8.249250   7.160648  -0.001588     207392   0.250000
      8.249250  11.756000  10.207213  -0.000323     601002   0.250000
     11.756000  16.753600  14.541171   0.000007    1740084   0.250000
     16.753600  23.875500  20.728773  -0.001595    5028058   0.250000