.. _weighted_correlations:

Computing Weighted Correlation Functions
========================================

Every clustering statistic in ``Corrfunc`` accepts an array
of weights that can be used to compute weighted correlation
functions. The API reference for each clustering statistic
(:py:mod:`Corrfunc.theory.xi`, :py:mod:`Corrfunc.mocks.DDrppi_mocks`,
etc.) contains examples of how to do this.  The interface is standard across functions: the
inputs are a ``weights`` array and a ``weight_type`` string
that specifies how to use the "point weights" to compute a "pair weight".
Currently, the only supported ``weight_type`` is ``pair_product``,
in which the pair weight is the product of the point weights
(but see :ref:`custom_weighting` for how to write your own
function).

.. warning::
    The computation of the weighted result is susceptible to loss of floating
    point precision, especially in single precision.  If you are using single
    precision, make sure you test double precision as well (by casting all
    pos and weight input arrays to type ``np.float64``, for example)
    and check that the difference with the single-precision result
    is acceptable.

If ``weight_type`` and ``weights`` (or ``weights1`` and ``weights2``
for cross-correlations) are given, the mean pair weight in a
separation bin will be given in the ``weightavg`` field of the
output.  This field is 0.0 if weights are disabled.

Pair counts (i.e. the ``npairs`` field in the ``results`` array)
are never affected by weights.  For theory functions like
:py:mod:`Corrfunc.theory.xi` and :py:mod:`Corrfunc.theory.wp`
that actually return a clustering statistic, the statistic is weighted.
For ``pair_product``, the distribution used to compute the
expected bin weight from an unclustered particle set (the ``RR`` term)
is taken to be a spatially uniform particle set where every particle
has the mean weight.  See :ref:`weighted_rr` for more discussion.

Running with weights incurrs a modest performance hit (around
20%, similar to enabling ``ravg``).  Weights are supported for
all instruction sets (SSE, AVX, and fallback).

Consider the following simple example adapted from the :py:mod:`Corrfunc.theory.xi`
docstring, in which we assign a weight of 0.5 to every particle and get
the expected average pair weight of 0.25 (last column of the output).
Note that ``xi`` (fourth column) is also weighted, but the case of uniform
weights is equivalent to the unweighted case.

::

    >>> from __future__ import print_function
    >>> import numpy as np
    >>> from os.path import dirname, abspath, join as pjoin
    >>> import Corrfunc
    >>> from Corrfunc.theory.xi import xi
    >>> binfile = pjoin(dirname(abspath(Corrfunc.__file__)),
    ...                 "../theory/tests/", "bins")
    >>> N = 100000
    >>> boxsize = 420.0
    >>> nthreads = 4
    >>> seed = 42
    >>> np.random.seed(seed)
    >>> X = np.random.uniform(0, boxsize, N)
    >>> Y = np.random.uniform(0, boxsize, N)
    >>> Z = np.random.uniform(0, boxsize, N)
    >>> weights = np.full_like(X, 0.5)
    >>> results = xi(boxsize, nthreads, binfile, X, Y, Z, weights=weights, weight_type='pair_product', output_ravg=True)
    >>> for r in results: print("{0:10.6f} {1:10.6f} {2:10.6f} {3:10.6f} {4:10d} {5:10.6f}"
    ...                         .format(r['rmin'], r['rmax'],
    ...                         r['ravg'], r['xi'], r['npairs'], r['weightavg']))
    ...                   # doctest: +NORMALIZE_WHITESPACE
          0.167536   0.238755   0.226592  -0.205733          4   0.250000
          0.238755   0.340251   0.289277  -0.176729         12   0.250000
          0.340251   0.484892   0.426819  -0.051829         40   0.250000
          0.484892   0.691021   0.596187  -0.131853        106   0.250000
          0.691021   0.984777   0.850100  -0.049207        336   0.250000
          0.984777   1.403410   1.225112   0.028543       1052   0.250000
          1.403410   2.000000   1.737153   0.011403       2994   0.250000
          2.000000   2.850200   2.474588   0.005405       8614   0.250000
          2.850200   4.061840   3.532018  -0.014098      24448   0.250000
          4.061840   5.788530   5.022241  -0.010784      70996   0.250000
          5.788530   8.249250   7.160648  -0.001588     207392   0.250000
          8.249250  11.756000  10.207213  -0.000323     601002   0.250000
         11.756000  16.753600  14.541171   0.000007    1740084   0.250000
         16.753600  23.875500  20.728773  -0.001595    5028058   0.250000