Corrfunc.mocks package

Wrapper for all clustering statistic calculations on galaxies in a mock catalog.

Corrfunc.mocks.DDrppi_mocks(autocorr, cosmology, nthreads, pimax, binfile, RA1, DEC1, CZ1, weights1=None, RA2=None, DEC2=None, CZ2=None, weights2=None, is_comoving_dist=False, verbose=False, output_rpavg=False, fast_divide_and_NR_steps=0, xbin_refine_factor=2, ybin_refine_factor=2, zbin_refine_factor=1, max_cells_per_dim=100, copy_particles=True, enable_min_sep_opt=True, c_api_timer=False, isa=u'fastest', weight_type=None)[source]

Calculate the 2-D pair-counts corresponding to the projected correlation function, \(\xi(r_p, \pi)\). Pairs which are separated by less than the rp bins (specified in binfile) in the X-Y plane, and less than pimax in the Z-dimension are counted. The input positions are expected to be on-sky co-ordinates. This module is suitable for calculating correlation functions for mock catalogs.

If weights are provided, the resulting pair counts are weighted. The weighting scheme depends on weight_type.

Returns a numpy structured array containing the pair counts for the specified bins.

Note

that this module only returns pair counts and not the actual correlation function \(\xi(r_p, \pi)\) or \(wp(r_p)\). See the utilities Corrfunc.utils.convert_3d_counts_to_cf and Corrfunc.utils.convert_rp_pi_counts_to_wp for computing \(\xi(r_p, \pi)\) and \(wp(r_p)\) respectively from the pair counts.

Parameters:
  • autocorr (boolean, required) – Boolean flag for auto/cross-correlation. If autocorr is set to 1, then the second set of particle positions are not required.
  • cosmology (integer, required) –

    Integer choice for setting cosmology. Valid values are 1->LasDamas cosmology and 2->Planck cosmology. If you need arbitrary cosmology, easiest way is to convert the CZ values into co-moving distance, based on your preferred cosmology. Set is_comoving_dist=True, to indicate that the co-moving distance conversion has already been done.

    Choices:
    1. LasDamas cosmology. \(\Omega_m=0.25\), \(\Omega_\Lambda=0.75\)
    2. Planck cosmology. \(\Omega_m=0.302\), \(\Omega_\Lambda=0.698\)

    To setup a new cosmology, add an entry to the function, init_cosmology in ROOT/utils/cosmology_params.c and re-install the entire package.

  • nthreads (integer) – The number of OpenMP threads to use. Has no effect if OpenMP was not enabled during library compilation.
  • pimax (double) –

    A double-precision value for the maximum separation along the Z-dimension.

    Distances along the \(\pi\) direction are binned with unit depth. For instance, if pimax=40, then 40 bins will be created along the pi direction. Only pairs with 0 <= dz < pimax are counted (no equality).

  • binfile (string or an list/array of floats) –

    For string input: filename specifying the rp bins for DDrppi_mocks. The file should contain white-space separated values of (rpmin, rpmax) for each rp wanted. The bins need to be contiguous and sorted in increasing order (smallest bins come first).

    For array-like input: A sequence of rp values that provides the bin-edges. For example, np.logspace(np.log10(0.1), np.log10(10.0), 15) is a valid input specifying 14 (logarithmic) bins between 0.1 and 10.0. This array does not need to be sorted.

  • RA1 (array-like, real (float/double)) –

    The array of Right Ascensions for the first set of points. RA’s are expected to be in [0.0, 360.0], but the code will try to fix cases where the RA’s are in [-180, 180.0]. For peace of mind, always supply RA’s in [0.0, 360.0].

    Calculations are done in the precision of the supplied arrays.

  • DEC1 (array-like, real (float/double)) –

    Array of Declinations for the first set of points. DEC’s are expected to be in the [-90.0, 90.0], but the code will try to fix cases where the DEC’s are in [0.0, 180.0]. Again, for peace of mind, always supply DEC’s in [-90.0, 90.0].

    Must be of same precision type as RA1.

  • CZ1 (array-like, real (float/double)) –

    Array of (Speed Of Light * Redshift) values for the first set of points. Code will try to detect cases where redshifts have been passed and multiply the entire array with the speed of light.

    If is_comoving_dist is set, then CZ1 is interpreted as the co-moving distance, rather than cz.

  • weights1 (array_like, real (float/double), optional) – A scalar, or an array of weights of shape (n_weights, n_positions) or (n_positions,). weight_type specifies how these weights are used; results are returned in the weightavg field. If only one of weights1 and weights2 is specified, the other will be set to uniform weights.
  • RA2 (array-like, real (float/double)) –

    The array of Right Ascensions for the second set of points. RA’s are expected to be in [0.0, 360.0], but the code will try to fix cases where the RA’s are in [-180, 180.0]. For peace of mind, always supply RA’s in [0.0, 360.0].

    Must be of same precision type as RA1/DEC1/CZ1.

  • DEC2 (array-like, real (float/double)) –

    Array of Declinations for the second set of points. DEC’s are expected to be in the [-90.0, 90.0], but the code will try to fix cases where the DEC’s are in [0.0, 180.0]. Again, for peace of mind, always supply DEC’s in [-90.0, 90.0].

    Must be of same precision type as RA1/DEC1/CZ1.

  • CZ2 (array-like, real (float/double)) –

    Array of (Speed Of Light * Redshift) values for the second set of points. Code will try to detect cases where redshifts have been passed and multiply the entire array with the speed of light.

    If is_comoving_dist is set, then CZ2 is interpreted as the co-moving distance, rather than cz.

    Must be of same precision type as RA1/DEC1/CZ1.

  • weights2 (array-like, real (float/double), optional) – Same as weights1, but for the second set of positions
  • is_comoving_dist (boolean (default false)) – Boolean flag to indicate that cz values have already been converted into co-moving distances. This flag allows arbitrary cosmologies to be used in Corrfunc.
  • verbose (boolean (default false)) – Boolean flag to control output of informational messages
  • output_rpavg (boolean (default false)) –

    Boolean flag to output the average rp for each bin. Code will run slower if you set this flag.

    If you are calculating in single-precision, rpavg will suffer suffer from numerical loss of precision and can not be trusted. If you need accurate rpavg values, then pass in double precision arrays for the particle positions.

  • fast_divide_and_NR_steps (integer (default 0)) – Replaces the division in AVX implementation with an approximate reciprocal, followed by fast_divide_and_NR_steps of Newton-Raphson. Can improve runtime by ~15-20% on older computers. Value of 0 uses the standard division operation.
  • (xyz)bin_refine_factor (integer, default is (2,2,1); typically within [1-3]) – Controls the refinement on the cell sizes. Can have up to a 20% impact on runtime.
  • max_cells_per_dim (integer, default is 100, typical values in [50-300]) – Controls the maximum number of cells per dimension. Total number of cells can be up to (max_cells_per_dim)^3. Only increase if rpmax is too small relative to the boxsize (and increasing helps the runtime).
  • copy_particles (boolean (default True)) –

    Boolean flag to make a copy of the particle positions If set to False, the particles will be re-ordered in-place

    New in version 2.3.0.

  • enable_min_sep_opt (boolean (default true)) –

    Boolean flag to allow optimizations based on min. separation between pairs of cells. Here to allow for comparison studies.

    New in version 2.3.0.

  • c_api_timer (boolean (default false)) – Boolean flag to measure actual time spent in the C libraries. Here to allow for benchmarking and scaling studies.
  • isa (string, case-insensitive (default fastest)) –

    Controls the runtime dispatch for the instruction set to use. Possible options are: [fastest, avx512f, avx, sse42, fallback]

    Setting isa to fastest will pick the fastest available instruction set on the current computer. However, if you set isa to, say, avx and avx is not available on the computer, then the code will revert to using fallback (even though sse42 might be available).

    Unless you are benchmarking the different instruction sets, you should always leave isa to the default value. And if you are benchmarking, then the string supplied here gets translated into an enum for the instruction set defined in utils/defs.h.

  • weight_type (string, optional (default None)) – The type of weighting to apply. One of [“pair_product”, None].
Returns:

  • results (Numpy structured array) – A numpy structured array containing [rpmin, rpmax, rpavg, pimax, npairs, weightavg] for each radial bin specified in the binfile. If output_ravg is not set, then rpavg will be set to 0.0 for all bins; similarly for weightavg. npairs contains the number of pairs in that bin and can be used to compute the actual \(\xi(r_p, \pi)\) or \(wp(rp)\) by combining with (DR, RR) counts.
  • api_time (float, optional) – Only returned if c_api_timer is set. api_time measures only the time spent within the C library and ignores all python overhead.

Example

>>> from __future__ import print_function
>>> import numpy as np
>>> from os.path import dirname, abspath, join as pjoin
>>> import Corrfunc
>>> from Corrfunc.mocks.DDrppi_mocks import DDrppi_mocks
>>> import math
>>> binfile = pjoin(dirname(abspath(Corrfunc.__file__)),
...                 "../mocks/tests/", "bins")
>>> N = 100000
>>> boxsize = 420.0
>>> seed = 42
>>> np.random.seed(seed)
>>> X = np.random.uniform(-0.5*boxsize, 0.5*boxsize, N)
>>> Y = np.random.uniform(-0.5*boxsize, 0.5*boxsize, N)
>>> Z = np.random.uniform(-0.5*boxsize, 0.5*boxsize, N)
>>> weights = np.ones_like(X)
>>> CZ = np.sqrt(X*X + Y*Y + Z*Z)
>>> inv_cz = 1.0/CZ
>>> X *= inv_cz
>>> Y *= inv_cz
>>> Z *= inv_cz
>>> DEC = 90.0 - np.arccos(Z)*180.0/math.pi
>>> RA = (np.arctan2(Y, X)*180.0/math.pi) + 180.0
>>> autocorr = 1
>>> cosmology = 1
>>> nthreads = 2
>>> pimax = 40.0
>>> results = DDrppi_mocks(autocorr, cosmology, nthreads,
...                        pimax, binfile, RA, DEC, CZ,
...                        weights1=weights, weight_type='pair_product',
...                        output_rpavg=True, is_comoving_dist=True)
>>> for r in results[519:]: print("{0:10.6f} {1:10.6f} {2:10.6f} {3:10.1f}"
...                               " {4:10d} {5:10.6f}".format(r['rmin'], r['rmax'],
...                               r['rpavg'], r['pimax'], r['npairs'], r['weightavg']))
...                         # doctest: +NORMALIZE_WHITESPACE
 11.359969  16.852277  14.285169       40.0     104850   1.000000
 16.852277  25.000000  21.181246        1.0     274144   1.000000
 16.852277  25.000000  21.190844        2.0     272876   1.000000
 16.852277  25.000000  21.183321        3.0     272294   1.000000
 16.852277  25.000000  21.188486        4.0     272506   1.000000
 16.852277  25.000000  21.170832        5.0     272100   1.000000
 16.852277  25.000000  21.165379        6.0     271788   1.000000
 16.852277  25.000000  21.175246        7.0     270040   1.000000
 16.852277  25.000000  21.187417        8.0     269492   1.000000
 16.852277  25.000000  21.172066        9.0     269682   1.000000
 16.852277  25.000000  21.182460       10.0     268266   1.000000
 16.852277  25.000000  21.170594       11.0     268744   1.000000
 16.852277  25.000000  21.178608       12.0     266820   1.000000
 16.852277  25.000000  21.187184       13.0     266510   1.000000
 16.852277  25.000000  21.184937       14.0     265484   1.000000
 16.852277  25.000000  21.180184       15.0     265258   1.000000
 16.852277  25.000000  21.191504       16.0     262952   1.000000
 16.852277  25.000000  21.187746       17.0     262602   1.000000
 16.852277  25.000000  21.189778       18.0     260206   1.000000
 16.852277  25.000000  21.188882       19.0     259410   1.000000
 16.852277  25.000000  21.185684       20.0     256806   1.000000
 16.852277  25.000000  21.194036       21.0     255574   1.000000
 16.852277  25.000000  21.184115       22.0     255406   1.000000
 16.852277  25.000000  21.178255       23.0     252394   1.000000
 16.852277  25.000000  21.184644       24.0     252220   1.000000
 16.852277  25.000000  21.187020       25.0     251668   1.000000
 16.852277  25.000000  21.183827       26.0     249648   1.000000
 16.852277  25.000000  21.183121       27.0     247160   1.000000
 16.852277  25.000000  21.180872       28.0     246238   1.000000
 16.852277  25.000000  21.185251       29.0     246030   1.000000
 16.852277  25.000000  21.183488       30.0     242124   1.000000
 16.852277  25.000000  21.194538       31.0     242426   1.000000
 16.852277  25.000000  21.190702       32.0     239778   1.000000
 16.852277  25.000000  21.188985       33.0     239046   1.000000
 16.852277  25.000000  21.187092       34.0     237640   1.000000
 16.852277  25.000000  21.185515       35.0     236256   1.000000
 16.852277  25.000000  21.190278       36.0     233536   1.000000
 16.852277  25.000000  21.183240       37.0     233274   1.000000
 16.852277  25.000000  21.183796       38.0     231628   1.000000
 16.852277  25.000000  21.200668       39.0     230378   1.000000
 16.852277  25.000000  21.181153       40.0     229006   1.000000
Corrfunc.mocks.DDtheta_mocks(autocorr, nthreads, binfile, RA1, DEC1, weights1=None, RA2=None, DEC2=None, weights2=None, link_in_dec=True, link_in_ra=True, verbose=False, output_thetaavg=False, fast_acos=False, ra_refine_factor=2, dec_refine_factor=2, max_cells_per_dim=100, copy_particles=True, enable_min_sep_opt=True, c_api_timer=False, isa=u'fastest', weight_type=None)[source]

Function to compute the angular correlation function for points on the sky (i.e., mock catalogs or observed galaxies).

Returns a numpy structured array containing the pair counts for the specified angular bins.

If weights are provided, the resulting pair counts are weighted. The weighting scheme depends on weight_type.

Note

This module only returns pair counts and not the actual correlation function \(\omega( heta)\). See Corrfunc.utils.convert_3d_counts_to_cf for computing \(\omega( heta)\) from the pair counts returned.

Parameters:
  • autocorr (boolean, required) – Boolean flag for auto/cross-correlation. If autocorr is set to 1, then the second set of particle positions are not required.
  • nthreads (integer) – Number of threads to use.
  • binfile (string or an list/array of floats. Units: degrees.) –

    For string input: filename specifying the theta bins for DDtheta_mocks. The file should contain white-space separated values of (thetamin, thetamax) for each theta wanted. The bins need to be contiguous and sorted in increasing order (smallest bins come first).

    For array-like input: A sequence of theta values that provides the bin-edges. For example, np.logspace(np.log10(0.1), np.log10(10.0), 15) is a valid input specifying 14 (logarithmic) bins between 0.1 and 10.0 degrees. This array does not need to be sorted.

  • RA1 (array-like, real (float/double)) –

    The array of Right Ascensions for the first set of points. RA’s are expected to be in [0.0, 360.0], but the code will try to fix cases where the RA’s are in [-180, 180.0]. For peace of mind, always supply RA’s in [0.0, 360.0].

    Calculations are done in the precision of the supplied arrays.

  • DEC1 (array-like, real (float/double)) – Array of Declinations for the first set of points. DEC’s are expected to be in the [-90.0, 90.0], but the code will try to fix cases where the DEC’s are in [0.0, 180.0]. Again, for peace of mind, always supply DEC’s in [-90.0, 90.0]. Must be of same precision type as RA1.
  • weights1 (array_like, real (float/double), optional) – A scalar, or an array of weights of shape (n_weights, n_positions) or (n_positions,). weight_type specifies how these weights are used; results are returned in the weightavg field. If only one of weights1 and weights2 is specified, the other will be set to uniform weights.
  • RA2 (array-like, real (float/double)) – The array of Right Ascensions for the second set of points. RA’s are expected to be in [0.0, 360.0], but the code will try to fix cases where the RA’s are in [-180, 180.0]. For peace of mind, always supply RA’s in [0.0, 360.0]. Must be of same precision type as RA1/DEC1.
  • DEC2 (array-like, real (float/double)) – Array of Declinations for the second set of points. DEC’s are expected to be in the [-90.0, 90.0], but the code will try to fix cases where the DEC’s are in [0.0, 180.0]. Again, for peace of mind, always supply DEC’s in [-90.0, 90.0]. Must be of same precision type as RA1/DEC1.
  • weights2 (array-like, real (float/double), optional) – Same as weights1, but for the second set of positions
  • link_in_dec (boolean (default True)) – Boolean flag to create lattice in Declination. Code runs faster with this option. However, if the angular separations are too small, then linking in declination might produce incorrect results. When running for the first time, check your results by comparing with the output of the code for link_in_dec=False and link_in_ra=False.
  • link_in_ra (boolean (default True)) –

    Boolean flag to create lattice in Right Ascension. Setting this option implies link_in_dec=True. Similar considerations as link_in_dec described above.

    If you disable both link_in_dec and link_in_ra, then the code reduces to a brute-force pair counter. No lattices are created at all. For very small angular separations, the brute-force method might be the most numerically stable method.

  • verbose (boolean (default false)) – Boolean flag to control output of informational messages
  • output_thetaavg (boolean (default false)) –

    Boolean flag to output the average `` heta`` for each bin. Code will run slower if you set this flag.

    If you are calculating in single-precision, thetaavg will suffer from numerical loss of precision and can not be trusted. If you need accurate thetaavg values, then pass in double precision arrays for RA/DEC.

    Code will run significantly slower if you enable this option. Use the keyword fast_acos if you can tolerate some loss of precision.

  • fast_acos (boolean (default false)) –

    Flag to use numerical approximation for the arccos - gives better performance at the expense of some precision. Relevant only if output_thetaavg==True.

    Developers: Two versions already coded up in utils/fast_acos.h, so you can choose the version you want. There are also notes on how to implement faster (and less accurate) functions, particularly relevant if you know your theta range is limited. If you implement a new version, then you will have to reinstall the entire Corrfunc package.

    Note: Tests will fail if you run the tests with``fast_acos=True``.

  • (radec)_refine_factor (integer, default is (2,2); typically within [1-5]) –

    Controls the refinement on the cell sizes. Can have up to a 20% impact on runtime.

    Only two refine factors are to be specified and these correspond to ra and dec (rather, than the usual three of (xyz)bin_refine_factor for all other correlation functions).

  • max_cells_per_dim (integer, default is 100, typical values in [50-300]) – Controls the maximum number of cells per dimension. Total number of cells can be up to (max_cells_per_dim)^3. Only increase if thetamax is too small relative to the boxsize (and increasing helps the runtime).
  • copy_particles (boolean (default True)) –

    Boolean flag to make a copy of the particle positions If set to False, the particles will be re-ordered in-place

    New in version 2.3.0.

  • enable_min_sep_opt (boolean (default true)) –

    Boolean flag to allow optimizations based on min. separation between pairs of cells. Here to allow for comparison studies.

    New in version 2.3.0.

  • c_api_timer (boolean (default false)) – Boolean flag to measure actual time spent in the C libraries. Here to allow for benchmarking and scaling studies.
  • isa (string, case-insensitive (default fastest)) –

    Controls the runtime dispatch for the instruction set to use. Options are: [fastest, avx512f, avx, sse42, fallback]

    Setting isa to fastest will pick the fastest available instruction set on the current computer. However, if you set isa to, say, avx and avx is not available on the computer, then the code will revert to using fallback (even though sse42 might be available).

    Unless you are benchmarking the different instruction sets, you should always leave isa to the default value. And if you are benchmarking, then the string supplied here gets translated into an enum for the instruction set defined in utils/defs.h.

  • weight_type (string, optional (default None)) – The type of weighting to apply. One of [“pair_product”, None].
Returns:

  • results (Numpy structured array) – A numpy structured array containing [thetamin, thetamax, thetaavg, npairs, weightavg] for each angular bin specified in the binfile. If output_thetaavg is not set then thetavg will be set to 0.0 for all bins; similarly for weightavg. npairs contains the number of pairs in that bin.
  • api_time (float, optional) – Only returned if c_api_timer is set. api_time measures only the time spent within the C library and ignores all python overhead.

Example

>>> from __future__ import print_function
>>> import numpy as np
>>> import time
>>> from math import pi
>>> from os.path import dirname, abspath, join as pjoin
>>> import Corrfunc
>>> from Corrfunc.mocks.DDtheta_mocks import DDtheta_mocks
>>> binfile = pjoin(dirname(abspath(Corrfunc.__file__)),
...                 "../mocks/tests/", "angular_bins")
>>> N = 100000
>>> nthreads = 4
>>> seed = 42
>>> np.random.seed(seed)
>>> RA = np.random.uniform(0.0, 2.0*pi, N)*180.0/pi
>>> cos_theta = np.random.uniform(-1.0, 1.0, N)
>>> DEC = 90.0 - np.arccos(cos_theta)*180.0/pi
>>> weights = np.ones_like(RA)
>>> autocorr = 1
>>> for isa in ['AVX', 'SSE42', 'FALLBACK']:
...     for link_in_dec in [False, True]:
...         for link_in_ra in [False, True]:
...             results = DDtheta_mocks(autocorr, nthreads, binfile,
...                         RA, DEC, output_thetaavg=True,
...                         weights1=weights, weight_type='pair_product',
...                         link_in_dec=link_in_dec, link_in_ra=link_in_ra,
...                         isa=isa, verbose=True)
>>> for r in results: print("{0:10.6f} {1:10.6f} {2:10.6f} {3:10d} {4:10.6f}".
...                         format(r['thetamin'], r['thetamax'],
...                         r['thetaavg'], r['npairs'], r['weightavg']))
...                         # doctest: +NORMALIZE_WHITESPACE
  0.010000   0.014125   0.012272         62   1.000000
  0.014125   0.019953   0.016978        172   1.000000
  0.019953   0.028184   0.024380        298   1.000000
  0.028184   0.039811   0.034321        598   1.000000
  0.039811   0.056234   0.048535       1164   1.000000
  0.056234   0.079433   0.068385       2438   1.000000
  0.079433   0.112202   0.096631       4658   1.000000
  0.112202   0.158489   0.136834       9414   1.000000
  0.158489   0.223872   0.192967      19098   1.000000
  0.223872   0.316228   0.272673      37848   1.000000
  0.316228   0.446684   0.385344      75520   1.000000
  0.446684   0.630957   0.543973     150938   1.000000
  0.630957   0.891251   0.768406     301854   1.000000
  0.891251   1.258925   1.085273     599896   1.000000
  1.258925   1.778279   1.533461    1200238   1.000000
  1.778279   2.511886   2.166009    2396338   1.000000
  2.511886   3.548134   3.059159    4775162   1.000000
  3.548134   5.011872   4.321445    9532582   1.000000
  5.011872   7.079458   6.104214   19001930   1.000000
  7.079458  10.000000   8.622400   37842502   1.000000
Corrfunc.mocks.vpf_mocks(rmax, nbins, nspheres, numpN, threshold_ngb, centers_file, cosmology, RA, DEC, CZ, RAND_RA, RAND_DEC, RAND_CZ, verbose=False, is_comoving_dist=False, xbin_refine_factor=1, ybin_refine_factor=1, zbin_refine_factor=1, max_cells_per_dim=100, copy_particles=True, c_api_timer=False, isa=u'fastest')[source]

Function to compute the counts-in-cells on points on the sky. Suitable for mock catalogs and observed galaxies.

Returns a numpy structured array containing the probability of a sphere of radius up to rmax containing 0--numpN-1 galaxies.

Parameters:
  • rmax (double) – Maximum radius of the sphere to place on the particles
  • nbins (integer) – Number of bins in the counts-in-cells. Radius of first shell is rmax/nbins
  • nspheres (integer (>= 0)) – Number of random spheres to place within the particle distribution. For a small number of spheres, the error is larger in the measured pN’s.
  • numpN (integer (>= 1)) –

    Governs how many unique pN’s are to returned. If numpN is set to 1, then only the vpf (p0) is returned. For numpN=2, p0 and p1 are returned.

    More explicitly, the columns in the results look like the following:

    numpN Columns in output
    1 p0
    2 p0 p1
    3 p0 p1 p2
    4 p0 p1 p2 p3

    and so on…

    Note: p0 is the vpf

  • threshold_ngb (integer) – Minimum number of random points needed in a rmax sphere such that it is considered to be entirely within the mock footprint. The command-line version, mocks/vpf/vpf_mocks.c, assumes that the minimum number of randoms can be at most a 1-sigma deviation from the expected random number density.
  • centers_file (string, filename) –

    A file containing random sphere centers. If the file does not exist, then a list of random centers will be written out. In that case, the randoms arrays, RAND_RA, RAND_DEC and RAND_CZ are used to check that the sphere is entirely within the footprint. If the file does exist but either rmax is too small or there are not enough centers then the file will be overwritten.

    Note: If the centers file has to be written, the code will take significantly longer to finish. However, subsequent runs can re-use that centers file and will be faster.

  • cosmology (integer, required) –

    Integer choice for setting cosmology. Valid values are 1->LasDamas cosmology and 2->Planck cosmology. If you need arbitrary cosmology, easiest way is to convert the CZ values into co-moving distance, based on your preferred cosmology. Set is_comoving_dist=True, to indicate that the co-moving distance conversion has already been done.

    Choices:
    1. LasDamas cosmology. \(\Omega_m=0.25\), \(\Omega_\Lambda=0.75\)
    2. Planck cosmology. \(\Omega_m=0.302\), \(\Omega_\Lambda=0.698\)

    To setup a new cosmology, add an entry to the function, init_cosmology in ROOT/utils/cosmology_params.c and re-install the entire package.

  • RA (array-like, real (float/double)) –

    The array of Right Ascensions for the first set of points. RA’s are expected to be in [0.0, 360.0], but the code will try to fix cases where the RA’s are in [-180, 180.0]. For peace of mind, always supply RA’s in [0.0, 360.0].

    Calculations are done in the precision of the supplied arrays.

  • DEC (array-like, real (float/double)) –

    Array of Declinations for the first set of points. DEC’s are expected to be in the [-90.0, 90.0], but the code will try to fix cases where the DEC’s are in [0.0, 180.0]. Again, for peace of mind, always supply DEC’s in [-90.0, 90.0].

    Must be of same precision type as RA.

  • CZ (array-like, real (float/double)) –

    Array of (Speed Of Light * Redshift) values for the first set of points. Code will try to detect cases where redshifts have been passed and multiply the entire array with the speed of light.

    If is_comoving_dist is set, then CZ is interpreted as the co-moving distance, rather than (Speed Of Light * Redshift).

  • RAND_RA (array-like, real (float/double)) –

    The array of Right Ascensions for the randoms. RA’s are expected to be in [0.0, 360.0], but the code will try to fix cases where the RA’s are in [-180, 180.0]. For peace of mind, always supply RA’s in [0.0, 360.0].

    Must be of same precision type as RA/DEC/CZ.

  • RAND_DEC (array-like, real (float/double)) –

    Array of Declinations for the randoms. DEC’s are expected to be in the [-90.0, 90.0], but the code will try to fix cases where the DEC’s are in [0.0, 180.0]. Again, for peace of mind, always supply DEC’s in [-90.0, 90.0].

    Must be of same precision type as RA/DEC/CZ.

  • RAND_CZ (array-like, real (float/double)) –

    Array of (Speed Of Light * Redshift) values for the randoms. Code will try to detect cases where redshifts have been passed and multiply the entire array with the speed of light.

    If is_comoving_dist is set, then CZ2 is interpreted as the co-moving distance, rather than (Speed Of Light * Redshift).

    Note: RAND_RA, RAND_DEC and RAND_CZ are only used when the
    centers_file needs to be written out. In that case, the RAND_RA, RAND_DEC, and RAND_CZ are used as random centers.
  • verbose (boolean (default false)) – Boolean flag to control output of informational messages
  • is_comoving_dist (boolean (default false)) – Boolean flag to indicate that cz values have already been converted into co-moving distances. This flag allows arbitrary cosmologies to be used in Corrfunc.
  • (xyz)bin_refine_factor (integer, default is (1, 1, 1); typically in [1-2]) –

    Controls the refinement on the cell sizes. Higher numbers might have a negative impact on runtime.

    Note: Since the counts in spheres calculation is symmetric in all 3 dimensions, the defaults are different from the clustering routines.

  • max_cells_per_dim (integer, default is 100, typical values in [50-300]) – Controls the maximum number of cells per dimension. Total number of cells can be up to (max_cells_per_dim)^3. Only increase if rmax is too small relative to the boxsize (and increasing helps the runtime).
  • copy_particles (boolean (default True)) –

    Boolean flag to make a copy of the particle positions If set to False, the particles will be re-ordered in-place

    New in version 2.3.0.

  • c_api_timer (boolean (default false)) – Boolean flag to measure actual time spent in the C libraries. Here to allow for benchmarking and scaling studies.
  • isa (string, case-insensitive (default fastest)) –

    Controls the runtime dispatch for the instruction set to use. Options are: [fastest, avx512f, avx, sse42, fallback]

    Setting isa to fastest will pick the fastest available instruction set on the current computer. However, if you set isa to, say, avx and avx is not available on the computer, then the code will revert to using fallback (even though sse42 might be available).

    Unless you are benchmarking the different instruction sets, you should always leave isa to the default value. And if you are benchmarking, then the string supplied here gets translated into an enum for the instruction set defined in utils/defs.h.

Returns:

  • results (Numpy structured array) – A numpy structured array containing [rmax, pN[numpN]] with nbins elements. Each row contains the maximum radius of the sphere and the numpN elements in the pN array. Each element of this array contains the probability that a sphere of radius rmax contains exactly N galaxies. For example, pN[0] (p0, the void probibility function) is the probability that a sphere of radius rmax contains 0 galaxies.
  • api_time (float, optional) – Only returned if c_api_timer is set. api_time measures only the time spent within the C library and ignores all python overhead.

Example

>>> from __future__ import print_function
>>> import math
>>> from os.path import dirname, abspath, join as pjoin
>>> import numpy as np
>>> import Corrfunc
>>> from Corrfunc.mocks.vpf_mocks import vpf_mocks
>>> rmax = 10.0
>>> nbins = 10
>>> numbins_to_print = nbins
>>> nspheres = 10000
>>> numpN = 6
>>> threshold_ngb = 1  # does not matter since we have the centers
>>> cosmology = 1  # LasDamas cosmology
>>> centers_file = pjoin(dirname(abspath(Corrfunc.__file__)),
...                      "../mocks/tests/data/",
...                      "Mr19_centers_xyz_forVPF_rmax_10Mpc.txt")
>>> N = 1000000
>>> boxsize = 420.0
>>> seed = 42
>>> np.random.seed(seed)
>>> X = np.random.uniform(-0.5*boxsize, 0.5*boxsize, N)
>>> Y = np.random.uniform(-0.5*boxsize, 0.5*boxsize, N)
>>> Z = np.random.uniform(-0.5*boxsize, 0.5*boxsize, N)
>>> CZ = np.sqrt(X*X + Y*Y + Z*Z)
>>> inv_cz = 1.0/CZ
>>> X *= inv_cz
>>> Y *= inv_cz
>>> Z *= inv_cz
>>> DEC = 90.0 - np.arccos(Z)*180.0/math.pi
>>> RA = (np.arctan2(Y, X)*180.0/math.pi) + 180.0
>>> results = vpf_mocks(rmax, nbins, nspheres, numpN, threshold_ngb,
...                     centers_file, cosmology,
...                     RA, DEC, CZ,
...                     RA, DEC, CZ,
...                     is_comoving_dist=True)
>>> for r in results:
...     print("{0:10.1f} ".format(r[0]), end="")
...     # doctest: +NORMALIZE_WHITESPACE
...     for pn in r[1]:
...         print("{0:10.3f} ".format(pn), end="")
...         # doctest: +NORMALIZE_WHITESPACE
...     print("") # doctest: +NORMALIZE_WHITESPACE
   1.0      0.999      0.001      0.000      0.000      0.000      0.000
   2.0      0.992      0.007      0.001      0.000      0.000      0.000
   3.0      0.982      0.009      0.005      0.002      0.001      0.000
   4.0      0.975      0.006      0.006      0.005      0.003      0.003
   5.0      0.971      0.004      0.003      0.003      0.004      0.003
   6.0      0.967      0.003      0.003      0.001      0.003      0.002
   7.0      0.962      0.004      0.002      0.003      0.002      0.001
   8.0      0.958      0.004      0.002      0.003      0.001      0.002
   9.0      0.953      0.003      0.003      0.002      0.003      0.001
  10.0      0.950      0.003      0.002      0.002      0.001      0.002
Corrfunc.mocks.DDsmu_mocks(autocorr, cosmology, nthreads, mu_max, nmu_bins, binfile, RA1, DEC1, CZ1, weights1=None, RA2=None, DEC2=None, CZ2=None, weights2=None, is_comoving_dist=False, verbose=False, output_savg=False, fast_divide_and_NR_steps=0, xbin_refine_factor=2, ybin_refine_factor=2, zbin_refine_factor=1, max_cells_per_dim=100, copy_particles=True, enable_min_sep_opt=True, c_api_timer=False, isa=u'fastest', weight_type=None)[source]

Calculate the 2-D pair-counts corresponding to the projected correlation function, \(\xi(s, \mu)\). The pairs are counted in bins of radial separation and cosine of angle to the line-of-sight (LOS). The input positions are expected to be on-sky co-ordinates. This module is suitable for calculating correlation functions for mock catalogs.

If weights are provided, the resulting pair counts are weighted. The weighting scheme depends on weight_type.

Returns a numpy structured array containing the pair counts for the specified bins.

Note

This module only returns pair counts and not the actual correlation function \(\xi(s, \mu)\). See the utilities Corrfunc.utils.convert_3d_counts_to_cf for computing \(\xi(s, \mu)\) from the pair counts.

New in version 2.1.0.

Parameters:
  • autocorr (boolean, required) – Boolean flag for auto/cross-correlation. If autocorr is set to 1, then the second set of particle positions are not required.
  • cosmology (integer, required) –

    Integer choice for setting cosmology. Valid values are 1->LasDamas cosmology and 2->Planck cosmology. If you need arbitrary cosmology, easiest way is to convert the CZ values into co-moving distance, based on your preferred cosmology. Set is_comoving_dist=True, to indicate that the co-moving distance conversion has already been done.

    Choices:
    1. LasDamas cosmology. \(\Omega_m=0.25\), \(\Omega_\Lambda=0.75\)
    2. Planck cosmology. \(\Omega_m=0.302\), \(\Omega_\Lambda=0.698\)

    To setup a new cosmology, add an entry to the function, init_cosmology in ROOT/utils/cosmology_params.c and re-install the entire package.

  • nthreads (integer) – The number of OpenMP threads to use. Has no effect if OpenMP was not enabled during library compilation.
  • mu_max (double. Must be in range [0.0, 1.0]) –

    A double-precision value for the maximum cosine of the angular separation from the line of sight (LOS). Here, mu is defined as the angle between s and l. If \(v_1\) and \(v_2\) represent the vectors to each point constituting the pair, then \(s := v_1 - v_2\) and \(l := 1/2 (v_1 + v_2)\).

    Note: Only pairs with \(0 <= \cos(\theta_{LOS}) < \mu_{max}\) are counted (no equality).

  • nmu_bins (int) – The number of linear mu bins, with the bins ranging from from (0, \(\mu_{max}\))
  • binfile (string or an list/array of floats) –

    For string input: filename specifying the s bins for DDsmu_mocks. The file should contain white-space separated values of (smin, smax) specifying each s bin wanted. The bins need to be contiguous and sorted in increasing order (smallest bins come first).

    For array-like input: A sequence of s values that provides the bin-edges. For example, np.logspace(np.log10(0.1), np.log10(10.0), 15) is a valid input specifying 14 (logarithmic) bins between 0.1 and 10.0. This array does not need to be sorted.

  • RA1 (array-like, real (float/double)) –

    The array of Right Ascensions for the first set of points. RA’s are expected to be in [0.0, 360.0], but the code will try to fix cases where the RA’s are in [-180, 180.0]. For peace of mind, always supply RA’s in [0.0, 360.0].

    Calculations are done in the precision of the supplied arrays.

  • DEC1 (array-like, real (float/double)) –

    Array of Declinations for the first set of points. DEC’s are expected to be in the [-90.0, 90.0], but the code will try to fix cases where the DEC’s are in [0.0, 180.0]. Again, for peace of mind, always supply DEC’s in [-90.0, 90.0].

    Must be of same precision type as RA1.

  • CZ1 (array-like, real (float/double)) –

    Array of (Speed Of Light * Redshift) values for the first set of points. Code will try to detect cases where redshifts have been passed and multiply the entire array with the speed of light.

    If is_comoving_dist is set, then CZ1 is interpreted as the co-moving distance, rather than cz.

  • weights1 (array_like, real (float/double), optional) – A scalar, or an array of weights of shape (n_weights, n_positions) or (n_positions,). weight_type specifies how these weights are used; results are returned in the weightavg field. If only one of weights1 or weights2 is specified, the other will be set to uniform weights.
  • RA2 (array-like, real (float/double)) –

    The array of Right Ascensions for the second set of points. RA’s are expected to be in [0.0, 360.0], but the code will try to fix cases where the RA’s are in [-180, 180.0]. For peace of mind, always supply RA’s in [0.0, 360.0].

    Must be of same precision type as RA1/DEC1/CZ1.

  • DEC2 (array-like, real (float/double)) –

    Array of Declinations for the second set of points. DEC’s are expected to be in the [-90.0, 90.0], but the code will try to fix cases where the DEC’s are in [0.0, 180.0]. Again, for peace of mind, always supply DEC’s in [-90.0, 90.0].

    Must be of same precision type as RA1/DEC1/CZ1.

  • CZ2 (array-like, real (float/double)) –

    Array of (Speed Of Light * Redshift) values for the second set of points. Code will try to detect cases where redshifts have been passed and multiply the entire array with the speed of light.

    If is_comoving_dist is set, then CZ2 is interpreted as the co-moving distance, rather than cz.

    Must be of same precision type as RA1/DEC1/CZ1.

  • weights2 (array-like, real (float/double), optional) – Same as weights1, but for the second set of positions
  • is_comoving_dist (boolean (default false)) – Boolean flag to indicate that cz values have already been converted into co-moving distances. This flag allows arbitrary cosmologies to be used in Corrfunc.
  • verbose (boolean (default false)) – Boolean flag to control output of informational messages
  • output_savg (boolean (default false)) – Boolean flag to output the average s for each bin. Code will run slower if you set this flag. Also, note, if you are calculating in single-precision, savg will suffer from numerical loss of precision and can not be trusted. If you need accurate savg values, then pass in double precision arrays for the particle positions.
  • fast_divide_and_NR_steps (integer (default 0)) – Replaces the division in AVX implementation with an approximate reciprocal, followed by fast_divide_and_NR_steps of Newton-Raphson. Can improve runtime by ~15-20% on older computers. Value of 0 uses the standard division operation.
  • (xyz)bin_refine_factor (integer, default is (2,2,1); typically within [1-3]) – Controls the refinement on the cell sizes. Can have up to a 20% impact on runtime.
  • max_cells_per_dim (integer, default is 100, typical values in [50-300]) – Controls the maximum number of cells per dimension. Total number of cells can be up to (max_cells_per_dim)^3. Only increase if rpmax is too small relative to the boxsize (and increasing helps the runtime).
  • copy_particles (boolean (default True)) –

    Boolean flag to make a copy of the particle positions If set to False, the particles will be re-ordered in-place

    New in version 2.3.0.

  • enable_min_sep_opt (boolean (default true)) –

    Boolean flag to allow optimizations based on min. separation between pairs of cells. Here to allow for comparison studies.

    New in version 2.3.0.

  • c_api_timer (boolean (default false)) – Boolean flag to measure actual time spent in the C libraries. Here to allow for benchmarking and scaling studies.
  • isa (string, case-insensitive (default fastest)) –

    Controls the runtime dispatch for the instruction set to use. Options are: [fastest, avx512f, avx, sse42, fallback]

    Setting isa to fastest will pick the fastest available instruction set on the current computer. However, if you set isa to, say, avx and avx is not available on the computer, then the code will revert to using fallback (even though sse42 might be available).

    Unless you are benchmarking the different instruction sets, you should always leave isa to the default value. And if you are benchmarking, then the string supplied here gets translated into an enum for the instruction set defined in utils/defs.h.

  • weight_type (string, optional (default None)) – The type of weighting to apply. One of [“pair_product”, None].
Returns:

  • results (Numpy structured array) – A numpy structured array containing [smin, smax, savg, mumax, npairs, weightavg]. There are a total of nmu_bins in mu for each separation bin specified in the binfile, with mumax being the upper limit of the mu bin. If output_savg is not set, then savg will be set to 0.0 for all bins; similarly for weightavg. npairs contains the number of pairs in that bin and can be used to compute the actual \(\xi(s, \mu)\) by combining with (DR, RR) counts.
  • api_time (float, optional) – Only returned if c_api_timer is set. api_time measures only the time spent within the C library and ignores all python overhead.