This documentation is for astroML version 0.2

This page

Links

astroML Mailing List

GitHub Issue Tracker

Videos

Scipy 2012 (15 minute talk)

Scipy 2013 (20 minute talk)

Citing

If you use the software, please consider citing astroML.

6. Two-point Correlation FunctionsΒΆ

The N-point correlation function is a common technique used in astronomy to extract useful information from multi-dimensional datasets. This is especially true for spatial distributions of galaxies, because the theoretical description of the process of galaxy formation directly predicts the two-point correlation function.

The computation of the two-point correlation is a generalized N-body problem similar to Nearest Neighbor searches and Kernel Density estimation (see Unsupervised Learning: Density Estimation)

AstroML implements a fast correlation function estimator based on the scikit-learn BallTree and KDTree data structures. Here is an example of computing the correlation function (with bootstrap estimates) over the SDSS spectroscopic galaxies within the redshift range 0.08 < z < 0.12:

../_images/fig_correlation_function_11.png

The correlation function interface is very straightforward. We’ll construct some random data in two dimensions and compute the two-point correlation here:

>>> from astroML.correlation import two_point
>>> np.random.seed(0)
>>> X = np.random.random((5000, 2))
>>> bins = np.linspace(0, 1, 20)
>>> corr = two_point(X, bins)
>>> np.allclose(corr, 0, atol=0.02)
True

For uniformly-distributed data, the correlation function is zero (that is, there is no excess over a uniformly-distributed background). If we wish to find the error on the result, this can be done via a bootstrap approach:

>>> from astroML.correlation import bootstrap_two_point
>>> corr, dcorr = bootstrap_two_point(X, bins, Nbootstrap=5)
>>> np.allclose(corr, 0, atol=2 * dcorr)
True

These are contrived examples, but they show how easy the computation can be. For a more realistic example, see the source code associated with the above figure.