This documentation is for astroML version 0.2

This page

Links

astroML Mailing List

GitHub Issue Tracker

Videos

Scipy 2012 (15 minute talk)

Scipy 2013 (20 minute talk)

Citing

If you use the software, please consider citing astroML.

Textbook Figures

This section makes available the source code used to generate every figure in the book Statistics, Data Mining, and Machine Learning in Astronomy. Many of the figures are fairly self-explanatory, though some will be less so without the book as a reference. The table of contents of the book can be seen here(pdf).

Getting Started/Frequently Asked Questions

There is so much here: where to begin?

  1. Getting SDSS and other data, and quick analysis and plotting:
    • How do I access SDSS imaging data and plot various color-color diagrams? Chapter 1.
    • How do I access an SDSS spectrum and plot it? Chapter 1.
    • How do plot data in pixelated sky projections? Chapter 1.
    • How can I visualize a four-dimensional data set and its intrinsic correlations? Chapter 1.
  2. Basic statistical tools:
    • How do I use python to evaluate and plot various statistical distributions, such as Cauchy, Laplace, etc. Chapter 3.
    • How do I robustly estimate location and scale parameters of a one-dimensional data set? Chapter 3.
    • How do I robustly estimate parameters of a two-dimensional Gaussian? Chapter 3.
    • How do I account for selection effects (e.g. luminosity functions)? Chapter 4.
    • How do I generate a simulated sample drawn from an arbitrary distribution? Chapter 4.
    • How do I choose optimal bin width for a histogram? Do bins need to be same size? Chapters 4 and 5.
    • How do I fit y(x) when y has non-Gaussian uncertainties? Chapter 8.
    • How do I fit y(x) when both x and y have non-negligible uncertainties? Chapter 8.
  3. Non-trivial data mining and other tools:
    • How do I run PCA on many SDSS spectra? Chapter 7.
    • How do I fit a multi-component Gaussian (or any other function) to my histogram? Chapter 5.
    • How do I decide if I have “detection”? Chapters 4, 5, 8.
    • How do I fit a multi-component Gaussian (while accounting for errors) to my multi-dimensional data? Chapter 6.
    • How do I justify the use of, for example, a parabola instead of a straight line to fit my data? Chapter 5.
    • How do I use Markov Chain Monte Carlo to fit a complex function to my multi-dimensional data? Chapter 5.
    • How do I estimate underlying density traced by a finite-size sample of points? Chapter 6.
    • How do I find clusters (over-densities, classes, features) in my data set? Chapters 6 and 9.
    • How do I estimate a light curve period (Lomb-Scargle)? Chapter 10.
    • How do I analyze a non-periodic light curve? Chapter 10.
    • How do I estimate power spectrum for unevenly sampled data with large heteroscedastic uncertainties? Chapter 10.
    • How do I use detection times for individual photons to estimate exponential decay time? Chapter 10.