AstroML: Machine Learning and Data Mining for Astronomy

AstroML is a Python module for machine learning and data mining built on numpy, scipy, scikit-learn, matplotlib, and astropy, and distributed under the 3-clause BSD license. It contains a growing library of statistical and machine learning routines for analyzing astronomical data in Python, loaders for several open astronomical datasets, and a large suite of examples of analyzing and visualizing astronomical datasets.

The goal of astroML is to provide a community repository for fast Python implementations of common tools and routines used for statistical data analysis in astronomy and astrophysics, to provide a uniform and easy-to-use interface to freely available astronomical datasets. We hope this package will be useful to researchers and students of astronomy. If you have an example you’d like to share, we are happy to accept a contribution via a GitHub Pull Request: the code repository can be found at



The astroML project was started in 2012 to accompany the book Statistics, Data Mining, and Machine Learning in Astronomy by Zeljko Ivezic, Andrew Connolly, Jacob VanderPlas, and Alex Gray, published by Princeton University Press. The table of contents is available here(pdf), or you can preview or purchase the book on Amazon.

Did you find a mistake or typo in the book? We maintain an up-to-date listing of errata in the text which you can view on GitHub. If you find a mistake which is not yet noted on that page, please let us know via email or GitHub pull request!

Citing astroML

If you make use of any of these datasets, tools, or examples in a scientific publication, please consider citing astroML. You may reference the following paper:

  • Introduction to astroML: Machine learning for astrophysics, Vanderplas et al, proc. of CIDU, pp. 47-54, 2012.

    Recipient of the best paper award for CIDU 2012

    Bibtex entry:

     author={{Vanderplas}, J.T. and {Connolly}, A.J.
             and {Ivezi{\'c}}, {\v Z}. and {Gray}, A.},
     booktitle={Conference on Intelligent Data Understanding (CIDU)},
     title={Introduction to astroML: Machine learning for astrophysics},
     pages={47 -54},

You may also reference the accompanying textbook:

  • Statistics, Data Mining, and Machine Learning for Astronomy, Ivezic et al, 2014

    Bibtex entry:

     title={Statistics, Data Mining and Machine Learning in Astronomy},
     author={{Ivezi{\'c}}, {\v Z}. and {Connolly}, A.J.
             and {Vanderplas}, J.T. and {Gray}, A.},
     publisher={Princeton University Press},
     location={Princeton, NJ},