scikit-learn 0.12 was released September 2012. Find out more at

astroML 0.1 was released October 2012. Find out more at

Video Links

PyData 2012: 75-minute version of this tutorial

Scipy 2012: a 3-hour version of this tutorial

PyData NYC 2012: 45-minute version of this tutorial


All material Open source: BSD license (3 clause).



Giving credit

Please consider citing the scikit-learn if you use it.

Tutorial: Machine Learning for Astronomy with Scikit-learn


For more information on machine learning for Astronomy, see the astroML code and examples.

Machine Learning for Astronomy with scikit-learn

This tutorial offers a brief introduction to the fields of machine learning and statistical data analysis, and their application to several problems in the field of astronomy. These learning tasks are enabled by the tools available in the open-source package scikit-learn.

scikit-learn is a Python module integrating classic machine learning algorithms in the tightly-knit world of scientific Python packages (numpy, scipy, matplotlib). It aims to provide simple and efficient solutions to learning problems that are accessible to everybody and reusable in various contexts: machine-learning as a versatile tool for science and engineering.

Many of the examples and exercises in this tutorial require the ipython notebook, a tool which provides an intuitive web-based interactive environment for scientific python. Some of the material in the notebooks is duplicated in the following pages, but ipython notebook is required for some parts. For information on how to download the associated notebooks, see the Tutorial Setup and Installation page.


This document is meant to be used with scikit-learn version 0.11+. Find the latest version here.