11.4.9. astroML.datasets.fetch_sdss_sspp

astroML.datasets.fetch_sdss_sspp(data_home=None, download_if_missing=True, cleaned=False)[source]

Loader for SDSS SEGUE Stellar Parameter Pipeline data

Parameters
data_homeoptional, default=None

Specify another download and cache folder for the datasets. By default all scikit learn data is stored in ‘~/astroML_data’ subfolders.

download_if_missingbool (optional) default=True

If False, raise a IOError if the data is not locally available instead of trying to download the data from the source site.

cleanedbool (optional) default=False

if True, then return a cleaned catalog where objects with extreme values are removed.

Returns
datarecarray, shape = (327260,)

record array containing pipeline parameters

Notes

Here are the comments from the fits file header:

Imaging data and spectrum identifiers for a sample of 327,260 stars with SDSS spectra, selected as:

  1. available SSPP parameters in SDSS Data Release 9 (SSPP rerun 122, file from Y.S. Lee)

  2. 14 < r < 21 (psf magnitudes, uncorrected for ISM extinction)

  3. 10 < u < 25 & 10 < z < 25 (same as above)

  4. errors in ugriz well measured (>0) and <10

  5. 0 < u-g < 3 (all color cuts based on psf mags, dereddened)

  6. -0.5 < g-r < 1.5 & -0.5 < r-i < 1.0 & -0.5 < i-z < 1.0

  7. -200 < pmL < 200 & -200 < pmB < 200 (proper motion in mas/yr)

  8. pmErr < 10 mas/yr (proper motion error)

  9. 1 < log(g) < 5

  10. TeffErr < 300 K

Teff and TeffErr are given in Kelvin, radVel and radVelErr in km/s. (ZI, Feb 2012, ivezic@astro.washington.edu)

Examples

>>> from astroML.datasets import fetch_sdss_sspp
>>> data = fetch_sdss_sspp()  # doctest: +IGNORE_OUTPUT
>>> data.shape  # number of objects in dataset
(327260,)
>>> print(data.dtype.names[:5])  # names of the first five columns
('ra', 'dec', 'Ar', 'upsf', 'uErr')
>>> print(data['ra'][:1])  # first RA value
[49.6275024]
>>> print(data['dec'][:1])  # first DEC value
[-1.04175591]