11.4.10. astroML.datasets.fetch_sdss_specgals

astroML.datasets.fetch_sdss_specgals(data_home=None, download_if_missing=True)[source]

Loader for SDSS Galaxies with spectral information

Parameters
data_homeoptional, default=None

Specify another download and cache folder for the datasets. By default all scikit learn data is stored in ‘~/astroML_data’ subfolders.

download_if_missingoptional, default=True

If False, raise a IOError if the data is not locally available instead of trying to download the data from the source site.

Returns
datarecarray, shape = (327260,)

record array containing pipeline parameters

Notes

These were compiled from the SDSS database using the following SQL query:

SELECT
  G.ra, G.dec, S.mjd, S.plate, S.fiberID, --- basic identifiers
  --- basic spectral data
  S.z, S.zErr, S.rChi2, S.velDisp, S.velDispErr,
  --- some useful imaging parameters
  G.extinction_r, G.petroMag_r, G.psfMag_r, G.psfMagErr_r,
  G.modelMag_u, modelMagErr_u, G.modelMag_g, modelMagErr_g,
  G.modelMag_r, modelMagErr_r, G.modelMag_i, modelMagErr_i,
  G.modelMag_z, modelMagErr_z, G.petroR50_r, G.petroR90_r,
  --- line fluxes for BPT diagram and other derived spec. parameters
  GSL.nii_6584_flux, GSL.nii_6584_flux_err, GSL.h_alpha_flux,
  GSL.h_alpha_flux_err, GSL.oiii_5007_flux, GSL.oiii_5007_flux_err,
  GSL.h_beta_flux, GSL.h_beta_flux_err, GSL.h_delta_flux,
  GSL.h_delta_flux_err, GSX.d4000, GSX.d4000_err, GSE.bptclass,
  GSE.lgm_tot_p50, GSE.sfr_tot_p50, G.objID, GSI.specObjID
INTO mydb.SDSSspecgalsDR8 FROM SpecObj S CROSS APPLY
  dbo.fGetNearestObjEQ(S.ra, S.dec, 0.06) N, Galaxy G,
  GalSpecInfo GSI, GalSpecLine GSL, GalSpecIndx GSX, GalSpecExtra GSE
WHERE N.objID = G.objID
  AND GSI.specObjID = S.specObjID
  AND GSL.specObjID = S.specObjID
  AND GSX.specObjID = S.specObjID
  AND GSE.specObjID = S.specObjID
  --- add some quality cuts to get rid of obviously bad measurements
  AND (G.petroMag_r > 10 AND G.petroMag_r < 18)
  AND (G.modelMag_u-G.modelMag_r) > 0
  AND (G.modelMag_u-G.modelMag_r) < 6
  AND (modelMag_u > 10 AND modelMag_u < 25)
  AND (modelMag_g > 10 AND modelMag_g < 25)
  AND (modelMag_r > 10 AND modelMag_r < 25)
  AND (modelMag_i > 10 AND modelMag_i < 25)
  AND (modelMag_z > 10 AND modelMag_z < 25)
  AND S.rChi2 < 2
  AND (S.zErr > 0 AND S.zErr < 0.01)
  AND S.z > 0.02
  --- end of query ---

Examples

>>> from astroML.datasets import fetch_sdss_specgals
>>> data = fetch_sdss_specgals()  # doctest: +IGNORE_OUTPUT
>>> data.shape  # number of objects in dataset
(661598,)
>>> data.dtype.names[:5]  # first five column names
('ra', 'dec', 'mjd', 'plate', 'fiberID')
>>> print(data['ra'][:3])  # first three RA values
[ 146.71419105  146.74414186  146.62857334]
>>> print(data['dec'][:3])  #  first three declination values
[-1.04127639 -0.6522198  -0.7651468 ]