EGS master catalogue

Preparation of Legacy Survey data

The catalogue comes from dmu0_LegacySurvey.

In the catalogue, we keep:

  • The identifier (it's unique in the catalogue);
  • The position;
  • The stellarity;
  • The aperture fluxes. Are these aperture corrected?
  • The kron magnitude to be used as total magnitude (no “auto” magnitude is provided).

We don't know when the maps have been observed. We will use the year of the reference paper.

In [1]:
from herschelhelp_internal import git_version
print("This notebook was run with herschelhelp_internal version: \n{}".format(git_version()))
This notebook was run with herschelhelp_internal version: 
44f1ae0 (Thu Nov 30 18:27:54 2017 +0000)
In [2]:
%matplotlib inline
#%config InlineBackend.figure_format = 'svg'

import matplotlib.pyplot as plt
plt.rc('figure', figsize=(10, 6))
plt.style.use('ggplot')

from collections import OrderedDict
import os

from astropy import units as u
from astropy import visualization as vis
from astropy.coordinates import SkyCoord
from astropy.table import Column, Table
import numpy as np

from herschelhelp_internal.flagging import  gaia_flag_column
from herschelhelp_internal.masterlist import nb_astcor_diag_plot, nb_plot_mag_ap_evol, \
    nb_plot_mag_vs_apcor, remove_duplicates
from herschelhelp_internal.utils import astrometric_correction, mag_to_flux, aperture_correction, flux_to_mag
In [3]:
OUT_DIR =  os.environ.get('TMP_DIR', "./data_tmp")
try:
    os.makedirs(OUT_DIR)
except FileExistsError:
    pass

RA_COL = "legacy_ra"
DEC_COL = "legacy_dec"
In [4]:
# Pritine LS catalogue
orig_legacy = Table.read("../../dmu0/dmu0_LegacySurvey/data/LegacySurvey-dr4_EGS.fits")
WARNING: UnitsWarning: '1/deg^2' did not parse as fits unit: Numeric factor not supported by FITS [astropy.units.core]
WARNING: UnitsWarning: 'nanomaggy' did not parse as fits unit: At col 0, Unit 'nanomaggy' not supported by the FITS standard.  [astropy.units.core]
WARNING: UnitsWarning: '1/nanomaggy^2' did not parse as fits unit: Numeric factor not supported by FITS [astropy.units.core]
WARNING: UnitsWarning: '1/arcsec^2' did not parse as fits unit: Numeric factor not supported by FITS [astropy.units.core]

I - Aperture correction

To compute aperture correction we need to dertermine two parametres: the target aperture and the range of magnitudes for the stars that will be used to compute the correction.

Target aperture: To determine the target aperture, we simulate a curve of growth using the provided apertures and draw two figures:

  • The evolution of the magnitudes of the objects by plotting on the same plot aperture number vs the mean magnitude.
  • The mean gain (loss when negative) of magnitude is each aperture compared to the previous (except for the first of course).

As target aperture, we should use the smallest (i.e. less noisy) aperture for which most of the flux is captures.

Magnitude range: To know what limits in aperture to use when doing the aperture correction, we plot for each magnitude bin the correction that is computed and its RMS. We should then use the wide limits (to use more stars) where the correction is stable and with few dispersion.

In [5]:
bands = ["g", "r", "z"]
apertures      = [0,      1,   2,   3,    4,   5,   6,   7] 
aperture_sizes = [0.5, 0.75, 1.0, 1.5,  2.0, 3.5, 5.0, 7.0] #arcsec aperture sizes

flux = {}
flux_errors ={}
magnitudes = {}
flux_errors ={}
magnitude_errors = {}
stellarities = {}

flux_to_mag_vect = np.vectorize(flux_to_mag)

for band in bands:
    flux[band] = np.transpose(np.array( orig_legacy["apflux_{}".format(band)], dtype=np.float )) 
    flux_errors[band] = np.transpose(np.array( orig_legacy["apflux_ivar_{}".format(band)], dtype=np.float  ))
    
    magnitudes[band], magnitude_errors[band] = flux_to_mag_vect(flux[band] * 3.631e-6 ,flux_errors[band] * 3.631e-6)
    
    stellarities[band] = np.full(len(orig_legacy),0., dtype='float32')
    stellarities[band][np.array( orig_legacy["type"]) == "PSF" ] = 1.
    
    # Some sources have an infinite magnitude
    mask = np.isinf(magnitudes[band])
    magnitudes[band][mask] = np.nan
    magnitude_errors[band][mask] = np.nan
    

    
    
mag_corr = {}
/opt/herschelhelp_internal/herschelhelp_internal/utils.py:76: RuntimeWarning: invalid value encountered in log10
  magnitudes = 2.5 * (23 - np.log10(fluxes)) - 48.6
/opt/herschelhelp_internal/herschelhelp_internal/utils.py:76: RuntimeWarning: divide by zero encountered in log10
  magnitudes = 2.5 * (23 - np.log10(fluxes)) - 48.6
/opt/herschelhelp_internal/herschelhelp_internal/utils.py:80: RuntimeWarning: invalid value encountered in double_scalars
  errors = 2.5 / np.log(10) * errors_on_fluxes / fluxes

I.a - g band

In [6]:
nb_plot_mag_ap_evol(magnitudes['g'], stellarities['g'], labels=apertures)

We will use aperture 5 as target.

In [7]:
nb_plot_mag_vs_apcor(magnitudes['g'][4], 
                     magnitudes['g'][5], 
                     stellarities['g'])

We will use magnitudes between 17.0 and 18.5

In [8]:
# Aperture correction
mag_corr['g'], num, std = aperture_correction(
    magnitudes['g'][4], magnitudes['g'][5], 
    stellarities['g'],
    mag_min=17.0, mag_max=18.5)
print("Aperture correction for g band:")
print("Correction: {}".format(mag_corr['g']))
print("Number of source used: {}".format(num))
print("RMS: {}".format(std))
Aperture correction for g band:
Correction: -0.09441636798258202
Number of source used: 1084
RMS: 0.01606272315484677

I.b - r band

In [9]:
nb_plot_mag_ap_evol(magnitudes['r'], stellarities['r'], labels=apertures)

We will use aperture 5 as target.

In [10]:
nb_plot_mag_vs_apcor(magnitudes['r'][4], 
                     magnitudes['r'][5], 
                     stellarities['r'])

We use magnitudes between 17.0 and 18.5.

In [11]:
# Aperture correction
mag_corr['r'], num, std = aperture_correction(
    magnitudes['r'][4], magnitudes['r'][5], 
    stellarities['r'],
    mag_min=17.0, mag_max=18.5)
print("Aperture correction for r band:")
print("Correction: {}".format(mag_corr['r']))
print("Number of source used: {}".format(num))
print("RMS: {}".format(std))
Aperture correction for r band:
Correction: -0.08614945231256144
Number of source used: 1513
RMS: 0.03438196144974404

I.c - z band

In [12]:
nb_plot_mag_ap_evol(magnitudes['z'], stellarities['z'], labels=apertures)

We will use aperture 4 as target.

In [13]:
nb_plot_mag_vs_apcor(magnitudes['z'][4], 
                     magnitudes['z'][4], 
                     stellarities['z'])

We use magnitudes between 16.0 and 17.5.

In [14]:
# Aperture correction
mag_corr['z'], num, std = aperture_correction(
    magnitudes['z'][4], magnitudes['z'][5], 
    stellarities['z'],
    mag_min=16.0, mag_max=17.5)
print("Aperture correction for z band:")
print("Correction: {}".format(mag_corr['z']))
print("Number of source used: {}".format(num))
print("RMS: {}".format(std))
Aperture correction for z band:
Correction: -0.033215372542812815
Number of source used: 1183
RMS: 0.012679290276947705

II - Stellarity

Legacy Survey does not provide a 0 to 1 stellarity so we replace items flagged as PSF accpording to the following table:

\begin{equation*} P(star) = \frac{ \prod_{i} P(star)_i }{ \prod_{i} P(star)_i + \prod_{i} P(galaxy)_i } \end{equation*}

where $i$ is the band, and with using the same probabilities as UKDISS:

HSC flag UKIDSS flag Meaning P(star) P(galaxy) P(noise) P(saturated)
-9 Saturated 0.0 0.0 5.0 95.0
-3 Probable galaxy 25.0 70.0 5.0 0.0
-2 Probable star 70.0 25.0 5.0 0.0
0 -1 Star 90.0 5.0 5.0 0.0
0 Noise 5.0 5.0 90.0 0.0
1 +1 Galaxy 5.0 90.0 5.0 0.0
In [15]:
stellarities['g'][np.isclose(stellarities['g'], 1.)] = 0.9
stellarities['g'][np.isclose(stellarities['g'], 0.)] = 0.05
In [16]:
orig_legacy.add_column(Column(data=stellarities['g'], name="stellarity")) #Stelarites computed earlier

II - Column selection

In [17]:
imported_columns = OrderedDict({
        "objid": "legacy_id",
        "ra": "legacy_ra",
        "dec": "legacy_dec",
        "flux_g": "f_bass_g",
        "flux_ivar_g": "ferr_bass_g",
        "apflux_g": "f_ap_bass_g",
        "apflux_ivar_g": "ferr_ap_bass_g",
        "flux_r": "f_bass_r",
        "flux_ivar_r": "ferr_bass_r",
        "apflux_r": "f_ap_bass_r",
        "apflux_ivar_r": "ferr_ap_bass_r",
        "flux_z": "f_bass_z",
        "flux_ivar_z": "ferr_bass_z",
        "apflux_z": "f_ap_bass_z",
        "apflux_ivar_z": "ferr_ap_bass_z",
        "stellarity": "legacy_stellarity"
    })


catalogue = orig_legacy[list(imported_columns)]
for column in imported_columns:
    catalogue[column].name = imported_columns[column]

epoch = 2017

# Clean table metadata
catalogue.meta = None
In [18]:
# Adding flux and band-flag columns
for col in catalogue.colnames:
    if col.startswith('f_'):
        
        errcol = "ferr{}".format(col[1:])
        
        #First we take aperture 4 if it is an aperture flux
        if 'ap' in col:
            catalogue[col] = catalogue[col][:, 4] 
            catalogue[errcol] = catalogue[errcol][:, 4] 
            
        #Convert nanomaggies to uJy
        # 1 nanomaggy = 1.e-9 maggy
        # 1 maggy = 3631 Jy
        # 1 nanomaggy = 3.631×10-6 Jy
        catalogue[col] = catalogue[col] * 3.631 #* 1.e9
        catalogue[errcol] = (1/np.sqrt(catalogue[errcol])) * 3.631 #* 1.e9
        catalogue[col].unit = u.microjansky
        catalogue[errcol].unit = u.microjansky
        
        mag, magerror = flux_to_mag(np.array(catalogue[col])* 1.e-6, np.array(catalogue[errcol])* 1.e-6)
        
        if 'ap' in col:
            mag += mag_corr[col[-1]]
            catalogue[col],catalogue[errcol] = mag_to_flux(mag,magerror)
        
        # Add magnitudes
        catalogue.add_column(Column(mag , name="m{}".format(col[1:])))
        catalogue.add_column(Column(magerror , name="m{}".format(errcol[1:])))
        
        
        # Band-flag column
        if 'ap' not in col:
            catalogue.add_column(Column(np.zeros(len(catalogue), dtype=bool), name="flag{}".format(col[1:])))
        
# TODO: Set to True the flag columns for fluxes that should not be used for SED fitting.
/opt/anaconda3/envs/herschelhelp_internal/lib/python3.6/site-packages/ipykernel/__main__.py:17: RuntimeWarning: divide by zero encountered in true_divide
/opt/herschelhelp_internal/herschelhelp_internal/utils.py:76: RuntimeWarning: divide by zero encountered in log10
  magnitudes = 2.5 * (23 - np.log10(fluxes)) - 48.6
/opt/herschelhelp_internal/herschelhelp_internal/utils.py:76: RuntimeWarning: invalid value encountered in log10
  magnitudes = 2.5 * (23 - np.log10(fluxes)) - 48.6
/opt/herschelhelp_internal/herschelhelp_internal/utils.py:43: RuntimeWarning: invalid value encountered in multiply
  errors = np.log(10)/2.5 * fluxes * errors_on_magnitudes
/opt/herschelhelp_internal/herschelhelp_internal/utils.py:80: RuntimeWarning: divide by zero encountered in true_divide
  errors = 2.5 / np.log(10) * errors_on_fluxes / fluxes
In [19]:
catalogue[:10].show_in_notebook()
Out[19]:
<Table length=10>
idxlegacy_idlegacy_ralegacy_decf_bass_gferr_bass_gf_ap_bass_gferr_ap_bass_gf_bass_rferr_bass_rf_ap_bass_rferr_ap_bass_rf_bass_zferr_bass_zf_ap_bass_zferr_ap_bass_zlegacy_stellaritym_bass_gmerr_bass_gflag_bass_gm_ap_bass_gmerr_ap_bass_gm_bass_rmerr_bass_rflag_bass_rm_ap_bass_rmerr_ap_bass_rm_bass_zmerr_bass_zflag_bass_zm_ap_bass_zmerr_ap_bass_z
degdeguJyuJyuJyuJyuJyuJy
04212.9778736951.62513077078.906530.2537655.70184e-069.72997e-0815.58390.7207771.09631e-052.96819e-0721.50061.350241.4819e-051.02269e-060.0521.52570.0309348False22.010.018527720.91830.0502168False21.30020.029395620.56890.0681844False20.97290.0749284
15212.97655293751.62753522974.775970.1498114.67149e-069.72992e-0812.26250.3853531.34196e-052.96821e-0726.82840.6424232.66382e-051.02269e-060.0522.20230.034057False22.22640.022614121.17860.0341197False21.08070.024014820.32850.0259986False20.33620.0416833
26212.97800880851.62698196122.949330.2361081.76984e-069.72997e-088.380540.622755.00338e-062.96819e-0719.87251.110981.07404e-051.02268e-060.0522.72570.0869184False23.28020.0596921.59180.08068False22.15180.0644120.65440.0606984False21.32250.103383
37212.88640744451.62526544720.5615840.1536554.44873e-071.21383e-072.065350.2848771.69335e-062.96819e-073.348160.4325053.13047e-061.02269e-060.924.52650.297068False24.77940.29624223.11250.149757False23.32810.19031422.5880.140252False22.6610.354697
48212.91921982251.62612622372.970490.2257271.91153e-069.73e-087.549030.6053925.72076e-062.9682e-0720.28091.123821.41014e-051.02269e-060.0522.71790.082505False23.19650.055265821.70530.0870702False22.00640.056333120.63230.0601638False21.02680.0787417
59212.91983246751.62417630831.116480.1714151.10276e-061.14914e-070.6818840.356651.06306e-062.96819e-072.198540.608134.13827e-061.02269e-060.0523.78040.166695False23.79380.1131424.31570.567879False23.83360.3031523.04470.300322False22.3580.268318
610212.90859896751.62482588230.2714940.1523315.24418e-071.06499e-070.5631830.2790485.6112e-072.9682e-075.059140.4329563.96118e-061.02268e-060.925.31560.609189False24.60080.22049124.52340.537964False24.52740.5743322.13980.0929161False22.40540.280311
712213.0005263151.62410597122.298320.1379762.38101e-061.00536e-073.210710.2940192.91559e-062.9682e-075.208670.409292.93118e-061.02269e-060.922.99650.0651804False22.95810.045844122.63350.0994257False22.73820.11053322.10820.0853156False22.73240.378814
813213.01503824651.62396391160.9502610.1485431.27126e-069.90667e-080.5016660.3608459.52239e-072.96821e-07-0.8654080.565681.52094e-061.02269e-060.0523.95540.169721False23.63940.08460924.6490.780963False23.95310.338433nan-0.709699False23.44470.730058
914213.014040251.6243687550.7337510.1258469.01264e-079.78857e-081.349690.285911.84723e-062.96818e-073.136650.392973.13573e-061.02269e-060.924.23610.186215False24.01290.11792123.57440.229995False23.23370.1744622.65880.136025False22.65920.354103

III - Removal of duplicated sources

We remove duplicated objects from the input catalogues.

In [20]:
SORT_COLS = [
        'merr_ap_bass_g', 'merr_ap_bass_r', 'merr_ap_bass_z']
FLAG_NAME = 'legacy_flag_cleaned'

nb_orig_sources = len(catalogue)

catalogue = remove_duplicates(
    catalogue, RA_COL, DEC_COL, 
    sort_col= SORT_COLS,
    flag_name=FLAG_NAME)

nb_sources = len(catalogue)

print("The initial catalogue had {} sources.".format(nb_orig_sources))
print("The cleaned catalogue has {} sources ({} removed).".format(nb_sources, nb_orig_sources - nb_sources))
print("The cleaned catalogue has {} sources flagged as having been cleaned".format(np.sum(catalogue[FLAG_NAME])))
The initial catalogue had 166758 sources.
The cleaned catalogue has 163539 sources (3219 removed).
The cleaned catalogue has 3171 sources flagged as having been cleaned

III - Astrometry correction

We match the astrometry to the Gaia one. We limit the Gaia catalogue to sources with a g band flux between the 30th and the 70th percentile. Some quick tests show that this give the lower dispersion in the results.

In [21]:
gaia = Table.read("../../dmu0/dmu0_GAIA/data/GAIA_EGS.fits")
gaia_coords = SkyCoord(gaia['ra'], gaia['dec'])
In [22]:
nb_astcor_diag_plot(catalogue[RA_COL], catalogue[DEC_COL], 
                    gaia_coords.ra, gaia_coords.dec)
In [23]:
delta_ra, delta_dec =  astrometric_correction(
    SkyCoord(catalogue[RA_COL], catalogue[DEC_COL]),
    gaia_coords
)

print("RA correction: {}".format(delta_ra))
print("Dec correction: {}".format(delta_dec))
RA correction: -0.003632177259760283 arcsec
Dec correction: 0.004341531230522833 arcsec
In [24]:
catalogue[RA_COL].unit = u.deg
catalogue[DEC_COL].unit = u.deg
catalogue[RA_COL] = catalogue[RA_COL] +  delta_ra.to(u.deg)
catalogue[DEC_COL] = catalogue[DEC_COL] +  delta_dec.to(u.deg)
In [25]:
nb_astcor_diag_plot(catalogue[RA_COL], catalogue[DEC_COL], 
                    gaia_coords.ra, gaia_coords.dec)

IV - Flagging Gaia objects

In [26]:
catalogue.add_column(
    gaia_flag_column(SkyCoord(catalogue[RA_COL], catalogue[DEC_COL]), epoch, gaia)
)
In [27]:
GAIA_FLAG_NAME = "legacy_flag_gaia"

catalogue['flag_gaia'].name = GAIA_FLAG_NAME
print("{} sources flagged.".format(np.sum(catalogue[GAIA_FLAG_NAME] > 0)))
9145 sources flagged.

V - Saving to disk

In [28]:
catalogue.write("{}/LegacySurvey.fits".format(OUT_DIR), overwrite=True)