Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

Using Multivariate Imputation by Chained Equations to Predict Redshifts of Active Galactic Nuclei

Gibson, Spencer James ; Narendra, Aditya ; Dainotti, Maria Giovanna ; Bogdan, Malgorzata LU ; Pollo, Agnieszka ; Poliszczuk, Artem ; Rinaldi, Enrico and Liodakis, Ioannis (2022) In Frontiers in Astronomy and Space Sciences 9.
Abstract

Redshift measurement of active galactic nuclei (AGNs) remains a time-consuming and challenging task, as it requires follow up spectroscopic observations and detailed analysis. Hence, there exists an urgent requirement for alternative redshift estimation techniques. The use of machine learning (ML) for this purpose has been growing over the last few years, primarily due to the availability of large-scale galactic surveys. However, due to observational errors, a significant fraction of these data sets often have missing entries, rendering that fraction unusable for ML regression applications. In this study, we demonstrate the performance of an imputation technique called Multivariate Imputation by Chained Equations (MICE), which rectifies... (More)

Redshift measurement of active galactic nuclei (AGNs) remains a time-consuming and challenging task, as it requires follow up spectroscopic observations and detailed analysis. Hence, there exists an urgent requirement for alternative redshift estimation techniques. The use of machine learning (ML) for this purpose has been growing over the last few years, primarily due to the availability of large-scale galactic surveys. However, due to observational errors, a significant fraction of these data sets often have missing entries, rendering that fraction unusable for ML regression applications. In this study, we demonstrate the performance of an imputation technique called Multivariate Imputation by Chained Equations (MICE), which rectifies the issue of missing data entries by imputing them using the available information in the catalog. We use the Fermi-LAT Fourth Data Release Catalog (4LAC) and impute 24% of the catalog. Subsequently, we follow the methodology described in Dainotti et al. (ApJ, 2021, 920, 118) and create an ML model for estimating the redshift of 4LAC AGNs. We present results which highlight positive impact of MICE imputation technique on the machine learning models performance and obtained redshift estimation accuracy.

(Less)
Please use this url to cite or link to this publication:
author
; ; ; ; ; ; and
organization
publishing date
type
Contribution to journal
publication status
published
subject
keywords
AGNs, BLLs, FERMI 4LAC, FSRQs, imputation, machine learning regressors, MICE, redshift
in
Frontiers in Astronomy and Space Sciences
volume
9
article number
836215
publisher
Frontiers Media S. A.
external identifiers
  • scopus:85127149465
ISSN
2296-987X
DOI
10.3389/fspas.2022.836215
language
English
LU publication?
yes
id
e45f602e-3346-48d2-8a8f-a7b6f781951c
date added to LUP
2022-06-02 11:20:09
date last changed
2022-11-18 18:59:27
@article{e45f602e-3346-48d2-8a8f-a7b6f781951c,
  abstract     = {{<p>Redshift measurement of active galactic nuclei (AGNs) remains a time-consuming and challenging task, as it requires follow up spectroscopic observations and detailed analysis. Hence, there exists an urgent requirement for alternative redshift estimation techniques. The use of machine learning (ML) for this purpose has been growing over the last few years, primarily due to the availability of large-scale galactic surveys. However, due to observational errors, a significant fraction of these data sets often have missing entries, rendering that fraction unusable for ML regression applications. In this study, we demonstrate the performance of an imputation technique called Multivariate Imputation by Chained Equations (MICE), which rectifies the issue of missing data entries by imputing them using the available information in the catalog. We use the Fermi-LAT Fourth Data Release Catalog (4LAC) and impute 24% of the catalog. Subsequently, we follow the methodology described in Dainotti et al. (ApJ, 2021, 920, 118) and create an ML model for estimating the redshift of 4LAC AGNs. We present results which highlight positive impact of MICE imputation technique on the machine learning models performance and obtained redshift estimation accuracy.</p>}},
  author       = {{Gibson, Spencer James and Narendra, Aditya and Dainotti, Maria Giovanna and Bogdan, Malgorzata and Pollo, Agnieszka and Poliszczuk, Artem and Rinaldi, Enrico and Liodakis, Ioannis}},
  issn         = {{2296-987X}},
  keywords     = {{AGNs; BLLs; FERMI 4LAC; FSRQs; imputation; machine learning regressors; MICE; redshift}},
  language     = {{eng}},
  publisher    = {{Frontiers Media S. A.}},
  series       = {{Frontiers in Astronomy and Space Sciences}},
  title        = {{Using Multivariate Imputation by Chained Equations to Predict Redshifts of Active Galactic Nuclei}},
  url          = {{http://dx.doi.org/10.3389/fspas.2022.836215}},
  doi          = {{10.3389/fspas.2022.836215}},
  volume       = {{9}},
  year         = {{2022}},
}