HOLISMOKES. XI. Evaluation of supervised neural networks for strong-lens searches in ground-based imaging surveys

While supervised neural networks have become state of the art for identifying the rare strong gravitational lenses from large imaging data sets, their selection remains significantly affected by the large number and diversity of nonlens contaminants. This work evaluates and compares systematically the performance of neural networks in order to move towards a rapid selection of galaxy-scale strong lenses with minimal human input in the era of deep, wide-scale surveys. We used multiband images from PDR2 of the Hyper-Suprime Cam (HSC) Wide survey to build test sets mimicking an actual classification experiment, with 189 strong lenses previously found over the HSC footprint and 70,910 nonlens galaxies in COSMOS covering representative lens-like morphologies. Multiple networks were trained on different sets of realistic strong-lens simulations and nonlens galaxies, with various architectures and data pre-processing, mainly using the deepest gri bands. Most networks reached excellent area under the Receiver Operating Characteristic (ROC) curves on the test set of 71099 objects, and we determined the ingredients to optimize the true positive rate for a total number of false positives equal to zero or 10 (TPR0 and TPR10).


The overall performances strongly depend on the construction of the ground-truth training data and they typically, but not systematically, improve using our baseline residual network architecture. TPR0 tends to be higher for ResNets (≃ 10–40%) compared to AlexNet-like networks or G-CNNs. Improvements are found when applying random shifts to the image centroids and square root stretches to the pixel values, adding z band, or using random viewpoints of the original images, but not when adding g − αi difference images (where α is a tuned constant) to subtract emission from the central galaxy. The most significant gain is obtained with committees of networks trained on different data sets, and showing a moderate overlap between populations of false positives. Nearly-perfect invariance to image quality can be achieved by using realistic PSF models in our lens simulation pipeline, and by training networks either with large number of bands, or jointly with the PSF and science frames. Overall, we show the possibility to reach a TPR0 as high as 60% for the test sets under consideration, which opens promising perspectives for pure selection of strong lenses without human input using the Rubin Observatory and other forthcoming ground-based surveys.