Document details
Abstract
Space objects can be characterised using non-resolved images obtained by ground-based telescopes. However, it is a laborsome task to detect space objects in the images. Deep image recognition technology can be leveraged to detect space objects in the telescope images with accuracies over 90%. The proposed solution leverages the Feature Pyramid Network (FPN) which is a convolutional neural network for semantic segmentation. The backbone that extracts the features from images is the pre-trained EfficientNet-B7 on ImageNet. A simple preprocessing is applied to images that are overexposed to scale the input image pixel values. The training has two phases; the first phase includes the training of models on training data with 5-fold cross-validation split, and the second phase includes the training on the complete dataset including test data utilising pseudo-labelling (masks for test data are generated by best model from the phase 1). A custom post-processing method based on vector mathematics and Hungarian algorithm are developed to clean false positives and improves false negatives by discovering missing objects. In the solution, ensemble approach and denoising of labels for training data (all objects should be labelled even they are detected once in the sequence, and all objects should be removed if there is no associated signal in the image) is not utilised, and these will be investigated for future work. In addition, the post-processing method based on vector mathematics performs quite well (discovers all structures accurately as long as they are detected in two frames) for all but some crowded scenes where objects overlap with each other in different frames, and these corner cases will be addressed with a generalisable approach by improving the technique. Following paragraphs provides details about data preprocessing and model building. Analysis and conclusions are being investigated currently.
The labels for space objects are indifferent to the fact that some labels have no corresponding signal in the image due to cloud cover, atmospheric/weather effects, light pollution, sensor noise/defects, and star occlusions. Since the proposed approach is a data-driven model, such noisy labels should be amended. However, this requires some manual effort to do robust label denoising. Instead, the author used a mask generation approach that takes the point in a bounding box of size 4 by 4, and normalize pixel values within the bounding box and threshold with 0.5, finally dilute the point with a kernel of 1 and use 2 by 2 window only. This approach is intended to generate a mask for the region of interest that is noticeable within the close vicinity of the object. The proposed mask generation is used for both phases of training, namely the first phase and second phase.
Feature pyramid networks are convolutional neural networks that are used forobject detection at different scales. Feature pyramid networks are top-downarchitectures with lateral connections to generate high-level semantic featuremaps. Geo-FPN leverages the EfficientNet-B7 architecture for the bottom-up pyramid on the left as the "backbone" to extract features. EfficientNets are a family of convolutional neural networks that are scaled efficiently in depth, width and resolution using a compound coefficient for increasing the accuracy of models while making them faster and smaller.