This Jupyter notebook provides a method for detecting outliers in astronomical images using deep learning techniques. The notebook utilizes pre-trained EfficientNet models and nearest neighbors algorithm to identify images that deviate significantly from the majority.
- Ensure you have the required libraries installed:
torchvision0.13.1torch1.12.1numpy1.23.5matplotlib3.8.4astropy5.1pillow9.0.1efficientnet_pytorch0.7.1scikit-learn1.4.1
-
Data Preparation:
- Organize your astronomical image data in folders, one folder for each object.
- It is important that the stacked image is the first in alphabetical order in your folder.
- Update
rootpathandobjpathvariables to point to the directory containing your image data. - Set the
sizeparameter to the appropriate image size, we assume that the image is a square ofsizexsizepixels. plotis a boolean variable that enable or disable the plotstimeStampis a boolean variable that enable or disable the print of timestamps
-
Outlier Detection:
- Run the notebook cell by cell to execute the code.
- The notebook will display detected outliers along with their respective images and filenames.
fits_numpy: Converts FITS format astronomical images to NumPy arrays.normalize: Normalizes image data.NumpyDataset: Custom dataset class for loading NumPy arrays.get_latent_vectors: Extracts latent vectors from images using a pre-trained EfficientNet model.blockPrinting: Decorator function to block printing output.get_features: Extracts features from images using EfficientNet model.get_nns: Finds nearest neighbors of a query image in the feature space.searchOutliers: Identifies outliers in the image dataset based on nearest neighbors distances.
- This notebook assumes the presence of GPU for efficient computation. It setups the device configuration (
cuda:0orcpu) based on your hardware availability, but the execution time can seriously depend on that.
For further details and updates, refer to the paper Cavuoti, De Cicco et al. arxiv:xxx