Skip to main content

A framework for testing and benchmarking machine learning methods on astronomical data

Hello Universe is a new project at MAST designed to help astronomers develop machine learning (ML) methods for astronomical discovery. ML will be an essential tool for analyzing the rich data sets of the upcoming decade, and Hello Universe provides a framework for testing ML algorithms and new techniques. Each entry in the Hello Universe collection includes: 

  • Data: a high-level science product (HLSP) data set for testing and benchmarking ML algorithms 
  • Code: a tutorial Jupyter notebook that provides step-by-step examples of how to apply an ML technique to the data

Though these data sets are motivated by the needs of a novice data science learner, they are sufficient for a wide range of tasks. Hello Universe entries include examples of:

  • analyzing 2D (image) and 1D (vector or light curve) data sets.
  • applying techniques for regression and for classification.
  • developing supervised and unsupervised learning models.
  • using best practices for training and optimizing models.
  • selecting metrics for assessing model performance.
Hello Universe text with Hubble icon

 

Entries

Classifying JWST/HST galaxy mergers with CNNs Classifying JWST/HST galaxy mergers with CNNs
Classifying JWST/HST galaxy mergers with CNNs
neural networks | 2d data | classification | overfitting | confusion matrix
Classifying TESS stellar flares with CNNs Classifying TESS stellar flares with CNNs
Classifying TESS stellar flares with CNNs
neural networks | 1d data | classification | prediction
Predicting 3D-HST redshift with decision trees Predicting 3D-HST redshift with decision trees
Predicting 3D-HST redshift with decision trees
decision trees | 1d data | regression | cross-validation
Classifying Pan-STARRS with (un)supervised learning Classifying Pan-STARRS with (un)supervised learning
Classifying Pan-STARRS with (un)supervised learning
classification | 1d data | PCA | tSNE | k-means | SGD | unsupervised | supervised

Contribute to Hello Universe!

Have an idea for a data set + notebook pair? We welcome your contributions to Hello Universe! Please contact archive@stsci.edu to get started.