A framework for testing and benchmarking machine learning methods on astronomical data
Hello Universe is a new project at MAST designed to help astronomers develop machine learning (ML) methods for astronomical discovery. ML will be an essential tool for analyzing the rich data sets of the upcoming decade, and Hello Universe provides a framework for testing ML algorithms and new techniques. Each entry in the Hello Universe collection includes:
- Data: a high-level science product (HLSP) data set for testing and benchmarking ML algorithms
- Code: a tutorial Jupyter notebook that provides step-by-step examples of how to apply an ML technique to the data
Though these data sets are motivated by the needs of a novice data science learner, they are sufficient for a wide range of tasks. Hello Universe entries include examples of:
- analyzing 2D (image) and 1D (vector or light curve) data sets.
- applying techniques for regression and for classification.
- developing supervised and unsupervised learning models.
- using best practices for training and optimizing models.
- selecting metrics for assessing model performance.
Classifying JWST/HST galaxy mergers with CNNsneural networks | 2d data | classification | overfitting | confusion matrix
Classifying TESS stellar flares with CNNsneural networks | 1d data | classification | prediction
Predicting 3D-HST redshift with decision treesdecision trees | 1d data | regression | cross-validation
Classifying Pan-STARRS with (un)supervised learningclassification | 1d data | PCA | tSNE | k-means | SGD | unsupervised | supervised
Contribute to Hello Universe!
Have an idea for a data set + notebook pair? We welcome your contributions to Hello Universe! Please contact firstname.lastname@example.org to get started.