Multi-mission Archive at Space Telescope (MAST)
(Optical/UV Science Archive Research Center)
Supporting Organization:
Space Telescope Science Institute
3700 San Martin Drive
Baltimore, MD 21218
Overall Mission:
MAST supports active and legacy mission datasets and related catalogs and surveys,
focusing primarily on data in the ultraviolet, optical, and near-IR spectral
retions.
Support includes curation of the data, providing expert support to users of the data,
providing access to data-specific calibration and analysis software, providing user support
for this software, and maintaining public access interfaces to the data. MAST works with new
mission teams in the supported wavelength regions to assist in the development of data management
plans, especially in the areas of data formats, descriptive metadata and standardization of keywords,
in the development of data access and delivery plans, and assuring data quality control.
This report covers data financially supported under the "MAST" contract.
Archive and distribution activities for HST data are supported under the HST contract.
Some HST statistics are included in this report, but more complete information on HST activities
can be found in the STScI Newsletters and in
the STSci Annual Reports.
Questions about HST can be directed to archive@stsci.edu.
Total MAST Holdings by volume as of June 30, 2005
MAST holdings without HST/GSC/DSS as of June 30, 2005
Services Provided: MAST provides support for
users seeking to understand the properties and instrumental signatures
of all archived datasets and assistance with the interfaces
to browse and retrieve these data. Access to non-HST mission
and instrument specific calibration and analysis software and assistance
in its use continues on a time-available basis. Full support
for HST related software is provided by the MAST Help Desk and
staff.
Non-HST Data Analysis Software Provided: IUE "RDAF"
package (IDL-based), IUE Final Archive processing software (IRAF port),
EUVE analysis software package (IRAF-based), Copernicus data
analysis software (IDL-based), UIT data reduction/analysis software
(FORTRAN, C, and IDL routines), WUPPE data analysis software
(FORTRAN routines requiring the FITSIO library), and HUT data
reduction software (IRAF-based) are available through MAST.
Mission Interfaces:
FUSE has deployed its CalFuse3 pipeline and has reprocessed 2/3 of the
archive with regenerated as well as new products. However, MAST has noted
that some of these data mistakenly include zero exposure times and zero
fluxes when scientifically valid observations had in fact been taken.
Delivery of new observations has essentially stopped with the failure of a
FUSE reaction wheel. The project is hopeful of recommencing observations
of objects in the polar region, but as of the close of the reporting period
few observations have been delivered. FUSE accepted a MAST suggestion during
the initial post-failure period that it make use of its temporarily relatively
lightened workload to develop a merged data file product in its next pipeline.
This product has been created by merging 5 of the 8 FUSE detector "sides"
into a nearly continuous one-dimensional spectrum. This product will be
immediately applicable to first-generational VO software protocols.
Following the submission of FUSE proposals in September, 2004, users of
MAST's FUSE retrieval site (including one who had been unhappy over the
long retrieval times late in the 2003 FUSE GI proposal cycle) have paid
compliments to MAST on the greatly reduced times required for data delivery.
The GALEX archive represents the largest mission dataset supported by
the MAST contract (HST archiving is supported by the HST contract). MAST
participates in weekly and monthly telecons with the GALEX project to
receive updates on the schedule and contents of data deliveries. During
the review period, MAST received a considerable volume of data known as an
internal release (to the GALEX Science Team) for practice in ingesting and
populating the first database. This allowed the subsequent deliveries of
four subreleases of the first GALEX Data Release, GR1, to be ingested
more easily. GALEX's deployment of the first of these releases made its
debut in December, 2004, and spectroscopic (grism) data was released in
April, 2005. Since the release of the imaging data, the GALEX data
usage (see figure) has had an average download of over 23349 datasets a month.
A Interface Control Document (version 3) was revised to
document changes imposed by this release in data interface protocols
and other internal arrangements between Caltech and MAST.
Over the last three years MAST has developed a GALEX database using
Microsoft SQLServer and .NET technology. This technology permits the
automated ingest of data, self-documentation, rapid browsing of the
entire database structure, including a listing of tables, keywords and
procedures used for qualified searches of object classes and listings
of all observations. An active helpdesk and a syndicated What's New
Page is maintained for user questions pertinent to the database and
data usage.
MAST continues to interface with the CHIPS Project concerning their
archiving plans. The project plans to transfer the CHIPS dataset and
auxiliary tools approximately one year after mission completion, or
(because of an expected add-on in funding at the time of writing), about
the middle of 2007. The total data volume will be less than 20 GB.
MAST supports the planning of the Data Management Center of the
Kepler mission through its attendance of local meetings and review
of design documentation. During the review period, the DMC won
approval of its Preliminary Design Review (PDR) document as a part
of the project's approval of its complete PDR.
The budgetary shortfalls will likewise curtail the scope of MAST's support
in the project, except for a basic website for retrieval of data.
In February NASA informed the Project of severe budget cuts. The new
funding profile will likely restrict the scope of MAST's support in
the project, possibly limiting the MAST website to one limited to
basic retrieval services.
Committee Participation within the STScI:
Levay and Kimball participated in a project to evaluate hardware
alternatives for the Kepler mission.
Levay participated in a project to recommend alternatives
to the increased demand for bandwidth capacity.
The demand has been mitigated somewhat by the addition of an Internet 2 router
by NASA Integrated Services Network (NISN), the internet service provider, for
data destined to other Internet 2 participants.
Levay is participating in an STScI project to study
a coordinated and long-term storage solution for all projects at STScI.
The plan is to choose a reasonably costed, secure heirarchical storage solution
that uses common management tools to save system administration costs.
ACTIVITIES AND MAJOR ACCOMPLISHMENTS OF THE LAST YEAR
MAST Data Ingest & Retrieval Activity
Date
Ingest Volume (GB) - Active Missions
Retrieval Volume (GB) - Active Missions
Retrieval Volume (GB) - Legacy Missions
Datasets Retreived - Active Missions
Datasets Retreived - Legacy Missions
Jul 1 2004 12:00AM
344.701
1382.197
5.227
54276
12473
Aug 1 2004 12:00AM
343.988
1076.612
142.030
52357
24081
Sep 1 2004 12:00AM
430.302
1104.396
27.781
65986
6315
Oct 1 2004 12:00AM
411.500
1685.704
91.422
59276
28973
Nov 1 2004 12:00AM
439.260
1567.505
101.697
65147
20529
Dec 1 2004 12:00AM
473.170
1557.429
21.437
179084
5185
Jan 1 2005 12:00AM
503.518
1859.858
39.422
56181
7298
Feb 1 2005 12:00AM
521.444
2816.290
23.474
78923
40096
Mar 1 2005 12:00AM
517.580
2581.407
58.090
96124
13943
Apr 1 2005 12:00AM
501.428
2532.458
41.562
86220
8823
May 1 2005 12:00AM
394.318
2132.460
7.361
65582
7503
Jun 1 2005 12:00AM
295.958
2598.555
1.786
61649
6417
Total
5177.167
22894.870
561.292
920805
181636
As MAST does not maintain retrieval statistics for DSS,
only the number of searches is displayed in order to show the general
interest level in these data. Previews are not available for VLA-FIRST data.
EUVE data is distributed from HEASARC.
This plot shows the number of datasets downloaded each month per mission during the reporting period
Data Discovery and Search Tools:
MAST has several search tools that complement the individual mission searches.
The Cross Mission searches for targets across selected missions. During the past year,
GALEX was added to the cross-mission search.
The MAST maintains a database of journal articles using data archived at MAST.
If a paper uses a MAST image or spectrum, it is included even if the main dataset
may have come from a different source. If they can be determined, the individual datasets
(or in the case of HST papers the program) are identified and stored in the database also.
ADS and CDS link to MAST to display these identified data. Papers using an individual dataset or program
are also linked from the MAST search results pages.
MAST maintains a search page that makes it easy to search the VizieR catalog holdings and then either
search and utilize the catalog or to cross-correlate the catalog with selected MAST missions.
These searches are labeled as Cross-Correlation (catalog) in the usage plot.
MAST maintains two Simple Image Access Protocol (SIAP) services- one for all the MAST missions
except GALEX and the other for GALEX.
SkyNode is the interface to a prototype of a "federated" database application called SkyQuery.
At the heart of SkyQuery is spatial matching, which means we want to find the same objects listed in different catalogs.
GALEX has implemented a skynode for these holdings.
The service known as casJobs permits GALEX users to maintain their own database of tables and to utilize
such tables for searches with the GALEX database and holdings. Implemented in late March, 30 users have utilized this service with nearly 6000 queries using the casJobs service.
Scrapbook updates: In March 2005, links to 2MASS images cached at the
NASA/IPAC Infrared Science Center (IRSA) were included in the scrapbook.
The 2MASS data (available as both jpg images and FITS files) are 20'x20'
and are centered on the listed MAST observations.
New datasets are added to the scrapbook each month from FUSE and HST observations.
High Level Science Products:
MAST continued to solicit good-quality High Level Science Products this year. MAST staff members
consulted with several team regarding the standards and procedures related to archiving HLSP at MAST.
The guidelines for submission of HLSP were modified to include new requirements related to making
these data available via MAST Virtual Observatory tools.
Several sets of High Level Science Products (HLSP) were delivered and implemented this year.
Since the HLSP are located in an anonymous FTP area,
MAST cannot precisely measure the number of distinct users downloading the data.
However, we can tabulate the number of distinct domains downloading the data.
During the reporting period, 6165 distinct domains downloaded HLSP.
The plot shown below shows the number of distinct domains for each set of HLSP .
The number of domains for those HLSP sets acquired during the reporting year are shown in pink.
A complete listing of HLSP hosted at MAST is below. Although MAST
provides an interface to the WFPC2 Associations, the data are
held at CADC. MAST distributes the data via a proxy.
High Level Science Product Holdings
High Level Science Product Set
Size
Number of Files
10 Lac Spectral Atlas (HST/GHRS)
5.3 MB
67
AGN and Quasar Spectral Atlas
73.8 MB
451
CoolCAT - Atlas of Cool Stars
1.4 GB
1388
Copernicus Atlas of 6 Selected Stars
3.6 MB
25
EUVE Spectral Atlas of Stars (EUVE)
29.4 MB
490
GOODS: The Great Observatories Origins Deep Survey
96.3 GB
1342
GRAPES - Grism-ACS Program for Extragalactic Science
78 MB
1401
Grayscale of Time Variation of gamma Cas Near SiIV Doublet
5.0 MB
7
Hubble Deep Field
2.1 GB
181
Hubble Deep Field South
7.8 GB
178
Hubble Helix Observations
13.8 GB
32
Magellanic Cloud Planetary Nebulae
721 MB
1620
OB Stars (Galactic): FUSE Spectral Atlas
30 MB
184
OB Stars (Magellanic): FUSE Spectral Atlas
1.2 MB
66
Pre-Main Sequence Stars: IUE Spectral Atlas
10.7 MB
733
Procyon (FV-IV) Spectral Atlas
1.2 MB
14
Quasar Spectrum HST/FOS
.6 MB
4
Quasar Spectrum FUSE
19.3 MB
1
Search Field from a Search for Kuiper Belt Objects
3.7 GB
8
The Medium Deep Survey
11.9 GB
4726
Ultra Deep Field
30.7 GB
2252
Ultraviolet Images of Nearby Galaxies
728.0 MB
334
WFPC2 Archival Parallels
16.6 GB
4087
alpha Ori Spectral Atlas
4.0 MB
60
chi Lupi Spectral Atlas
22.7 MB
156
TOTAL
181.73 GB
19807
New plotting and graphical display tools:
MAST users may now bring up image previews in Aladin, an interactive software sky atlas
allowing the user to visualize digitized images of any part of the sky, to superimpose
entries from astronomical catalogs. Aladin is provided and maintained by the
"Centre de Données astronomiques de Strasbourg" (CDS).
New protocols and IVOA-related services:
MAST staff members have been using draft versions of the
VO SED data model and the Simple Spectral Access Protocol (SSAP)
to begin planning access to the MAST spectral holdings through the VO.
Working with colleagues from the CADC and the ST-ECF,
staff members are designing a "spectral container" that will standardize format for spectra
from different missions and conform to the evolving data model and SSAP.
MAST created an Image Scrapbook web service. The service is a variation of the MAST SIAP webservice.
Database and software changes were implemented that flag and serve the representative observations that are
included in the Scrapbook. The MAST Image Scrapbook web service was then entered into the IVOA registry.
The new web service permits users and archival services to obtain images as a ftp-get
request for implementation into their tools.
IRSA is now using this service as a part of its IRSA-MAST Scrapbook tool.
New interface pages and search tools:
MAST has imported and tailored the "CasJobs" tool originally written for queries of SDSS objects.
CasJobs permits the posting in background of large queries and of lists of objects or coordinate positions.
The results may be searched and manipulated by users on a local database assigned to each (registered) user.
A tutorial has been written and maintained to allow novice users to navigate through many of the features
of the web site and includes links to sites where the user may learn to write SQL queries.
CasJobs is a package, adapted from a release by the JHU/SDSS team, that allows specialized SQL accesses
to the GALEX database, most commonly for the submission of long-running queries and for matching of GALEX
entries to input object names or coordinate lists. Users may keep and work with the results indefinitely
on a local "My DataBase" on the GALEX/CasJobs server. This allows one to work on My Database with matches
of physical parameters on objects observed by GALEX and SDSS (thanks to an preexisting list of cross-matched
objects observed by both missions).
Many MAST searches can be run as web services or as HTTP GET requests.
A web page documents the available web services is now available.
Enhancements of User Interface Pages and Tools:
MAST implemented a more sophisticated website design in December 2004.
The website was reorganized to attempt to make it easier for users to find items
of interest. Pull down menus made it possible to "flatten" the structure so that
more of the website is available from the top pages, thus reducing the
number of "clicks" needed to navigate the website.
As part of the new website, a tutorial was written with an introduction to MAST,
the various search tools and interfaces.
The tutorial includes some hints and techniques for using the tools for specialized searches.
MAST enhanced the mission cross-correlation search and the VizieR/Cross-correlation search routines
to include the GALEX data.
The user interface to the GALEX database is a web site that provides
the capability of querying the data by object name, coordinates, or
physical property either through a "MAST-style" simple query form or
by detailed sample queries written in SQL. The user may modify the
latter to generate customized queries. Results ("Explore") pages are
active (see active), thereby allowing users to query objects in the
vicinity of the initially queried position. MAST has obtained a list
of SDSS objects contained in the GALEX Release 1 database, thereby
allowiing cross correlations on properties from SDSS investigations,
such as optical magnitudes and redshifts.
The VIZIER search and cross-correlation tool was improved using suggestions made by
MAST Users Group members. Calibration files were properly removed from cross correlations
with HST instruments, the search radii used for each mission are now displayed,
and all the VizieR catalog values are displayed on the results page.
MAST has written a tutorial
to demonstrate how to use the spectrum combination/filtering functions
of the Specview
tool written by I. Busko. This tool has been enhanced to allow the
use of trusted applets to be downloaded onto one's computer and to
access local spectral data files. The thrust of the tutorial is to
show users how to combine (coadd and merge) spectra, detrend them,
and to compute Fast Fourier filtered spectra of these results. The
results may be stored and reused for later sessions.
Outreach to the user community:
The MAST Users Group meets annually to provide input and guidance in
setting priorities for expanded services. This year the Users Group met in February, 2005.
Presentations to the MUG and the MUG reports are received are posted on the web.
MAST conducted a comprehensive survey of user preferences,
attitudes, and search practices. Results and responses to some comments
were presented to the MAST Users Group and also
placed on the web.
MAST provides electronic newsletters as needed. This year one newsletter was distributed in April 2005.
MAST staff participated in several conferences or workshops (AAS, ADASS, IVOA and NVO).
The Astrophysical Data Centers Executive Council (ADEC):
ADEC representatives White and Kamp attended
an ADEC meeting in Pasadena October 29, 2004. The status of implementation of
dataset identifiers at the various archive centers was discussed. The use of
ADS identifiers will be advertised in a variety of ways including
an announcement at AAS and by announcements in participating journals.
There was a brief discussion on the efficacy of reducing the number of
scripting languages used by the various ADEC members. The conclusion was
that this was not practical, but that ADEC members might share libraries
that could be of common use.
Kamp participated in an ADEC telecon on March 25, 2005. ADEC discussed the
possible joint NASA/NSF funding of an NVO/all datacenters proposal from FY07.
ADEC members planned to prepare slides to show hour our major goals fit into the
NASA strategy planning. There was additional discussion of the dataset identifiers.
IRSA is setting up the tagging for complex Keck datasets. The link will actually
start a query that presents the whole data structure, meaning that one tag is no
longer attached to an individual file. Other issues such as the dismissal of a service
remain to be resolved.
MAST proposed that all data centers impose the FITS standard strictly.
Some centers do not have the manpower to impose the FITS standard and
use more relaxed criteria for flagging of datasets that are not correct.
For example, IRSA checks that the file can be read using the standard
cfitsio library from Goddard, while NED checks that files submitted to
them can be read by Aladin and other common tools (probably an even
weaker criteria). MAST prefers not to distribute data that fails when
tested with a more thorough tool such as fitsverify. We reached an
agreement to work towards a common datacenter policy.
Coordination activities:
The web service for the MAST Scrapbook was created to provide an
easy and timely method for IRSA to see what data were marked as
"representative". They use it in a tool they developed based on
our Scrapbook. MAST also worked with IRSA to incorporate links
to the 2MASS previews within the MAST Scrapbook.
MAST Literature Links:
The publications database and the links between scientific papers
and the referenced MAST datasets are regularly updated
as new citations become available through the ADS.
Below is a plot showing the number of papers published in referred
journals during the reporting period (July 2004 through June 2005).
During the past year, MAST began to track the number of citations per paper in the journal database.
We obtain the total number of citations per paper from the ADS. The count is updated at least once
per month.
Below is a chart that shows the average number of citations for papers published that year.
The citation record for IUE from 1978 through 1990 is not included in this plot.
The "fall-off" in the average number of citations per paper for those articles published more recently
is due to the lag between publication of a paper and citation of a paper in a later publication.
Meylan, Madrid and Macchetto published a paper in
PASP,116:790 entitled "HST Science Metrics".
These authors state that the peak of the citation rate occurs about 2 years after publication.
We show below a plot of the average number of citations per paper over the
publication lifetime per mission. (If papers from the years 2004-2005 are
excluded, the average number of citations per paper increases.)
STAFFING CHANGES AT MAST
During the year Rachel Somerville left MAST and Alberto Conti scaled back participation in MAST activities.
Shui-Ay Tseng joined the staff. Conti and Tseng are partially salaried from the MAST contract.
PLANS FOR COMING YEAR
Future datasets:
MAST will receive no datasets from new missions during the coming
year. Data will continue to be relased steadily from FUSE and as GALEX
Release 2, in late 2005.
Future Services for Ongoing Missions:
MAST will provide a Spectral Services utility that permits users to
perform cone searches on objects for which GALEX grism spectra are
available. The service will permit users to manipulate spectra with
a coplotting tool. The MAST/GALEX group is also developing footprint
services that will permit users to perform general cross correlations
of objects in nonoverlapping GALEX sky fields and with objects observed
with GALEX and other survey missions. Shopping cart services are being
developed for GALEX public release data to permit users to create their
own customized fits files of collections of objects of interest.
MAST hopes to be able to provide footprint services for a few legacy
missions or instruments by the end of the next reporting period.
MAST will ingest a new FUSE data product: a merged spectrum called
an "nvo" file. This addition will permit users to work with the entire
FUSE wavelength range from a single file and will also permit FUSE
data to be conveniently accessed by IVOA search tools.
MAST will develop the first generation database for the testing
of Kepler data (Kepler launch has slipped to mid 2008).
MAST will distribute several new High Level Data products from a
range of GI and HST-Treasury programs. These will include a (FUSE)
Wolf-Rayet spectal atlas, a FUSE atlas of galaxies,
GEMS (HST), HPOL data (in support of WUPPE),
GOODS (version 2; HST), COSMOS (HST), Helix Nebula (HST/NICMOS),
Eta Carina (HST/STIS), and a few B-star UV spectral atlases (IUE).