spacer link to MAST page spacer logo image spacer
 
link to STScI page


NASA Data Center Annual Program Plan

Program Year: Reporting period July 2004 through June 2005
Data Center/Service: Multi-mission Archive at Space Telescope (MAST)
(Optical/UV Science Archive Research Center) 
Supporting Organization: Space Telescope Science Institute
3700 San Martin Drive
Baltimore, MD 21218

Overall Mission:  MAST supports active and legacy mission datasets and related catalogs and surveys, focusing primarily on data in the ultraviolet, optical, and near-IR spectral retions. Support includes curation of the data, providing expert support to users of the data, providing access to data-specific calibration and analysis software, providing user support for this software, and maintaining public access interfaces to the data. MAST works with new mission teams in the supported wavelength regions to assist in the development of data management plans, especially in the areas of data formats, descriptive metadata and standardization of keywords, in the development of data access and delivery plans, and assuring data quality control.

This report covers data financially supported under the "MAST" contract. Archive and distribution activities for HST data are supported under the HST contract. Some HST statistics are included in this report, but more complete information on HST activities can be found in the STScI Newsletters and in the STSci Annual Reports. Questions about HST can be directed to archive@stsci.edu.

Total MAST Holdings by volume as of June 30, 2005



MAST holdings without HST/GSC/DSS as of June 30, 2005



MAST Data Holdings

Name Size Number of Observations Active Mission Duration
ACTIVE MISSIONS
FUSE 673.53 GB 3985 1999-
GALEX 429 GB (public)
173 GB (proprietary)
75 GB (catalog)
3252 (public)
187 (proprietary)
19,730,767 (catalog objects)
2003-
HST 23.996 TB 532628 1990-
LEGACY MISSIONS
ORFEUS: BEFS 4.1 GB 332 Sept. 1993; Nov. 1996
ORFEUS: IMAPS 0.3 GB 643 Sept. 1993; Nov. 1996
ORFEUS: TUES 0.2 GB 229 Nov. 1996
EUVE 96 GB 1377 1992-Jan. 2001
ASTRO: UIT 56 GB 1442 Dec. 1990; March 1995
ASTRO: HUT 0.6 GB 516 Dec. 1990; March 1995
ASTRO: WUPPE 0.1 GB 238 Dec. 1990; March 1995
IUE Final Archive 475 GB 103,552 1978-1996
IUE SIPS 125 GB 104,296 1978-1996
Copernicus 0.8 GB 551 1972-1981
CATALOGS & SURVEYS
VLA-FIRST 183.98 GB (compressed) 29,153 1993-
Digitized Sky Surveys 5 TB n/a 1950-58, 1975-99
GSC I, II 2 TB n/a 1950-58, 1975-99

Services Provided: MAST provides support for users seeking to understand the properties and instrumental signatures of all archived datasets and assistance with the interfaces to browse and retrieve these data. Access to non-HST mission and instrument specific calibration and analysis software and assistance in its use continues on a time-available basis. Full support for HST related software is provided by the MAST Help Desk and staff.

Non-HST Data Analysis Software Provided: IUE "RDAF" package (IDL-based), IUE Final Archive processing software (IRAF port), EUVE analysis software package (IRAF-based), Copernicus data analysis software (IDL-based), UIT data reduction/analysis software (FORTRAN, C, and IDL routines), WUPPE data analysis software (FORTRAN routines requiring the FITSIO library), and HUT data reduction software (IRAF-based) are available through MAST.

Mission Interfaces:

  • FUSE has deployed its CalFuse3 pipeline and has reprocessed 2/3 of the archive with regenerated as well as new products. However, MAST has noted that some of these data mistakenly include zero exposure times and zero fluxes when scientifically valid observations had in fact been taken. Delivery of new observations has essentially stopped with the failure of a FUSE reaction wheel. The project is hopeful of recommencing observations of objects in the polar region, but as of the close of the reporting period few observations have been delivered. FUSE accepted a MAST suggestion during the initial post-failure period that it make use of its temporarily relatively lightened workload to develop a merged data file product in its next pipeline. This product has been created by merging 5 of the 8 FUSE detector "sides" into a nearly continuous one-dimensional spectrum. This product will be immediately applicable to first-generational VO software protocols. Following the submission of FUSE proposals in September, 2004, users of MAST's FUSE retrieval site (including one who had been unhappy over the long retrieval times late in the 2003 FUSE GI proposal cycle) have paid compliments to MAST on the greatly reduced times required for data delivery.

  • The GALEX archive represents the largest mission dataset supported by the MAST contract (HST archiving is supported by the HST contract). MAST participates in weekly and monthly telecons with the GALEX project to receive updates on the schedule and contents of data deliveries. During the review period, MAST received a considerable volume of data known as an internal release (to the GALEX Science Team) for practice in ingesting and populating the first database. This allowed the subsequent deliveries of four subreleases of the first GALEX Data Release, GR1, to be ingested more easily. GALEX's deployment of the first of these releases made its debut in December, 2004, and spectroscopic (grism) data was released in April, 2005. Since the release of the imaging data, the GALEX data usage (see figure) has had an average download of over 23349 datasets a month. A Interface Control Document (version 3) was revised to document changes imposed by this release in data interface protocols and other internal arrangements between Caltech and MAST.

  • Over the last three years MAST has developed a GALEX database using Microsoft SQLServer and .NET technology. This technology permits the automated ingest of data, self-documentation, rapid browsing of the entire database structure, including a listing of tables, keywords and procedures used for qualified searches of object classes and listings of all observations. An active helpdesk and a syndicated What's New Page is maintained for user questions pertinent to the database and data usage.

  • MAST continues to interface with the CHIPS Project concerning their archiving plans. The project plans to transfer the CHIPS dataset and auxiliary tools approximately one year after mission completion, or (because of an expected add-on in funding at the time of writing), about the middle of 2007. The total data volume will be less than 20 GB.

  • MAST supports the planning of the Data Management Center of the Kepler mission through its attendance of local meetings and review of design documentation. During the review period, the DMC won approval of its Preliminary Design Review (PDR) document as a part of the project's approval of its complete PDR. The budgetary shortfalls will likewise curtail the scope of MAST's support in the project, except for a basic website for retrieval of data. In February NASA informed the Project of severe budget cuts. The new funding profile will likely restrict the scope of MAST's support in the project, possibly limiting the MAST website to one limited to basic retrieval services.


Committee Participation within the STScI: 

  • Levay and Kimball participated in a project to evaluate hardware alternatives for the Kepler mission.

  • Levay participated in a project to recommend alternatives to the increased demand for bandwidth capacity. The demand has been mitigated somewhat by the addition of an Internet 2 router by NASA Integrated Services Network (NISN), the internet service provider, for data destined to other Internet 2 participants.

  • Levay is participating in an STScI project to study a coordinated and long-term storage solution for all projects at STScI. The plan is to choose a reasonably costed, secure heirarchical storage solution that uses common management tools to save system administration costs.
ACTIVITIES AND MAJOR ACCOMPLISHMENTS OF THE LAST YEAR

MAST Data Ingest & Retrieval Activity

Date Ingest Volume (GB) - Active Missions Retrieval Volume (GB) - Active Missions Retrieval Volume (GB) - Legacy Missions Datasets Retreived - Active Missions Datasets Retreived - Legacy Missions
Jul 1 2004 12:00AM 344.701 1382.197 5.227 54276 12473
Aug 1 2004 12:00AM 343.988 1076.612 142.030 52357 24081
Sep 1 2004 12:00AM 430.302 1104.396 27.781 65986 6315
Oct 1 2004 12:00AM 411.500 1685.704 91.422 59276 28973
Nov 1 2004 12:00AM 439.260 1567.505 101.697 65147 20529
Dec 1 2004 12:00AM 473.170 1557.429 21.437 179084 5185
Jan 1 2005 12:00AM 503.518 1859.858 39.422 56181 7298
Feb 1 2005 12:00AM 521.444 2816.290 23.474 78923 40096
Mar 1 2005 12:00AM 517.580 2581.407 58.090 96124 13943
Apr 1 2005 12:00AM 501.428 2532.458 41.562 86220 8823
May 1 2005 12:00AM 394.318 2132.460 7.361 65582 7503
Jun 1 2005 12:00AM 295.958 2598.555 1.786 61649 6417
Total5177.167 22894.870 561.292 920805 181636

As MAST does not maintain retrieval statistics for DSS, only the number of searches is displayed in order to show the general interest level in these data. Previews are not available for VLA-FIRST data. EUVE data is distributed from HEASARC.



This plot shows the number of datasets downloaded each month per mission during the reporting period










Data Discovery and Search Tools:

MAST has several search tools that complement the individual mission searches.

  • The Cross Mission searches for targets across selected missions. During the past year, GALEX was added to the cross-mission search.
  • The MAST maintains a database of journal articles using data archived at MAST. If a paper uses a MAST image or spectrum, it is included even if the main dataset may have come from a different source. If they can be determined, the individual datasets (or in the case of HST papers the program) are identified and stored in the database also. ADS and CDS link to MAST to display these identified data. Papers using an individual dataset or program are also linked from the MAST search results pages.
  • MAST maintains a search page that makes it easy to search the VizieR catalog holdings and then either search and utilize the catalog or to cross-correlate the catalog with selected MAST missions. These searches are labeled as Cross-Correlation (catalog) in the usage plot.
  • MAST maintains two Simple Image Access Protocol (SIAP) services- one for all the MAST missions except GALEX and the other for GALEX.
  • SkyNode is the interface to a prototype of a "federated" database application called SkyQuery. At the heart of SkyQuery is spatial matching, which means we want to find the same objects listed in different catalogs. GALEX has implemented a skynode for these holdings.
  • The service known as casJobs permits GALEX users to maintain their own database of tables and to utilize such tables for searches with the GALEX database and holdings. Implemented in late March, 30 users have utilized this service with nearly 6000 queries using the casJobs service.


Scrapbook updates: In March 2005, links to 2MASS images cached at the NASA/IPAC Infrared Science Center (IRSA) were included in the scrapbook. The 2MASS data (available as both jpg images and FITS files) are 20'x20' and are centered on the listed MAST observations. New datasets are added to the scrapbook each month from FUSE and HST observations.

High Level Science Products:

MAST continued to solicit good-quality High Level Science Products this year. MAST staff members consulted with several team regarding the standards and procedures related to archiving HLSP at MAST. The guidelines for submission of HLSP were modified to include new requirements related to making these data available via MAST Virtual Observatory tools.

Several sets of High Level Science Products (HLSP) were delivered and implemented this year.

Since the HLSP are located in an anonymous FTP area, MAST cannot precisely measure the number of distinct users downloading the data. However, we can tabulate the number of distinct domains downloading the data. During the reporting period, 6165 distinct domains downloaded HLSP. The plot shown below shows the number of distinct domains for each set of HLSP . The number of domains for those HLSP sets acquired during the reporting year are shown in pink.



A complete listing of HLSP hosted at MAST is below. Although MAST provides an interface to the WFPC2 Associations, the data are held at CADC. MAST distributes the data via a proxy.

High Level Science Product Holdings
High Level Science Product Set Size
Number of Files
10 Lac Spectral Atlas (HST/GHRS)
5.3 MB
67
AGN and Quasar Spectral Atlas
73.8 MB
451
CoolCAT - Atlas of Cool Stars
1.4 GB
1388
Copernicus Atlas of 6 Selected Stars
3.6 MB
25
EUVE Spectral Atlas of Stars (EUVE)
29.4 MB
490
GOODS: The Great Observatories Origins Deep Survey
96.3 GB
1342
GRAPES - Grism-ACS Program for Extragalactic Science 78 MB
1401
Grayscale of Time Variation of gamma Cas Near SiIV Doublet
5.0 MB
7
Hubble Deep Field
2.1 GB
181
Hubble Deep Field South
7.8 GB
178
Hubble Helix Observations
13.8 GB
32
Magellanic Cloud Planetary Nebulae
721 MB
1620
OB Stars (Galactic): FUSE Spectral Atlas
30 MB
184
OB Stars (Magellanic): FUSE Spectral Atlas
1.2 MB
66
Pre-Main Sequence Stars: IUE Spectral Atlas
10.7 MB
733
Procyon (FV-IV) Spectral Atlas
1.2 MB
14
Quasar Spectrum HST/FOS
.6 MB
4
Quasar Spectrum FUSE
19.3 MB
1
Search Field from a Search for Kuiper Belt Objects
3.7 GB
8
The Medium Deep Survey
11.9 GB
4726
Ultra Deep Field
30.7 GB
2252
Ultraviolet Images of Nearby Galaxies
728.0 MB
334
WFPC2 Archival Parallels
16.6 GB
4087
alpha Ori Spectral Atlas
4.0 MB
60
chi Lupi Spectral Atlas
22.7 MB
156
TOTAL
181.73 GB
19807

New plotting and graphical display tools:

  • MAST users may now bring up image previews in Aladin, an interactive software sky atlas allowing the user to visualize digitized images of any part of the sky, to superimpose entries from astronomical catalogs. Aladin is provided and maintained by the "Centre de Données astronomiques de Strasbourg" (CDS).

New protocols and IVOA-related services:

  • MAST staff members have been using draft versions of the VO SED data model and the Simple Spectral Access Protocol (SSAP) to begin planning access to the MAST spectral holdings through the VO. Working with colleagues from the CADC and the ST-ECF, staff members are designing a "spectral container" that will standardize format for spectra from different missions and conform to the evolving data model and SSAP.

  • MAST created an Image Scrapbook web service. The service is a variation of the MAST SIAP webservice. Database and software changes were implemented that flag and serve the representative observations that are included in the Scrapbook. The MAST Image Scrapbook web service was then entered into the IVOA registry. The new web service permits users and archival services to obtain images as a ftp-get request for implementation into their tools. IRSA is now using this service as a part of its IRSA-MAST Scrapbook tool.

New interface pages and search tools:

  • MAST has imported and tailored the "CasJobs" tool originally written for queries of SDSS objects. CasJobs permits the posting in background of large queries and of lists of objects or coordinate positions. The results may be searched and manipulated by users on a local database assigned to each (registered) user. A tutorial has been written and maintained to allow novice users to navigate through many of the features of the web site and includes links to sites where the user may learn to write SQL queries. CasJobs is a package, adapted from a release by the JHU/SDSS team, that allows specialized SQL accesses to the GALEX database, most commonly for the submission of long-running queries and for matching of GALEX entries to input object names or coordinate lists. Users may keep and work with the results indefinitely on a local "My DataBase" on the GALEX/CasJobs server. This allows one to work on My Database with matches of physical parameters on objects observed by GALEX and SDSS (thanks to an preexisting list of cross-matched objects observed by both missions).

  • Many MAST searches can be run as web services or as HTTP GET requests. A web page documents the available web services is now available.


Enhancements of User Interface Pages and Tools:

  • MAST implemented a more sophisticated website design in December 2004. The website was reorganized to attempt to make it easier for users to find items of interest. Pull down menus made it possible to "flatten" the structure so that more of the website is available from the top pages, thus reducing the number of "clicks" needed to navigate the website.

    As part of the new website, a tutorial was written with an introduction to MAST, the various search tools and interfaces. The tutorial includes some hints and techniques for using the tools for specialized searches.

    MAST enhanced the mission cross-correlation search and the VizieR/Cross-correlation search routines to include the GALEX data.

  • The user interface to the GALEX database is a web site that provides the capability of querying the data by object name, coordinates, or physical property either through a "MAST-style" simple query form or by detailed sample queries written in SQL. The user may modify the latter to generate customized queries. Results ("Explore") pages are active (see active), thereby allowing users to query objects in the vicinity of the initially queried position. MAST has obtained a list of SDSS objects contained in the GALEX Release 1 database, thereby allowiing cross correlations on properties from SDSS investigations, such as optical magnitudes and redshifts.

  • The VIZIER search and cross-correlation tool was improved using suggestions made by MAST Users Group members. Calibration files were properly removed from cross correlations with HST instruments, the search radii used for each mission are now displayed, and all the VizieR catalog values are displayed on the results page.

    MAST has written a tutorial to demonstrate how to use the spectrum combination/filtering functions of the Specview tool written by I. Busko. This tool has been enhanced to allow the use of trusted applets to be downloaded onto one's computer and to access local spectral data files. The thrust of the tutorial is to show users how to combine (coadd and merge) spectra, detrend them, and to compute Fast Fourier filtered spectra of these results. The results may be stored and reused for later sessions.


Outreach to the user community:

  • The MAST Users Group meets annually to provide input and guidance in setting priorities for expanded services. This year the Users Group met in February, 2005. Presentations to the MUG and the MUG reports are received are posted on the web.
  • MAST conducted a comprehensive survey of user preferences, attitudes, and search practices. Results and responses to some comments were presented to the MAST Users Group and also placed on the web.
  • MAST provides electronic newsletters as needed. This year one newsletter was distributed in April 2005.
  • MAST staff participated in several conferences or workshops (AAS, ADASS, IVOA and NVO).


The Astrophysical Data Centers Executive Council (ADEC): 

ADEC representatives White and Kamp attended an ADEC meeting in Pasadena October 29, 2004. The status of implementation of dataset identifiers at the various archive centers was discussed. The use of ADS identifiers will be advertised in a variety of ways including an announcement at AAS and by announcements in participating journals. There was a brief discussion on the efficacy of reducing the number of scripting languages used by the various ADEC members. The conclusion was that this was not practical, but that ADEC members might share libraries that could be of common use.

Kamp participated in an ADEC telecon on March 25, 2005. ADEC discussed the possible joint NASA/NSF funding of an NVO/all datacenters proposal from FY07. ADEC members planned to prepare slides to show hour our major goals fit into the NASA strategy planning. There was additional discussion of the dataset identifiers. IRSA is setting up the tagging for complex Keck datasets. The link will actually start a query that presents the whole data structure, meaning that one tag is no longer attached to an individual file. Other issues such as the dismissal of a service remain to be resolved. MAST proposed that all data centers impose the FITS standard strictly. Some centers do not have the manpower to impose the FITS standard and use more relaxed criteria for flagging of datasets that are not correct. For example, IRSA checks that the file can be read using the standard cfitsio library from Goddard, while NED checks that files submitted to them can be read by Aladin and other common tools (probably an even weaker criteria). MAST prefers not to distribute data that fails when tested with a more thorough tool such as fitsverify. We reached an agreement to work towards a common datacenter policy.

Coordination activities:

The web service for the MAST Scrapbook was created to provide an easy and timely method for IRSA to see what data were marked as "representative". They use it in a tool they developed based on our Scrapbook. MAST also worked with IRSA to incorporate links to the 2MASS previews within the MAST Scrapbook.


MAST Literature Links:  

The publications database and the links between scientific papers and the referenced MAST datasets are regularly updated as new citations become available through the ADS. Below is a plot showing the number of papers published in referred journals during the reporting period (July 2004 through June 2005).

During the past year, MAST began to track the number of citations per paper in the journal database. We obtain the total number of citations per paper from the ADS. The count is updated at least once per month. Below is a chart that shows the average number of citations for papers published that year. The citation record for IUE from 1978 through 1990 is not included in this plot. The "fall-off" in the average number of citations per paper for those articles published more recently is due to the lag between publication of a paper and citation of a paper in a later publication. Meylan, Madrid and Macchetto published a paper in PASP,116:790 entitled "HST Science Metrics". These authors state that the peak of the citation rate occurs about 2 years after publication.

We show below a plot of the average number of citations per paper over the publication lifetime per mission. (If papers from the years 2004-2005 are excluded, the average number of citations per paper increases.)

STAFFING CHANGES AT MAST

During the year Rachel Somerville left MAST and Alberto Conti scaled back participation in MAST activities. Shui-Ay Tseng joined the staff. Conti and Tseng are partially salaried from the MAST contract.


PLANS FOR COMING YEAR

Future datasets:

MAST will receive no datasets from new missions during the coming year. Data will continue to be relased steadily from FUSE and as GALEX Release 2, in late 2005.

Future Services for Ongoing Missions:

  • MAST will provide a Spectral Services utility that permits users to perform cone searches on objects for which GALEX grism spectra are available. The service will permit users to manipulate spectra with a coplotting tool. The MAST/GALEX group is also developing footprint services that will permit users to perform general cross correlations of objects in nonoverlapping GALEX sky fields and with objects observed with GALEX and other survey missions. Shopping cart services are being developed for GALEX public release data to permit users to create their own customized fits files of collections of objects of interest.

  • MAST hopes to be able to provide footprint services for a few legacy missions or instruments by the end of the next reporting period.

  • MAST will ingest a new FUSE data product: a merged spectrum called an "nvo" file. This addition will permit users to work with the entire FUSE wavelength range from a single file and will also permit FUSE data to be conveniently accessed by IVOA search tools.

  • MAST will develop the first generation database for the testing of Kepler data (Kepler launch has slipped to mid 2008).

  • MAST will distribute several new High Level Data products from a range of GI and HST-Treasury programs. These will include a (FUSE) Wolf-Rayet spectal atlas, a FUSE atlas of galaxies, GEMS (HST), HPOL data (in support of WUPPE), GOODS (version 2; HST), COSMOS (HST), Helix Nebula (HST/NICMOS), Eta Carina (HST/STIS), and a few B-star UV spectral atlases (IUE).