spacer link to MAST page spacer logo image spacer

[Top] [Prev] [Next] [Bottom]



Preface

In the beginning was the Hubble Data Archive (HDA), which contained data from HST. Users searched the catalog, sometimes called the DADS catalog, using Starview, a software package developed at STScI. Data were retrieved through the Data Archiving and Distribution System (DADS) and written to tapes, which were mailed to the user. As networks developed and became robust, the demand for web-based search and retrieval grew. A web interface was first made available in 1994. A few years later, the Multimission Archive at STScI (MAST), funded under a separate contract, was created to host data from optical, UV and IR space-based missions. At about the same time, on the fly calibration (OTFC), later replaced by on the fly recalibration (OTFR), was developed for some HST data. At STScI a user could find data from many missions, some available for immediate download, some delayed by OTFR processing. Starview could be used to search for data from some missions, while the web interface, sometimes called MAST web, could be used to search all missions. Some complicated searches of the HST catalog, which Starview supports, are not supported by the web. The alphabet soup of acronyms continued to grow as more missions and tools were added to the archive. The distinction between HDA and MAST lessened except in funding.

In this document the holdings at STScI, whether archived and maintained under the Hubble contract or the MAST contract, are treated as one archive, called the archive, as that's what it looks like to an outside user. No distinction is made between datasets stored on spinning disk and those stored in archive appliances. Where it is important, the distinctions between HDA and MAST are delineated. For example, there are some differences when retrieving HST or FUSE data, and some acknowledgement distinctions are made, as required by the funding sources.

While this document was in revision, work on the Hubble Legacy Archive (HLA) began. As the HLA is under development, it will not be discussed in this version of the Archive Manual.

1. Introduction to the Archive

Archives are an important component of astronomical research programs. Indeed, the Hubble Space Telescope (HST) program regularly calls for proposals for funded archival research (by U.S. investigators using HST data) as part of the HST Call for Proposals. However, research utilizing any data in the archive does not have to be funded by the archival research program. The archive is available to any individual with the interest and hardware capabilities required to analyze data from any mission in the archive. Note: The archive is not a repository of pretty, heavily processed pictures; HST press releases can be downloaded from the Web site maintained by the Office for Public Outreach (OPO) at http://hubblesite.org.

1.1 Data in the Archive

As noted above, the archive primarily contains data from UV, optical and IR spaced based missions, some of which are still active (e.g.., HST and GALEX). A complete listing of the missions is available on the MAST web site, http://archive.stsci.edu/missions.html. Table 1.1 lists the archive holdings as of November 2007. Data from the Kepler Mission, scheduled for launch in 2008, and the EPOCh (Extrasolar Planet Observation and Characterization) portion of the EPOXI mission will be part of the archive. See the EPOXI site, http://epoxi.astro.umd.edu/, for information on the mission.

Table 1.1: Archive Holdings
Mission Instrument Description
ASTRO ASTRO Observatory
ASTROHUT Hopkins Ultraviolet Explorer
ASTRO UIT Ultraviolet Imaging Telescope
ASTRO WUPPE Wisconsin Ultraviolet Photo-Polarimeter Experiment
Copernicus - Copernicus
DSS - Digitized Sky Survey
EUVE - Extreme Ultraviolet Explorer
FUSE - Far Ultraviolet Spectragraphic Explorer
GALEX - Galaxy Explorer
GSC - Guide Star Catalogs
HPOL - Halfwave Spectropolarimeter
HST - Hubble Space Telescope
IUE - International Ultraviolet Explorer
ORFEUS Orbiting Retrievable Far and Extreme Ultraviolet Spectrometers-SPAS
ORFEUS BEFS Berkeley Extrme and Far-UV Spectrometer
ORFEUS IMAPS Interstellar Medium Absorption Profile Spectrograph
ORFEUS TUES Tübingen Ultraviolet Echelle Spectrometer
VLA-FIRST - Very Large Array - Faint Images of the Radio Sky at Twenty-cm
XXM-OM - Xray Multi-Mirror Telescope - Optical Monitor data

For HST and FUSE, the archive contains the calibration reference files, such as flat fields.

For HST, the archive contains engineering files (aka observation logs or jitter files) that may be useful for diagnosing some questions about observations, and the spacecraft ephemeris.

Retrieved data from all missions are primarily in FITS format.

The archive has searchable catalogs of the data, consisting of information about the observations and targets. For HST data, the catalog is populated from the header keywords of the data files and is quite extensive.

The archive also holds community contributed high level science products (HLSPs). These are fully processed images and spectra that are ready for scienctific analysis. A complete list is available at http://archive.stsci.edu/hlsp .

1.2 Other Archives Containing HST Data

Copies of the HST data and the archive catalog are maintained at the European Space Astronmy Centre (ESAC) in Madrid, Spain (http://www.cosmos.esa.int/web/hst/) and at the Canadian Astronomy Data Centre (CADC) in Victoria, Canada (http://cadcwww.dao.nrc.ca/). There is a significant amount of collaboration and coordination between STScI, ST-ECF and CADC to ensure that the data held in common is identical and the basic services provided are similar. However, the archives are not identical. Therefore, European and Canadian astronomers should consult the ST-ECF and CADC Web pages or contact the ST-ECF (stdesk@eso.org) or CADC (cadc@hia.nrc.ca) for information about using their archive systems.

1.3 Proprietary Data

HST data become available to the astronomical community upon the expiration of a proprietary period. For HST, most general observer (GO) and guaranteed time observer (GTO) observations have proprietary periods of a year, but some observations have shorter or longer proprietary periods. Nearly all calibration observations are made public immediately upon receipt. The archive catalog contains information on these proprietary observations.

Proprietary datasets may be retrieved only by GOs and GTOs with the appropriate authorization. See the section on Registration/STScI Single Sign On (SSO) for account information.

1.4 Publication of Archival Research Results

The results of investigations with archival data are generally published in the scientific literature. Archive staff examine the literature for such publications in order to include them in the reference database, and display them as part of the search results. Clearly indicating what datasets are used in your publication will improve the accuracy of this task. Some publishers require authors to use the ADS (Astrophysics Data System) naming convention for datasets. The archive participates in this effort and provides a link, http://archive.stsci.edu/pub_dsn.html, on its main web page to information on the naming convention and verification tools.

1.4.1 HST Data

All publications based on HST archival data should carry the following footnote:

"Based on observations made with the NASA/ESA Hubble Space Telescope, obtained from the data archive at the Space Telescope Science Institute. STScI is operated by the Association of Universities for Research in Astronomy, Inc. under NASA contract NAS 5-26555."

In addition, if the archival research was supported by a grant from STScI, the publication should also carry the following acknowledgment at the end of the text:

"Support for this work was provided by NASA through grant number ________ from the Space Telescope Science Institute, which is operated by AURA, Inc., under NASA contract NAS 5-26555."

Please send one preprint or reprint of each refereed publication based on HST archival research to the following address:

Librarian
Space Telescope Science Institute
3700 San Martin Drive
Baltimore, MD 21218 USA

1.4.2 All Other Data

Publications based on all other data from the archive should carry the following acknowledgment:

"Some/all of the data presented in this paper were obtained from the Multimission Archive at the Space Telescope Science Institute (MAST). STScI is operated by the Association of Universities for Research in Astronomy, Inc., under NASA contract NAS5-26555. Support for MAST for non-HST data is provided by the NASA Office of Space Science via grant NAG5-7584 and by other grants and contract."

See the MAST Data Use Policy for the current MAST grant number.

1.5 Registration/STScI Single Sign On

Starting in April 2015, STScI is going to a Single Sign On (SSO) system for access to various systems, including the archive. Users who do not already have an STScI SSO account (e.g., non-STScI employees, occasional users who have not used their archive account in the past few years) should obtain STScI SSO credentials, via STScI's Single Sign On (SSO) Portal. PIs and Co-Is for all accepted proposal already have SSO credentials, which were used to submit the proposal. These same credentials are used for Archive access.

NB: An STScI SSO account is not required to search the Archive and/or retrieve public data. An anonymous user has full access to the archive catalog, including previews of public data. All public data may be retrieved anonymously. However, only STScI SSO credentialed and authorized users may retrieve proprietary data.

PIs, GOs and GTOs can retrieve their proprietary data from the archive. To do so, they must have STScI SSO credentials and be authorized users. PIs, GOs and GTOs desiring this option must have an STScI SSO account (see above). PI's should request authorization for themselves when they register for their STScI SSO account. Only PIs may authorize anyone to access their data. If a co-I wishes access to their data, they must have the PI on the proposal send e-mail to archive@stsci.edu stating the proposal ID number and the identities of anyone who should be able to retrieve the data.

For questions/issues/problems regarding your STScI SSO Portal credentials, please email support@stsci.edu. For questions about archive issues, please email archive@stsci.edu.

1.6 Searching for Data

There are many ways of searching the archive for data of interest. These include, but are not limited to, StarView, MAST (web based), SpecView, Aladin and many Virtual Observatory (VO) services. There are also special interfaces or tools, such as the CASJobs implementation for searching GALEX data. Some of these interfaces, applications and tools are described here, others are described in the MAST chapter. Still others are not discussed at all. See the MAST chapter and the on-line help for details.

1.6.1 StarView

StarView is a web-based tool that provides knowledgeable users the opportunity to build their own archive catalog search pages. The pages may be saved, so they persist from session to session, and may be shared with other users.

1.6.2 MAST

The archive holdings can be accessed via the web interface at http://archive.stsci.edu. Most users will find the Web interface more convenient than StarView to use, as it can be accessed by any Web browser, and requires no special software. Note: while the Web interface does not provide the special purpose functions of StarView as regards HST data, it does provide access to additional functions, including previews and links to references.

Through the web interface, MAST provides a separate search page for each mission. These pages are accessible via the main MAST webpage at http://archive.stsci.edu. In addition, MAST provides several tools (cross-mission search, catalog cross correlation (Vizier, spectral co=plotters, etc.) to give a more global look at the archive holdings.

The web interfaces can also be accessed using HTTP GET requests. The GET request allows search parameters to be included in the URL. As such, they can be called from within programs to automate data searches. The results can be returned in a variety of formats including HTML, VOTable XML format, Excel spreadsheet format and comma-separated values (CSV), which can simplify utilization by user-written programs. See http://archive.stsci.edu/VO/mast_services.html for more information. Also see the MAST chapter for more tools and services.

1.6.3 Virtual Observatory Services

The Virtual Observatory (VO) project is creating standards to facilitate the discovery and joining of data among astronomical archives. These projects include the US-Virtual Observatory (NVO) which is a member of the International Virtual Observatory Alliance (IVOA). MAST is working to make the archive data compliant with the current VO defined standards and protocols.

An example of a VO interface is the VO DataScope Data Inventory Service, commonly known as DataScope. Using this interface, thearchive holdings can be searched simultaneously with data from many other surveys and missions. DataScope was developed for the NVO and is hosted at HEASARC. The survey and mission data accessed include SDSS, 2MASS, RASS, HST, Chandra and EGRET.

1.6.4 CASJobs

In addition to a standard web search form, for GALEX (GALaxy Evolution eXplorer), a separate interface, called CASJobs, exists. CASJobs is a Batch Query Service that allows SQL access to the GALEX databases. Registered CASJobs users receive local storage on the database server, where tables may be created using the "select into" statement. This storage is called MYDB. Tables created in MYDB may be extracted to FITS, VOTable, or CSV using the extract page. Each user controls their own MYDB, which means the user can drop tables in their MYDB to make more space. MYDB is a proper database: tables in MYDB may be joined with tables in any GALEX target database. For more details on CASJobs, go to http://galex.stsci.edu/casjobs/. See the GALEX chapter for more information on GALEX.

1.7 Retrieving Data

For most missions in the archive, retrieval can be as simple as clicking on the dataset name. However for HST and FUSE data the procedure is different.

As noted above, public data may be retrieved anonymously. Only STScI SSO credentialed (registered) and authorized users (see section 1.5, above) can retrieve proprietary data. Once you have STScI SSO credentials, you can retrieve your proprietary data by using MAST to select your datasets and choose the delivery mode.

1.7.1 HST and FUSE Datasets

Whatever means is used to search the archive, when retrieval is requested for HST or FUSE data the Retrieval Options page is displayed. You are required to select a delivery option for the data. Current options are

  • FTP/SFTP, which will send the data directly to your home computer,

  • STAGE, which will place the data on the archive computer staging disk. You must log into this computer and FTP your data to your home computer.

  • DVD/external hard disk, which will be mailed to you after the data are retrieved.

For every request you will be asked for your STScI SSO credentials. For anonymous retrieval, the default, enter your e-mail address. For an SFTP/FTP retrieval, you will be required to give your home username and password so that the retrieved data can be written to your disk.

When retrieving a large number of datasets, it is better to submit several smaller requests than one large one. To avoid delay, please keep the number of datasets in each request under 100 - 200.

For large volume requests (more than 500 Megabytes) or slow Internet connections, consider requesting a DVD of the data. For very large requests (discussed below), contact the hotseat at archive@stsci.edu, to make arrangements to transfer your data via external hard disk.

System resources required for OTFR may significantly delay availability of the data to programs requesting large volumes of data. Even smaller requests may sometimes encounter delays because competing requests fluctuate greatly. If you are making a large request (greater than 350 ACS, 700 STIS, 700 NICMOS, or 1500 WFPC2 datasets at one time), please submit the requests early on a Friday (Eastern Time) for weekend processing to avoid peak processing times.

Large requests can only be started by Archive staff. Operations is not staffed after hours or during the weekend, so make sure to submit large requests during business hours. To avoid a logjam of multiple large requests during a given weekend, please contact the Archive Hotseat (410-338-4547 or archive@stsci.edu) prior to submitting your request (weekend time will be granted on a first come, first served basis).

As a guide to the system capabilities, Table 1.2 gives the maximum number of datasets that should be submitted in a single large request and the maximum number of datasets that can be processed per weekend (above the "normal" load). See the large request information page for current guidelines on what is a large or very large request.

Table 1.2: Definition of Large Datasets
Instrument Datasets per Request Maximium datasets per weekend
ACS 500 1000
NICMOS 500 2000
WFPC2 750 4500

After a request is submitted, the archive system processes it. An e-mail notification is sent immediately when the system has accepted the request and again when the system has completed processing the request, indicating whether or not the transfer was successful. The messages will go to the e-mail address specified on the Retrieval Options page (for anonymous retrieval) or, if an STScI SSO account was used, to the e-mail address associated with that STScI SSO account.

How long a retrieval takes depends on a variety of factors, including: the type of data in the request, the size of the request, the number of requests in the system at the time, and the destination of the request (the internet connections between STScI and some sites, especially those overseas, is sometimes a significant source of delay). If everything is running smoothly, one should expect a median turn around time of an hour. If it takes more than one day, and you do not think any of the factors listed above are playing a significant role, please contact us at archive@stsci.edu.

If the data are retrieved to the staging disk (STAGE option), the data will be written to a subdirectory. Each data retrieval request will be in its own subdirectory, identified by the request ID number, which will be included in the notification message you will receive. To find your data, from your home account type:

% ftp archive.stsci.edu
% login: STScI SSO credentials
% password: ******
ftp> binary
ftp> cd /stage/username/request_ID_number
ftp> ls

Note the use of binary in the above example. Not all ftp clients automatically set the data mode. Any attempt to ftp fits files in ASCII mode will result in corrupted data files, with no errors from ftp.

For anonymous retrievals, the current default, enter your e-mail address. The subdirectory will be /stage/anonymous/request_ID_number.

After locating your data on the archive host, transfer them across the Internet using FTP as described in a later section. Because the disk space available to each user within the data directory is limited, the files created for you are temporary and are deleted automatically after a few days. Please transfer your data promptly and delete your data from the staging area after the transfer has completed. When disk space is tight, we would appreciate notification that you have completed copying over your retrieved files so that we may delete them. (Send email to archive@stsci.edu.)

1.7.2 MAST Data

All non-HST/FUSE data retrieved through the MAST web search interface is directly downloaded to the user's system as a tar or zip file. The data are also all available via anonymous ftp (archive.stsci.edu, cd pub/mission/data). Users will need to know the desired dataset names when using this option. More information on MAST retrievals is provided in the MAST Chapter of this manual.

1.8 User Support

The Operations and Engineering Division (OED) at STScI is committed to providing outstanding and timely support to archive researchers. We provide assistance and advice on methods and strategies for finding information in the archives and provide a hotseat staff for researchers who have specific problems or questions about using the archive. Archive researchers who need extensive advice on search strategies or help analyzing their astronomical data can visit STScI.

1.8.1 Support for Archival Research

The Operations and Engineering Division (OED) at STScI is responsible for the management, scientific and technical oversight, and operation of the STScI archives. OED staff also support astronomers who wish to use public data from the archives for their own research. To provide assistance for archive researchers, the OED staff includes archive specialists (with bachelor or masters level degrees in physics or astronomy) and archive scientists (Ph.D astronomers). The support provided by the OED includes:

  • Answering specific questions about data in the archives and methods for retrieving those data.
  • Maintaining a Web site for archive services.
  • Providing advice on strategies for searching the archives.
  • Responding to problems identified by users.
  • Authorizing GOs and GTOs who wish to retrieve their own proprietary data.
  • Writing DVDs and, for large retrievals, external hard disks for users.
  • Providing support for users who visit STScI.
OED staff will not normally do an astronomer's archive search, generate requests for data, or reanalyze data from the archive. OED staff will provide assistance and documentation so that archive researchers can perform these tasks.

1.8.2 MAST Newletter

Archive users should consult the Web page at http://archive.stsci.edu for up-to-date archive information. The MAST newsletter, another source of archive information, is accessable from this web site. The newsletter is also distributed electronically via a mailing list. STScI archive users are encouraged to subscribe by sending an e-mail message to archive_news-request@stsci.edu. The single word SUBSCRIBE should be included in the body of the message.

1.8.3 Archive Helpdesk

You can obtain help or answers to any questions that you may have about archive issues by sending e-mail to archive@stsci.edu, or by telephoning (410)338-4547 Monday through Friday, 9 a.m. to 5 p.m. Eastern time.

The helpdesk staff will respond to questions concerning the StarView and Web user interfaces, the archive and archive databases, DVDs and hard disks provided by STScI. Helpdesk personnel can authorize access to proprietory data. The hotseat staff will also provide advice concerning basic strategies, and will investigate and document all problem reports. The archive helpdesk staff may not always know how to solve a problem, but they are responsible for finding out who does know the answer and for continuing to work with you until the problem is resolved. All initial communication from the user community to the archive - both inside and outside of STScI - should be directed to the archive helpdesk.

1.8.4 Questions and Comments

We welcome your comments and questions about the archive in general or about archive user support. As discussed above, communication regarding all aspects of the archive should normally be directed to the archive helpdesk (e-mail: archive@stsci.edu, or telephone (410) 338-4547). This will allow Archive Branch staff to respond to your requests even when individual members of the group are away. If you feel your needs are not being adequately addressed through the helpdesk, place a message in the Suggestion Box located on the main archive page, http://archive.stsci.edu.

1.8.5 When a Retrieval Fails

Occasionally, a retrieval fails because of a network timeout, disk space inadequacy, or other reasons. The staff at archive@stsci.edu are available for any questions about the request.

1.8.6 Documentation

Documentation is available on-line for all archive holdings. The main archive page provides links to a general introduction to MAST and a "getting started" page. Each mission page has links to mission specific information (About ...), a mission specific "getting started" page. MAST's HST page contains similar links. Under the About HST link are links to documentation on HST, its instruments and their calibration, proposal instructions, the Archive Manual and much more.

The Archive Manual (i.e., this document) is also available, either as a postscript or pdf file, from the archive via anonymous ftp. To get a copy of the postscript version, follow these instructions. For the pdf version, substitute pdf for ps in the file name.

>ftp archive.stsci.edu
>login: anonymous
>passwd: your e-mail address
ftp> cd pub
ftp> cd manuals
ftp> dir [lists the available files]
ftp> ascii
ftp> mget archive_manual*.ps.gz
ftp> bye

Those files that end in .gz were compressed with the utility gzip and can be uncompressed with the utility gunzip.

1.9 Using FTP

If you have used the STAGE option to retrieve your HST or FUSE data to the archive host computer (archive.stsci.edu) you must transfer your files to your local computer via ftp. See Table 1.3 for examples of ftp sessions.

Datasets from other MAST missions are stored online and can be retrieved via anonymous ftp or through the browser using wget. See the MAST chapter for more information.

Final calibrated data for the HST Instruments STIS, GHRS, FOS and FOC are available on disk in the hstonline area. See archive.stsci.edu/hstonline/ for details.

Table 1.3: File Access Commands for Archive Users

Sample Function

Unix Commands

Retrieve HST data from the data directory as named in the acknowledgement e-mail - in this case, "dir0129" (e.g., data retrieval was requested using an STScI SSO Credentialed account)

%ftp archive.stsci.edu (or stdatu.stsci.edu)
(login as "anonymous")
ftp> cd staging
ftp> dir
ftp> binary
ftp> prompt
ftp> mget x*
ftp> bye

Retrieve PostScript and/or text versions of manual, abstracts, catalogs, general information.

Files that end in .gz can be uncompressed by using the command "gunzip". The uncompressed files are also available.

ftp archive.stsci.edu

(login as "anonymous")
ftp> cd pub/manuals (or pub/catalogs or pub/hdf etc.)
ftp> ascii1

ftp> ls

ftp> mget *.ps.Z (or whatever)
ftp> bye

1 As appropriate. In the case of the Hubble Deep Field images, "binary" would be the appropriate datatype.

1.10 Using STSDAS/IRAF to Analyze Your Data

You can analyze or recalibrate HST datasets using STSDAS, which can be installed under IRAF. A comprehensive discussion of STSDAS and IRAF features or a tutorial on how to use the software is beyond the scope of this manual. Contact the STSDAS hotseat (help@stsci.edu) for answers to any STSDAS-related questions that you may have. To learn how to use this data analysis package, you can request copies of the documentation by sending e-mail to help@stsci.edu. Various STSDAS and software related documentation is also available on-line at http://www.stsci.edu/hst/HST_overview/documents.



1 External users may prefer to specify an anonymous incoming ftp site at their institution. Notice however that, when using our Web interface, we protect your destination information with the same kind of secure Web mechanism that is used by commercial sites.



[Top] [Prev] [Next] [Bottom]