If you have any issues or concerns then please do not hesitate to let us know about it.

Help Home > The Download Area

browse tab higlighted within the navigation

The Download Area

All JGI portal sites offer an area for downloading primary sequence, annotation, and other data. For older genome assemblies, data is provided through individual download pages which simply list available data files with direct links to download each file. In these cases, please follow any special instructions given on the page.

For newer genomes, and for all Genome Groups, many new file types are now supported: All RAW, Assemblies, Quality Control, various analysis files. All files are accessible in a variety of ways each designed with a particular class of user in mind.

We are setup to allow easy downloads of individual projects/genomes and don't provide means to download whole collections of data. For your convenience and fast downloads you can use Globus and API. Our raw reads are also published to SRA at NCBI for bulk download needs.

Access to downloads is provided in three ways that are designed to serve three distinct needs:

I. Download page with web UI

There is the web driven approach where the user finds a portal of interest and clicks on the Download tab of that portal. Downloads are available in a tree structure that divides the files into logical groups so that the user can download RAW files, assemblies, and so on with a single operation.
screenshot of downloadPage

Available download files are presented in a tree based on the data-type (assembly, masked assembly, protein models, ESTs...). To expand or collapse all items in the tree, click Expand All or Collapse All. You can download files in one of two ways:

  1. To download multiple files at once, select the checkboxes to the left of file sections or individual files, and hit the Download button. The selected items will be packaged into a .zip archive file and down-loaded to your browser. Note that you can select or deselect all download files by toggling the checkbox at the root of the tree.

  2. To download an individual file, click on the desired single item in the tree.

New feature "Organize By File Type" for Downloads. You can use this option on the download page of Projects, Proposals and Groups. In order to use it please:

  1. Go to the download page
  2. Check the box "Organize By File Type".
  3. The folders on the "download tree" will be reorganized by the file type arrangement

II. Download with Globus service (fast, reliable, convenient)

To download a large number of files you can use the Globus service.
screenshot of downloadPage with Portal UI

How to set up your Globus account:

  1. Go to www.globus.org website
  2. Create an account. Follow the instructions in email from Globus to activate your account.
  3. Globus transfers occur between the endpoints. The link to the JGI endpoint will be sent to you by email once you initiate the download. The other endpoint is your institution's Globus endpoint. To transfer to your local machine you can install the program Globus Connect Personal

How to use Globus services for your downloads

  1. Go to the download section of your project/portal/proposal/group
  2. Click the "Download via Globus (v.2)" button under the main navigation of the portal
  3. The dialog box will appear where you have to provide your Globus Account name.
  4. Submit the request and wait for the email from our service when your files are staged and ready for download.
  5. Click on the link in the received email.
  6. Globus then provides a user friendly interface for big data transfers.
  7. Provide your second endpoint you are planning to transfer to.
  8. Authenticate to your second endpoint.
  9. Select files and start to transfer.
  10. Behind the scenes the transfers are performed using GridFTP which is a parallel transfer protocol and program. GridFTP has built in checks that ensures the integrity of the transfers and guarantees that the files reach their destination intact.

Please provide your feedback

III. Download with API

A third way to retrieve data from the JGI is available for users who need to download using scripting or programming. The details of API XML schema (XSD) are available for your review.
  1. Identify the name of the portal before you can download

    You can find that using our JGI Portal search on the home page. Use any search terms necessary to find the portal you want, click on the "Download" link in the "Resources" column, then make a note of the short portal name in the URL. It is located between the second and third "/" characters in the path after the web host. For example, in the URL https://genome.jgi.doe.gov/portal/Aurpu_var_sub1/... the portal name to use for API download is "Aurpu_var_sub1"

    You can also export the full search results into CSV format by clicking "Project Overview Report", then you could iterate over all your projects.The short portal name is identified in "Portal ID" column.

  2. Log in using the following command.

    curl 'https://signon-old.jgi.doe.gov/signon/create' --data-urlencode 'login=USER_NAME' --data-urlencode 'password=USER_PASSWORD' -c cookies > /dev/null

    Replace USER_NAME, USER_PASSWORD with the appropriate values

a) If you prefer to download directly from Portal, please follow the steps 3, 4 below

3. Download a list of files available for the portal that you are interested in.

For example, for PhytozomeV10 the command will look like this:
curl 'https://genome.jgi.doe.gov/portal/ext-api/downloads/get-directory?organism=PhytozomeV10' -b cookies > files.xml

4. Find the file that you would like to download in the XML document and download it.

For example, if you look for "Alyrata_107_v1.0.annotation_info.txt", you will find the following entry in the file:
<file label="PhytozomeV10" filename="Alyrata_107_v1.0.annotation_info.txt" size="3 MB" sizeInBytes="3901148" timestamp="Sun Jan 12 17:46:56 PST 2014" url="/portal/ext-api/downloads/get_tape_file?blocking=true&amp;url=/PhytozomeV10/download/_JAMO/53112a9e49607a1be0055980/Alyrata_107_v1.0.annotation_info.txt" project="" library="" md5="b03b5173b0adabe4c0e37f82b4a7a2a1"/>

Then you need to transfer the URL attribute from the entry to the download curl command (please make sure you that you replace "&amp;" with "&"). The command to download the file would look like this:
curl 'https://genome.jgi.doe.gov/portal/ext-api/downloads/get_tape_file?blocking=true&url=/PhytozomeV10/download/_JAMO/53112a9e49607a1be0055980/Alyrata_107_v1.0.annotation_info.txt' -b cookies > Alyrata_107_v1.0.annotation_info.txt

b) OR if you would like to download with Globus, please follow these steps instead

3. Request the data to be staged for download via Globus.

For example, for PhytozomeV10 the command will look like this:
curl 'https://genome.jgi.doe.gov/portal/ext-api/downloads/globus/request' -b cookies --data-urlencode 'portal=PhytozomeV10' --data-urlencode 'globusName=UUUUU@DDDDD.NNN'

Replace UUUUU@DDDDD.NNN with your Globus user name.

You can add the following optional parameters to the above curl command:
--data-urlencode 'addedSince=YYYY-MM-DD' (if you are only interested in newer data)
--data-urlencode 'sendMail=true' (if you want to receive a notification via email when your data is ready)
--data-urlencode 'organizedByFileType=true' (if you want the data to be organized by file type)

4. Check the status of your request.

The curl command in step 3 will return a link that you can use to check the status of your request, for example:
curl 'https://genome.jgi.doe.gov/portal/ext-api/downloads/globus/NNNNN-MM/status' -b cookies

The command will return a link to your data when it is ready, for example:
Download request completed.
Data URL: https://www.globus.org/app/transfer?origin_id=XXXXXX&origin_path=/NNNNN/MM/PhytozomeV10/&add_identity=UUUUUU

After that, you can either enter the URL in your Web browser or use the values of the "origin_id" and "origin_path" parameters of the URL with Globus API calls.

IMG annotation files in "IMG Data" Folders

Please note that IMG annotation files are bundled in download_bundle.tar.gz

To learn about the contents of the tar bundle and how to extract them please read the IMG guidelines

Data Usage and Download Policy

For both new and old download pages, you are required to read and approve a JGI Data Usage Policy statement before accessing JGI data. This statement will appear on the first page you see when enter the download area, and may vary by organism. To continue to the download page, click the "Agree" button after reviewing the policy. You may also select the checkbox next to the "Agree" button to bypass the usage statement the next time you visit the download area for the given organism or group. If you would like to review the policy again please use "Show Data Usage Policy" button under the main navigation.