Open Science for Life in Space NASA GeneLab Open API
API version 4.1.11 (changelog)

Table of contents

Structure

This API provides the means to query the GeneLab database for annotation and data associated with datasets, assays, and samples hosted by GeneLab, and represents results in a number of formats.

              id ↴
                 accession → GLDS-239
                 assay name → mhu2_fskn_transcription_profiling_RNA_Sequencing_(RNA-Seq)
                 sample name → Mmus_C57-6J_FSKN_FLT_1G_JC_Rep3_F12
              study ↴
                    factor value ↴
                                 altered gravity → 1G by centrifugation
                                 diet → JAXA Chow
                                 spaceflight → Space Flight
                                 ...
                    characteristics ↴
                                    ...
                    parameter value ↴
                                    ...
              assay ↴
                    parameter value ↴
                                    read depth → 67001231 {read}
                                    read length → 149 {base pair}
                                    ...
                    ...
              file ↴
                   datatype → normalized counts
                   filename → GLDS-239_rna_seq_Normalized_Counts.csv
            

Names of categories and nested fields can be joined with a period (.) to form queries.
Request URLs are formed by combining an arbitrary number of queries, described below.

You are here:
https://visualization.genelab.nasa.gov/GLOpenAPI
Additional interactive features on this page require a browser with JavaScript support.

URL and query components

View level

This part of the URL is required.

  • The "assays" view generates overview tables; each column represents a property (such as a factor value), and each row represents one assay, followed by boolean values (True/False) denoting whether a property is associated with the assay.
    • The fields id.accession and id.assay name are always included;
    • There are two levels of column names:
      • The bottom one is the target nested field;
      • the top one is the preceding nested fields, joined by a period, e.g.:
        id → accession, study.characteristics → age.
    • This tabular data can be represented in CSV, TSV, JSON, and interactive formats.
  • The "samples" view generates annotation (metadata) tables; each column represents a property (such as a factor value), and each row represents one sample, followed by its property values.
    • The output format is analogous to the "assays" view.
    • This tabular data can be represented in CSV, TSV, JSON, and interactive formats.
  • The "data" view outputs data associated with sample(s).
    • All tabular formats (CSV, TSV, JSON) are supported if the underlying data is a table, and contain three levels of column names, from top to bottom:
      • accession (e.g., GLDS-1);
      • assay name (e.g., E-GEOD-53196_GeneChip_assay);
      • sample name (e.g., Dmel_OR_wo_FLT_uninfd_Rep1).
      Thus, the header corresponds to the leftmost three columns (index) of the "samples" view – the "id" fields.
    • Otherwise, if a single non-tabular file matches the query, it can be returned raw (see formats below).
Example:  /samples/ (we will build upon this below).
Retrieval of entries from specific datasets and/or assays

& id= .
OR

This part of the URL, as well as all following parts, can be specified any number of times (including zero times).

  • The search can be constrained to only the samples from specific datasets (id.accession=ACCESSION) and/or assays (id.assay name=ASSAY)
  • As a shorthand, it is also possible to pass a mixed query, where accessions and assay names are separated by a forward slash, and multiple accessions / assay names are joined by a vertical pipe (id=ACCESSION_1/ASSAY_1A|ACCESSION_2)
Example:  /samples/?id.accession=GLDS-38
Example:  /samples/?id=GLDS-38/proteomics|GLDS-276
Retrieval of entries with metadata categories

This interactive widget is temporarily disabled pending performance improvements and backend refactoring.
Please refer to the examples below.

Information on all values under a given ISA-Tab category can be retrieved:

  • Querying directly by any of these categories (e.g., assay.parameters) will constrain the results to only the datasets, assays, and samples that have this category annotated.
  • Under view levels "assays" and "samples", the values of fields in the category will be reported in table columns.
Example:  /samples/?study.factor value
Retrieval of entries with metadata fields

This interactive widget is temporarily disabled pending performance improvements and backend refactoring.
Please refer to the examples below.

Each ISA category contains multiple fields:

  • The logical AND (inclusion of several fields) is achieved by passing them at the same time (=x.a.b&=y.c.d).
  • The logical OR of several fields within a single category is achieved by passing the target fields joined by a vertical pipe (=x.a.b|d).
  • Under view levels "assays" and "samples", the values of any such field will be reported in table columns.
  • The leading "=" (=x.a.b) effectively queries for existing values of the field(s), i.e. constrains the results to to only the datasets, assays, and samples that have this field annotated (with non-NaN values).
  • Without the leading "=" (x.a.b), the columns may contain NaN values (i.e., this constrains to the columns, but not to values within columns).
Example:  /samples/?=study.factor value.spaceflight Example:  /samples/?study.factor value.spaceflight Example:  /samples/?=study.factor value.radiation dose|absorbed radiation dose Example:  /samples/?study.factor value.radiation dose|absorbed radiation dose Example:  /samples/?study.factor value.radiation dose|absorbed radiation dose&=assay.factor value.radiation type
Retrieval of entries with metadata field values

This interactive widget is temporarily disabled pending performance improvements and backend refactoring.
Please refer to the examples below.

The search can be constrained to only the samples that are annotated with specific value(s) of an ISA field.

  • The logical AND of several conditions is achieved by passing them at the same time (x=a&y=b).
  • The logical OR for a single condition is achieved by passing the target values joined by a vertical pipe (x=a|b).
  • Under view levels "assays" and "samples", the values of target field(s) will be reported in table columns.
Example:  /samples/?study.characteristics.genotype=WT
Example:  /samples/?study.characteristics.genotype=WT|TK6
Retrieval of files or data

& file.filename=
& file.datatype
& file.datatype=
OR

For the resultant queried datasets, assays, and samples, there may be files associated.

  • All files with recognized datatypes can be queried by passing file.datatype.
  • The query can be constrained to files of particular datatypes by specifying these datatypes in the query (file.datatype=pca).
  • Finally, the query can be constrained by specifying a full file name of interest or a regular expression pattern enclosed in forward slashes.
Examples:
/samples/?file.datatype
/samples/?file.datatype=differential expression
/samples/?file.datatype=visualization table|pca
/samples/?file.filename=/trimmed\.fastq\.gz$/
Display formats

& format=
& schema=

Data returned for the query can be formatted in multiple ways, depending on the output type.
Refer to the table below for a matrix of valid requested formats.

Note that annotation columns only appear in the "assays" and "samples" views, while files are only sourced from for the "data" view.

View
Number of
annotation columns
in output
Number of files
data is
sourced from
Resultant
output type

&format= &schema=
csv* tsv json raw cls gct 0* 1
/assays/ 1 table yes yes yes no yes no yes yes
/assays/ >1 table yes yes yes no no no yes yes
/samples/ 1 table yes yes yes no yes no yes yes
/samples/ >1 table yes yes yes no no no yes yes
/data/ 1 table yes yes yes yes no maybe1 yes yes
/data/ >1 table maybe2 maybe2 maybe2 no no maybe1,2 maybe2 maybe2
/data/ 1 other no no no yes no no no no
/data/ >1 other no no no no no no no no
* Default
1 Only for transcription profiling data
2 Only for data that can be merged across assays (currently only unnormalized counts RNA-seq data)
  • Character-separated formats (&format=csv, &format=tsv):
    • Tabular results can be represented in comma-separated (CSV) or tab-separated (TSV) formats.
    • Header lines (column names) are preceded with a hash sign (#):
      • Annotation headers (i.e., for views "assays" and "samples") contain two levels of column names:
        • The bottom one is the target nested field
        • The top one is the preceding nested fields, joined by a period, e.g.:
          id → accession, study.characteristics → age
      • Headers of tabular data (for view "data") contain three levels of column names, from top to bottom:
        • accession (e.g., GLDS-1);
        • assay name (e.g., E-GEOD-53196_GeneChip_assay);
        • sample name (e.g., Dmel_OR_wo_FLT_uninfd_Rep1).
        Thus, the header corresponds to the leftmost three columns (index) of the "samples" view – the "id" fields.
    • In all remaining lines,
      • string values are always quoted with double quotes ("value"),
      • numeric and boolean values are never quoted,
      • and missing values are always displayed as NaN (unquoted).
    Example:   /data/?study.characteristics.organism=Homo sapiens&file.datatype=unnormalized counts&format=tsv
  • JSON format (&format=json):
    • JSON-formatted output is derived from the tabular output in split orientation, placing:
      • column names under the "columns" key,
      • index names under the "index" key,
      • cell values under the "data" key,
      • and additionally providing the "meta" key with index names.
    • The following are treated as indices:
      • The leftmost two columns of the "assays" view (i.e., the "id" columns);
      • The leftmost three columns of the "samples" view (i.e., the "id" columns);
      • The leftmost column of the "data" view (e.g., ENSEMBL gene names).
    • String values are always quoted in double quotes ("value"),
    • numeric and boolean values are never quoted,
    • and missing values are always displayed as NaN (unquoted).
    Example:   /samples/?study.factor value.spaceflight&id=GLDS-15&format=json
  • Schema (&schema=1):
    • Rather than retrieving the entire table, a description of tabular data can be requested.
    • The output contains the same header as the full table, followed by a single row, where each cell contains a string of form type[(minimum)..(maximum)|NaN].
      • If type is "str", minima and maxima are omitted;
      • if the column does not have any missing values, NaN is omitted.
    Example:   /data/?id=GLDS-4&file.datatype=differential%20expression&format=tsv&schema=1
NASA   |   No Fear Act   |   FOIA   |   USA.gov   |   Contact   |   Accessibility