Open Science for Life in Space | NASA GeneLab Open API |
This API provides the means to query the GeneLab database for annotation and data associated with datasets, assays, and samples hosted by GeneLab, and represents results in a number of formats.
id ↴ accession → GLDS-239 assay name → mhu2_fskn_transcription_profiling_RNA_Sequencing_(RNA-Seq) sample name → Mmus_C57-6J_FSKN_FLT_1G_JC_Rep3_F12 study ↴ factor value ↴ altered gravity → 1G by centrifugation diet → JAXA Chow spaceflight → Space Flight ... characteristics ↴ ... parameter value ↴ ... assay ↴ parameter value ↴ read depth → 67001231 {read} read length → 149 {base pair} ... ... file ↴ datatype → normalized counts filename → GLDS-239_rna_seq_Normalized_Counts.csv
"id"
category always contains fields "accession"
,
"assay name"
, "sample name"
.
"investigation"
, "study"
, "assay"
are organized
according to the ISA-Tab model, and subsequently may
contain the following fields, each of which may contain further nested subfields:
investigation ↴ study ↴ ... study assays ↴ ... investigation ↴ ... study ↴ characteristics ↴ ... factor value ↴ ... parameter value ↴ ... assay ↴ characteristics ↴ ... factor value ↴ ... parameter value ↴ ...
"file"
category contains fields "filename"
,
"datatype"
and represents the information about the file(s) associated with a given sample.
Names of categories and nested fields can be joined with a period (.
) to form queries.
Request URLs are formed by combining an arbitrary number of queries, described below.
This part of the URL is required.
id.accession
and id.assay name
are always included;id → accession
, study.characteristics → age
.
GLDS-1
);E-GEOD-53196_GeneChip_assay
);Dmel_OR_wo_FLT_uninfd_Rep1
)."id"
fields.
/samples/
(we will build upon this below).
This part of the URL, as well as all following parts, can be specified any number of times (including zero times).
id.accession=ACCESSION
) and/or assays
(id.assay name=ASSAY
)
id=ACCESSION_1/ASSAY_1A|ACCESSION_2
)
/samples/?id.accession=GLDS-38
/samples/?id=GLDS-38/proteomics|GLDS-276
Information on all values under a given ISA-Tab category can be retrieved:
assay.parameters
) will constrain the results to only
the datasets, assays, and samples that have this category annotated.
/samples/?study.factor value
Each ISA category contains multiple fields:
=x.a.b&=y.c.d
).
=x.a.b|d
).
=x.a.b
) effectively queries for existing values of the field(s), i.e. constrains
the results to to only the datasets, assays, and samples that have this field annotated (with non-NaN values).
x.a.b
), the columns may contain NaN values (i.e., this constrains
to the columns, but not to values within columns).
/samples/?=study.factor value.spaceflight
Example: /samples/?study.factor value.spaceflight
Example: /samples/?=study.factor value.radiation dose|absorbed radiation dose
Example: /samples/?study.factor value.radiation dose|absorbed radiation dose
Example: /samples/?study.factor value.radiation dose|absorbed radiation dose&=assay.factor value.radiation type
The search can be constrained to only the samples that are annotated with specific value(s) of an ISA field.
x=a&y=b
).
x=a|b
).
/samples/?study.characteristics.genotype=WT
/samples/?study.characteristics.genotype=WT|TK6
For the resultant queried datasets, assays, and samples, there may be files associated.
file.datatype
.file.datatype=pca
).
/samples/?file.datatype
/samples/?file.datatype=differential expression
/samples/?file.datatype=visualization table|pca
/samples/?file.filename=/trimmed\.fastq\.gz$/
Data returned for the query can be formatted in multiple ways, depending on the output type.
Refer to the table below for a matrix of valid requested formats.
Note that annotation columns only appear in the "assays" and "samples" views, while files are only sourced from for the "data" view.
View |
Number of annotation columns in output |
Number of files data is sourced from |
Resultant output type |
&format= | &schema= | ||||||
csv* | tsv | json | raw | cls | gct | 0* | 1 | ||||
/assays/ | 1 | table | yes | yes | yes | no | yes | no | yes | yes | |
/assays/ | >1 | table | yes | yes | yes | no | no | no | yes | yes | |
/samples/ | 1 | table | yes | yes | yes | no | yes | no | yes | yes | |
/samples/ | >1 | table | yes | yes | yes | no | no | no | yes | yes | |
/data/ | 1 | table | yes | yes | yes | yes | no | maybe1 | yes | yes | |
/data/ | >1 | table | maybe2 | maybe2 | maybe2 | no | no | maybe1,2 | maybe2 | maybe2 | |
/data/ | 1 | other | no | no | no | yes | no | no | no | no | |
/data/ | >1 | other | no | no | no | no | no | no | no | no |
&format=csv
, &format=tsv
):
#
):
id → accession
,
study.characteristics → age
GLDS-1
);E-GEOD-53196_GeneChip_assay
);Dmel_OR_wo_FLT_uninfd_Rep1
)."id"
fields.
"value"
),NaN
(unquoted)./data/?study.characteristics.organism=Homo sapiens&file.datatype=unnormalized counts&format=tsv
&format=json
):
"id"
columns);"id"
columns);"value"
),NaN
(unquoted)./samples/?study.factor value.spaceflight&id=GLDS-15&format=json
&format=raw
):
/data/?id=GLDS-4&file.filename=/.*normPCA.png$/&format=raw
&format=cls
):
id
column (e.g., study.factor value.genotype
).
/samples/?study.factor value.genotype&format=cls
&format=gct
):
id
information (accession, assay name, sample name) is joined with a forward
slash (/
).
/data/?file.datatype=normalized counts&id=GLDS-38&format=gct
&schema=1
):
type[(minimum)..(maximum)|NaN]
.
type
is "str", minima and maxima are omitted;NaN
is omitted./data/?id=GLDS-4&file.datatype=differential%20expression&format=tsv&schema=1