INFRAFRONTIER programmatic data access

Infrafrontier enriched strain data are not only accessible through the web interface but also via a RESTful API, enabling automated access and integration into external analysis pipelines. The API is powered by Apache SOLR, supporting SOLR query syntax and SOLR query parameters , including advanced filtering, faceting, and full-text search capabilities. Data are distributed across multiple Solr cores, each representing a distinct dataset or domain (e.g., cancer-related strains, COVID-19 strains), and are independently queryable. Each core defines a specific schema with dedicated fields that can be used for both targeted searching and result filtering. Detailed information about the available cores, fields, and usage examples is provided in the Solr cores documentation below.

SOLR cores

EMMA cancer strains

This core contains information related to EMMA strains associated to more than 50 different cancer types.

Accessible at emma_cancer_strains

Fields

Parameter	Type	Description
id_str	Integer	Unique identifier for the strain
name_str	String	Name of the strain
do_term	String	Disease Ontology (DO) term
do_id	String	Unique identifier for the disease (DOID)
gene_symbol	String	Symbol for the gene
mp_id	String	Unique identifier for the phenotype (MP ontology)
mp_term	String	Mammalian Phenotype term

Examples

Strain with strain id 1

id_str:1

https://infrafrontier.eu/solr/disease/cancer_resource/select?q=id_str:1

Strains with disease term “prostate cancer”

do_term:”prostate cancer”

https://infrafrontier.eu/solr/disease/cancer_resource/select?q=do_term:%22prostate%20cancer%22

Strains with gene Thrb

gene_symbol:Thrb

https://infrafrontier.eu/solr/disease/cancer_resource/select?q=gene_symbol:Thrb