Skip to content

Open Access Biomedical Resources

Cell Datasets

ACT

  • act logo
  • Description: ACT was developed based on the structural marker map and the cell type enrichment algorithm, with the aim of efficiently facilitating the process of cell type annotation.
  • URL: ACT-Annotation of Cell Types

Panglao DB

  • panglaodb logo
  • Description: Panglao DB is a database for the scientific community interested in exploration of single cell RNA sequencing experiments from mouse and human.
  • URL: Panglao DB

Cell Marker

  • cellmarker logo
  • Description: Cell Marker2.0 is a database with functional analysis, which integrates multi-tissue cell type markers, including data browsing, searching, downloading, storage and visualization analysis.
  • URL: Cell Marker

Cell Taxonomy

  • celltaxonomy logo
  • Description: CellTaxonomy is a comprehensive and curated repository of cell types and associated cell markers encompassing a wide range of species, tissues and conditions.
  • URL: Cell Taxonomy

Human Cell Atlas

  • hca-logo
  • Description: HCA is a comprehensive, high-quality, and accessible human cell atlas database that supports research and applications in the fields of biology, medicine, and bioinformatics. It covers all major human organ and tissue types, including the brain, heart, lungs, liver, kidneys, pancreas, intestines, reproductive system, and more.
  • URL: Human Cell Atlas

Cell Landscape

  • cellland-logo
  • Description: Using single-cell RNA sequencing technology, Cell Landscape is an organism-level cell atlas was constructed for mice, zebrafish, and fruit flies across multiple life stages.
  • URL: Cell Landscape

Knowledge Graphs

TarKG

  • tarkg-logo
  • Description: TarKG is a holistic knowledge graph tailored for target discovery. Centered around the three core entity types (Disease, Gene, and Compound), TarKG integrated data from seven existing biomedical knowledge graphs, nine public databases, and various Traditional Chinese Medicine knowledge databases.
  • URL: TarKG

PrimeKG

  • primekg-logo
  • Description: PrimeKG, a precision medicine-oriented knowledge graph that provides a holistic view of diseases. PrimeKG integrates 20 high-quality resources to describe 17,080 diseases with 4,050,249 relationships representing ten major biological scales.
  • URL: PrimeKG

PharmKG

  • pharmkg-logo
  • Description: PharmKG is a multi-relational, attributed biomedical knowledge graph, comsed of more than 500 thousands individual interconnectons between genes, drugs and diseases, with 29 relation types over a vocabulary of ~8000 disambiguated entites.
  • URL: PharmKG

DisGeNET

  • disgenet-logo
  • Description: DisGeNET is a resource that integrates disease–gene associations from several sources. It uses the resource in Rephetio, our network for drug repurposing. This repository extracts associations for the diseases and genes in rephetio.
  • URL: DisGeNet

Hetionet

  • hetionet-logo
  • Description: Hetionet is a hetnet — network with multiple node and edge (relationship) types — which encodes biology. The hetnet was designed for Project Rephetio, which aims to systematically identify why drugs work and predict new therapies for drugs.
  • URL: Hetionet
  • openbiolink-logo
  • Description: OpenBioLink is a resource and evaluation framework for evaluating link prediction models on heterogeneous biomedical graph data. It contains benchmark datasets as well as tools for creating custom benchmarks and evaluating models.
  • URL: OpenBioLink

BioKG

  • biokg-logo
  • Description: BioKG is a knowledge graph for relational learning on biological data.
  • URL: BioKG

DRKG

  • drkg-logo
  • Description: DRKG is a comprehensive biological knowledge graph relating genes, compounds, diseases, biological processes, side effects and symptoms.
  • URL: DRKG

iKraph

  • ikraph-logo
  • Description: iKraph is a comprehensive, large-scale biomedical knowledge graph for AI-powered, data-driven biomedical research
  • URL: iKraph

OREGANO

  • oregano-logo
  • Description: The OREGANO project aims to build a holistic knowledge graph on drugs and to apply link prediction approaches for the discovery of possible drug - target relations for the purpose of drug repositioning.
  • URL: OREGANO

Monarch

  • monarch-logo
  • Description: The Monarch Knowledge Graph (KG) comprises the combined knowledge of 33 biomedical resources and biomedical ontologies, and is updated with the latest data from each source once a month.
  • URL: Monarch

Orphanet

  • orphanet-logo
  • Description: Orphanet has developed and maintains the Orphanet nomenclature of rare diseases, a multilingual, standardised, controlled medical terminology specific to rare diseases.
  • URL: Orphanet

Open Targets Platform

  • opentargets-logo
  • Description: The Open Targets Platform is a comprehensive data integration tool that supports systematic identification and prioritisation of potential therapeutic drug targets. By integrating publicly available datasets including data generated by the Open Targets experimental and informatics research programmes, the Platform provides data and services to assist in the task of therapeutic hypothesis building.
  • URL: Open Targets Platform

HCDT 2.0

  • hcdt-logo
  • Description: HCDT 2.0 is a comprehensive database that provides validated associations between drugs and targets (genes, RNAs, and pathways), contains 1,284,353 high-confidence drug-target interactions.
  • URL: HCDT 2.0

DrugBank

  • drugbank-logo
  • Description: The DrugBank database is a comprehensive, freely accessible, online database containing information on drugs and drug targets created and maintained by the University of Alberta and The Metabolomics Innovation Centre located in Alberta, Canada. DrugBank combines detailed drug (i.e. chemical, pharmacological and pharmaceutical) data with comprehensive drug target (i.e. sequence, structure, and pathway) information.
  • URL: DrugBank

PHAROS

  • pharos-logo
  • Description: Pharos is the user interface to the Knowledge Management Center (KMC) for the Illuminating the Druggable Genome (IDG) program. The goal of KMC is to develop a comprehensive, integrated knowledge-base for the Druggable Genome (DG) to illuminate the uncharacterized and/or poorly annotated portion of the DG.
  • URL: PHAROS

SIDER

  • sider-logo
  • Description: SIDER contains information on marketed medicines and their recorded adverse drug reactions. The information is extracted from public documents and package inserts. The available information include side effect frequency, drug and side effect classifications as well as links to further information, for example drug–target relations.
  • URL: SIDER

REACTOME

  • reactome-logo
  • Description: REACTOME is an open-source, open access, manually curated and peer-reviewed pathway database. Our goal is to provide intuitive bioinformatics tools for the visualization, interpretation and analysis of pathway knowledge to support basic and clinical research, genome analysis, modeling, systems biology and education.
  • URL: REACTOME

ChEMBL

  • chembl-logo
  • Description: ChEMBL is a manually curated database of bioactive molecules with drug-like properties. It brings together chemical, bioactivity and genomic data to aid the translation of genomic information into effective new drugs.
  • URL: ChEMBL

ClinVar

  • clinvar-logo
  • Description: ClinVar is a freely accessible, public archive of reports of human variations classified for diseases and drug responses, with supporting evidence. ClinVar thus facilitates access to and communication about the relationships asserted between human variation and observed conditions, and the history of those assertions.
  • URL: ClinVar

Ontologies & Terms

OMIM

  • omim-logo
  • Description: OMIM is a comprehensive, authoritative compendium of human genes and genetic phenotypes that is freely available and updated daily. The full-text, referenced overviews in OMIM contain information on all known mendelian disorders and over 16,000 genes. OMIM focuses on the relationship between phenotype and genotype. It is updated daily, and the entries contain copious links to other genetics resources.
  • URL: OMIM

MeSH

  • mesh-logo
  • Description: The Medical Subject Headings (MeSH) thesaurus is a controlled and hierarchically-organized vocabulary produced by the National Library of Medicine. It is used for indexing, cataloging, and searching of biomedical and health-related information. MeSH includes the subject headings appearing in MEDLINE/PubMed, the NLM Catalog, and other NLM databases.
  • URL: MeSH

Mondo

  • mondo-logo
  • Description: Mondo’s development is coordinated with the Human Phenotype Ontology (HPO), which describes the individual phenotypic features that constitute a disease. Like the HPO, Mondo provides a hierarchical structure which can be used for classification or “rolling up” diseases to higher level groupings. It provides mappings to other disease resources, but in contrast to other mappings between ontologies, we precisely annotate each mapping using strict semantics, so that we know when two disease names or identifiers are equivalent or one-to-one, in contrast to simply being closely related.
  • URL: Mondo

Human Phenotype Ontology

  • hpo-logo
  • Description: The Human Phenotype Ontology (HPO) provides a standardized vocabulary of phenotypic abnormalities encountered in human disease. Each term in the HPO describes a phenotypic abnormality, such as Atrial septal defect. The HPO is currently being developed using the medical literature, Orphanet, DECIPHER, and OMIM.
  • URL: The Human Phenotype Ontology

OBO Foundry

  • obo-logo
  • Description: The Open Biological and Biomedical Ontologies (OBO) provide a suite of high-quality, interoperable, free and open source tools for sharing scientific knowledge and making new discoveries.
  • URL: OBO Foundry

UMLS

  • umls-logo.png
  • Description: The UMLS integrates and distributes key terminology, classification and coding standards, and associated resources to promote creation of more effective and interoperable biomedical information systems and services, including electronic health records.
  • URL: UMLS

GeneOntology

  • geneontology-logo
  • Description: GO ontology is the logical structure describing the full complexity of the biology, comprising the ‘classes’ (often referred to as ‘terms’) describing the many different types of molecular functions (Molecular Function), the pathways carrying out different biological programs (Biological Process), and the cellular locations where these occur (Cellular Component). The corpus of GO annotations is the traceable (i. e., associated with scientific articles), evidence-based statements relating a specific gene product to a specific ontology term.
  • URL: GeneOntology

Genome Datasets

Bgee

  • bgee-logo
  • Description: Bgee is a database for retrieval and comparison of gene expression patterns across multiple animal species. It provides an intuitive answer to the question "where is a gene expressed?" and supports research in cancer and agriculture, as well as evolutionary biology.
  • URL: Bgee

CancerLivER

  • cancerliver-logo
  • Description: CancerLivER (Liver Cancer Expression Resource) is a database of liver cancer that maintains gene expression datasets and biomarkers curated from public repositories and literature respectively. It contains the following three modules for extracting and analyzing data.
  • URL: CancerLivER

CLCA

  • clca-logo
  • Description: The Chinese Liver Cancer Atlas (CLCA) project. Deep whole-genome sequencing of 494 hepatocellular carcinomas and their matched normals.
  • URL: The Chinese Liver Cancer Atlas (CLCA)

TCGA-LIHC (Hepatocellular Carcinoma)

  • lihc-logo
  • Description: GDC TCGA Liver Cancer (LIHC)
  • URL: TCGA-LIHC

TCGA-LUAD (Lung Adenocarcinoma)

  • lihc-logo
  • Description: GDC TCGA Lung Adenocarcinoma (LUAD)
  • URL: TCGA-LUAD

TCGA-COAD (Colon Cancer)

  • lihc-logo
  • Description: GDC TCGA Colon Cancer (COAD)
  • URL: TCGA-COAD

TCGA-PAAD (Pancreatic Cancer)

  • lihc-logo
  • Description: GDC TCGA Pancreatic Cancer (PAAD)
  • URL: TCGA-PAAD