Skip to content

🗄️ Conversion of biomedical nomenclatures like HGNC to OBO

Notifications You must be signed in to change notification settings

biopragmatics/obo-db-ingest

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

183 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

OBO Database Ingestion

DOI

This repository shows how databases can be formalized as an OBO Ontology in the OBO flat file format, OWL format, and OBO Graph JSON format. A list of the databases whose controlled vocabularies and related content can be readily converted to OBO can be in found in the PyOBO source code's sources/ folder here.

Further discussion:

Contents

Each resource gets a subdirectory in the export/ directory containing the following exports:

A manifest of all resources is available at manifest.yml.

Build

To generate all OBO, OWL, and OFN files, run the following shell commands:

$ uv run --script build.py

If you just want to generate one, use -x like

$ uv run --script build.py -x spdx

If you want to pin the versions of specific ontologies, use --version-override like

$ uv run --script build.py -x mesh --version-override mesh 2018

PURLs

See PURL configuration at https://github.com/perma-id/w3id.org/tree/master/biopragmatics. This W3ID entry makes ontology artifacts in the "export" folder (https://github.com/biopragmatics/obo-db-ingest/tree/main/export) resolvable. Here are a few examples:

Resource Version Type Example PURL
Reactome Sequential https://w3id.org/biopragmatics/resources/reactome/83/reactome.obo
Interpro Major/Minor https://w3id.org/biopragmatics/resources/interpro/92.0/interpro.obo
DrugBank Salt Semantic https://w3id.org/biopragmatics/resources/drugbank.salt/5.1.9/drugbank.salt.obo
MeSH Year https://w3id.org/biopragmatics/resources/mesh/2023/mesh.obo.gz
UniProt Year/Month https://w3id.org/biopragmatics/resources/uniprot/2022_05/uniprot.obo.gz
HGNC Date https://w3id.org/biopragmatics/resources/hgnc/2023-02-01/hgnc.obo
CGNC unversioned https://w3id.org/biopragmatics/resources/cgnc/cgnc.obo