Link to bioRxiv paper: http://biorxiv.org/cgi/content/short/2023.02.11.528088v1?rss=1
Authors: Stear, B. J., Ahooyi, T. M., Vasisht, S., Simmons, A., Beigel, K., Callahan, T. J., Silverstein, J. C., Taylor, D. M.
Abstract: The use of biomedical knowledge graphs (BMKG) for knowledge representation and data integration has increased drastically in the past several years due to the size, diversity, and complexity of biomedical datasets and databases. Data extraction from a single dataset or database is usually not particularly challenging. However, if a scientific question must rely on analysis across multiple databases or datasets, it can often take many hours to correctly and reproducibly extract and integrate data towards effective analysis. To overcome this issue, we created Petagraph, a large-scale BMKG that integrates biomolecular data into a schema incorporating the Unified Medical Language System (UMLS). Petagraph is instantiated on the Neo4j graph platform, and to date, has fifteen integrated biomolecular datasets. The majority of the data consists of entities or relationships related to genes, animal models, human phenotypes, drugs, and chemicals. Quantitative data sets containing values from gene expression analyses, chromatin organization, and genetic analyses have also been included. By incorporating models of biomolecular data types, the datasets can be traversed with hundreds of ontologies and controlled vocabularies native to the UMLS, effectively bringing the data to the ontologies. Petagraph allows users to analyze relationships between complex multi-omics data quickly and efficiently.
Copy rights belong to original authors. Visit the link for more info
Podcast created by Paper Player, LLC