Using BioNetDB
Populating BioNetDB
Before running queries and analysis over BioNetDB, you have to populate the database by using the administration command line, i.e., the bionetdb-admin.sh script.
$ ./build/bin/bionetdb-admin.sh
Program: BioNetDB (OpenCB)
Description: BioNetDB implements a storage engine to work with biological networks using a NoQSL Graph database
Usage: bionetdb-admin.sh [-h|--help] [--version] <command> [options]
Commands:
download Download all different data sources provided in the configuration.yml file
build Build the data models in CSV format files
import Import the built data models in format CSV files into the BioNetDB databaseBioNetDB is designed to allow users to insert a huge amount of data. In order to make this process as efficient as possible, BioNetDB uses the Neo4j's bulk import tool: neo4j-admin import that loads large data sets by importing a collection of CSV files.
In order to populate BioNetDB services follow the next steps:
Download biological data, i.e.: genes, proteins, disease panels, variants, pathways,...
Create the CSV files from the biological data. This step is called build.
Import the CSV files into the BioNetDB database.
Let's see how to perform those steps using the bionetdb-admin.sh command line.
Download
The BioNetDB configuration file contains a section called download where users indicate the different locations to the biological data to download.
Execute the following command line to download the biological data in the directory ~/data.
Build
Once data is downloaded, it has to be saved in CSV format files before importing. The CSV files are created using the build command:
Import
The CSV files created previously are loaded into the database by executing the import command:
In addition to populate the database, the import command creates the indexes on the main nodes in order to speed-up further queries and analysis.
Last updated
Was this helpful?