Using BioNetDB

Populating BioNetDB

Before running queries and analysis over BioNetDB, you have to populate the database by using the administration command line, i.e., the bionetdb-admin.sh script.

$ ./build/bin/bionetdb-admin.sh 

Program:     BioNetDB (OpenCB)

Description: BioNetDB implements a storage engine to work with biological networks using a NoQSL Graph database

Usage:       bionetdb-admin.sh [-h|--help] [--version] <command> [options]

Commands:
            download  Download all different data sources provided in the configuration.yml file
               build  Build the data models in CSV format files
              import  Import the built data models in format CSV files into the BioNetDB database

BioNetDB is designed to allow users to insert a huge amount of data. In order to make this process as efficient as possible, BioNetDB uses the Neo4j's bulk import tool: neo4j-admin import that loads large data sets by importing a collection of CSV files.

In order to populate BioNetDB services follow the next steps:

  1. Download biological data, i.e.: genes, proteins, disease panels, variants, pathways,...

  2. Create the CSV files from the biological data. This step is called build.

  3. Import the CSV files into the BioNetDB database.

Let's see how to perform those steps using the bionetdb-admin.sh command line.

Download

The BioNetDB configuration file contains a section called download where users indicate the different locations to the biological data to download.

## Raw data download URLs
download:
    network:
      ## In this version we are loading Ractome as default network
      host: http://resources.opencb.org/datasets/reactome/Homo_sapiens.owl
    ensemblGene:
      host: http://bioinfo.hpc.cam.ac.uk/downloads/cellbase/v5/homo_sapiens_grch38/build/gene.json.gz
    refSeqGene:
      host: http://bioinfo.hpc.cam.ac.uk/downloads/cellbase/v5/homo_sapiens_grch38/build/refseq.json.gz
    protein:
      host: http://bioinfo.hpc.cam.ac.uk/downloads/cellbase/v5/homo_sapiens_grch38/build/protein.json.gz
    panel:
      host: http://resources.opencb.org/opencb/opencga/disease-panels/panelapp/panels.json
    clinicalVariant:
      host: http://bioinfo.hpc.cam.ac.uk/downloads/cellbase/v5/homo_sapiens_grch38/build/clinical_variants.full.json.gz

Execute the following command line to download the biological data in the directory ~/data.

$ ./build/bin/bionetdb-admin.sh download --output ~/data

$ ls -lhtr ~/data/*
-rw-rw-r-- 1 jtarraga jtarraga 523M Nov 17 12:24 /home/jtarraga/bioinfo/bionetdb/download/gene.json.gz
-rw-rw-r-- 1 jtarraga jtarraga 236M Nov 17 12:25 /home/jtarraga/bioinfo/bionetdb/download/refseq.json.gz
-rw-rw-r-- 1 jtarraga jtarraga 124M Nov 17 12:26 /home/jtarraga/bioinfo/bionetdb/download/protein.json.gz
-rw-rw-r-- 1 jtarraga jtarraga  49M Nov 17 12:26 /home/jtarraga/bioinfo/bionetdb/download/panels.json
-rw-rw-r-- 1 jtarraga jtarraga 1.4G Nov 17 12:34 /home/jtarraga/bioinfo/bionetdb/download/clinical_variants.full.json.gz
-rw-rw-r-- 1 jtarraga jtarraga 248M Nov 17 12:35 /home/jtarraga/bioinfo/bionetdb/download/Homo_sapiens.owl

Build

Once data is downloaded, it has to be saved in CSV format files before importing. The CSV files are created using the build command:

$ ./build/bin/bionetdb-admin.sh build --input ~/data --output ~/csv

$ ls -lhtr ~/csv/*
-rw-rw-r-- 1 jtarraga jtarraga    0 Feb 15 15:01 VARIANT_ANNOTATION.csv
-rw-rw-r-- 1 jtarraga jtarraga    0 Feb 15 15:01 TRANSPORT.csv
-rw-rw-r-- 1 jtarraga jtarraga    0 Feb 15 15:01 REGULATION_REGION.csv
-rw-rw-r-- 1 jtarraga jtarraga    0 Feb 15 15:01 REFSEQ_TRANSCRIPT.csv
-rw-rw-r-- 1 jtarraga jtarraga    0 Feb 15 15:01 REFSEQ_GENE.csv
-rw-rw-r-- 1 jtarraga jtarraga    0 Feb 15 15:01 REFSEQ_EXON.csv
-rw-rw-r-- 1 jtarraga jtarraga    0 Feb 15 15:01 PROTEIN_ANNOTATION.csv
-rw-rw-r-- 1 jtarraga jtarraga    0 Feb 15 15:01 PHYSICAL_ENTITY.csv
-rw-rw-r-- 1 jtarraga jtarraga    0 Feb 15 15:01 PANEL_VARIANT.csv
-rw-rw-r-- 1 jtarraga jtarraga    0 Feb 15 15:01 PANEL_STR.csv
-rw-rw-r-- 1 jtarraga jtarraga    0 Feb 15 15:01 PANEL_REGION.csv
-rw-rw-r-- 1 jtarraga jtarraga    0 Feb 15 15:01 ONTOLOGY.csv
-rw-rw-r-- 1 jtarraga jtarraga    0 Feb 15 15:01 INTERACTION.csv
-rw-rw-r-- 1 jtarraga jtarraga    0 Feb 15 15:01 GENE_ANNOTATION.csv
-rw-rw-r-- 1 jtarraga jtarraga    0 Feb 15 15:01 ENSEMBL_TRANSCRIPT.csv
-rw-rw-r-- 1 jtarraga jtarraga    0 Feb 15 15:01 ENSEMBL_GENE.csv
-rw-rw-r-- 1 jtarraga jtarraga    0 Feb 15 15:01 ENSEMBL_EXON.csv
-rw-rw-r-- 1 jtarraga jtarraga    0 Feb 15 15:01 COMPLEX_ASSEMBLY.csv
-rw-rw-r-- 1 jtarraga jtarraga    0 Feb 15 15:01 TRANSCRIPT_ANNOTATION_FLAG.csv
-rw-rw-r-- 1 jtarraga jtarraga    0 Feb 15 15:01 PHENOTYPE.csv
-rw-rw-r-- 1 jtarraga jtarraga    0 Feb 15 15:01 ONTOLOGY_TERM.csv
-rw-rw-r-- 1 jtarraga jtarraga    0 Feb 15 15:01 EXON_OVERLAP.csv
-rw-rw-r-- 1 jtarraga jtarraga    0 Feb 15 15:01 DISORDER.csv
-rw-rw-r-- 1 jtarraga jtarraga    0 Feb 15 15:01 CUSTOM.csv
-rw-rw-r-- 1 jtarraga jtarraga    0 Feb 15 15:01 ASSEMBLY.csv
drwxr-xr-x 2 jtarraga jtarraga 4.0K Feb 15 15:02 xref.proteins.rocksdb
drwxr-xr-x 2 jtarraga jtarraga 4.0K Feb 15 15:02 proteins.rocksdb
drwxr-xr-x 2 jtarraga jtarraga  32K Feb 15 15:11 genes.rocksdb
drwxr-xr-x 2 jtarraga jtarraga 4.0K Feb 15 15:14 xref.genes.rocksdb
drwxr-xr-x 2 jtarraga jtarraga 4.0K Feb 15 15:42 uidRocksDB
-rw-rw-r-- 1 jtarraga jtarraga 4.5M Feb 15 15:42 IS___TRANSCRIPT___PROTEIN.csv
-rw-rw-r-- 1 jtarraga jtarraga  17K Feb 15 15:42 REACTANT___REACTION___UNDEFINED.csv
-rw-rw-r-- 1 jtarraga jtarraga  133 Feb 15 15:42 PATHWAY_NEXT_STEP___PATHWAY___REGULATION.csv
-rw-rw-r-- 1 jtarraga jtarraga   51 Feb 15 15:42 HAS___STRUCTURAL_VARIATION___BREAKEND.csv
-rw-rw-r-- 1 jtarraga jtarraga 309K Feb 15 15:42 COMPONENT_OF_PHYSICAL_ENTITY_COMPLEX___PROTEIN___PHYSICAL_ENTITY_COMPLEX.csv
-rw-rw-r-- 1 jtarraga jtarraga 405M Feb 15 15:42 ANNOTATION___VARIANT___TRANSCRIPT_CONSTRAINT_SCORE.csv
-rw-rw-r-- 1 jtarraga jtarraga  70M Feb 15 15:42 VARIANT_SAMPLE_DATA.csv
-rw-rw-r-- 1 jtarraga jtarraga 1.2G Feb 15 15:42 VARIANT_POPULATION_FREQUENCY.csv
-rw-rw-r-- 1 jtarraga jtarraga  271 Feb 15 15:42 VARIANT_FILE.csv
-rw-rw-r-- 1 jtarraga jtarraga 345M Feb 15 15:42 VARIANT.csv
-rw-rw-r-- 1 jtarraga jtarraga 259M Feb 15 15:42 VARIANT_CLASSIFICATION.csv
-rw-rw-r-- 1 jtarraga jtarraga  62M Feb 15 15:42 TRANSCRIPT.csv
-rw-rw-r-- 1 jtarraga jtarraga 8.7M Feb 15 15:42 TRANSCRIPT_CONSTRAINT_SCORE.csv
-rw-rw-r-- 1 jtarraga jtarraga 510M Feb 15 15:42 TFBS.csv
-rw-rw-r-- 1 jtarraga jtarraga   41 Feb 15 15:42 TARGET___GENE___MIRNA_MATURE.csv
-rw-rw-r-- 1 jtarraga jtarraga 1.6K Feb 15 15:42 SO_TERM.csv
-rw-rw-r-- 1 jtarraga jtarraga 276K Feb 15 15:42 SMALL_MOLECULE.csv
-rw-rw-r-- 1 jtarraga jtarraga  164 Feb 15 15:42 SAMPLE.csv
-rw-rw-r-- 1 jtarraga jtarraga 6.0M Feb 15 15:42 REPEAT.csv
-rw-rw-r-- 1 jtarraga jtarraga  93K Feb 15 15:42 REGULATION.csv
-rw-rw-r-- 1 jtarraga jtarraga 1.3M Feb 15 15:42 REACTION.csv
-rw-rw-r-- 1 jtarraga jtarraga 113K Feb 15 15:42 REACTANT___REACTION___PROTEIN.csv
-rw-rw-r-- 1 jtarraga jtarraga  22K Feb 15 15:42 REACTANT___REACTION___DNA.csv
-rw-rw-r-- 1 jtarraga jtarraga  35K Feb 15 15:42 PROTEIN_KEYWORD.csv
-rw-rw-r-- 1 jtarraga jtarraga 965M Feb 15 15:42 PROPERTY.csv
-rw-rw-r-- 1 jtarraga jtarraga 9.3K Feb 15 15:42 PRODUCT___REACTION___UNDEFINED.csv
-rw-rw-r-- 1 jtarraga jtarraga 2.5K Feb 15 15:42 PRODUCT___REACTION___RNA.csv
-rw-rw-r-- 1 jtarraga jtarraga 1.2M Feb 15 15:42 PHYSICAL_ENTITY_COMPLEX.csv
-rw-rw-r-- 1 jtarraga jtarraga  42K Feb 15 15:42 PATHWAY_NEXT_STEP___REGULATION___REACTION.csv
-rw-rw-r-- 1 jtarraga jtarraga  367 Feb 15 15:42 PATHWAY_NEXT_STEP___REGULATION___PATHWAY.csv
-rw-rw-r-- 1 jtarraga jtarraga  46K Feb 15 15:42 PATHWAY_NEXT_STEP___REACTION___REGULATION.csv
-rw-rw-r-- 1 jtarraga jtarraga 1.3K Feb 15 15:42 PATHWAY_NEXT_STEP___REACTION___PATHWAY.csv
-rw-rw-r-- 1 jtarraga jtarraga 2.3K Feb 15 15:42 PATHWAY_NEXT_STEP___PATHWAY___PATHWAY.csv
-rw-rw-r-- 1 jtarraga jtarraga  11M Feb 15 15:42 PANEL_GENE.csv
-rw-rw-r-- 1 jtarraga jtarraga  62M Feb 15 15:42 MIRNA_TARGET.csv
-rw-rw-r-- 1 jtarraga jtarraga 273K Feb 15 15:42 MIRNA_MATURE.csv
-rw-rw-r-- 1 jtarraga jtarraga 278K Feb 15 15:42 MIRNA.csv
-rw-rw-r-- 1 jtarraga jtarraga  59K Feb 15 15:42 MATURE___MIRNA___MIRNA_MATURE.csv
-rw-rw-r-- 1 jtarraga jtarraga   53 Feb 15 15:42 MATE___BREAKEND___BREAKEND_MATE.csv
-rw-rw-r-- 1 jtarraga jtarraga  400 Feb 15 15:42 IS___RNA___MIRNA.csv
-rw-rw-r-- 1 jtarraga jtarraga  31K Feb 15 15:42 IS___GENE___MIRNA.csv
-rw-rw-r-- 1 jtarraga jtarraga  16K Feb 15 15:42 IS___DNA___GENE.csv
-rw-rw-r-- 1 jtarraga jtarraga   87 Feb 15 15:42 INTERNAL_CONNFIG.csv
-rw-rw-r-- 1 jtarraga jtarraga  236 Feb 15 15:42 INDIVIDUAL.csv
-rw-rw-r-- 1 jtarraga jtarraga 161M Feb 15 15:42 HERITABLE_TRAIT.csv
-rw-rw-r-- 1 jtarraga jtarraga   52 Feb 15 15:42 HAS___VARIANT_FILE___SAMPLE.csv
-rw-rw-r-- 1 jtarraga jtarraga   58 Feb 15 15:42 HAS___INDIVIDUAL___SAMPLE.csv
-rw-rw-r-- 1 jtarraga jtarraga 105M Feb 15 15:42 HAS___CLINICAL_EVIDENCE___HERITABLE_TRAIT.csv
-rw-rw-r-- 1 jtarraga jtarraga  81M Feb 15 15:42 HAS___CLINICAL_EVIDENCE___EVIDENCE_SUBMISSION.csv
-rw-rw-r-- 1 jtarraga jtarraga 236M Feb 15 15:42 GENE_EXPRESSION.csv
-rw-rw-r-- 1 jtarraga jtarraga   32 Feb 15 15:42 FAMILY.csv
-rw-rw-r-- 1 jtarraga jtarraga  44K Feb 15 15:42 DNA.csv
-rw-rw-r-- 1 jtarraga jtarraga 5.2M Feb 15 15:42 DATA___VARIANT___VARIANT_FILE_DATA.csv
-rw-rw-r-- 1 jtarraga jtarraga 8.4K Feb 15 15:42 CYTOBAND.csv
-rw-rw-r-- 1 jtarraga jtarraga 3.8K Feb 15 15:42 CONTROLLER___REGULATION___SMALL_MOLECULE.csv
-rw-rw-r-- 1 jtarraga jtarraga   93 Feb 15 15:42 CONTROLLER___REGULATION___RNA.csv
-rw-rw-r-- 1 jtarraga jtarraga 8.1K Feb 15 15:42 CONTROLLER___REGULATION___PROTEIN.csv
-rw-rw-r-- 1 jtarraga jtarraga  22K Feb 15 15:42 CONTROLLER___REGULATION___PHYSICAL_ENTITY_COMPLEX.csv
-rw-rw-r-- 1 jtarraga jtarraga 4.5K Feb 15 15:42 CONTROLLER___CATALYSIS___UNDEFINED.csv
-rw-rw-r-- 1 jtarraga jtarraga  56K Feb 15 15:42 CONTROLLER___CATALYSIS___PHYSICAL_ENTITY_COMPLEX.csv
-rw-rw-r-- 1 jtarraga jtarraga  35K Feb 15 15:42 CONTROLLED___REGULATION___REACTION.csv
-rw-rw-r-- 1 jtarraga jtarraga   43 Feb 15 15:42 CONTROLLED___REGULATION___PATHWAY.csv
-rw-rw-r-- 1 jtarraga jtarraga   45 Feb 15 15:42 CONTROLLED___REGULATION___CATALYSIS.csv
-rw-rw-r-- 1 jtarraga jtarraga 244K Feb 15 15:42 COMPONENT_OF_PATHWAY___REACTION___PATHWAY.csv
-rw-rw-r-- 1 jtarraga jtarraga  44K Feb 15 15:42 COMPONENT_OF_PATHWAY___PATHWAY___PATHWAY.csv
-rw-rw-r-- 1 jtarraga jtarraga 561M Feb 15 15:42 CLINICAL_EVIDENCE.csv
-rw-rw-r-- 1 jtarraga jtarraga 8.5K Feb 15 15:42 CELLULAR_LOCATION___REGULATION___CELLULAR_LOCATION.csv
-rw-rw-r-- 1 jtarraga jtarraga 125K Feb 15 15:42 CELLULAR_LOCATION___REACTION___CELLULAR_LOCATION.csv
-rw-rw-r-- 1 jtarraga jtarraga 277K Feb 15 15:42 CELLULAR_LOCATION___PHYSICAL_ENTITY_COMPLEX___CELLULAR_LOCATION.csv
-rw-rw-r-- 1 jtarraga jtarraga  17K Feb 15 15:42 CELLULAR_LOCATION___DNA___CELLULAR_LOCATION.csv
-rw-rw-r-- 1 jtarraga jtarraga 7.0K Feb 15 15:42 CELLULAR_LOCATION.csv
-rw-rw-r-- 1 jtarraga jtarraga  41K Feb 15 15:42 CELLULAR_LOCATION___CATALYSIS___CELLULAR_LOCATION.csv
-rw-rw-r-- 1 jtarraga jtarraga  172 Feb 15 15:42 BREAKEND.csv
-rw-rw-r-- 1 jtarraga jtarraga 150M Feb 15 15:42 ANNOTATION___VARIANT___VARIANT_FUNCTIONAL_SCORE.csv
-rw-rw-r-- 1 jtarraga jtarraga   55 Feb 15 15:42 ANNOTATION___VARIANT___VARIANT_DRUG_INTERACTION.csv
-rw-rw-r-- 1 jtarraga jtarraga 639M Feb 15 15:42 ANNOTATION___VARIANT___VARIANT_CONSEQUENCE_TYPE.csv
-rw-rw-r-- 1 jtarraga jtarraga  20M Feb 15 15:42 ANNOTATION___VARIANT___REPEAT.csv
-rw-rw-r-- 1 jtarraga jtarraga  76M Feb 15 15:42 ANNOTATION___VARIANT___CYTOBAND.csv
-rw-rw-r-- 1 jtarraga jtarraga 552M Feb 15 15:42 ANNOTATION___VARIANT_CONSEQUENCE_TYPE___TRANSCRIPT.csv
-rw-rw-r-- 1 jtarraga jtarraga 716M Feb 15 15:42 ANNOTATION___VARIANT_CONSEQUENCE_TYPE___SO_TERM.csv
-rw-rw-r-- 1 jtarraga jtarraga 244M Feb 15 15:42 ANNOTATION___VARIANT_CONSEQUENCE_TYPE___PROTEIN_VARIANT_ANNOTATION.csv
-rw-rw-r-- 1 jtarraga jtarraga 199M Feb 15 15:42 ANNOTATION___VARIANT___CLINICAL_EVIDENCE.csv
-rw-rw-r-- 1 jtarraga jtarraga 176M Feb 15 15:42 ANNOTATION___TRANSCRIPT___XREF.csv
-rw-rw-r-- 1 jtarraga jtarraga 4.1M Feb 15 15:42 ANNOTATION___TRANSCRIPT___TRANSCRIPT_CONSTRAINT_SCORE.csv
-rw-rw-r-- 1 jtarraga jtarraga  27M Feb 15 15:42 ANNOTATION___PROTEIN___XREF.csv
-rw-rw-r-- 1 jtarraga jtarraga 8.3M Feb 15 15:42 ANNOTATION___PROTEIN___PROTEIN_FEATURE.csv
-rw-rw-r-- 1 jtarraga jtarraga  16M Feb 15 15:42 ANNOTATION___MIRNA_MATURE___MIRNA_TARGET.csv
-rw-rw-r-- 1 jtarraga jtarraga 824K Feb 15 15:42 ANNOTATION___GENE___PANEL_GENE.csv
-rw-rw-r-- 1 jtarraga jtarraga  63M Feb 15 15:42 ANNOTATION___GENE___GENE_TRAIT_ASSOCIATION.csv
-rw-rw-r-- 1 jtarraga jtarraga  83K Feb 15 15:42 ANNOTATION___GENE___GENE_DRUG_INTERACTION.csv
-rw-rw-r-- 1 jtarraga jtarraga  80K Feb 15 15:42 ANNOTATION___DRUG___GENE_DRUG_INTERACTION.csv
-rw-rw-r-- 1 jtarraga jtarraga 295M Feb 15 15:42 XREF.csv
-rw-rw-r-- 1 jtarraga jtarraga 581M Feb 15 15:42 VARIANT_FUNCTIONAL_SCORE.csv
-rw-rw-r-- 1 jtarraga jtarraga  77M Feb 15 15:42 VARIANT_FILE_DATA.csv
-rw-rw-r-- 1 jtarraga jtarraga   88 Feb 15 15:42 VARIANT_DRUG_INTERACTION.csv
-rw-rw-r-- 1 jtarraga jtarraga 883M Feb 15 15:42 VARIANT_CONSERVATION_SCORE.csv
-rw-rw-r-- 1 jtarraga jtarraga 4.4G Feb 15 15:42 VARIANT_CONSEQUENCE_TYPE.csv
-rw-rw-r-- 1 jtarraga jtarraga 103K Feb 15 15:42 UNDEFINED.csv
-rw-rw-r-- 1 jtarraga jtarraga  54M Feb 15 15:42 TRANSCRIPT_ANNOTATION_EVIDENCE.csv
-rw-rw-r-- 1 jtarraga jtarraga  52K Feb 15 15:42 STRUCTURAL_VARIATION.csv
-rw-rw-r-- 1 jtarraga jtarraga  23K Feb 15 15:42 RNA.csv
-rw-rw-r-- 1 jtarraga jtarraga 149K Feb 15 15:42 REACTANT___REACTION___SMALL_MOLECULE.csv
-rw-rw-r-- 1 jtarraga jtarraga 3.2K Feb 15 15:42 REACTANT___REACTION___RNA.csv
-rw-rw-r-- 1 jtarraga jtarraga 134K Feb 15 15:42 REACTANT___REACTION___PHYSICAL_ENTITY_COMPLEX.csv
-rw-rw-r-- 1 jtarraga jtarraga 516M Feb 15 15:42 PROTEIN_VARIANT_ANNOTATION.csv
-rw-rw-r-- 1 jtarraga jtarraga 1.2G Feb 15 15:42 PROTEIN_SUBSTITUTION_SCORE.csv
-rw-rw-r-- 1 jtarraga jtarraga  41M Feb 15 15:42 PROTEIN_FEATURE.csv
-rw-rw-r-- 1 jtarraga jtarraga 3.5M Feb 15 15:42 PROTEIN.csv
-rw-rw-r-- 1 jtarraga jtarraga 132K Feb 15 15:42 PRODUCT___REACTION___SMALL_MOLECULE.csv
-rw-rw-r-- 1 jtarraga jtarraga  62K Feb 15 15:42 PRODUCT___REACTION___PROTEIN.csv
-rw-rw-r-- 1 jtarraga jtarraga 148K Feb 15 15:42 PRODUCT___REACTION___PHYSICAL_ENTITY_COMPLEX.csv
-rw-rw-r-- 1 jtarraga jtarraga  21K Feb 15 15:42 PATHWAY_NEXT_STEP___REGULATION___REGULATION.csv
-rw-rw-r-- 1 jtarraga jtarraga  15K Feb 15 15:42 PATHWAY_NEXT_STEP___REGULATION___CATALYSIS.csv
-rw-rw-r-- 1 jtarraga jtarraga 205K Feb 15 15:42 PATHWAY_NEXT_STEP___REACTION___REACTION.csv
-rw-rw-r-- 1 jtarraga jtarraga  92K Feb 15 15:42 PATHWAY_NEXT_STEP___REACTION___CATALYSIS.csv
-rw-rw-r-- 1 jtarraga jtarraga  509 Feb 15 15:42 PATHWAY_NEXT_STEP___PATHWAY___REACTION.csv
-rw-rw-r-- 1 jtarraga jtarraga  240 Feb 15 15:42 PATHWAY_NEXT_STEP___PATHWAY___CATALYSIS.csv
-rw-rw-r-- 1 jtarraga jtarraga 8.8K Feb 15 15:42 PATHWAY_NEXT_STEP___CATALYSIS___REGULATION.csv
-rw-rw-r-- 1 jtarraga jtarraga  93K Feb 15 15:42 PATHWAY_NEXT_STEP___CATALYSIS___REACTION.csv
-rw-rw-r-- 1 jtarraga jtarraga  510 Feb 15 15:42 PATHWAY_NEXT_STEP___CATALYSIS___PATHWAY.csv
-rw-rw-r-- 1 jtarraga jtarraga  55K Feb 15 15:42 PATHWAY_NEXT_STEP___CATALYSIS___CATALYSIS.csv
-rw-rw-r-- 1 jtarraga jtarraga 163K Feb 15 15:42 PATHWAY.csv
-rw-rw-r-- 1 jtarraga jtarraga   54 Feb 15 15:42 MOTHER_OF___INDIVIDUAL___INDIVIDUAL.csv
-rw-rw-r-- 1 jtarraga jtarraga 3.2G Feb 15 15:42 HGV.csv
-rw-rw-r-- 1 jtarraga jtarraga  24K Feb 15 15:42 HAS___VARIANT___STRUCTURAL_VARIANT.csv
-rw-rw-r-- 1 jtarraga jtarraga  55M Feb 15 15:42 HAS___TRANSCRIPT___EXON.csv
-rw-rw-r-- 1 jtarraga jtarraga 6.5M Feb 15 15:42 HAS___GENE___TRANSCRIPT.csv
-rw-rw-r-- 1 jtarraga jtarraga  14M Feb 15 15:42 HAS___FEATURE_ONTOLOGY_TERM_ANNOTATION___TRANSCRIPT_ANNOTATION_EVIDENCE.csv
-rw-rw-r-- 1 jtarraga jtarraga   42 Feb 15 15:42 HAS___FAMILY___INDIVIDUAL.csv
-rw-rw-r-- 1 jtarraga jtarraga 850K Feb 15 15:42 HAS___DISEASE_PANEL___PANEL_GENE.csv
-rw-rw-r-- 1 jtarraga jtarraga  90M Feb 15 15:42 HAS___CLINICAL_EVIDENCE___VARIANT_CLASSIFICATION.csv
-rw-rw-r-- 1 jtarraga jtarraga 356M Feb 15 15:42 HAS___CLINICAL_EVIDENCE___PROPERTY.csv
-rw-rw-r-- 1 jtarraga jtarraga 300M Feb 15 15:42 HAS___CLINICAL_EVIDENCE___GENOMIC_FEATURE.csv
-rw-rw-r-- 1 jtarraga jtarraga 474M Feb 15 15:42 GENOMIC_FEATURE.csv
-rw-rw-r-- 1 jtarraga jtarraga  52M Feb 15 15:42 GENE_TRAIT_ASSOCIATION.csv
-rw-rw-r-- 1 jtarraga jtarraga 287K Feb 15 15:42 GENE_DRUG_INTERACTION.csv
-rw-rw-r-- 1 jtarraga jtarraga  19M Feb 15 15:42 GENE.csv
-rw-rw-r-- 1 jtarraga jtarraga  60M Feb 15 15:42 FEATURE_ONTOLOGY_TERM_ANNOTATION.csv
-rw-rw-r-- 1 jtarraga jtarraga   54 Feb 15 15:42 FATHER_OF___INDIVIDUAL___INDIVIDUAL.csv
-rw-rw-r-- 1 jtarraga jtarraga 307M Feb 15 15:42 EXON.csv
-rw-rw-r-- 1 jtarraga jtarraga 211M Feb 15 15:42 EVIDENCE_SUBMISSION.csv
-rw-rw-r-- 1 jtarraga jtarraga  73K Feb 15 15:42 DRUG.csv
-rw-rw-r-- 1 jtarraga jtarraga  55K Feb 15 15:42 DISEASE_PANEL.csv
-rw-rw-r-- 1 jtarraga jtarraga  21M Feb 15 15:42 DATA___VARIANT___VARIANT_SAMPLE_DATA.csv
-rw-rw-r-- 1 jtarraga jtarraga 3.2M Feb 15 15:42 DATA___VARIANT_FILE___VARIANT_FILE_DATA.csv
-rw-rw-r-- 1 jtarraga jtarraga  13M Feb 15 15:42 DATA___SAMPLE___VARIANT_SAMPLE_DATA.csv
-rw-rw-r-- 1 jtarraga jtarraga  279 Feb 15 15:42 CONTROLLER___REGULATION___UNDEFINED.csv
-rw-rw-r-- 1 jtarraga jtarraga  40K Feb 15 15:42 CONTROLLER___CATALYSIS___PROTEIN.csv
-rw-rw-r-- 1 jtarraga jtarraga 102K Feb 15 15:42 CONTROLLED___CATALYSIS___REACTION.csv
-rw-rw-r-- 1 jtarraga jtarraga  25K Feb 15 15:42 COMPONENT_OF_PHYSICAL_ENTITY_COMPLEX___UNDEFINED___PHYSICAL_ENTITY_COMPLEX.csv
-rw-rw-r-- 1 jtarraga jtarraga  38K Feb 15 15:42 COMPONENT_OF_PHYSICAL_ENTITY_COMPLEX___SMALL_MOLECULE___PHYSICAL_ENTITY_COMPLEX.csv
-rw-rw-r-- 1 jtarraga jtarraga 4.9K Feb 15 15:42 COMPONENT_OF_PHYSICAL_ENTITY_COMPLEX___RNA___PHYSICAL_ENTITY_COMPLEX.csv
-rw-rw-r-- 1 jtarraga jtarraga 153K Feb 15 15:42 COMPONENT_OF_PHYSICAL_ENTITY_COMPLEX___PHYSICAL_ENTITY_COMPLEX___PHYSICAL_ENTITY_COMPLEX.csv
-rw-rw-r-- 1 jtarraga jtarraga  12K Feb 15 15:42 COMPONENT_OF_PHYSICAL_ENTITY_COMPLEX___DNA___PHYSICAL_ENTITY_COMPLEX.csv
-rw-rw-r-- 1 jtarraga jtarraga  30K Feb 15 15:42 CELLULAR_LOCATION___UNDEFINED___CELLULAR_LOCATION.csv
-rw-rw-r-- 1 jtarraga jtarraga  74K Feb 15 15:42 CELLULAR_LOCATION___SMALL_MOLECULE___CELLULAR_LOCATION.csv
-rw-rw-r-- 1 jtarraga jtarraga 7.8K Feb 15 15:42 CELLULAR_LOCATION___RNA___CELLULAR_LOCATION.csv
-rw-rw-r-- 1 jtarraga jtarraga 404K Feb 15 15:42 CELLULAR_LOCATION___PROTEIN___CELLULAR_LOCATION.csv
-rw-rw-r-- 1 jtarraga jtarraga 258K Feb 15 15:42 CATALYSIS.csv
-rw-rw-r-- 1 jtarraga jtarraga   51 Feb 15 15:42 BREAKEND_MATE.csv
-rw-rw-r-- 1 jtarraga jtarraga 251M Feb 15 15:42 ANNOTATION___VARIANT___VARIANT_POPULATION_FREQUENCY.csv
-rw-rw-r-- 1 jtarraga jtarraga 247M Feb 15 15:42 ANNOTATION___VARIANT___VARIANT_CONSERVATION_SCORE.csv
-rw-rw-r-- 1 jtarraga jtarraga 674M Feb 15 15:42 ANNOTATION___VARIANT___HGV.csv
-rw-rw-r-- 1 jtarraga jtarraga 577M Feb 15 15:42 ANNOTATION___VARIANT_CONSEQUENCE_TYPE___GENE.csv
-rw-rw-r-- 1 jtarraga jtarraga  96M Feb 15 15:42 ANNOTATION___TRANSCRIPT___TFBS.csv
-rw-rw-r-- 1 jtarraga jtarraga  12M Feb 15 15:42 ANNOTATION___TRANSCRIPT___FEATURE_ONTOLOGY_TERM_ANNOTATION.csv
-rw-rw-r-- 1 jtarraga jtarraga 452M Feb 15 15:42 ANNOTATION___PROTEIN_VARIANT_ANNOTATION___PROTEIN_SUBSTITUTION_SCORE.csv
-rw-rw-r-- 1 jtarraga jtarraga  87M Feb 15 15:42 ANNOTATION___PROTEIN_VARIANT_ANNOTATION___PROTEIN.csv
-rw-rw-r-- 1 jtarraga jtarraga 2.4M Feb 15 15:42 ANNOTATION___PROTEIN___PROTEIN_KEYWORD.csv
-rw-rw-r-- 1 jtarraga jtarraga  94M Feb 15 15:42 ANNOTATION___GENE___XREF.csv
-rw-rw-r-- 1 jtarraga jtarraga  17M Feb 15 15:42 ANNOTATION___GENE___MIRNA_TARGET.csv
-rw-rw-r-- 1 jtarraga jtarraga  47M Feb 15 15:42 ANNOTATION___GENE___GENE_EXPRESSION.csv

Import

The CSV files created previously are loaded into the database by executing the import command:

$ ./build/bin/bionetdb-admin.sh import --input ~/bioinfo/bionetdb/build/ --database neo4j

In addition to populate the database, the import command creates the indexes on the main nodes in order to speed-up further queries and analysis.

Last updated

Was this helpful?