Graph Database Overview

Life Sciences Graph Protecting Pharmaceutical Supply Chain (PPSC)
Graph Databases are built and named after Graph Theory in Discrete Mathematics.

Nodes (vertices) are nouns and

Relationships (edges) are verbs

Labels and Properties can be placed on both

  • Nodes
    • Label = Lab and Pharma
    • Property = name, short, start date, and end date
  • Relationship
    • Label = SENDS_DATA
    • Property = start date and end date

All 4 components are equal and named components in a “Property Graph Model

Queries on relationships is the main advantage over relational and hierarchical databases

Neo4j Bloom application displaying the "Property Graph Model":

The figure above shows a lab sending data to a pharmaceutical company. The Neo4j Bloom visualizer which displays the properties better than the browser.

Both Bloom and the Browser can be launched from the Neo4j Desktop application:

Primary Transaction

In the models described in this blog series are based on the primary transaction of a patient getting treatment from a Doctor.

This is the primary transaction of the models that follow.

  1. A Patient
    PRESENTS himself to a Doctor
  2. The Doctor
    TREATS the Patient
The figure below shows the model in the Neo4j browser:

Doctor orders lab test

We expand the primary transaction to include ordering a lab test

  1. A Patient
    PRESENTS himself to a Doctor
  2. The Doctor
    ORDERS_TEST from a Lab
  3. The Lab returns the TEST_RESULT to the Doctor
  4. The Doctor
    TREATS the Patient based on the TEST_RESULT
The figure below visualizes the Graph model within Neo4j Bloom:

Cypher, APOC, Browser, and Bloom

Cypher

The Graph created for this article was done via Cypher, like SQL, commands through the Neo4j Browser, like Toad, SQL Navigator, PGADMIN, etc.

A simple create Cypher statement:

// Intro to graphs

create(l:Lab    {seq: 1, name:‘Lab 1’   , short:‘l1’, start: date(), end: date(‘9000-01-01’)})

create(p:Pharma {seq: 2, name:‘Pharma 1’, short:‘p1’, start: date(), end: date(‘9000-01-01’)})

create (l)-[r:SENDS_DATA{start: date(), end: date(‘9000-01-01’)}]->(p);

The code example above uses the core Cypher create statement as an example. In the real world, you would use a merge instead to avoid creating duplicates. If you want more information, check out Neo4j’s Cypher manual. Cypher has also been added as a new technical standard.

Neo4j Links:

Cypher Reference Card:     https://neo4j.com/docs/cypher-refcard/current/

Cypher Manual:         https://neo4j.com/docs/cypher-manual/current/introduction/

Cypher Create:            https://neo4j.com/docs/cypher-manual/current/clauses/create/

Cypher Merge:            https://neo4j.com/docs/cypher-manual/current/clauses/merge/

Cypher style guide:        https://neo4j.com/developer/cypher-style-guide/

Graph Query Language (GQL):    https://neo4j.com/blog/gql-standard-query-language-property-graphs/

Awesome Procedures On Cypher (APOC)

APOC Name History: Apoc was the technician and driver on board of the Nebuchadnezzar in the Matrix movie. He was killed by Cypher.


APOC was also the first bundled A Package Of Component for Neo4j in 2009.

APOC name description taken from the Neo4j website: https://neo4j.com/docs/labs/apoc/current/introduction/

Basically, if there is an APOC procedure or function, use it instead of raw cypher. You can experiment with raw Cypher, but we all end up in the same place, APOC.

To create Nodes and Relationships:

CALL apoc.merge.node([‘Patient’], {metadata: true, short: ‘md’, name: ‘MetaData’}, {}) YIELD node AS patient

CALL apoc.merge.node([‘Doctor’ ], {metadata: true, short: ‘md’, name: ‘MetaData’}, {}) YIELD node AS doctor

CALL apoc.merge.relationship(patient, ‘PRESENTS’   , {metadata: true, short: ‘pr’, name: ‘PRESENTS’}, {}, doctor) YIELD rel                                   YIELD rel

RETURN *;

Like Cypher, the APOC merge will not create another Node and/or Relationship in the Graph. You can also set which properties should be set when a Node or Relationship is created. For more information, you can get more information on the Creating Data chapter on APOC: https://neo4j.com/docs/labs/apoc/current/graph-updates/data-creation/

Neo4J Browser

At LDC, we use the Browser to create the MetaData (labels) and use Bloom to create instances or objects of data. This has more to do with the limitations of Bloom or the author in creating labels in Bloom. Once all the nodes, labels, properties, and relationships are defined in the Graph metadata, you can create data within Bloom through the user interface instead of creating APOC Cypher statements in the Neo4j browser.

Here are some examples:

Create 1 node that has all the labels on it to create the MetaData within the Graph:

CALL apoc.merge.node(  [‘Company’,

                        ‘Hospital’,

                        ‘Doctor’,

                        ‘Government’,

                        ‘Non-Government Organizations’,

                        ‘Patient’,

                        ‘Citizen’,

                        ‘Virus’,

                        ‘Drug’,

                        ‘Test’,

                        ‘Event’,

                        ‘Regulations’,

                        ‘Exemption’,

                        ‘Waiver’,

                        ‘Vaccine’,

                        ‘Medical Device’,

                        ‘Good’,

                        ‘Service’,

                        ‘Supply Chain’,

                        ‘System’,

                        ‘Governance’,

                        ‘Workflow’,

                        ‘Step’,

                        ‘Risk’,

                        ‘Counter Party Exposure’,

                        ‘Alert’,

                        ‘Info’,

                        ‘Supplier’,

                        ‘Location’,

                        ‘Ingredient’

                       ], {seq: 9999, metadata: true, short: ‘md’, name: ‘MetaData’, start: date(), end: date(‘9000-01-01’)}, {}) YIELD node

RETURN *;

Example of creating :CodeList

//***                           CDISC LBTEST – Analyte Codelist

    WITH [

    {relProps: {seq:  1}, nodeProps: {key: ‘LBTEST’, critical: ‘NA’, short:‘alt’     , name: ‘Alanine Aminotransferase’}},

    {relProps: {seq:  2}, nodeProps: {key: ‘LBTEST’, critical: ‘NA’, short:‘ast’     , name: ‘Aspartate Aminotransferase’}},

    {relProps: {seq:  3}, nodeProps: {key: ‘LBTEST’, critical: ‘NA’, short:‘creati’  , name: ‘Creatinine’}},

    {relProps: {seq:  4}, nodeProps: {key: ‘LBTEST’, critical: ‘NA’, short:‘tbili’   , name: ‘Total Bilirubin’}},

    {relProps: {seq:  5}, nodeProps: {key: ‘LBTEST’, critical: ‘NA’, short:‘wbc’     , name: ‘White Blood Cells’}},

    {relProps: {seq:  6}, nodeProps: {key: ‘LBTEST’, critical: ‘NA’, short:‘sarscov2’, name: ‘SARS-CoV-2 PCR’}}

    ] AS tarList

    CALL apoc.merge.node(            [‘CodeList’], {key: ‘LBTEST’, short: ‘Analyte Codelist’, name: ‘Analyte Codelist’}, {}) YIELD node AS src

    UNWIND tarList AS tar

        CALL apoc.merge.node(        [‘CodeItem’], tar.nodeProps, {}                    ) YIELD node

        CALL apoc.merge.relationship(src         , ‘CONTAINS’   , tar.relProps, {}, node) YIELD rel

    RETURN *;

Neo4J Bloom

Once all the MetaData and example data is added to the Graph, then you can add records via Bloom.

From bloom, you can right click on some existing data to duplicate the data.

We get another John Smith Patient:

Now we can change the name and other properties as needed:

Click on the pencil icon to go into edit mode to change the :Patient.name:

Click on the check box to approve the changes.

For the next update, click on the Relationship tab and see that none of the Relationships were copied.

It is important that you click the Nodes in order, e.g. the “from” Node would be clicked first, then the “to” Node, then the ‘Create Relationship”, and finally which relationship you want to create.

After the relationship is created, click on the mat or border to clear the selected Nodes. Next click the previous ‘to’ Node which is now going to be the ‘from’ Node to create the new return relationship if modeled. Create the ‘from’ relationship from the previous ‘to’ relationship.

That how you can add test data. There is a lot more you can do within Bloom.

You can read more about Bloom here: https://neo4j.com/docs/bloom-user-guide/current/

Initial Data Exchange Hub Model

The high-level design of the Data Exchange Hub will be built using the initial COVID-19 timeline. These use cases will show how a Hub can be used to help communicate about initial endemics.

Initial COVID-19 Timeline for modeling

Date Description
December 1, 2019 The symptom onset date of the first patient identified was “Dec 1, 2019
December 6, 2019 Wuhan doctors were finding cases that indicated the virus was spreading from one human to another.
December 21, 2019 Doctors notice a “cluster of pneumonia cases with an unknown cause.”

Initial Timeline Use Cases

  • New Virus alert
  • Human to Human transfer of a new Virus alert
  • Cluster of new cases indicating community spread

Some Considerations

  • Redundant messages about information to avoid single points of failure
  • Direct communications between linked partners (e.g. Dr. to Hospital)
  • Subscribe/Publish model in the Hub(s) to create alerts for interested parties

Graph Visualization in Neo4j’s Bloom

An example protocol:

  • Dr. creates a new virus alert
    • Sends to the Hub
    • Sends to local Hospital
    • Sends to Government
  • Lab creates a new virus alert
    • Based on lab results, Lab knows about new virus
    • Lab creates new virus alert for Hub
    • Lab sends alert to Government
  • Local Hospital forwards alert
    • Sends to Regional Hospital
    • Sends to Government
  • Government sends to WHO
  • Hub sends new virus alert to
    • Other Governments
    • WHO
  • WHO sends new virus alert
    • Pharmaceuticals
    • Labs
    • Other Governments

Redundancy built in to ensure communication of alert. This will increase the chances that the message gets out into the ecosystem. Identifiers will be used to reconcile duplicate messages.

Regarding COVID-19, Dr. Li Wenliang urges other Chinese doctors to protect themselves on 30-Dec-19. Dr. Li Wenliang became a whistle blower trying to alert the world about the virus. The Wuhan police summoned Dr. Li Wenliang and admonished him for making false comments on the internet on 3-Jan-20. Dr. Li Wenliang died on 7-Feb-20. The redundancy suggestion should help minimize attempts at suppression other than technical difficulties.

You May Also Like

About the Author: Tracy Sanders

Over thirty (30) years of experience in consulting Fortune 500 companies in the pharmaceutical (18+ years), health care (7 years), and insurance industries (5 years).