Blog
Mar 8, 2026 - 8 MIN READ
Analyzing a Healthcare Knowledge Graph with Cypher and Graph Data Science

Analyzing a Healthcare Knowledge Graph with Cypher and Graph Data Science

How I explored a FAERS-style healthcare graph, moved from Cypher EDA to GDS workflows, and turned graph results into practical analytical insights.

Peter Mangoro

Peter Mangoro

This assignment felt like a real analytics engagement: incomplete documentation, a complex healthcare graph, and the need to make method choices that I could justify.

Assignment Focus

I worked with a FAERS-inspired healthcare dataset restored from a Neo4j dump, then analyzed it in three passes:

  1. Cypher-based schema and data exploration
  2. Deeper analytical querying and aggregation
  3. GDS workflows for structural and similarity analysis

I used Python + Neo4j driver + pandas to keep experiments iterative and reproducible.

What I Built

  • A practical EDA workflow for discovering labels, relationship patterns, and property quality
  • Analytical Cypher queries for non-trivial domain questions
  • GDS projections and algorithm runs aligned to assignment questions (instead of running algorithms blindly)

I documented not just outputs, but why each query/algorithm choice matched the analytical goal.

Key Findings

  • Schema discovery first saved time later. Without that step, I would have built analyses on false assumptions.
  • Cypher before GDS produced better algorithm decisions because I understood graph semantics first.
  • Projection design is the real GDS work: node sets, relationship directions, and weighting determine result quality.
  • Driver-based iteration in Python made query refinement much faster than manual-only browser experimentation.

Lessons Learned

This project taught me to treat graph analytics as a staged process: understand the graph, validate assumptions, then apply algorithms with purpose. I also learned how quickly “interesting but wrong” results can appear when projections are underspecified.

Skills I Gained

  • Cypher EDA for unfamiliar production-like graph datasets
  • GDS projection design and algorithm execution
  • Python-driven graph analysis workflow with Neo4j driver + pandas
  • Query/result interpretation with explicit methodological tradeoffs

Artifacts

Built with Nuxt UI • © 2026