Hierarchical Data Model with MongoDB

Bipin Thite
3 min readSep 6, 2020

--

Are you working on a data model that has hierarchical or nested data relationships?

Are you looking for a way to store and manage them efficiently?

Do you need to store organization’s data that can have flexible hierarchy levels?

Hierarchical Data Model with MongoDB
Hierarchical Data Model with MongoDB

But, why am I talking about it?

Some time back, I was working on a solution where the system is expected to manage the organizations’ hierarchical data such as legal entities, BU, Sub-BU, departments, teams, etc. The organization can be a huge globally-spanned corporation with complex structure or a small startup with flat structure. Needless to say that the system is expected to support fast retrievals and easier changes to the hierarchy.

What does MongoDB offer?

MongoDB allows different ways to model such data hierarchy using the tree data structures. The MongoDB manual describes those ways to model tree structures.

  1. Model Tree Structures with Parent References — child node stores the reference to the parent node
  2. Model Tree Structures with Child References — parent node stores the reference to the child nodes
  3. Model Tree Structures with an Array of Ancestors — a node stores references to parent nodes and array that stores all ancestors
  4. Model Tree Structures with Materialized Paths — a node stores a string the id(s) of the node’s ancestors or path
  5. Model Tree Structures with Nested Sets — uses Nested Set pattern

In this article, I will focus on data modeling using Materialized Paths.

Model Tree Structures with Materialized Paths

Before talking about materialized paths approach, I will describe how we can model our collection to store organization data. In any organization, there can be multiple entities from global groups, regional legal entities, departments, business units, to smaller teams. We can think of all these entities as teams. Because, at the end, any entity is nothing but a team consisting of one or more resources and focusing on some specific goals. For the same reason, we can model the organization data as collection of teams.

If we are going to treat all these organization entities as of type team, then how do we manage their hierarchies? This is where tree structures with materialized paths are going to help. The idea is, with every node i.e. the entity, we store a field containing the full path of the hierarchy starting from the root node and covering all parent nodes. The path string can use a delimiter such as comma.

For example, consider a fictitious organization Acme Inc.

A fictitious organization hierarchy
G

Let’s add some organization hierarchy data for it and we will keep it simple.

db.teams.insertMany( [  { name: "acme_inc", path: null },  { name: "product_abc", path: ",acme_inc," },  { name: "technology", path: ",acme_inc,product_abc," },  { name: "dev", path: ",acme_inc,product_abc,technology," },  { name: "ops", path: ",acme_inc,product_abc,technology," },  { name: "qa", path: ",acme_inc,product_abc,technology," }] )

We can add more fields such as display name, entity type, team owner ID, etc, but we will keep them aside for now.

Now, we have different ways to work with the hierarchical data.

Add a new entity to the existing hierarchy

Insert a Marketing department under the business unit Product ABC.

const parentPath = db.teams.findOne({ name: "product_abc" }).pathconst childPath = parentPath + ",product_abc,"db.teams.insert({ name: "marketing", path: childPath })

Find all descendants of the top-most entity

For retrieval operations, we can use regular expressions.

db.teams.find( { path: /^,acme_inc,/ } )

Create an index to improve search performance

Let’s create an index on the field path to speed up the retrieval.

db.teams.createIndex( { path: 1 } )

Final Words

MongoDB offers different ways to model your hierarchical data. We may not be able to call any of the approaches the best one, as it depends on various factors like ease of use, query performance, ease of maintenance. You can study them carefully and decide what solution suits better for your use case.

References

--

--

Bipin Thite
Bipin Thite

Written by Bipin Thite

A software professional working with Java, Node.js, DevSecOps, Azure