Hierarchical data stored in a relational database such as PostgreSQL can be tricky to work with. An organization is an example of a hierarchical structure. For example, let’s say that we need to represent employees:
-
Davy Crocket
-
Cindy Cruz
- George Jefferson
-
Blinky Bill
- Gustavo Frederickson
- Joan Smith
- Debora Ferreiro
- Harry Thompson
- Linda Vernsky
-
Abigail Quispe
- Jessica Ericson
- Matthew Haywood
- Angel Jones
- Chris Mackenzie
-
Cindy Cruz
Let’s say that this data is stored in a PostgreSQL database as example_employees
in the public
schema:
id |
name |
parent_id |
---|---|---|
1 |
Davy Crocket | NULL |
2 |
Cindy Cruz | 1 |
3 |
Harry Thompson | 1 |
4 |
George Jefferson | 2 |
5 |
Blinky Bill | 2 |
6 |
Linda Vernsky | 1 |
7 |
Abigail Quispe | 1 |
8 |
Jessica Ericson | 7 |
9 |
Matthew Haywood | 7 |
10 |
Angel Jones | 7 |
11 |
Chris Mackenzie | 7 |
12 |
Gustavo Frederickson | 5 |
13 |
Joan Smith | 5 |
14 |
Debora Ferreira | 5 |
If you want this example_employees
table you can create it by using the following:
CREATE TABLE public.example_employees AS SELECT 1 AS id, 'Davy Crockett' AS name, NULL AS parent_id UNION SELECT 2 AS id, 'Cindy Cruz' AS name, 1 AS parent_id UNION SELECT 3 AS id, 'Harry Thompson' AS name, 1 AS parent_id UNION SELECT 4 AS id, 'George Jefferson' AS name, 2 AS parent_id UNION SELECT 5 AS id, 'Blinky Bill' AS name, 2 AS parent_id UNION SELECT 6 AS id, 'Linda Vernsky' AS name, 1 AS parent_id UNION SELECT 7 AS id, 'Abigail Quispe' AS name, 1 AS parent_id UNION SELECT 8 AS id, 'Jessica Ericson' AS name, 7 AS parent_id UNION SELECT 9 AS id, 'Matthew Haywood' AS name, 7 AS parent_id UNION SELECT 10 AS id, 'Angel Jones' AS name, 7 AS parent_id UNION SELECT 11 AS id, 'Chris Mackenzie' AS name, 7 AS parent_id UNION SELECT 12 AS id, 'Gustavo Frederickson' AS name, 5 AS parent_id UNION SELECT 13 AS id, 'Joan Smith' AS name, 5 AS parent_id UNION SELECT 14 AS id, 'Debora Ferreira' AS name, 5 AS parent_id;
Now that we have our database setup correctly, we can use a query to pull this data in a way that each employee’s chain of command is listed as well:
SELECT t.employee_id, t.name, t.chain_of_command FROM ( SELECT t1.id AS employee_id, t1.name, unnest(ARRAY[t5.id, t4.id, t3.id, t2.id, t1.id]) AS unnested_id, array_remove(ARRAY[t5.name, t4.name, t3.name, t2.name, t1.name], NULL) AS chain_of_command FROM public.example_employees AS t1 LEFT JOIN public.example_employees AS t2 ON t1.parent_id = t2.id LEFT JOIN public.example_employees AS t3 ON t2.parent_id = t3.id LEFT JOIN public.example_employees AS t4 ON t3.parent_id = t4.id LEFT JOIN public.example_employees AS t5 ON t4.parent_id = t5.id ) t WHERE t.employee_id = t.unnested_id;
Running the above query will result in the following output:
employee_id |
name |
chain_of_command |
---|---|---|
1 |
Davy Crocket | {"Davy Crocket"} |
2 |
Cindy Cruz | {"Davy Crocket","Cindy Cruz"} |
3 |
Harry Thompson | {"Davy Crocket","Harry Thompson"} |
4 |
George Jefferson | {"Davy Crocket","Cindy Cruz","George Jefferson"} |
5 |
Blinky Bill | {"Davy Crocket","Cindy Cruz","Blinky Bill"} |
6 |
Linda Vernsky | {"Davy Crocket","Linda Vernsky"} |
7 |
Abigail Quispe | {"Davy Crocket","Abigail Quispe"} |
8 |
Jessica Ericson | {"Davy Crocket","Abigail Quispe","Jessica Ericson"} |
9 |
Matthew Haywood | {"Davy Crocket","Abigail Quispe","Matthew Haywood"} |
10 |
Angel Jones | {"Davy Crocket","Abigail Quispe","Angel Jones"} |
11 |
Chris Mackenzie | {"Davy Crocket","Abigail Quispe","Chris Mackenzie"} |
12 |
Gustavo Frederickson | {"Davy Crocket","Cindy Cruz","Blinky Bill","Gustavo Frederickson"} |
13 |
Joan Smith | {"Davy Crocket","Cindy Cruz","Blinky Bill","Joan Smith"} |
14 |
Debora Ferreiro | {"Davy Crocket","Cindy Cruz","Blinky Bill","Debora Ferreiro"} |
Example Using Salesforce Campaigns
Let’s say that you are cloning Salesforce data into a Postgres database. Salesforce campaigns have a hierarchical structure, meaning that campaigns can be parent campaigns, grandparent campaigns, etc.
For instance, imagine that the campaign
table into which we are cloning the data is setup as follows:
id
parent_id
⋮
If you wanted to query the hierarchical data within this table (in the salesforce
schema) you could use a query like the following:
SELECT t.campaign_id, t.lineage FROM ( SELECT c1.id AS campaign_id, unnest(ARRAY[c1.id, c2.id, c3.id, c4.id, c5.id]) AS lineage_campaign_id, array_remove(ARRAY[c1.id, c2.id, c3.id, c4.id, c5.id], NULL) AS lineage FROM salesforce.campaign AS c1 LEFT JOIN salesforce.campaign AS c2 ON c1.parent_id = c2.id LEFT JOIN salesforce.campaign AS c3 ON c2.parent_id = c3.id LEFT JOIN salesforce.campaign AS c4 ON c3.parent_id = c4.id LEFT JOIN salesforce.campaign AS c5 ON c4.parent_id = c5.id ) t WHERE t.campaign_id = t.lineage_campaign_id;
Using the above query would result in a list of all of the campaign IDs in the first column and an array containing any possible lineage in the second column. The following is a screenshot of running the above query on our campaign
table:
Concluding Words
It may be that your database is setup differently, but the result will be the same as long as you modify your query to match the IDs correctly between the parent and child tables. If you need to allow for a chain_of_command
or lineage
of more than 5 levels, you can continue to add LEFT JOIN
s.
There is a better way using a recursive CTE to get this data which is outlined here. Have fun and as always, happy coding!!! š