Hierarchical data stored in a relational database such as PostgreSQL can be tricky to work with. An organization is an example of a hierarchical structure. For example, let’s say that we need to represent employees:
-
Davy Crocket
-
Cindy Cruz
- George Jefferson
-
Blinky Bill
- Gustavo Frederickson
- Joan Smith
- Debora Ferreiro
- Harry Thompson
- Linda Vernsky
-
Abigail Quispe
- Jessica Ericson
- Matthew Haywood
- Angel Jones
- Chris Mackenzie
-
Cindy Cruz
Let’s say that this data is stored in a PostgreSQL database as example_employees
in the public
schema:
id |
name |
parent_id |
---|---|---|
1 |
Davy Crocket | NULL |
2 |
Cindy Cruz | 1 |
3 |
Harry Thompson | 1 |
4 |
George Jefferson | 2 |
5 |
Blinky Bill | 2 |
6 |
Linda Vernsky | 1 |
7 |
Abigail Quispe | 1 |
8 |
Jessica Ericson | 7 |
9 |
Matthew Haywood | 7 |
10 |
Angel Jones | 7 |
11 |
Chris Mackenzie | 7 |
12 |
Gustavo Frederickson | 5 |
13 |
Joan Smith | 5 |
14 |
Debora Ferreira | 5 |
If you want this example_employees
table you can create it by using the following:
CREATE TABLE public.example_employees AS
SELECT 1 AS id, 'Davy Crockett' AS name, NULL AS parent_id
UNION SELECT 2 AS id, 'Cindy Cruz' AS name, 1 AS parent_id
UNION SELECT 3 AS id, 'Harry Thompson' AS name, 1 AS parent_id
UNION SELECT 4 AS id, 'George Jefferson' AS name, 2 AS parent_id
UNION SELECT 5 AS id, 'Blinky Bill' AS name, 2 AS parent_id
UNION SELECT 6 AS id, 'Linda Vernsky' AS name, 1 AS parent_id
UNION SELECT 7 AS id, 'Abigail Quispe' AS name, 1 AS parent_id
UNION SELECT 8 AS id, 'Jessica Ericson' AS name, 7 AS parent_id
UNION SELECT 9 AS id, 'Matthew Haywood' AS name, 7 AS parent_id
UNION SELECT 10 AS id, 'Angel Jones' AS name, 7 AS parent_id
UNION SELECT 11 AS id, 'Chris Mackenzie' AS name, 7 AS parent_id
UNION SELECT 12 AS id, 'Gustavo Frederickson' AS name, 5 AS parent_id
UNION SELECT 13 AS id, 'Joan Smith' AS name, 5 AS parent_id
UNION SELECT 14 AS id, 'Debora Ferreira' AS name, 5 AS parent_id;
Now that we have our database setup correctly, we can use a query to pull this data in a way that each employee’s chain of command is listed as well:
SELECT
t.employee_id,
t.name,
t.chain_of_command
FROM (
SELECT
t1.id AS employee_id,
t1.name,
unnest(ARRAY[t5.id, t4.id, t3.id, t2.id, t1.id]) AS unnested_id,
array_remove(ARRAY[t5.name, t4.name, t3.name, t2.name, t1.name], NULL) AS chain_of_command
FROM public.example_employees AS t1
LEFT JOIN public.example_employees AS t2
ON t1.parent_id = t2.id
LEFT JOIN public.example_employees AS t3
ON t2.parent_id = t3.id
LEFT JOIN public.example_employees AS t4
ON t3.parent_id = t4.id
LEFT JOIN public.example_employees AS t5
ON t4.parent_id = t5.id
) t
WHERE t.employee_id = t.unnested_id;
Running the above query will result in the following output:
employee_id |
name |
chain_of_command |
---|---|---|
1 |
Davy Crocket | {"Davy Crocket"} |
2 |
Cindy Cruz | {"Davy Crocket","Cindy Cruz"} |
3 |
Harry Thompson | {"Davy Crocket","Harry Thompson"} |
4 |
George Jefferson | {"Davy Crocket","Cindy Cruz","George Jefferson"} |
5 |
Blinky Bill | {"Davy Crocket","Cindy Cruz","Blinky Bill"} |
6 |
Linda Vernsky | {"Davy Crocket","Linda Vernsky"} |
7 |
Abigail Quispe | {"Davy Crocket","Abigail Quispe"} |
8 |
Jessica Ericson | {"Davy Crocket","Abigail Quispe","Jessica Ericson"} |
9 |
Matthew Haywood | {"Davy Crocket","Abigail Quispe","Matthew Haywood"} |
10 |
Angel Jones | {"Davy Crocket","Abigail Quispe","Angel Jones"} |
11 |
Chris Mackenzie | {"Davy Crocket","Abigail Quispe","Chris Mackenzie"} |
12 |
Gustavo Frederickson | {"Davy Crocket","Cindy Cruz","Blinky Bill","Gustavo Frederickson"} |
13 |
Joan Smith | {"Davy Crocket","Cindy Cruz","Blinky Bill","Joan Smith"} |
14 |
Debora Ferreiro | {"Davy Crocket","Cindy Cruz","Blinky Bill","Debora Ferreiro"} |
Example Using Salesforce Campaigns
Let’s say that you are cloning Salesforce data into a Postgres database. Salesforce campaigns have a hierarchical structure, meaning that campaigns can be parent campaigns, grandparent campaigns, etc.
For instance, imagine that the campaign
table into which we are cloning the data is setup as follows:
id
parent_id
⋮
If you wanted to query the hierarchical data within this table (in the salesforce
schema) you could use a query like the following:
SELECT
t.campaign_id,
t.lineage
FROM (
SELECT
c1.id AS campaign_id,
unnest(ARRAY[c1.id, c2.id, c3.id, c4.id, c5.id]) AS lineage_campaign_id,
array_remove(ARRAY[c1.id, c2.id, c3.id, c4.id, c5.id], NULL) AS lineage
FROM salesforce.campaign AS c1
LEFT JOIN salesforce.campaign AS c2
ON c1.parent_id = c2.id
LEFT JOIN salesforce.campaign AS c3
ON c2.parent_id = c3.id
LEFT JOIN salesforce.campaign AS c4
ON c3.parent_id = c4.id
LEFT JOIN salesforce.campaign AS c5
ON c4.parent_id = c5.id
) t
WHERE t.campaign_id = t.lineage_campaign_id;
Using the above query would result in a list of all of the campaign IDs in the first column and an array containing any possible lineage in the second column. The following is a screenshot of running the above query on our campaign
table:
Concluding Words
It may be that your database is setup differently, but the result will be the same as long as you modify your query to match the IDs correctly between the parent and child tables. If you need to allow for a chain_of_command
or lineage
of more than 5 levels, you can continue to add LEFT JOIN
s.
There is a better way using a recursive CTE to get this data which is outlined here. Have fun and as always, happy coding!!! š