I recently wrote a post about how to query hierarchical data without recursion in PostgreSQL. Now it is time to talk about how to do it with recursion. 😎
CTEs (Common Table Expressions), AKA WITH
clauses, provide us the ability to not only use a query as if it were a table, they also give us the ability to recursively pull data! This is good because we don’t always know how many levels are in hierarchical data. With that said, let’s think about the following example employees table:
id | full_name | manager_id |
---|---|---|
1 |
Davy Crocket | NULL |
2 |
Cindy Cruz | 1 |
3 |
Harry Thompson | 1 |
4 |
George Jefferson | 2 |
5 |
Blinky Bill | 2 |
6 |
Linda Vernsky | 1 |
7 |
Abigail Quispe | 1 |
8 |
Jessica Ericson | 7 |
9 |
Matthew Haywood | 7 |
10 |
Angel Jones | 7 |
11 |
Chris Mackenzie | 7 |
12 |
Gustavo Frederickson | 5 |
13 |
Joan Smith | 5 |
14 |
Debora Ferreira | 5 |
Chain of Command Query
As you can see, we can match up the manager_id
to each manager’s id
to identify the manager-subordinate relationship. Since we are already using CTEs we are not only going to use them for recursion, but we are also going to use them to represent our employees table. The following SQL can be used to indicate the chain of command for each employee:
As I said before, since I am using a CTE here there is no need to create a table and add data to it. You can simply run the above query and you should get the following results:
id | full_name | chain_of_command_names |
---|---|---|
1 |
Davy Crockett | {"Davy Crockett"} |
2 |
Cindy Cruz | {"Davy Crockett","Cindy Cruz"} |
3 |
Harry Thompson | {"Davy Crockett","Harry Thompson"} |
4 |
George Jefferson | {"Davy Crockett","Cindy Cruz","George Jefferson"} |
5 |
Blinky Bill | {"Davy Crockett","Cindy Cruz","Blinky Bill"} |
6 |
Linda Vernsky | {"Davy Crockett","Linda Vernsky"} |
7 |
Abigail Quispe | {"Davy Crockett","Abigail Quispe"} |
8 |
Jessica Ericson | {"Davy Crockett","Abigail Quispe","Jessica Ericson"} |
9 |
Matthew Haywood | {"Davy Crockett","Abigail Quispe","Matthew Haywood"} |
10 |
Angel Jones | {"Davy Crockett","Abigail Quispe","Angel Jones"} |
11 |
Chris Mackenzie | {"Davy Crockett","Abigail Quispe","Chris Mackenzie"} |
12 |
Gustavo Frederickson | {"Davy Crockett","Cindy Cruz","Blinky Bill","Gustavo Frederickson"} |
13 |
Joan Smith | {"Davy Crockett","Cindy Cruz","Blinky Bill","Joan Smith"} |
14 |
Debora Ferreira | {"Davy Crockett","Cindy Cruz","Blinky Bill","Debora Ferreira"} |
Subordinates Query
What if we want to get an array containing each employee’s subordinates if he/she has any?
We can actually modify the CTE to indicate what level each employee is on. By level I mean to say that the leader of the organization has a level of 1, his/her subordinates has a level of 2, etc. We will also need to have an array of the ids for the employees in the chain of command. Finally, using the subordinates
CTE we will need to do a LEFT JOIN
to join subordinates
to itself to see if one employee id
is found in the chain of command id
list of another employee (who would thusly be the subordinate).
Perhaps looking at the actual SQL will make this a bit easier to understand:
Now we have the same list of all of the employees with their respective chain of command, but we also have their subordinates listed. Again, since we have used CTEs for all of the data you should be able to simply run the above query and see the following results:
id | full_name | chain_of_command_names | subordinates |
---|---|---|---|
1 |
Davy Crockett | {"Davy Crockett"} |
{"Abigail Quispe","Linda Vernsky","Harry Thompson","Blinky Bill","Cindy Cruz","George Jefferson","Matthew Haywood","Jessica Ericson","Angel Jones","Chris Mackenzie","Debora Ferreira","Gustavo Frederickson","Joan Smith"} |
2 |
Cindy Cruz | {"Davy Crockett","Cindy Cruz"} |
{"Joan Smith","Blinky Bill","George Jefferson","Debora Ferreira","Gustavo Frederickson"} |
3 |
Harry Thompson | {"Davy Crockett","Harry Thompson"} |
NULL |
4 |
George Jefferson | {"Davy Crockett","Cindy Cruz","George Jefferson"} |
NULL |
5 |
Blinky Bill | {"Davy Crockett","Cindy Cruz","Blinky Bill"} |
{"Debora Ferreira","Gustavo Frederickson","Joan Smith"} |
6 |
Linda Vernsky | {"Davy Crockett","Linda Vernsky"} |
NULL |
7 |
Abigail Quispe | {"Davy Crockett","Abigail Quispe"} |
{"Matthew Haywood","Jessica Ericson","Angel Jones","Chris Mackenzie"} |
8 |
Jessica Ericson | {"Davy Crockett","Abigail Quispe","Jessica Ericson"} |
NULL |
9 |
Matthew Haywood | {"Davy Crockett","Abigail Quispe","Matthew Haywood"} |
NULL |
10 |
Angel Jones | {"Davy Crockett","Abigail Quispe","Angel Jones"} |
NULL |
11 |
Chris Mackenzie | {"Davy Crockett","Abigail Quispe","Chris Mackenzie"} |
NULL |
12 |
Gustavo Frederickson | {"Davy Crockett","Cindy Cruz","Blinky Bill","Gustavo Frederickson"} |
NULL |
13 |
Joan Smith | {"Davy Crockett","Cindy Cruz","Blinky Bill","Joan Smith"} |
NULL |
14 |
Debora Ferreira | {"Davy Crockett","Cindy Cruz","Blinky Bill","Debora Ferreira"} |
NULL |
Final Words
Using recursive CTEs is a huge improvement over my previous solution which simply involved doing multiple LEFT JOIN
s and hoping that the number of levels would never exceed the number of LEFT JOIN
s. 😆
There are plenty of applications for recursive CTEs but I figured this is one of the most common uses. Let me know what you think and as always, happy coding! 😎