About myself
My technical
skills
My favorites
My picture album
Shortcut keys
My code library
VB resources
SQLServer
resources
SQLServer books
Replication FAQ
Scripting
resources
ASP resources
Search my site
Sign my
guestbook
Contact
information
SQL Server Articles New
|
|
NEW!!! Subscribe to my
newsletter:
Want to keep in touch with the
latest in SQL Server world? Email vyaskn@hotmail.com with
'subscribe' in the subject line |
Working with hierarchical data in SQL Server databases
Note: Information & code samples from this article are tested on SQL Server 2005 RTM (Yukon) and found to be working. Will update the article in case of any compatibility issues. |
In this article, I will show you a simple example that traverses a hierarchy and displays the output in indented format. I will also provide you with links and pointers for more advanced and in-depth information on processing hierarchies.
Representing data in the form of a hierarchy or a tree is a common requirement. The most common example I can think of, is an organizational chart that contains all the employees in a given organization and each employee is linked to his/her manager by his/her manager's ID. Since managers are also employees, their details also get stored in the same employees table.
A few more examples:
- Implementation of a permissions hierarchy.
- Implementation of a product catalog for an online store.
All the above examples need data to be stored in the form of hierarchies or trees (also loosely referred to as parent child relationships). But SQL Server is a relational database, not a hierarchical database. So, we have to store the data in normalized, relational tables, but come up with programming techniques to process this data in a hierarchical manner.
Unfortunately, there is no built-in support for processing hierarchies and tree structures in SQL Server's T-SQL implementation. (Oracle's PL/SQL implementation has a CONNECT BY syntax that makes it easier to work with hierarchical data. Let's hope we will see something similar in the next version of SQL Server, Yukon).
C programmers use recursive programming techniques for traversing tree structures. The same can be implemented in T-SQL using recursive stored procedure calls. I will demonstrate this in the following example.
Consider the employee table of an organization, that stores all the employee records. Each employee is linked to his/her manager by a manger ID. This table can be represented using the following CREATE TABLE statement:
CREATE TABLE dbo.Emp
(
EmpID int PRIMARY KEY,
EmpName varchar(30),
MgrID int FOREIGN KEY REFERENCES Emp(EmpID)
)
GO
Notice that, EmpID is decalred as a primary key, and the MgrID column is declared as a foreign key constraint, that references the EmpID column of the same table, that is, a self referencing table. This is so, because all employees and managers are stored in the same table.
Since EmpID is declared as a primary key, by default it will be implemented as a unique clustered index. Now, let's create a non-clustered index on MgrID column, to improve the query performance:
CREATE NONCLUSTERED INDEX NC_NU_Emp_MgrID ON dbo.Emp(MgrID)
GO
Let's populate the employee table with some data:
INSERT dbo.Emp SELECT 1, 'President', NULL
INSERT dbo.Emp SELECT 2, 'Vice President', 1
INSERT dbo.Emp SELECT 3, 'CEO', 2
INSERT dbo.Emp SELECT 4, 'CTO', 2
INSERT dbo.Emp SELECT 5, 'Group Project Manager', 4
INSERT dbo.Emp SELECT 6, 'Project Manager 1', 5
INSERT dbo.Emp SELECT 7, 'Project Manager 2', 5
INSERT dbo.Emp SELECT 8, 'Team Leader 1', 6
INSERT dbo.Emp SELECT 9, 'Software Engineer 1', 8
INSERT dbo.Emp SELECT 10, 'Software Engineer 2', 8
INSERT dbo.Emp SELECT 11, 'Test Lead 1', 6
INSERT dbo.Emp SELECT 12, 'Tester 1', 11
INSERT dbo.Emp SELECT 13, 'Tester 2', 11
INSERT dbo.Emp SELECT 14, 'Team Leader 2', 7
INSERT dbo.Emp SELECT 15, 'Software Engineer 3', 14
INSERT dbo.Emp SELECT 16, 'Software Engineer 4', 14
INSERT dbo.Emp SELECT 17, 'Test Lead 2', 7
INSERT dbo.Emp SELECT 18, 'Tester 3', 17
INSERT dbo.Emp SELECT 19, 'Tester 4', 17
INSERT dbo.Emp SELECT 20, 'Tester 5', 17
GO
Notice that the MgrID of President is NULL, that means, he is the super boss and has nobody managing him. As you can see, rest of the employees are linked to their respective managers using the MgrID column.
Now, let's create a stored procedure, that traverses this employee hierarchy recursively and displays the employees in the form of an indented tree structure. The following stored procedure has an input parameter named Root, that takes the ID of the employee (equivalent to the ID of a node in a tree) and displays all the employees managed by him and his sub-ordinates.
CREATE PROC dbo.ShowHierarchy
(
@Root int
)
AS
BEGIN
SET NOCOUNT ON
DECLARE @EmpID int, @EmpName varchar(30)
SET @EmpName = (SELECT EmpName FROM dbo.Emp WHERE EmpID = @Root)
PRINT REPLICATE('-', @@NESTLEVEL * 4) + @EmpName
SET @EmpID = (SELECT MIN(EmpID) FROM dbo.Emp WHERE MgrID = @Root)
WHILE @EmpID IS NOT NULL
BEGIN
EXEC dbo.ShowHierarchy @EmpID
SET @EmpID = (SELECT MIN(EmpID) FROM dbo.Emp WHERE MgrID = @Root AND EmpID > @EmpID)
END
END
GO
While creating the above stored procedure, you will receive the following warning:
Cannot add rows to sysdepends for the current stored procedure because it depends on the missing object 'ShowHierarchy'. The stored procedure will still be created.
Nothing to worry about. It's a just a warning, as SQL Server is a little confused about the recursive stored procedures.
As you can see, the above stored procedure calls itself recursively. You should be aware of a limitation imposed by SQL Server though. A stored procedure can nest itself upto a maximum of 32 levels. If you exceed this limit, you will receive the following error:
Server: Msg 217, Level 16, State 1, Procedure sdss, Line 1
Maximum stored procedure nesting level exceeded (limit 32).
Now let's run this procedure by passing the IDs of different employees and look at the output:
--Complete hierarchy
EXEC dbo.ShowHierarchy 1
GO
---President
------Vice President
---------CEO
---------CTO
------------Group Project Manager
---------------Project Manager 1
------------------Team Leader 1
---------------------Software Engineer 1
---------------------Software Engineer 2
------------------Test Lead 1
---------------------Tester 1
---------------------Tester 2
---------------Project Manager 2
------------------Team Leader 2
---------------------Software Engineer 3
---------------------Software Engineer 4
------------------Test Lead 2
---------------------Tester 3
---------------------Tester 4
---------------------Tester 5
--From Group Project Manager onwards
EXEC dbo.ShowHierarchy 5
GO
---Group Project Manager
------Project Manager 1
---------Team Leader 1
------------Software Engineer 1
------------Software Engineer 2
---------Test Lead 1
------------Tester 1
------------Tester 2
------Project Manager 2
---------Team Leader 2
------------Software Engineer 3
------------Software Engineer 4
---------Test Lead 2
------------Tester 3
------------Tester 4
------------Tester 5
--From Project Manager 1 onwards
EXEC dbo.ShowHierarchy 6
GO
---Project Manager 1
------Team Leader 1
---------Software Engineer 1
---------Software Engineer 2
------Test Lead 1
---------Tester 1
---------Tester 2
The above is just a simple example of traversing hierarchies. There's more to hierarchies than just traversing, that is, adding to, deleting from and modifying hierarchies. The following links deal with hierarchies in greater detail and propose different methodologies for managing hierarchies efficiently.
Read the topic titled Expanding Hierarchies in SQL Server Books Online.
|
Read the following article from Microsoft Knowledgebase:
HOW TO: Show Expanding Hierarchies by Using SQL Server (Q248915)
|
Joe Celko proposed the "Nested set model" for representing trees in SQL. He is one of those people, involved in the ANSI SQL standard. Apart from books, he wrote about this nested set model in magazine columns and newsgroups as well. Here are some links that contain good explanation and code samples:
Trees in SQL (From Intelligent Enterprise Magazine)
A Look at SQL Trees (From DBMS magazine, March 1996)
SQL Lessons (From DBMS magazine, April 1996)
Nontraditional Databases (From DBMS magazine, May 1996)
|
Read the article titled Maintaining Hierarchies
by Itzik Ben-Gan from the July 2000 issue of SQL Server Magazine.
|
| Joe Celko's SQL for Smarties: Advanced SQL Programming
Chapter 28: Adjacency List Model of Trees in SQL
Chapter 29: Nested Set Model of Trees in SQL
Click here for more info or to purchase this book from Amazon.com or Amazon.co.uk or Amazon.ca |
|
|
|
| Advanced Transact-SQL for SQL Server 2000
Chapter 16: Expanding Hierarchies
Click here for more info or to purchase this book from Amazon.com or Amazon.co.uk or Amazon.ca |
|
|
RAC
RAC is a native utility for MS SQL Server, designed exclusively for data manipulation. RAC uses a common framework that can be manipulated to summarize data, perform pivoting operations including crosstabulations, character operations including string replacement, splitting, concatenation and processing hierarcharical data.
|
|
|