Hierarchies and trees are a challenge in SQL.
This is an example of a very simple tree
Parent ID | ID | Name |
0001 | Earth | |
0001 | 0002 | EU |
0001 | 0003 | USA |
0002 | 0004 | UK |
0002 | 0005 | France |
0004 | 0006 | England |
0006 | 0007 | Yorkshire |
With our application the Parent ID and ID columns are both GUIDs. To help with readability I have used numbers in the examples.
This looks like
Earth
EU
France
UK
England
Yorkshire
USA
The table structure above will work very well in most cases. It's very easy to select the parent of an item and select the children of an item. You can easily add new items or delete items. You get problems when you need to find things like all the decedents of an item.
You can use CTE (common table expressions) but under heavy load and with lots of data it just does not work.
We looked at the nested model. This does not work with simple deletes and inserts. (http://www.developersdex.com/gurus/articles/112.asp )
The final approach we took was to add an incremental number to our table and a row path column. As follows :
Row bitint
RowPath varbinary(160)
When we generate the rowpath we pad it with zeros so each row number starts at the same place. We do not use delimiters we used fixed spaces.
This is a simple representation of how the table will look. This is simple as I am just padding the rowpath with 2 zeros (not 16).
Row | Parent ID | ID | Name | RowPath |
1 | 0001 | Earth | 01 | |
2 | 0001 | 0002 | EU | 0102 |
3 | 0001 | 0003 | USA | 0103 |
4 | 0002 | 0004 | UK | 010204 |
5 | 0002 | 0005 | France | 010205 |
6 | 0004 | 0006 | England | 01020406 |
7 | 0006 | 0007 | Yorkshire | 0102040607 |
This is an example of how the data looks in SQL with it padded with 16 zeros.
0x0000000000000795
0x00000000000007950000000000000796
0x0000000000000797
0x00000000000007970000000000000798
0x000000000000079700000000000007980000000000000799
0x00000000000007970000000000000798000000000000079A
0x00000000000007970000000000000798000000000000079B
To get the child items, then we need to run the following SQL :
@path is the path that we need to find the child items from
@pathplusone is our path plus one, see the function below for how to increment this.
Select Name from Table where RowPath > @path and RowPath < @pathplusone
We increment the path with the following SQL
CREATE FUNCTION [dbo].[IncrementRowPath]
(
@VarBinary VARBINARY(160) = 0x0000000000000000
)
RETURNS VARBINARY(160)
AS
BEGIN
DECLARE @Result VARBINARY(160)
-- If the data is the wrong size.
IF(@VarBinary IS NULL OR DATALENGTH(@VarBinary) % 8 != 0)
SET @Result = 0x0000000000000000
-- Increment the last 8 byte segment.
ELSE
BEGIN
DECLARE @LastSegment BIGINT
SET @LastSegment = CAST(SUBSTRING(@VarBinary, DATALENGTH(@VarBinary) - 7, 8) AS BIGINT)
SET @Result = SUBSTRING(@VarBinary, 1, DATALENGTH(@VarBinary) - 8) + CONVERT(VARBINARY(8), @LastSegment + 1)
END
RETURN @Result
END
I hope this helps you out with trees.
The only issue is that you need and index on rowpath, this index is going to be large, but in our testing and our live implementations it's not caused an issue.
This is really really great!!!! I had to stumble across something like the first exmaple in the past, it works, however, this is much much better.
ReplyDelete