Monday, September 01, 2008

Converting Delimited String To Separate Values In SQL

Ever need to split a comma delimited string for use in SQL? If so, this post will go through a simple Microsoft SQL Server 2005 example of converting delimited text to a SQL table variable.

CREATE FUNCTION [dbo].[split] (@csv nvarchar(max), @delim varchar(1))
RETURNS @entries TABLE (entry nvarchar(100))
AS
BEGIN
DECLARE @commaindex int
SELECT @commaindex = CHARINDEX(@delim, @csv)

IF @commaindex > 0
BEGIN
INSERT INTO @entries
-- insert left side
SELECT LTrim(RTrim(LEFT(@csv, @commaindex-1)))
-- pass right side recursively
UNION ALL
SELECT entry
FROM dbo.split(RIGHT(@csv, LEN(@csv) - @commaindex), @delim)
END
ELSE
BEGIN
INSERT INTO @entries
SELECT LTrim(RTrim(@csv))
END

RETURN
END
The above can be manipulated according to individual need as each entry may need to be more than a 100 character string or you may have a delimiter greater than one character. The key to this solution is the simplicity achieved through use of recursion. Instead of trying to parse through each position that delimiter could be in string from beginning to end, simply keep taking left portion off and passing right portion for further processing.

To take a string and get distinct values, the UNION ALL statement can be changed to UNION or simply select distinct in usage of function which is pretty straight forward as well.

SELECT DISTINCT entry FROM dbo.split('my;delimited;text;to;be;parsed', ';')
As you can see, the string doesn't have to be comma delimited either.

Of course this solution is very basic, but hopefully demonstrates some of the power of using user defined functions and recursion as a part of your SQL toolkit. Well keep learning and keep smiling.

1 comment:

theblackknight said...

A quick alteration if you cannot use recursion would be to change if statement to a WHILE loop and instead of recursively calling split function:

SET @csv = RIGHT(@csv, LEN(@csv) - @commaindex)
SET @commaindex = CHARINDEX(@delim, @csv)

The else portion of the if would just be a standalone statement after the while loop to add the last bit or add entire string if not delimited in the first place.

Happy coding!